HyperMTA Processor Specifications
|
HyperMTA Processor Specifications
|
|
|
This is only a preliminary release and it is not complete.
|
This is only a preliminary release and it is not complete.
|
|
|
|
|
#######################################################################
|
#######################################################################
|
#######################################################################
|
#######################################################################
|
|
|
User/System Access
|
User/System Access
|
|
|
Registers:
|
Registers:
|
R0-R31: General Purpose Integer
|
R0-R31: General Purpose Integer
|
F0-F31: Floating Point Registers
|
F0-F31: Floating Point Registers
|
C0(F0,F1)-C15(F30,F31): Complex Floating Point Registers
|
C0(F0,F1)-C15(F30,F31): Complex Floating Point Registers
|
|
|
Instruction Format and Next Instruction Data Placement:
|
Instruction Format and Next Instruction Data Placement:
|
| 42 Bits(I) | 42 Bits(I) | 42 Bits(I) | 2 Bits(11) |
|
| 42 Bits(I) | 42 Bits(I) | 42 Bits(I) | 2 Bits(11) |
|
|
|
| 64 Bits(D) | 20 Bits(D) | 42 Bits(I) | 2 Bits(01) |
|
| 64 Bits(D) | 20 Bits(D) | 42 Bits(I) | 2 Bits(01) |
|
NID0(I/F) NID1(I)
|
NID0(I/F) NID1(I)
|
|
|
| 42 Bits(I) | 42 Bits(I) | 42 Bits(D) | 2 Bits(10) |
|
| 42 Bits(I) | 42 Bits(I) | 42 Bits(D) | 2 Bits(10) |
|
NID0(I)
|
NID0(I)
|
|
|
| 126 Bits(D) | 2 Bits(00) | (Complex)
|
| 126 Bits(D) | 2 Bits(00) | (Complex)
|
NID0(C)
|
NID0(C)
|
// 63 Bits of each float least significat bit is zeroed or oneed by instruction
|
// 63 Bits of each float least significat bit is zeroed or oneed by instruction
|
|
|
Next Instruction Data:
|
Next Instruction Data:
|
Inlined data stored within the next instruction. Nothing else to say except
|
Inlined data stored within the next instruction. Nothing else to say except
|
it is one of the ways in which we hide memory latency.
|
it is one of the ways in which we hide memory latency.
|
|
|
Internal VLIW Branching:
|
Internal VLIW Branching:
|
This is another way to hide latency.
|
This is another way to hide latency.
|
|
|
The following to reduce branch misprediction penalties since it'll be more
|
The following to reduce branch misprediction penalties since it'll be more
|
costly in this system:
|
costly in this system:
|
|
|
| Conditional Branch | ALU OP | ALU OP |
|
| Conditional Branch | ALU OP | ALU OP |
|
If true the instruction will execute:
|
If true the instruction will execute:
|
| ALU | ALU | IGNORE THIS ALU OP |
|
| ALU | ALU | IGNORE THIS ALU OP |
|
else
|
else
|
| IGNORE THIS | ALU | ALU OP |
|
| IGNORE THIS | ALU | ALU OP |
|
|
|
Both of those were the same instruction. The Branch instruction contains
|
Both of those were the same instruction. The Branch instruction contains
|
a mask of which molecules to execute of the next instruction. In a standard
|
a mask of which molecules to execute of the next instruction. In a standard
|
pipelined system all three can be executed and in the write back stage the
|
pipelined system all three can be executed and in the write back stage the
|
correct molecules will be writen back. This can eliminate small loops.
|
correct molecules will be writen back. This can eliminate small loops.
|
Then a special return instruction will return the execution back to normal
|
Then a special return instruction will return the execution back to normal
|
executing all instructions. As in the example it is possible to have
|
executing all instructions. As in the example it is possible to have
|
shared instructions which will be executed either way.
|
shared instructions which will be executed either way.
|
|
|
ISA:
|
ISA:
|
|
|
Arithmetic(64 bit):
|
Arithmetic(64 bit):
|
ADD reg,reg,reg/imm
|
ADD reg,reg,reg/imm
|
SUB reg,reg,reg/imm
|
SUB reg,reg,reg/imm
|
MUL reg,reg,reg/imm
|
MUL reg,reg,reg/imm
|
MULU reg,reg,reg/imm
|
MULU reg,reg,reg/imm
|
DIV reg,reg,reg/imm
|
DIV reg,reg,reg/imm
|
DIVU reg,reg,reg/imm
|
DIVU reg,reg,reg/imm
|
MOD reg,reg,reg/imm
|
MOD reg,reg,reg/imm
|
MODU reg,reg,reg/imm
|
MODU reg,reg,reg/imm
|
LMUL reg,reg,reg,reg/imm // Long multiply
|
LMUL reg,reg,reg,reg/imm // Long multiply
|
LMULU reg,reg,reg,reg/imm // Long multiply unsigned
|
LMULU reg,reg,reg,reg/imm // Long multiply unsigned
|
|
|
Logic(64 bit):
|
Logic(64 bit):
|
OR reg,reg,reg/imm
|
OR reg,reg,reg/imm
|
AND reg,reg,reg/imm
|
AND reg,reg,reg/imm
|
XOR reg,reg,reg/imm
|
XOR reg,reg,reg/imm
|
NOT reg,reg
|
NOT reg,reg
|
SHL reg,reg,reg/imm
|
SHL reg,reg,reg/imm
|
SHR reg,reg,reg/imm
|
SHR reg,reg,reg/imm
|
ROL reg,reg,reg/imm
|
ROL reg,reg,reg/imm
|
ROR reg,reg,reg/imm
|
ROR reg,reg,reg/imm
|
PCNT reg,reg
|
PCNT reg,reg
|
PCNTZ reg,reg
|
PCNTZ reg,reg
|
PCNTC reg,reg
|
PCNTC reg,reg
|
CHG reg,reg
|
CHG reg,reg
|
SB reg,imm // Set Bit
|
SB reg,imm // Set Bit
|
CB reg,imm // Clear Bit
|
CB reg,imm // Clear Bit
|
TB reg,imm // Toggle Bit
|
TB reg,imm // Toggle Bit
|
|
|
Floating Point(64 Bit):
|
Floating Point(64 Bit):
|
FADD reg,reg,reg
|
FADD reg,reg,reg
|
FSUB reg,reg,reg
|
FSUB reg,reg,reg
|
FMUL reg,reg,reg
|
FMUL reg,reg,reg
|
FDIV reg,reg,reg
|
FDIV reg,reg,reg
|
FMOD reg,reg,reg
|
FMOD reg,reg,reg
|
FABS reg,reg
|
FABS reg,reg
|
FNEG reg,reg // Make Negative
|
FNEG reg,reg // Make Negative
|
FPOS reg,reg // Make Positive
|
FPOS reg,reg // Make Positive
|
FTSIGN reg,reg // Toggle Sign
|
FTSIGN reg,reg // Toggle Sign
|
FSQ reg,reg
|
FSQ reg,reg
|
FCMP reg,reg
|
FCMP reg,reg
|
FRND reg // Random Generator
|
FRND reg // Random Generator
|
FPI reg // Load PI
|
FPI reg // Load PI
|
FE reg // Load E
|
FE reg // Load E
|
FZERO reg // Load ZERO
|
FZERO reg // Load ZERO
|
FONE reg // Load ONE
|
FONE reg // Load ONE
|
FFLOOR reg,reg
|
FFLOOR reg,reg
|
FCEIL reg,reg
|
FCEIL reg,reg
|
FINV reg,reg // 1/reg
|
FINV reg,reg // 1/reg
|
|
|
Complex(128 Bit):
|
Complex(128 Bit):
|
CADD reg,reg,reg
|
CADD reg,reg,reg
|
CSUB reg,reg,reg
|
CSUB reg,reg,reg
|
CMUL reg,reg,reg
|
CMUL reg,reg,reg
|
CDIV reg,reg,reg
|
CDIV reg,reg,reg
|
CMOD reg,reg,reg // Do we really need this? I don't think so.
|
CMOD reg,reg,reg // Do we really need this? I don't think so.
|
CSQ reg,reg
|
CSQ reg,reg
|
CCMP reg,reg // ?
|
CCMP reg,reg // ?
|
CI reg // Load I
|
CI reg // Load I
|
|
|
Branch: // Avoid if possible user internal VLIW branching
|
Branch: // Avoid if possible user internal VLIW branching
|
JMP rel
|
JMP rel
|
JMP reg
|
JMP reg
|
JMP{condition} rel
|
JMP{condition} rel
|
JMP{condition} reg
|
JMP{condition} reg
|
CALL rel
|
CALL rel
|
CALL reg
|
CALL reg
|
CALL{condition} rel
|
CALL{condition} rel
|
CALL{condition} reg
|
CALL{condition} reg
|
CALL [reg+8*cccc]
|
CALL [reg+8*cccc]
|
CALL{condition} [reg+8*cccc]
|
CALL{condition} [reg+8*cccc]
|
RETURN
|
RETURN
|
RETURN{condition}
|
RETURN{condition}
|
|
|
Internal VLIW Branching:
|
Internal VLIW Branching:
|
// Selects to execute certain molecules of each atom until a return is reached
|
// Selects to execute certain molecules of each atom until a return is reached
|
// This is another way to hide memory latency
|
// This is another way to hide memory latency
|
IVB{condition} moleculemask(0,1,2, or any combination)
|
IVB{condition} moleculemask(0,1,2, or any combination)
|
IVRET
|
IVRET
|
|
|
Interupt:
|
Interupt:
|
THROW reg/imm
|
THROW reg/imm
|
RTI // Return Interupt
|
RTI // Return Interupt
|
|
|
Data Movement:
|
Data Movement:
|
MOV reg,reg // Move
|
MOV reg,reg // Move
|
MOVS reg,sreg // Move Special
|
MOVS reg,sreg // Move Special
|
MOVS sreg,reg
|
MOVS sreg,reg
|
PREFETCH // Data Prefetch
|
PREFETCH // Data Prefetch
|
PREFETCHI // Instruction Prefetch
|
PREFETCHI // Instruction Prefetch
|
LOADB(U)
|
LOADB(U)
|
LOADW(U)
|
LOADW(U)
|
LOADD(U)
|
LOADD(U)
|
LOADQ(U)
|
LOADQ(U)
|
STOREB
|
STOREB
|
STOREW
|
STOREW
|
STORED
|
STORED
|
STOREQ
|
STOREQ
|
LOADF // Load/Store Float
|
LOADF // Load/Store Float
|
STOREF
|
STOREF
|
LOADC // Load/Store Complex
|
LOADC // Load/Store Complex
|
STOREC
|
STOREC
|
LOADNID // Load from Next Instruction Data
|
LOADNID // Load from Next Instruction Data
|
LOADFNID // Load Float from Next Instruction Data
|
LOADFNID // Load Float from Next Instruction Data
|
LOADCNID // Load Complex from Next Instruction Data
|
LOADCNID // Load Complex from Next Instruction Data
|
EXTRACT reg(dest),reg(src),imm(start),imm(stop)
|
EXTRACT reg(dest),reg(src),imm(start),imm(stop)
|
DEPOSITE reg(dest),reg(src),reg(srcb),imm(start),imm(stop)
|
DEPOSITE reg(dest),reg(src),reg(srcb),imm(start),imm(stop)
|
|
|
System:
|
System:
|
TLBR reg(threadid),reg(tlbvalueh),reg(tlbvaluel)
|
TLBR reg(threadid),reg(tlbvalueh),reg(tlbvaluel)
|
TLBW reg(threadid),reg(tlbvalueh),reg(tlbvaluel)
|
TLBW reg(threadid),reg(tlbvalueh),reg(tlbvaluel)
|
|
|
Interupts: -- Avoid this unless absolutely nessicary
|
Interupts: -- Avoid this unless absolutely nessicary
|
THROW reg/imm(vector) // Throw Exception
|
THROW reg/imm(vector) // Throw Exception
|
RETI // Return from Interupt
|
RETI // Return from Interupt
|
|
|
System:
|
System:
|
IFENCE // Instruction Fence
|
IFENCE // Instruction Fence
|
DFENCE // Data Fence
|
DFENCE // Data Fence
|
REGISTER reg(threadptr),imm(interupt vector) // Registers an interupt
|
REGISTER reg(threadptr),imm(interupt vector) // Registers an interupt
|
SYSCALL // Syscall (Pauses Current Stream/Flags for Service)
|
SYSCALL // Syscall (Pauses Current Stream/Flags for Service)
|
|
|
Process Management: // Dispatched through MP Bus (8 Threads = 1 Process)
|
Process Management: // Dispatched through MP Bus (8 Threads = 1 Process)
|
PROCESS.LOAD reg(addrptr),reg(processorid:processid)
|
PROCESS.LOAD reg(addrptr),reg(processorid:processid)
|
PROCESS.STORE reg(addrptr),reg(processorid:processid)
|
PROCESS.STORE reg(addrptr),reg(processorid:processid)
|
PROCESS.START reg(processorid:processid)
|
PROCESS.START reg(processorid:processid)
|
PROCESS.STOP reg(processorid:processid)
|
PROCESS.STOP reg(processorid:processid)
|
|
|
Thread Management: // Dispatched through MP Bus
|
Thread Management: // Dispatched through MP Bus
|
THREAD.LOAD reg(addrptr),reg(processorid:threadid) // Loads a threads state
|
THREAD.LOAD reg(addrptr),reg(processorid:threadid) // Loads a threads state
|
THREAD.STORE reg(addrptr),reg(processorid:threadid) // Saves a threads state
|
THREAD.STORE reg(addrptr),reg(processorid:threadid) // Saves a threads state
|
THREAD.START reg(processorid:threadid) // Continues execution of a thread
|
THREAD.START reg(processorid:threadid) // Continues execution of a thread
|
THREAD.STOP reg(processorid:threadid) // Stops execution of a thread
|
THREAD.STOP reg(processorid:threadid) // Stops execution of a thread
|
BREAK // Debugger Support
|
BREAK // Debugger Support
|
|
|
Processor Management: // Dispatched through MP Bus
|
Processor Management: // Dispatched through MP Bus
|
PROCESSOR.START reg(processorid) // Start the processor
|
PROCESSOR.START reg(processorid) // Start the processor
|
PROCESSOR.STOP reg(processorid) // Stop the processor
|
PROCESSOR.STOP reg(processorid) // Stop the processor
|
PROCESSOR.PAUSE reg(processorid) // Pause a processor and all it's streams
|
PROCESSOR.PAUSE reg(processorid) // Pause a processor and all it's streams
|
PROCESSOR.CONTINUE reg(processorid) // Resume a processor from pause
|
PROCESSOR.CONTINUE reg(processorid) // Resume a processor from pause
|
PROCESSOR.RESET reg(processorid) // Restart a processor
|
PROCESSOR.RESET reg(processorid) // Restart a processor
|
PROCESSOR.PING reg(processorid),reg(result/hop count) // Ping's a processor
|
PROCESSOR.PING reg(processorid),reg(result/hop count) // Ping's a processor
|
// result is number of hops to processor or 0 for nonexistant
|
// result is number of hops to processor or 0 for nonexistant
|
|
|
Processor IDs:
|
Processor IDs:
|
0000: Startup
|
0000: Startup
|
0001: Master Processor (OS Only)
|
0001: Master Processor (OS Only)
|
0002-FFFE: Slave Processors
|
0002-FFFE: Slave Processors
|
FFFF: Broadcast ID
|
FFFF: Broadcast ID
|
|
|
Routing:
|
Routing:
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
-----------1-------------2---------------3------------
|
-----------1-------------2---------------3------------
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
-----------4-------------5 NO CONNECTION 6------------
|
-----------4-------------5 NO CONNECTION 6------------
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
-----------7-------------8---------------9------------
|
-----------7-------------8---------------9------------
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|
|
Each router will automatically keep track of processor id's and their routing keys
|
Each router will automatically keep track of processor id's and their routing keys
|
and each router will try to route to a specified processor using the best way possible
|
and each router will try to route to a specified processor using the best way possible
|
When a processor is assigned a processor id it automatically tells the router its
|
When a processor is assigned a processor id it automatically tells the router its
|
id and the router from then on builds routing key tables as data transfers occur.
|
id and the router from then on builds routing key tables as data transfers occur.
|
Routers also buffer memory transfers and cache for their own memory banks.
|
Routers also buffer memory transfers and cache for their own memory banks.
|
The routing processors must be capable of sustaining 1 memory read/write to
|
The routing processors must be capable of sustaining 1 memory read/write to
|
each processor every clock cycle. Instructions will have a small special
|
each processor every clock cycle. Instructions will have a small special
|
buffer so that small loops can be made without any memory access penalty.
|
buffer so that small loops can be made without any memory access penalty.
|
(That is loops not implemented with Internal VLIW Branching.)
|
(That is loops not implemented with Internal VLIW Branching.)
|
|
|
I/O Interfacing:
|
I/O Interfacing:
|
There are memory based I/O chips connected to the memory routing network.
|
There are memory based I/O chips connected to the memory routing network.
|
They are able to throw interupts by signalling processors through the MP
|
They are able to throw interupts by signalling processors through the MP
|
Bus that their is a service request needed to be serviced.
|
Bus that their is a service request needed to be serviced.
|
|
|
CPU Bus Interface:
|
CPU Bus Interface:
|
Consist of MP Bus interface which connects to microkernel risc processor
|
Consist of MP Bus interface which connects to microkernel risc processor
|
and the memory i/o interface that is 128 bits in length and transfers
|
and the memory i/o interface that is 128 bits in length and transfers
|
data through.
|
data through.
|
|
|
#######################################################################
|
#######################################################################
|
#######################################################################
|
#######################################################################
|
MicroKernel Support Processor's ISA(Small risc core) -- Incomplete
|
MicroKernel Support Processor's ISA(Small risc core) -- Incomplete
|
This microprocessor runs part of the os and manages the mp bus.
|
This microprocessor runs part of the os and manages the mp bus.
|
|
|
Arithmetic:
|
Arithmetic:
|
ADD
|
ADD
|
SUB
|
SUB
|
SHR
|
SHR
|
SHL
|
SHL
|
ROR
|
ROR
|
ROL
|
ROL
|
RND // Random Number Generator
|
RND // Random Number Generator
|
Arguments: reg,reg,reg
|
Arguments: reg,reg,reg
|
Arguments: reg,reg,imm16
|
Arguments: reg,reg,imm16
|
|
|
Logic:
|
Logic:
|
OR
|
OR
|
AND
|
AND
|
XOR
|
XOR
|
NOT
|
NOT
|
Arguments: reg,reg,reg
|
Arguments: reg,reg,reg
|
|
|
Memory:
|
Memory:
|
LB/LW/LD(S)
|
LB/LW/LD(S)
|
SB/SW/SD
|
SB/SW/SD
|
Arguments: reg,[reg+imm16]
|
Arguments: reg,[reg+imm16]
|
|
|
Branch:
|
Branch:
|
BEQ(L)
|
BEQ(L)
|
BNE(L)
|
BNE(L)
|
BZ(L)
|
BZ(L)
|
BNZ(L)
|
BNZ(L)
|
BC(L)
|
BC(L)
|
BNC(L)
|
BNC(L)
|
J(L)
|
J(L)
|
JR(L)
|
JR(L)
|
|
|
Interupts/Special:
|
Interupts/Special:
|
NOP // No Operation
|
NOP // No Operation
|
|
|
MP(MultiProcessing) Interconnect:
|
MP(MultiProcessing) Interconnect:
|
MPIREAD // Write Buffer
|
MPIREAD // Write Buffer
|
MPIWRITE // Read Buffer
|
MPIWRITE // Read Buffer
|
MPIREQ? // Branch on Request Pending
|
MPIREQ? // Branch on Request Pending
|
|
|
Threads:
|
Threads:
|
TSREQ? // Branch on Thread Service Request
|
TSREQ? // Branch on Thread Service Request
|
|
|
Local Processor Manipulation:
|
Local Processor Manipulation:
|
PSTOP // Processor
|
PSTOP // Processor
|
PSTART
|
PSTART
|
TSTOP reg(threadid) // Thread
|
TSTOP reg(threadid) // Thread
|
|
|