1 |
15 |
earlz |
==TinyCPU==
|
2 |
|
|
|
3 |
|
|
TinyCPU is an 8-bit processor designed to be small, yet fairly fast.
|
4 |
|
|
|
5 |
|
|
Goals:
|
6 |
|
|
|
7 |
|
|
The goals of TinyCPU are basically to have a small 8-bit processor that can be embedded with minimal logic required, but also fast enough to do what's needed.
|
8 |
|
|
With these goals, I try to lay out instructions in a way so that they are trivial to decode, for instance, nearly all ALU opcodes fit within 2 opcode groups,
|
9 |
|
|
and the ALU is arranged so that no translation needs to be done to decode these groups. It is also designed to be fast. Because XST failed at synthesizing
|
10 |
|
|
every attempt I threw at multi-port registerfiles, I instead decided to make it braindead simple and just provide a port for every register. This means that
|
11 |
|
|
every register can be accessed at the same time, preventing me from having to worry about how many registers are accessed in an opcode, and therefore enabling
|
12 |
|
|
very rich opcodes. Also, with the standard opcode format, decoding should hopefully be a breeze involving basically only 2 or 3 states.
|
13 |
|
|
|
14 |
|
|
Features:
|
15 |
|
|
1. Single clock cycle for all instructions without memory access
|
16 |
|
|
2. Two clock cycles for memory access instructions
|
17 |
|
|
3. 7 general purpose registers arranged as 2 banks of 4 registers, as well as 2 fixed registers
|
18 |
|
|
4. IP and SP are treated as normal registers, enabling very intuitive opcodes such as "push and move" without polluting the opcode space
|
19 |
|
|
5. Able to use up to 255 input and output ports
|
20 |
|
|
6. Fixed opcode size of 2 bytes
|
21 |
|
|
7. Capable of addressing up to 65536 bytes of memory with 4 segment registers for "extended" memory accesses
|
22 |
|
|
8. Conditional execution is built into every opcode
|
23 |
|
|
9. Von Neuman machine, ie data is code and vice-versa
|
24 |
|
|
|
25 |
|
|
|
26 |
|
|
Plans:
|
27 |
|
|
|
28 |
|
|
Although a lot of the processor is well underway and coded, there is still some minor planning taking place. The instruction list is still not formalized
|
29 |
|
|
and as of this writing, there is still room for 3 "full" opcodes, and 4 opcodes in a group not completely allocated.
|
30 |
|
|
|
31 |
|
|
Software:
|
32 |
|
|
|
33 |
|
|
I can already tell getting software running on this will be difficult, though I have a plan for loading software through the UART built into the papilio-one.
|
34 |
|
|
Also, I will create a fairly awesome assembler for this architecture using the DSL capabilities of Ruby. I created a prototype x86 assembler in Ruby before, so
|
35 |
|
|
it shouldn't be any big deal.. and it should be a lot easier than writing an assembler in say C... Also, I have no immediate plans of porting a C compiler.
|
36 |
|
|
This is mainly because of the small segment size(just 256 bytes).. though I'm considering adding a way to "extend" segments in some way without changing the opcode
|
37 |
|
|
format.
|
38 |
|
|
|
39 |
|
|
Oddities:
|
40 |
|
|
|
41 |
|
|
I used this opportunity to try out my "JIT/JIF" comparison mechanism. Basically, instead of doing something like
|
42 |
|
|
|
43 |
|
|
cmp r0,r1
|
44 |
|
|
jgt .greater
|
45 |
|
|
mov r0,0xFF
|
46 |
|
|
.greater:
|
47 |
|
|
mov r1,0x00
|
48 |
|
|
|
49 |
|
|
You can instead do
|
50 |
|
|
cmp_greater_than r0,r1
|
51 |
|
|
jit .greater --jit=jump if true
|
52 |
|
|
mov r0,0xFF
|
53 |
|
|
.greater:
|
54 |
|
|
mov r1,0x00
|
55 |
|
|
|
56 |
|
|
or because of the awesome conditional execution that's built in:
|
57 |
|
|
|
58 |
|
|
cmp greater_than r0,r1
|
59 |
|
|
mov_if_true r0,0xFF
|
60 |
|
|
mov r1,0x00
|
61 |
|
|
|
62 |
|
|
|
63 |
|
|
Short comings:
|
64 |
|
|
|
65 |
|
|
This truth-register mechanism is unlike anything I've ever seen, and I'm really curious as to how it will act in actual logic. Because of how it works, conditional jumps are needed
|
66 |
|
|
a lot less often, which in the future could mean less cache missing (if I ever implement a cache, that is) It's only bad part is that multiple comparisons are needed
|
67 |
|
|
when doing something like `if r0>0 and r0<10 then r3=0`:
|
68 |
|
|
|
69 |
|
|
mov r4,0
|
70 |
|
|
mov r5,10
|
71 |
|
|
cmp_greater r0,r4
|
72 |
|
|
jif .skip
|
73 |
|
|
cmp_lessthan r0,r5
|
74 |
|
|
mov_if_true r3,0
|
75 |
|
|
.skip:
|
76 |
|
|
;continue on
|
77 |
|
|
|
78 |
|
|
Another apparent thing is that code size is going to be difficult to keep down, especially since each segment can only contain 128 instructions.
|
79 |
|
|
One possible solution is adding a "overflow into segment" option where when IP rolls over from 255 to 0, it will also increment CS by 1
|