1 |
2 |
samiam9512 |
Intel 8080 CPU Verilog core
|
2 |
|
|
|
3 |
|
|
2006/9/12
|
4 |
|
|
|
5 |
|
|
PROJECT: 8080 CPU
|
6 |
|
|
LANGUAGE: VERILOG
|
7 |
|
|
TARGET: Xilinx xc3s1000-4
|
8 |
|
|
|
9 |
7 |
samiam9512 |
OVERVIEW
|
10 |
|
|
|
11 |
2 |
samiam9512 |
This core was implemented as a first project in Verilog by an old schematic
|
12 |
|
|
design engineer. There were a few reasons for implementing an 8080 processor.
|
13 |
|
|
First, it was the first true general purpose processor available. Second, it
|
14 |
|
|
was, by nature, designed to be compact in instruction set and implementation.
|
15 |
|
|
Third, it has a rich set of software applications, including assemblers,
|
16 |
|
|
compilers and operating systems.
|
17 |
|
|
|
18 |
|
|
More often, A Z80 target is used in Verilog or VHDL, with the idea that the
|
19 |
|
|
Z80 is the superset of the 8080. However, the Z80 is significantly more
|
20 |
|
|
complex than the 8080. The 8080 can make a useful maintainence processor for
|
21 |
|
|
SOC systems, since it consumes a small amount of resources. The 8080 has a
|
22 |
|
|
significant body of support, since it was a primary 8 bit processor before
|
23 |
|
|
the Z80, and coding for the 8080 instruction set often continued even after
|
24 |
|
|
general Z80 use, because that was the universal subset of both processors.
|
25 |
|
|
|
26 |
|
|
My own experience with 8080 coding lasted perhaps a year, then I switched
|
27 |
|
|
to the Z80. Although he Z80 was a significantly more usable processor to
|
28 |
|
|
code for than a 8080, it made a mildly non-orthogonal instruction set much
|
29 |
|
|
more so. I have worked on several processors through the series, including
|
30 |
|
|
working on the design for the Z280, a Z80 16 bit replacement at Zilog
|
31 |
|
|
Corporation, so hopefully nobody can accuse me of bias against the Z80 :-)
|
32 |
|
|
|
33 |
|
|
The core presented is completely compatible with the original 8080 instruction
|
34 |
|
|
set, although the exact handling of the two undefined bits in the status (flags)
|
35 |
|
|
register have not been verified to be identical to the original. This only
|
36 |
|
|
matters for "trick" code that expects a value which is pop'ed into the PSW
|
37 |
|
|
to be preserved when subsequently pushed. This core preserves all bits,
|
38 |
|
|
including the undefined bits, which means such code would function correctly.
|
39 |
|
|
Also needing verification to the original are the illegal opcodes, which on
|
40 |
|
|
this core are treated as nops.
|
41 |
|
|
|
42 |
|
|
The pinout is decidedly not compatible with the original pinout. The original
|
43 |
|
|
8080 pinout was a multiplexed nightmare that was never really designed to be
|
44 |
|
|
directly used. Intel was attempting to save on pins, and much of the CPU status
|
45 |
|
|
was sent out on every cycle via the data pins. Intel subsequently came out with
|
46 |
|
|
"demultiplexor" chips to result in simple signals.
|
47 |
|
|
|
48 |
|
|
The cpu8080 signals are a simple unmultiplexed 16 bit address, and a
|
49 |
|
|
bidirectional, separate 8 bit data bus. The read and write for each of memory
|
50 |
|
|
and I/O spaces are all separately decoded. An interrupt request and acknowledge
|
51 |
|
|
is implemented sufficient to allow an external interrupt controller to be
|
52 |
|
|
connected. A readint signal is implemented that is true for the entire time that
|
53 |
|
|
an interrupt fetch is occuring. This allows simple implementation of full
|
54 |
|
|
vectoring mode.
|
55 |
|
|
|
56 |
|
|
There were two vectoring modes on the 8080. The most famous one was the use of
|
57 |
|
|
a single instruction byte that was forced onto the data lines during an
|
58 |
|
|
interrupt acknowledge cycle. This instruction could be any valid instruction,
|
59 |
|
|
but was most likely a restart, which gave 8 possible vector locations for the
|
60 |
|
|
interrupt.
|
61 |
|
|
|
62 |
|
|
It was not as well known, but the original 8080 could accept a full 2 or 3 byte
|
63 |
|
|
instruction via the interrupt acknowledge cycle. This was used by advanced
|
64 |
|
|
Intel interrupt controllers to place a full CALL instruction on the data lines,
|
65 |
|
|
and thus achieve full arbitrary address vectoring. I have included such an
|
66 |
|
|
advanced interrupt controller as an accessory to the cpu8080 core, also in
|
67 |
|
|
verilog.
|
68 |
|
|
|
69 |
7 |
samiam9512 |
IMPLEMENTATION
|
70 |
|
|
|
71 |
|
|
The 8080 is implemented with a standard finite state machine. At the next
|
72 |
|
|
positive edge after reset, the state, PC, all of the bus signals are reset to
|
73 |
|
|
false, and the interrupt enable is set. All of the registers, including the
|
74 |
|
|
stack, are left undefined at reset.
|
75 |
|
|
|
76 |
|
|
At the fetch instruction state, the current pc is placed on the address bus,
|
77 |
|
|
and a read instruction cycle occurs. In fetch2, the readmem signal is
|
78 |
|
|
deactivated, and the opcode is now on the data lines. We read it without
|
79 |
|
|
latching.
|
80 |
|
|
|
81 |
|
|
The instruction decode starts by splitting the instruction by the top two bits.
|
82 |
|
|
This breaks down as follows:
|
83 |
|
|
|
84 |
|
|
00 - General instructions, including carry control, rotates, double adds,
|
85 |
|
|
increment and decrement, and many types of loads.
|
86 |
|
|
01 - The "mov" instruction.
|
87 |
|
|
10 - Register or memory to accumulator operations.
|
88 |
|
|
11 - General instructions, including jumps and calls, interrupt enable and
|
89 |
|
|
disable, and I/O.
|
90 |
|
|
|
91 |
|
|
The move and accumulator operations are handled as one case. The others both
|
92 |
|
|
break down into a further case based on the lower 6 bits of the instruction.
|
93 |
|
|
|
94 |
|
|
Quite a few instructions are handled in a single case. For these instructions,
|
95 |
|
|
the only required finish is to set the next pc, then set the next state back
|
96 |
|
|
to fetch instruction.
|
97 |
|
|
|
98 |
|
|
The other class of instruction is those that require other bus read or write
|
99 |
|
|
transactions to complete. This includes instructions that read or write memory,
|
100 |
|
|
and those that have a byte or word parameter following.
|
101 |
|
|
|
102 |
|
|
To handle these, there are quite a few states I call "follow" states. When a
|
103 |
|
|
fetch2 state handler needs extra states to complete, it sets the next state to
|
104 |
|
|
one of these follow states, and that state will occur on the next positive
|
105 |
|
|
clock. The follow states can then chain further follow states, and so on,
|
106 |
|
|
until the entire instruction is complete.
|
107 |
|
|
|
108 |
|
|
To "talk" between states, several registers are employed as "mailboxes". For
|
109 |
|
|
example, the regd register tells the follow states where to put fetched data
|
110 |
|
|
meant for a register.
|
111 |
|
|
|
112 |
|
|
Many multistate instructions are handled completely with follow states. However,
|
113 |
|
|
a system exists for more complex instructions I call the "macro state
|
114 |
|
|
generator". This is just a ROM that holds a series of follow states to be
|
115 |
|
|
chained up to complete instructions. These are much like subroutines, and they
|
116 |
|
|
use special follow states that get the next state from the macro table, then
|
117 |
|
|
advance the macro state selector. The chain of follow states ends when either
|
118 |
|
|
a terminal state is in the table, a state that ends by selecting the fetch
|
119 |
|
|
instruction 1 as the next state, or because the macro table specifically
|
120 |
|
|
selects that as the next state.
|
121 |
|
|
|
122 |
|
|
The most commonly used macro states are the read and write states. It takes
|
123 |
|
|
four states to write, and two to read. An instruction can use a lot of them,
|
124 |
|
|
and do it repetitively. For example, the "xthl" instruction must perform two
|
125 |
|
|
byte reads, followed by two byte writes. The read and write routines use
|
126 |
|
|
general registers to keep the addresses they must read or write, and up to
|
127 |
|
|
two bytes of data to be read or written.
|
128 |
|
|
|
129 |
|
|
The only submodule in cpu8080 is the alu. It knows how to implement 8 kinds of
|
130 |
|
|
arithmetic operations, add, add with carry, subtract, subtract with borrow, and,
|
131 |
|
|
or, xor and compare. All of the 8080 flags are calculated within the alu. The
|
132 |
|
|
compare operation is actually a subtract that discards the result, and passes
|
133 |
|
|
the A operand out as the result.
|
134 |
|
|
|
135 |
|
|
To operate the ALU, the input operands and flags are placed on its input
|
136 |
|
|
registers, then the output result and flags are moved to the appropriate
|
137 |
|
|
registers. Each time this occurs, we give a full cycle for the alu to perform
|
138 |
|
|
this action.
|
139 |
|
|
|
140 |
2 |
samiam9512 |
Scott Moore
|
141 |
|
|
|