OpenCores
URL https://opencores.org/ocsvn/light8080/light8080/trunk

Subversion Repositories light8080

[/] [light8080/] [trunk/] [doc/] [designNotes.tex] - Blame information for rev 72

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 8 ja_rd
\documentclass[11pt]{article}
2
\usepackage{graphicx}    % needed for including graphics e.g. EPS, PS
3
\usepackage{multirow}
4
\usepackage{alltt}
5
\topmargin -1.5cm        % read Lamport p.163
6
\oddsidemargin -0.04cm   % read Lamport p.163
7
\evensidemargin -0.04cm  % same as oddsidemargin but for left-hand pages
8
\textwidth 16.59cm
9
\textheight 21.94cm
10
%\pagestyle{empty}       % Uncomment if don't want page numbers
11
\parskip 7.2pt           % sets spacing between paragraphs
12
%\renewcommand{\baselinestretch}{1.5} % Uncomment for 1.5 spacing between lines
13
\parindent 15pt          % sets leading space for paragraphs
14
 
15
\begin{document}
16
 
17
\section{Basic behavior}
18
\label{basics}
19
 
20
The microcoded machine ($\mu$M) is built around a register bank and an 8-bit
21
ALU with registered operands T1 and T2. It performs all its operations in two
22
cycles, so I have divided it in two stages: an operand stage and an ALU stage.
23
This is nothing more than a 2-stage pipeline. \\
24
 
25
In the operand stage, registers T1 and T2 are loaded with either the
26
contents of the register bank (RB) or the input signal DI.\\
27
In the ALU stage, the ALU output is written back into the RB or loaded
28
into the output register DO. Besides, flags are updated, or not, according to
29
the microinstruction ($\mu$I) in execution.\\
30
 
31
Every microinstruction controls the operation of the operand stage
32
and the succeeding ALU stage; that is, the execution of a $\mu$I extends over 2
33
succeeding clock cycles, and microinstructions overlap each other. This means
34
that the part of the $\mu$I that controls the 2nd stage has to be pipelined; in
35
the VHDL code, I have divided the $\mu$I in a field\_1 and a field\_2, the
36
latter of which is registered (pipelined) and controls the 2nd $\mu$M stage
37
(ALU). \\
38
Many of the control signals are encoded in the microinstructions in what I
39
have improperly called flags. You will see many references to flags in the
40
following text (\#end,\#decode, etc.). They are just signals that you can
41
activate individually in each $\mu$I, some are active in the 1st stage, some in
42
the 2nd. They are all explained in section ~\ref{ucodeFlags}. \\
43
 
44
Note that microinstructions are atomic: both stages are guaranteed to
45
execute in all circumstances. Once the 1st stage of a $\mu$I has executed,
46
the only thing that can prevent the execution of the 2nd stage is a reset.\\
47
It might have been easier to design the machine so that microinstructions
48
executed in one cycle, thus needing no pipeline for the $\mu$I itself. I
49
arbitrarily chose to 'split' the microcode execution, figuring that it would be
50
easier for me to understand and program the microcode; in hindsight it may have
51
been a mistake but in the end, once the debugging is over, it makes little
52
difference.\\
53
 
54
The core as it is now does not support wait states: it does all its
55
external accesses (memory or i/o, read or write) in one clock cycle. It would
56
not be difficult to improve this with some little modification to the
57
micromachine, without changes to the microcode.\\
58
Since the microcode rom is the same type of memory as will be used for program
59
memory, the main advantage of microprogramming is lost. Thus, it would make
60
sense to develop the core a bit further with support for wait states, so it
61
could take advantage of the speed difference between the FPGA and external slow
62
memory.\\
63
The register bank reads asynchronously, while writes are synchronous. This
64
is the standard behaviour of a Spartan LUT-based RAM. The register bank holds
65
all the 8080 registers, including the accumulator, plus temporary, 'hidden'
66
registers (x,y,w,z). Only the PSW register is held out of the register bank, in
67
a DFF-8 register.
68
 
69
\section{Micromachine control}
70
\label{umachineControl}
71
 
72
\subsection{Microcode operation}
73
\label{ucodeOperation}
74
 
75
There is little more to the core that what has already been said; all the
76
CPU operations are microcoded, including interrupt response, reset and
77
instruction opcode fetch. The microcode source code can be seen in file
78
\texttt{ucode/light8080.m80}, in a format I expect will be less obscure than a
79
plain vhdl constant table.\\
80
 
81
The microcode table is a synchronous ROM with 512 32-bit words, designed
82
to fit in a Spartan 3 block ram. Each 32-bit word makes up a microinstruction.
83
The microcode 'program counter' (uc\_addr in the VHDL code) thus is a 9-bit
84
register.\\
85
Out of those 512 words, 256 (the upper half of the table) are used as a
86
jump-table for instruction decoding. Each entry at 256+NN contains a 'JSR'
87
$\mu$I to the start of the microcode for the instruction whose opcode is NN.
88
This seemingly unefficient use of RAM is in fact an optimization for the
89
Spartan-3 architecture to which this design is tailored –-- the 2KB RAM blocks
90
are too large for the microcode so I chose to fill them up with the decoding
91
table.\\
92
This scheme is less than efficient where smaller RAM blocks are available (e.g.
93
Altera Stratix).\\
94
The jump table is built automatically by the microcode
95
assembler, as explained in section ~\ref{ucodeAssembler}.\\
96
The upper half of the table can only be used for decoding; JSR
97
instructions can only point to the lower half, and execution from address 0x0ff
98
rolls over to 0x00 (or would; the actual microcode does not use this
99
'feature').\\
100
 
101
The ucode address counter uc\_addr has a number of possible sources: the
102
micromachine supports one level of micro-subroutine calls; it can also
103
return from those calls; the uc\_addr gets loaded with some constant values upon
104
reset, interrupt or instruction fetch. And finally, there is the decoding jump
105
table mentioned above. So, in summary, these are the possible sources of
106
uc\_addr each cycle:
107
 
108
\begin{itemize}
109
\item Constant value of 0x0001 at reset (see VHDL source for details).
110
\item Constant value of 0x0003 at the beginning (fetch cycle) of every
111
instruction.
112
\item Constant value of 0x0007 at interrupt acknowledge.
113
\item uc\_addr + 1 in normal microinstruction execution
114
\item Some 8-bit value included in JSR microinstructions (calls).
115
\item The return value preserved in the last JSR (used when flag \#ret is
116
raised)
117
\end{itemize}
118
 
119
All of this is readily apparent, I hope, by inspecting the VHDL source.
120
Note that there is only one jump microinstruction (JSR) which doubles as 'call';
121
whenever a jump is taken the the 1-level-deep 'return stack' is loaded with
122
the return address (address of the $\mu$I following the jump). You just have to
123
ignore the return address when you don't need it (e.g. the jumps in the decoding
124
jump table). I admit this scheme is awkward and inflexible; but it was the first
125
I devised, it works and fits the area budget: more than enough in this project.
126
A list of all predefined, 'special' microcode addresses follows.\\
127
\begin{itemize}
128
\item \textbf{0x001 –-- reset}\\
129
After reset, the $\mu$I program counter (uc\_addr in the VHDL code) is
130
initialized to 0x00. The program counter works as a pre-increment counter when
131
reading the microcode rom, so the $\mu$I at address 0 never gets executed (unless
132
'rolling over' from address 0x0ff, which the actual microcode does not). Reset
133
starts at address 1 and takes 2 microinstructions to clear PC to 0x0000. It does
134
nothing else. After clearing the PC the microcode runs into the fetch routine.
135
\item \textbf{0x003 –-- fetch}\\
136
The fetch routine places the PC in the address output lines while postincrementing
137
it, and then enables a memory read cycle. In doing so it relies on
138
T2 being 0x00 (necessary for the ADC to behave like an INC in the oversimplified
139
ALU), which is always true by design. After the fetch is done, the \#decode flag
140
is raised, which instructs the micromachine to take the value in the DI signal
141
(data input from external memory) and load it into the IR and the microcode
142
address counter, while setting the high address bit to 1. At the resulting
143
address there will be a JSR $\mu$I pointing to the microcode for the 8080 opcode in
144
question (the microcode assembler will make sure of that). The \#decode flag will
145
also clear registers T1 and T2.
146
\item \textbf{0x007 –-- halt}\\
147
Whenever a HALT instruction is executed, the \#halt flag is raised, which
148
when used in the same $\mu$I as flag \#end, makes the the micromachine jump to this
149
address. The $\mu$I at this address does nothing but raise flags \#halt and \#end. The
150
micromachine will keep jumping to this address until the halt state is left,
151
something which can only happen by reset or by interrupt. The \#halt flag, when
152
raised, sets the halt output signal, which will be cleared when the CPU exits
153
the halt state.
154
\end{itemize}
155
 
156
\subsection{Conditional jumps}
157
\label{conditionalJumps}
158
 
159
There is a conditional branch microinstruction: TJSR. This instruction
160
tests certain condition and, if the condition is true, performs exactly as JSR.
161
Otherwise, it ends the microcode execution exactly as if the flag \#end had been
162
raised. This microinstruction has been made for the conditional branches and
163
returns of the 8080 CPU and is not flexible enough for any other use.
164
The condition tested is encoded in the register IR, in the field ccc (bits
165
5..3), as encoded in the conditional instructions of the 8080 –-- you can look
166
them up in any 8080 reference. Flags are updated in the 2nd stage, so a TJSR
167
cannot test the flags modified by the previous $\mu$I. But it is not necessary; this
168
instruction will always be used to test conditions set by previous 8080
169
instructions, separated at least by the opcode fetch $\mu$Is, and probably many
170
more. Thus, the condition flags will always be valid upon testing.
171
 
172
\subsection{Implicit operations}
173
\label{implicitOperations}
174
 
175
Most micromachine operations happen only when explicitly commanded. But
176
some happen automatically and have to be taken into account when coding the
177
microprogram:
178
 
179
\begin{enumerate}
180
\item Register IR is loaded automatically when the flag \#decode is raised. The
181
microcode program counter is loaded automatically with the same value as
182
the IR, as has been explained above. From that point on, execution resumes
183
normally: the jump table contains normal JSR microinstructions.
184
\item T1 is cleared to 0x00 at reset, when the flag \#decode is active or when
185
the flag \#clrt1 is used.
186
\item T2 is cleared to 0x00 at reset, when the flag \#decode is active or when
187
the flag \#end is used.
188
\item Microcode flow control:
189
  \begin{enumerate}
190
  \item When flag \#end is raised, execution continues at $\mu$code address
191
        0x0003.
192
  \item When both flags \#halt and \#end are raised, execution continues at
193
        $\mu$code address 0x0007, unless there is an interrupt pending.
194
  \item Otherwise, when flag \#ret is raised, execution continues in the address
195
        following the last JSR executed. If such a return is tried before a JSR
196
        has executed since the last reset, the results are undefined –-- this
197
        should never happen with the microcode source as it is now.
198
  \item If none of the above flags are used, the next $\mu$I is executed.
199
  \end{enumerate}
200
\end{enumerate}
201
 
202
Notice that both T1 and T2 are cleared at the end of the opcode fetch, so
203
they are guaranteed to be 0x00 at the beginning of the instruction microcode.
204
And T2 is cleared too at the end of the instruction microcode, so it is
205
guaranteed clear for its use in the opcode fetch microcode. T1 can be cleared
206
if a microinstruction so requires. Refer to the section on microinstruction
207
flags.
208
 
209
 
210
\section{Microinstructions}
211
\label{uinstructions}
212
 
213
The microcode for the CPU is a source text file encoded in a format
214
described below. This 'microcode source' is assembled by the microcode assembler
215
(described later) which then builds a microcode table in VHDL format. There's
216
nothing stopping you from assembling the microcode by hand directly on the VHDL
217
source, and in a machine this simple it might have been better.
218
 
219
 
220
\subsection{Microcode source format}
221
\label{ucodeFormat}
222
 
223
The microcode source format is more similar to some early assembly language
224
that to other microcodes you may have seen. Each non-blank,
225
non-comment line of code contains a single microinstruction in the format
226
informally described below:
227
 
228
% there must be some cleaner way to do this in TeX...
229
 
230
\begin{alltt}
231
\textless microinstruction line \textgreater :=
232
    [\textless label \textgreater]\footnote{Labels appear alone by themselves in a line} \textbar
233
    \textless operand stage control \textgreater ; \textless ALU stage control \textgreater [; [\textless flag list \textgreater]] \textbar
234
    JSR \textless destination address \textgreater\textbar TJSR \textless destination address \textgreater
235
    \\
236
    \textless label \textgreater := \{':' immediately followed by a common identifier\}
237
    \textless destination address \textgreater := \{an identifier defined as a label anywhere in the file\}
238
    \textless operand stage control \textgreater := \textless op\_reg \textgreater = \textless op\_src \textgreater \textbar NOP
239
    \textless op\_reg \textgreater := T1 \textbar T2
240
    \textless op\_src \textgreater := \textless register \textgreater \textbar DI \textbar \textless IR register \textgreater
241
    \textless IR register \textgreater := \{s\}\textbar\{d\}\textbar\{p\}0\textbar\{p\}1\footnote{Registers are specified by IR field}
242
    \textless register \textgreater := \_a\textbar\_b\textbar\_c\textbar\_d\textbar\_e\textbar\_h\textbar\_l\textbar\_f\textbar\_a\textbar\_ph\textbar\_pl\textbar\_x\textbar\_y\textbar\_z\textbar\_w\textbar\_sh\textbar\_sl
243
    \textless ALU stage control \textgreater := \textless alu\_dst \textgreater = \textless alu\_op \textgreater \textbar NOP
244
    \textless alu\_dst \textgreater := \textless register \textgreater \textbar DO
245
    \textless alu\_op \textgreater := add\textbar adc\textbar sub\textbar sbb\textbar and\textbar orl\textbar not\textbar xrl\textbar rla\textbar rra\textbar rlca\textbar rrca\textbar aaa\textbar
246
    t1\textbar rst\textbar daa\textbar cpc\textbar sec\textbar psw
247
    \textless flag list \textgreater := \textless flag \textgreater [, \textless flag \textgreater ...]
248
    \textless flag \textgreater := \#decode\textbar\#di\textbar\#ei\textbar\#io\textbar\#auxcy\textbar\#clrt1\textbar\#halt\textbar\#end\textbar\#ret\textbar\#rd\textbar\#wr\textbar\#setacy
249 64 ja_rd
          \#ld\_al\textbar\#ld\_addr\textbar\#fp\_c\textbar\#fp\_r\textbar\#fp\_rc\textbar\#clr\_acy \footnote{There are some restrictions on the flags that can be used together} \\
250 8 ja_rd
\end{alltt}
251
 
252
 
253
Please bear in mind that this is just an informal description; I made
254
it up from my personal notes and the assembler source. The ultimate reference is
255
the microcode source itself and the assembler source.\\
256
Due to the way that flags have been encoded (there's less than one bit per
257
flag in the microinstruction), there are restrictions on what flags can be used
258
together. See section ~\ref{ucodeFlags}.
259
 
260
The assembler will complain if the source does not comply with the
261
expected format; but syntax check is somewhat weak.
262
In the microcode source you will see words like \_\_reset, \_\_fetch, etc.
263
which don't fit the above syntax. Those were supposed to be assembler pragmas,
264
which the assembler would use to enforce the alignment of the microinstructions
265
to certain addresses. I finally decided not to use them and align the
266
instructions myself. The assembler ignores them but I kept them as a reminder.
267
 
268
The 1st part of the $\mu$I controls the ALU operand stage; we can load either
269
T1 or T2 with either the contents of the input signal DI, or the selected
270
register from the register bank. Or, we can do nothing (NOP).\\
271
The 2nd part of the $\mu$I controls the ALU stage; we can instruct the ALU to
272
perform some operation on the operands T1 and T2 loaded by this same
273
instruction, in the previous stage; and we can select where to load the ALU
274
result, eiher in the output register DO or in the register bank. Or we can do
275
nothing of the above (NOP).
276
 
277
The write address for the register bank used in the 2nd stage has to be the
278
same as the read address used in the 1st stage; that is, if both $\mu$I parts use the
279
RB, both have to use the same address (the assembler will enforce this
280
restriction). This is due to an early, silly mistake that I chose not to fix:
281
there is a single $\mu$I field that holds both addresses.\\
282
This is a very annoying limitation that unduly complicates the microcode
283
and wastes many microcode slots for no saving in hardware; I just did not want
284
to make any major refactors until the project is working. As
285
you can see in the VHDL source, the machine is prepared to use 2 independent
286
address fields with little modification. I may do this improvement and others
287
in a later version, but only when I deem the design 'finished' (since the design
288
as it is already exceeds my modest performance target).
289
 
290
 
291
\subsection{Microcode ALU operations}
292
\label{ucodeAluOps}
293
 
294
\begin{tabular}{|l|l|l|l|}
295
\hline
296
\multicolumn{4}{|c|}{ALU operations} \\
297
\hline
298
Operation & encoding & result & notes \\
299
 
300
\hline ADD & 001100 & T2 + T1 & \\
301
\hline ADC & 001101 & T2 + T1 + CY & \\
302
\hline SUB & 001110 & T2 - T1 & \\
303
\hline SBB & 001111 & T2 – T1 - CY & \\
304
\hline AND & 000100 & T1 AND T2 & \\
305
\hline ORL & 000101 & T1 OR T2 & \\
306
\hline NOT & 000110 & NOT T1 & \\
307
\hline XRL & 000111 & T1 XOR T2 & \\
308
\hline RLA & 000000 & 8080 RLC & \\
309
\hline RRA & 000001 & 8080 RRC & \\
310
\hline RLCA & 000010 & 8080 RAL & \\
311
\hline RRCA & 000011 & 8080 RAR & \\
312
\hline T1 & 010111 & T1 & \\
313
\hline RST & 011111 & 8*IR(5..3) & as per RST instruction \\
314
\hline DAA & 101000 & DAA T1 & but only after executing 2 in a row \\
315
\hline CPC & 101100 & UNDEFINED & CY complemented \\
316
\hline SEC & 101101 & UNDEFINED & CY set \\
317
\hline PSW & 110000 & PSW & \\
318
\hline
319
 
320
\end{tabular}
321
 
322
 
323
 
324
Notice that ALU operation DAA takes two cycles to complete; it uses a
325
dedicated circuit with an extra pipeline stage. So it has to be executed twice
326
in a row before taking the result -- refer to microcode source for an example.\\
327
The PSW register is updated with the ALU result at every cycle, whatever
328
ALU operation is executed –- though every ALU operation computes flags by
329
different means, as it is apparent in the case of CY. Which flags are updated,
330
and which keep their previous values, is defined by a microinstruction field
331
named flag\_pattern. See the VHDL code for details.
332
 
333
 
334
\subsection{Microcode binary format}
335
\label{ucodeBinFormat}
336
 
337
\begin{tabular}{|l|l|l|}
338
\hline
339
\multicolumn{3}{|c|}{Microcode word bitfields} \\ \hline
340
POS & VHDL NAME & PURPOSE \\ \hline
341
31..29 & uc\_flags1 & Encoded flag of group 1 (see section on flags) \\ \hline
342
28..26 & uc\_flags2 & Encoded flag of group 2 (see section on flags) \\ \hline
343
25 & load\_addr & Address register load enable (note 1) \\ \hline
344
24 & load\_al & AL load enable (note 1) \\ \hline
345
23 & load\_t1 & T1 load enable \\ \hline
346
22 & load\_t2 & T2 load enable \\ \hline
347
21 & mux\_in & T1/T2 source mux control (0 for DI, 1 for reg bank) \\ \hline
348
20..19 & rb\_addr\_sel & Register bank address source control (note 2) \\ \hline
349
18..15 & ra\_field & Register bank address (used both for write and read) \\ \hline
350 64 ja_rd
14 & clr\_acy & Clear CY and AC -- see explaination below (pipelined signal) \\ \hline
351 8 ja_rd
13..10 & (unused) & Reserved for write register bank address, unused yet \\ \hline
352
11..10 & uc\_jmp\_addr(7..6) & JSR/TJSR jump address, higher 2 bits \\ \hline
353
9..8 & flag\_pattern & PSW flag update control (note 3) (pipelined signal) \\ \hline
354
7 & load\_do & DO load enable (note 4) (pipelined signal) \\ \hline
355
6 & we\_rb & Register bank write enable (pipelined signal) \\ \hline
356
5..0 & uc\_jmp\_addr(5..0) & JSR/TJSR jump address, lower 6 bits \\ \hline
357
5..0 & (several) & Encoded ALU operation \\ \hline
358
\end{tabular}
359
 
360
\begin{itemize}
361
\item {\bf Note 1: load\_al}\\
362
AL is a temporary register for the lower byte of the external 16 bit
363
address. The memory interface (and the IO interface) assumes external
364
synchronous memory, so the 16 bit address has to be externally loaded as
365
commanded by load\_addr.
366
Note that both halves of the address signal load directly from the
367
register bank output; you can load AL with PC, for instance, in the same cycle
368
in which you modify the PC –- AL will load with the pre-modified value.
369
 
370
\item {\bf Note 2 : rb\_addr\_sel}\\
371
A microinstruction can access any register as specified by ra\_field, or
372
the register fields in the 8080 instruction opcode: S, D and RP (the
373
microinstruction can select which register of the pair). In the microcode source
374
this is encoded like this:
375
\begin{description}
376
\item[\{s\}] $\Rightarrow$ 0 \& SSS
377
\item[\{d\}] $\Rightarrow$ 0 \& DDD
378
\item[\{p\}0] $\Rightarrow$ 1 \& PP \& 0 (HIGH byte of register pair)
379
\item[\{p\}1] $\Rightarrow$ 1 \& PP \& 1 (LOW byte of register pair)
380
\end{description}
381
\small SSS = IR(5 downto 3) (source register)\\
382
\small DDD = IR(2 downto 0) (destination register)\\
383
\small PP = IR(5 downto 4) (register pair)\\
384
 
385
\item {\bf Note 3 : flag\_pattern}\\
386
Selects which flags of the PSW, if any, will be updated by the
387
microinstruction:
388
\begin{itemize}
389
\item When flag\_pattern(0)='1', CY is updated in the PSW.
390
\item When flag\_pattern(1)='1', all flags other than CY are updated in the PSW.
391
\end{itemize}
392
 
393
\item {\bf Note 4 : load\_do}\\
394
DO is the data ouput register that is loaded with the ALU output, so the
395
load enable signal is pipelined.
396
 
397
\item {\bf Note 5 : JSR-H and JSR-L}\\
398
These fields overlap existing fields which are unused in JSR/TJSR
399
instructions (fields which can be used with no secondary effects).
400
 
401
\end{itemize}
402
 
403
\subsection{Microcode flags}
404
\label{ucodeFlags}
405
 
406
 
407
Flags is what I have called those signals of the microinstruction that you
408
assert individually in the microcode source. Due to the way they have been
409
encoded, I have separated them in two groups. Only one flag in each group can be
410
used in any instruction. These are all the flags in the format thay appear in
411
the microcode source:
412
 
413
\begin{itemize}
414
\item Flags from group 1: use only one of these
415
  \begin{itemize}
416
  \item \#decode : Load address counter and IR with contents of data input
417
  lines, thus starting opcode decoging.
418
  \item \#ei : Set interrupt enable register.
419
  \item \#di : Reset interrupt enable register.
420
  \item \#io : Activate io signal for 1st cycle.
421
  \item \#auxcy : Use aux carry instead of regular carry for this $\mu$I.
422
  \item \#clrt1 : Clear T1 at the end of 1st cycle.
423
  \item \#halt : Jump to microcode address 0x07 without saving return value,
424
  when used with flag \#end, and only if there is no interrupt
425
  pending. Ignored otherwise.
426
  \end{itemize}
427
 
428
\item Flags from group 2: use only one of these
429
  \begin{itemize}
430
  \item \#setacy : Set aux carry at the start of 1st cycle (used for ++).
431
  \item \#end : Jump to microinstruction address 3 after the present m.i.
432
  \item \#ret : Jump to address saved by the last JST or TJSR m.i.
433
  \item \#rd : Activate rd signal for the 2nd cycle.
434
  \item \#wr : Activate wr signal for the 2nd cycle.
435
  \end{itemize}
436
 
437
\item Independent flags: no restrictions
438
  \begin{itemize}
439
  \item \#ld\_al : Load AL register with register bank output as read by opn. 1
440
  (used in memory and io access).
441
  \item \#ld\_addr : Load address register (H byte = register bank output as read
442
  by operation 1, L byte = AL).
443
  Activate vma signal for 1st cycle.
444 64 ja_rd
  \item \#clr\_acy : Clear PSW flags AC and CY, except for AND instructions
445
  (ALU operation = 000100), where AC is set.
446
  Meant to be used with flag \#fp\_rc for the logic instructions (AND, OR, XOR).
447
  See \ref{compatibility} for a note about compatibility to the original 8080.
448 8 ja_rd
  \end{itemize}
449
 
450
\item PSW update flags: use only one of these
451
  \begin{itemize}
452
  \item \#fp\_r : This instruction updates all PSW flags except for C.
453
  \item \#fp\_c : This instruction updates only the C flag in the PSW.
454
  \item \#fp\_rc : This instruction updates all the flags in the PSW.
455
  \end{itemize}
456
 
457
\end{itemize}
458
 
459
\section{Notes on the microcode assembler}
460
\label{ucodeAssembler}
461
 
462
The microcode assembler is a Perl script (\texttt{util/uasm.pl}). Please refer
463
to the comments in the script for a reference on the usage of the assembler.\\
464
I will admit up front that the microcode source format and the assembler
465
program itself are a mess. They were hacked quickly and then often retouched
466
but never redesigned, in order to avoid the 'never ending project' syndrome.\\
467
Please note that use of the assembler, and the microcode assembly source,
468
is optional and perhaps overkill for this simple core. All you need to build the
469
core is the vhdl source file.\\
470
 
471
The perl assembler itself accounted for more than half of all the bugs I caught
472
during development.
473
Though the assembler certainly saved me a lot of mistakes in the hand-assembly
474
of the microcode, a half-cooked assembler like
475
this one may do more harm than good. I expect that the program now behaves
476
correctly; I have done a lot of modifications to the microcode source for
477
testing purposes and I have not found any more bugs in the assembler. But you
478
have been warned: don't trust the assembler too much (in case someone actually
479
wants to mess with these things at all).\\
480
The assembler is a Perl program (\texttt{util/uasm.pl}) that will read a
481
microcode text source file and write to stdout a microcode table in the form of
482
a chunk of VHDL code. You are supposed to capture that output and paste it into
483
the VHDL source (Actually, I used another perl script to do that, but I don't
484
want to overcomplicate an already messy documentation).\\
485
The assembler can do some other simple operations on the source, for debug
486
purposes. The invocation options are documented in the program file.\\
487
You don't need any extra Perl modules or libraries, any distribution of Perl 5
488
will do -– earlier versions should too but might not, I haven't
489
tested.
490
 
491
\section{CPU details}
492
\label{cpuDetails}
493
 
494
\subsection{Synchronous memory and i/o interface}
495
\label{syncMem}
496
 
497
The core is designed to connect to external synchronous memory similar to
498
the internal fpga ram blocks found in the Spartan series. It can be used with
499
asynchronous ram provided that you add the necessary registers (I have used it
500
with external SRAM included on a development board with no trouble).
501
 
502
Signal 'vma' is the master read/write enable. It is designed to be used as
503
a synchronous rd/wr enable. All other memory/io signals are only valid when vma
504
is active. Read data is sampled in the positive clock edge following deassertion
505
of vma. Than is, the core expects external memory and io to behave as an
506
internal fpga block ram would.\\
507
I think the interface is simple enough to be fully described by the
508
comments in the header of the VHDL source file.
509
 
510
\subsection{Interrupt response}
511
\label{irqResponse}
512
 
513
Interrupt response has been greatly simplified, but it follows the outline
514
of the original procedure. The biggest difference is that inta is
515
active for the entire duration of the instruction, and not only the opcode fetch
516
cycle.
517
 
518
Whenever a high value is sampled in line intr in any positive clock edge,
519
an interrupt pending flag is internally raised. After the current instruction
520
finishes execution, the interrupt pending flag is sampled. If active, it is
521
cleared, interrupts are disabled and the processor enters an inta cycle. If
522
inactive, the processor enters a fetch cycle as usual.
523
The inta cycle is identical to a fetch cycle, with the exception that inta
524
signal is asserted high.
525
 
526
The processor will fetch an opcode during the first inta cycle and will
527
execute it normally, except the PC increment will not happen and inta will be
528
high for the duration of the instruction. Note that though pc increment is
529
inhibited while inta is high, pc can be explicitly changed (rst, jmp, etc.).
530
After the special inta instruction execution is done, the processor
531
resumes normal execution, with interrupts disabled.\\
532
The above means that any instruction (even XTHL, which the original 8080
533
forbids) can be used as an interrupt vector and will be executed normally. The
534
core has been tested with rst, lxi and inr, for example.
535
 
536
Since there's no M1 signal available, feeding multi-byte instructions as
537
interrupt vectors can be a little complicated. It is up to you to deal with this
538
situation (i.e. use only single-byte vectors or make up some sort of cycle
539
counter).
540
 
541
\subsection{Instruction timing}
542
\label{timing}
543
 
544
This core is slower than the original in terms of clocks per instruction.
545
Since the original 8080 was itself one of the slowest micros ever, this does not
546
say much for the core. Yet, one of these clocked at 50MHz would outperform an
547
original 8080 at 25 Mhz, which is fast enough for many control applications ---
548
except that there are possibly better alternatives.\\
549
A comparative table follows.
550
 
551
 
552
\begin{tabular}{|l|l|l|l|l|l|l|}
553
\hline
554
\multicolumn{7}{|c|}{Instruction timing (core vs. original)} \\ \hline
555
 
556
Opcode & Intel 8080 & Light8080 & & Opcode & Intel 8080 & Light8080 \\ \hline
557
 
558
MOV r1, r2 & 5 & 6 & &        XRA M & 7 & 9 \\ \hline
559
MOV r, M & 7 & 9 & &          XRI data & 7 & 9 \\ \hline
560
MOV M, r & 7 & 9 & &          ORA r & 4 & 6 \\ \hline
561
MVI r, data & 7 & 9 & &       ORA M & 7 & 9 \\ \hline
562
MVI M, data & 10 & 12 & &     ORI data & 7 & 9 \\ \hline
563
LXI rp, data16 & 10 & 14 & &  CMP r & 4 & 6 \\ \hline
564
LDA addr & 13 & 16 & &        CMP M & 7 & 9 \\ \hline
565
STA addr & 13 & 16 & &        CPI data & 7 & 9 \\ \hline
566
LHLD addr & 16 & 19 & &       RLC & 4 & 5 \\ \hline
567
SHLD addr & 16 & 19 & &       RRC & 4 & 5 \\ \hline
568
LDAX rp & 7 & 9 & &           RAL & 4 & 5 \\ \hline
569
STAX rp & 7 & 9 & &           RAR & 4 & 5 \\ \hline
570
XCHG & 4 & 16 & &             CMA & 4 & 5 \\ \hline
571
ADD r & 4 & 6 & &             CMC & 4 & 5 \\ \hline
572
ADD M & 7 & 9 & &             STC & 4 & 5 \\ \hline
573
ADI data & 7 & 9 & &          JMP & 10 & 15 \\ \hline
574
ADC r & 4 & 6 & &             Jcc & 10 & 12/16 \\ \hline
575
ADC M & 7 & 9 & &             CALL & 17 & 29 \\ \hline
576
ACI data & 7 & 9 & &          Ccc & 11/17 & 12/30 \\ \hline
577
SUB r & 4 & 6 & &             RET & 10 & 14 \\ \hline
578
SUB M & 7 & 9 & &             Rcc & 5/11 & 5/15 \\ \hline
579
SUI data & 7 & 9 & &          RST n & 11 & 20 \\ \hline
580
SBB r & 4 & 6 & &             PCHL & 5 & 8 \\ \hline
581
SBB M & 7 & 9 & &             PUSH rp & 11 & 19 \\ \hline
582
SBI data & 7 & 9 & &          PUSH PSW & 11 & 19 \\ \hline
583
INR r & 5 & 6 & &             POP rp & 10 & 14 \\ \hline
584
INR M & 10 & 13 & & POP PSW & 10 & 14 \\ \hline
585
INX rp & 5 & 6 & & XTHL & 18 & 32 \\ \hline
586
DCR r & 5 & 6 & & SPHL & 5 & 8 \\ \hline
587
DCR M & 10 & 14 & & EI & 4 & 5 \\ \hline
588
DCX rp & 5 & 6 & & DI & 4 & 5 \\ \hline
589
DAD rp & 10 & 8 & & IN port & 10 & 14 \\ \hline
590
DAA & 4 & 6 & & OUT port & 10 & 14 \\ \hline
591
ANA r & 4 & 6 & & HLT & 7 & 5 \\ \hline
592
ANA M & 7 & 9 & & NOP & 4 & 5 \\ \hline
593
ANI data & 7 & 9 & & & & \\ \hline
594
XRA r & 4 & 6 & & & & \\ \hline
595
 
596
\end{tabular}
597
 
598 64 ja_rd
\clearpage
599
 
600
\subsection{Binary compatibility to original 8080}
601
\label{compatibility}
602
 
603
Flag AC (auxiliary carry) does not work exactly as in the original 8080. In the
604
original 8080, ANI and ANA don't clear AC but set it to the OR'ing of
605
bits 3 of the ALU operands.
606
 
607
In this core, these two instructions instead set the AC flag to 1. In this, the
608
core is compatible to the 8085 ad not to the 8080.
609
 
610
That is the only difference to the original 8080 that I am aware of.
611
Unfortunately, the only test bench that I have available right now is not
612
exhaustive enough to pick that kind of detail. Until I develop a stronger test
613
bench, full compatibility to the 8080 can't be guaranteed.
614 8 ja_rd
 
615
\end{document}
616
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.