OpenCores
URL https://opencores.org/ocsvn/light8080/light8080/trunk

Subversion Repositories light8080

[/] [light8080/] [trunk/] [doc/] [designNotes.tex] - Blame information for rev 63

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 8 ja_rd
\documentclass[11pt]{article}
2
\usepackage{graphicx}    % needed for including graphics e.g. EPS, PS
3
\usepackage{multirow}
4
\usepackage{alltt}
5
\topmargin -1.5cm        % read Lamport p.163
6
\oddsidemargin -0.04cm   % read Lamport p.163
7
\evensidemargin -0.04cm  % same as oddsidemargin but for left-hand pages
8
\textwidth 16.59cm
9
\textheight 21.94cm
10
%\pagestyle{empty}       % Uncomment if don't want page numbers
11
\parskip 7.2pt           % sets spacing between paragraphs
12
%\renewcommand{\baselinestretch}{1.5} % Uncomment for 1.5 spacing between lines
13
\parindent 15pt          % sets leading space for paragraphs
14
 
15
\begin{document}
16
 
17
\section{Basic behavior}
18
\label{basics}
19
 
20
The microcoded machine ($\mu$M) is built around a register bank and an 8-bit
21
ALU with registered operands T1 and T2. It performs all its operations in two
22
cycles, so I have divided it in two stages: an operand stage and an ALU stage.
23
This is nothing more than a 2-stage pipeline. \\
24
 
25
In the operand stage, registers T1 and T2 are loaded with either the
26
contents of the register bank (RB) or the input signal DI.\\
27
In the ALU stage, the ALU output is written back into the RB or loaded
28
into the output register DO. Besides, flags are updated, or not, according to
29
the microinstruction ($\mu$I) in execution.\\
30
 
31
Every microinstruction controls the operation of the operand stage
32
and the succeeding ALU stage; that is, the execution of a $\mu$I extends over 2
33
succeeding clock cycles, and microinstructions overlap each other. This means
34
that the part of the $\mu$I that controls the 2nd stage has to be pipelined; in
35
the VHDL code, I have divided the $\mu$I in a field\_1 and a field\_2, the
36
latter of which is registered (pipelined) and controls the 2nd $\mu$M stage
37
(ALU). \\
38
Many of the control signals are encoded in the microinstructions in what I
39
have improperly called flags. You will see many references to flags in the
40
following text (\#end,\#decode, etc.). They are just signals that you can
41
activate individually in each $\mu$I, some are active in the 1st stage, some in
42
the 2nd. They are all explained in section ~\ref{ucodeFlags}. \\
43
 
44
Note that microinstructions are atomic: both stages are guaranteed to
45
execute in all circumstances. Once the 1st stage of a $\mu$I has executed,
46
the only thing that can prevent the execution of the 2nd stage is a reset.\\
47
It might have been easier to design the machine so that microinstructions
48
executed in one cycle, thus needing no pipeline for the $\mu$I itself. I
49
arbitrarily chose to 'split' the microcode execution, figuring that it would be
50
easier for me to understand and program the microcode; in hindsight it may have
51
been a mistake but in the end, once the debugging is over, it makes little
52
difference.\\
53
 
54
The core as it is now does not support wait states: it does all its
55
external accesses (memory or i/o, read or write) in one clock cycle. It would
56
not be difficult to improve this with some little modification to the
57
micromachine, without changes to the microcode.\\
58
Since the microcode rom is the same type of memory as will be used for program
59
memory, the main advantage of microprogramming is lost. Thus, it would make
60
sense to develop the core a bit further with support for wait states, so it
61
could take advantage of the speed difference between the FPGA and external slow
62
memory.\\
63
The register bank reads asynchronously, while writes are synchronous. This
64
is the standard behaviour of a Spartan LUT-based RAM. The register bank holds
65
all the 8080 registers, including the accumulator, plus temporary, 'hidden'
66
registers (x,y,w,z). Only the PSW register is held out of the register bank, in
67
a DFF-8 register.
68
 
69
\section{Micromachine control}
70
\label{umachineControl}
71
 
72
\subsection{Microcode operation}
73
\label{ucodeOperation}
74
 
75
There is little more to the core that what has already been said; all the
76
CPU operations are microcoded, including interrupt response, reset and
77
instruction opcode fetch. The microcode source code can be seen in file
78
\texttt{ucode/light8080.m80}, in a format I expect will be less obscure than a
79
plain vhdl constant table.\\
80
 
81
The microcode table is a synchronous ROM with 512 32-bit words, designed
82
to fit in a Spartan 3 block ram. Each 32-bit word makes up a microinstruction.
83
The microcode 'program counter' (uc\_addr in the VHDL code) thus is a 9-bit
84
register.\\
85
Out of those 512 words, 256 (the upper half of the table) are used as a
86
jump-table for instruction decoding. Each entry at 256+NN contains a 'JSR'
87
$\mu$I to the start of the microcode for the instruction whose opcode is NN.
88
This seemingly unefficient use of RAM is in fact an optimization for the
89
Spartan-3 architecture to which this design is tailored –-- the 2KB RAM blocks
90
are too large for the microcode so I chose to fill them up with the decoding
91
table.\\
92
This scheme is less than efficient where smaller RAM blocks are available (e.g.
93
Altera Stratix).\\
94
The jump table is built automatically by the microcode
95
assembler, as explained in section ~\ref{ucodeAssembler}.\\
96
The upper half of the table can only be used for decoding; JSR
97
instructions can only point to the lower half, and execution from address 0x0ff
98
rolls over to 0x00 (or would; the actual microcode does not use this
99
'feature').\\
100
 
101
The ucode address counter uc\_addr has a number of possible sources: the
102
micromachine supports one level of micro-subroutine calls; it can also
103
return from those calls; the uc\_addr gets loaded with some constant values upon
104
reset, interrupt or instruction fetch. And finally, there is the decoding jump
105
table mentioned above. So, in summary, these are the possible sources of
106
uc\_addr each cycle:
107
 
108
\begin{itemize}
109
\item Constant value of 0x0001 at reset (see VHDL source for details).
110
\item Constant value of 0x0003 at the beginning (fetch cycle) of every
111
instruction.
112
\item Constant value of 0x0007 at interrupt acknowledge.
113
\item uc\_addr + 1 in normal microinstruction execution
114
\item Some 8-bit value included in JSR microinstructions (calls).
115
\item The return value preserved in the last JSR (used when flag \#ret is
116
raised)
117
\end{itemize}
118
 
119
All of this is readily apparent, I hope, by inspecting the VHDL source.
120
Note that there is only one jump microinstruction (JSR) which doubles as 'call';
121
whenever a jump is taken the the 1-level-deep 'return stack' is loaded with
122
the return address (address of the $\mu$I following the jump). You just have to
123
ignore the return address when you don't need it (e.g. the jumps in the decoding
124
jump table). I admit this scheme is awkward and inflexible; but it was the first
125
I devised, it works and fits the area budget: more than enough in this project.
126
A list of all predefined, 'special' microcode addresses follows.\\
127
\begin{itemize}
128
\item \textbf{0x001 –-- reset}\\
129
After reset, the $\mu$I program counter (uc\_addr in the VHDL code) is
130
initialized to 0x00. The program counter works as a pre-increment counter when
131
reading the microcode rom, so the $\mu$I at address 0 never gets executed (unless
132
'rolling over' from address 0x0ff, which the actual microcode does not). Reset
133
starts at address 1 and takes 2 microinstructions to clear PC to 0x0000. It does
134
nothing else. After clearing the PC the microcode runs into the fetch routine.
135
\item \textbf{0x003 –-- fetch}\\
136
The fetch routine places the PC in the address output lines while postincrementing
137
it, and then enables a memory read cycle. In doing so it relies on
138
T2 being 0x00 (necessary for the ADC to behave like an INC in the oversimplified
139
ALU), which is always true by design. After the fetch is done, the \#decode flag
140
is raised, which instructs the micromachine to take the value in the DI signal
141
(data input from external memory) and load it into the IR and the microcode
142
address counter, while setting the high address bit to 1. At the resulting
143
address there will be a JSR $\mu$I pointing to the microcode for the 8080 opcode in
144
question (the microcode assembler will make sure of that). The \#decode flag will
145
also clear registers T1 and T2.
146
\item \textbf{0x007 –-- halt}\\
147
Whenever a HALT instruction is executed, the \#halt flag is raised, which
148
when used in the same $\mu$I as flag \#end, makes the the micromachine jump to this
149
address. The $\mu$I at this address does nothing but raise flags \#halt and \#end. The
150
micromachine will keep jumping to this address until the halt state is left,
151
something which can only happen by reset or by interrupt. The \#halt flag, when
152
raised, sets the halt output signal, which will be cleared when the CPU exits
153
the halt state.
154
\end{itemize}
155
 
156
\subsection{Conditional jumps}
157
\label{conditionalJumps}
158
 
159
There is a conditional branch microinstruction: TJSR. This instruction
160
tests certain condition and, if the condition is true, performs exactly as JSR.
161
Otherwise, it ends the microcode execution exactly as if the flag \#end had been
162
raised. This microinstruction has been made for the conditional branches and
163
returns of the 8080 CPU and is not flexible enough for any other use.
164
The condition tested is encoded in the register IR, in the field ccc (bits
165
5..3), as encoded in the conditional instructions of the 8080 –-- you can look
166
them up in any 8080 reference. Flags are updated in the 2nd stage, so a TJSR
167
cannot test the flags modified by the previous $\mu$I. But it is not necessary; this
168
instruction will always be used to test conditions set by previous 8080
169
instructions, separated at least by the opcode fetch $\mu$Is, and probably many
170
more. Thus, the condition flags will always be valid upon testing.
171
 
172
\subsection{Implicit operations}
173
\label{implicitOperations}
174
 
175
Most micromachine operations happen only when explicitly commanded. But
176
some happen automatically and have to be taken into account when coding the
177
microprogram:
178
 
179
\begin{enumerate}
180
\item Register IR is loaded automatically when the flag \#decode is raised. The
181
microcode program counter is loaded automatically with the same value as
182
the IR, as has been explained above. From that point on, execution resumes
183
normally: the jump table contains normal JSR microinstructions.
184
\item T1 is cleared to 0x00 at reset, when the flag \#decode is active or when
185
the flag \#clrt1 is used.
186
\item T2 is cleared to 0x00 at reset, when the flag \#decode is active or when
187
the flag \#end is used.
188
\item Microcode flow control:
189
  \begin{enumerate}
190
  \item When flag \#end is raised, execution continues at $\mu$code address
191
        0x0003.
192
  \item When both flags \#halt and \#end are raised, execution continues at
193
        $\mu$code address 0x0007, unless there is an interrupt pending.
194
  \item Otherwise, when flag \#ret is raised, execution continues in the address
195
        following the last JSR executed. If such a return is tried before a JSR
196
        has executed since the last reset, the results are undefined –-- this
197
        should never happen with the microcode source as it is now.
198
  \item If none of the above flags are used, the next $\mu$I is executed.
199
  \end{enumerate}
200
\end{enumerate}
201
 
202
Notice that both T1 and T2 are cleared at the end of the opcode fetch, so
203
they are guaranteed to be 0x00 at the beginning of the instruction microcode.
204
And T2 is cleared too at the end of the instruction microcode, so it is
205
guaranteed clear for its use in the opcode fetch microcode. T1 can be cleared
206
if a microinstruction so requires. Refer to the section on microinstruction
207
flags.
208
 
209
 
210
\section{Microinstructions}
211
\label{uinstructions}
212
 
213
The microcode for the CPU is a source text file encoded in a format
214
described below. This 'microcode source' is assembled by the microcode assembler
215
(described later) which then builds a microcode table in VHDL format. There's
216
nothing stopping you from assembling the microcode by hand directly on the VHDL
217
source, and in a machine this simple it might have been better.
218
 
219
 
220
\subsection{Microcode source format}
221
\label{ucodeFormat}
222
 
223
The microcode source format is more similar to some early assembly language
224
that to other microcodes you may have seen. Each non-blank,
225
non-comment line of code contains a single microinstruction in the format
226
informally described below:
227
 
228
% there must be some cleaner way to do this in TeX...
229
 
230
\begin{alltt}
231
\textless microinstruction line \textgreater :=
232
    [\textless label \textgreater]\footnote{Labels appear alone by themselves in a line} \textbar
233
    \textless operand stage control \textgreater ; \textless ALU stage control \textgreater [; [\textless flag list \textgreater]] \textbar
234
    JSR \textless destination address \textgreater\textbar TJSR \textless destination address \textgreater
235
    \\
236
    \textless label \textgreater := \{':' immediately followed by a common identifier\}
237
    \textless destination address \textgreater := \{an identifier defined as a label anywhere in the file\}
238
    \textless operand stage control \textgreater := \textless op\_reg \textgreater = \textless op\_src \textgreater \textbar NOP
239
    \textless op\_reg \textgreater := T1 \textbar T2
240
    \textless op\_src \textgreater := \textless register \textgreater \textbar DI \textbar \textless IR register \textgreater
241
    \textless IR register \textgreater := \{s\}\textbar\{d\}\textbar\{p\}0\textbar\{p\}1\footnote{Registers are specified by IR field}
242
    \textless register \textgreater := \_a\textbar\_b\textbar\_c\textbar\_d\textbar\_e\textbar\_h\textbar\_l\textbar\_f\textbar\_a\textbar\_ph\textbar\_pl\textbar\_x\textbar\_y\textbar\_z\textbar\_w\textbar\_sh\textbar\_sl
243
    \textless ALU stage control \textgreater := \textless alu\_dst \textgreater = \textless alu\_op \textgreater \textbar NOP
244
    \textless alu\_dst \textgreater := \textless register \textgreater \textbar DO
245
    \textless alu\_op \textgreater := add\textbar adc\textbar sub\textbar sbb\textbar and\textbar orl\textbar not\textbar xrl\textbar rla\textbar rra\textbar rlca\textbar rrca\textbar aaa\textbar
246
    t1\textbar rst\textbar daa\textbar cpc\textbar sec\textbar psw
247
    \textless flag list \textgreater := \textless flag \textgreater [, \textless flag \textgreater ...]
248
    \textless flag \textgreater := \#decode\textbar\#di\textbar\#ei\textbar\#io\textbar\#auxcy\textbar\#clrt1\textbar\#halt\textbar\#end\textbar\#ret\textbar\#rd\textbar\#wr\textbar\#setacy
249
          \#ld\_al\textbar\#ld\_addr\textbar\#fp\_c\textbar\#fp\_r\textbar\#fp\_rc \footnote{There are some restrictions on the flags that can be used together} \\
250
\end{alltt}
251
 
252
 
253
Please bear in mind that this is just an informal description; I made
254
it up from my personal notes and the assembler source. The ultimate reference is
255
the microcode source itself and the assembler source.\\
256
Due to the way that flags have been encoded (there's less than one bit per
257
flag in the microinstruction), there are restrictions on what flags can be used
258
together. See section ~\ref{ucodeFlags}.
259
 
260
The assembler will complain if the source does not comply with the
261
expected format; but syntax check is somewhat weak.
262
In the microcode source you will see words like \_\_reset, \_\_fetch, etc.
263
which don't fit the above syntax. Those were supposed to be assembler pragmas,
264
which the assembler would use to enforce the alignment of the microinstructions
265
to certain addresses. I finally decided not to use them and align the
266
instructions myself. The assembler ignores them but I kept them as a reminder.
267
 
268
The 1st part of the $\mu$I controls the ALU operand stage; we can load either
269
T1 or T2 with either the contents of the input signal DI, or the selected
270
register from the register bank. Or, we can do nothing (NOP).\\
271
The 2nd part of the $\mu$I controls the ALU stage; we can instruct the ALU to
272
perform some operation on the operands T1 and T2 loaded by this same
273
instruction, in the previous stage; and we can select where to load the ALU
274
result, eiher in the output register DO or in the register bank. Or we can do
275
nothing of the above (NOP).
276
 
277
The write address for the register bank used in the 2nd stage has to be the
278
same as the read address used in the 1st stage; that is, if both $\mu$I parts use the
279
RB, both have to use the same address (the assembler will enforce this
280
restriction). This is due to an early, silly mistake that I chose not to fix:
281
there is a single $\mu$I field that holds both addresses.\\
282
This is a very annoying limitation that unduly complicates the microcode
283
and wastes many microcode slots for no saving in hardware; I just did not want
284
to make any major refactors until the project is working. As
285
you can see in the VHDL source, the machine is prepared to use 2 independent
286
address fields with little modification. I may do this improvement and others
287
in a later version, but only when I deem the design 'finished' (since the design
288
as it is already exceeds my modest performance target).
289
 
290
 
291
\subsection{Microcode ALU operations}
292
\label{ucodeAluOps}
293
 
294
\begin{tabular}{|l|l|l|l|}
295
\hline
296
\multicolumn{4}{|c|}{ALU operations} \\
297
\hline
298
Operation & encoding & result & notes \\
299
 
300
\hline ADD & 001100 & T2 + T1 & \\
301
\hline ADC & 001101 & T2 + T1 + CY & \\
302
\hline SUB & 001110 & T2 - T1 & \\
303
\hline SBB & 001111 & T2 – T1 - CY & \\
304
\hline AND & 000100 & T1 AND T2 & \\
305
\hline ORL & 000101 & T1 OR T2 & \\
306
\hline NOT & 000110 & NOT T1 & \\
307
\hline XRL & 000111 & T1 XOR T2 & \\
308
\hline RLA & 000000 & 8080 RLC & \\
309
\hline RRA & 000001 & 8080 RRC & \\
310
\hline RLCA & 000010 & 8080 RAL & \\
311
\hline RRCA & 000011 & 8080 RAR & \\
312
\hline T1 & 010111 & T1 & \\
313
\hline RST & 011111 & 8*IR(5..3) & as per RST instruction \\
314
\hline DAA & 101000 & DAA T1 & but only after executing 2 in a row \\
315
\hline CPC & 101100 & UNDEFINED & CY complemented \\
316
\hline SEC & 101101 & UNDEFINED & CY set \\
317
\hline PSW & 110000 & PSW & \\
318
\hline
319
 
320
\end{tabular}
321
 
322
 
323
 
324
Notice that ALU operation DAA takes two cycles to complete; it uses a
325
dedicated circuit with an extra pipeline stage. So it has to be executed twice
326
in a row before taking the result -- refer to microcode source for an example.\\
327
The PSW register is updated with the ALU result at every cycle, whatever
328
ALU operation is executed –- though every ALU operation computes flags by
329
different means, as it is apparent in the case of CY. Which flags are updated,
330
and which keep their previous values, is defined by a microinstruction field
331
named flag\_pattern. See the VHDL code for details.
332
 
333
 
334
\subsection{Microcode binary format}
335
\label{ucodeBinFormat}
336
 
337
\begin{tabular}{|l|l|l|}
338
\hline
339
\multicolumn{3}{|c|}{Microcode word bitfields} \\ \hline
340
POS & VHDL NAME & PURPOSE \\ \hline
341
31..29 & uc\_flags1 & Encoded flag of group 1 (see section on flags) \\ \hline
342
28..26 & uc\_flags2 & Encoded flag of group 2 (see section on flags) \\ \hline
343
25 & load\_addr & Address register load enable (note 1) \\ \hline
344
24 & load\_al & AL load enable (note 1) \\ \hline
345
23 & load\_t1 & T1 load enable \\ \hline
346
22 & load\_t2 & T2 load enable \\ \hline
347
21 & mux\_in & T1/T2 source mux control (0 for DI, 1 for reg bank) \\ \hline
348
20..19 & rb\_addr\_sel & Register bank address source control (note 2) \\ \hline
349
18..15 & ra\_field & Register bank address (used both for write and read) \\ \hline
350
14 & (unused) & Reserved \\ \hline
351
13..10 & (unused) & Reserved for write register bank address, unused yet \\ \hline
352
11..10 & uc\_jmp\_addr(7..6) & JSR/TJSR jump address, higher 2 bits \\ \hline
353
9..8 & flag\_pattern & PSW flag update control (note 3) (pipelined signal) \\ \hline
354
7 & load\_do & DO load enable (note 4) (pipelined signal) \\ \hline
355
6 & we\_rb & Register bank write enable (pipelined signal) \\ \hline
356
5..0 & uc\_jmp\_addr(5..0) & JSR/TJSR jump address, lower 6 bits \\ \hline
357
5..0 & (several) & Encoded ALU operation \\ \hline
358
\end{tabular}
359
 
360
\begin{itemize}
361
\item {\bf Note 1: load\_al}\\
362
AL is a temporary register for the lower byte of the external 16 bit
363
address. The memory interface (and the IO interface) assumes external
364
synchronous memory, so the 16 bit address has to be externally loaded as
365
commanded by load\_addr.
366
Note that both halves of the address signal load directly from the
367
register bank output; you can load AL with PC, for instance, in the same cycle
368
in which you modify the PC –- AL will load with the pre-modified value.
369
 
370
\item {\bf Note 2 : rb\_addr\_sel}\\
371
A microinstruction can access any register as specified by ra\_field, or
372
the register fields in the 8080 instruction opcode: S, D and RP (the
373
microinstruction can select which register of the pair). In the microcode source
374
this is encoded like this:
375
\begin{description}
376
\item[\{s\}] $\Rightarrow$ 0 \& SSS
377
\item[\{d\}] $\Rightarrow$ 0 \& DDD
378
\item[\{p\}0] $\Rightarrow$ 1 \& PP \& 0 (HIGH byte of register pair)
379
\item[\{p\}1] $\Rightarrow$ 1 \& PP \& 1 (LOW byte of register pair)
380
\end{description}
381
\small SSS = IR(5 downto 3) (source register)\\
382
\small DDD = IR(2 downto 0) (destination register)\\
383
\small PP = IR(5 downto 4) (register pair)\\
384
 
385
\item {\bf Note 3 : flag\_pattern}\\
386
Selects which flags of the PSW, if any, will be updated by the
387
microinstruction:
388
\begin{itemize}
389
\item When flag\_pattern(0)='1', CY is updated in the PSW.
390
\item When flag\_pattern(1)='1', all flags other than CY are updated in the PSW.
391
\end{itemize}
392
 
393
\item {\bf Note 4 : load\_do}\\
394
DO is the data ouput register that is loaded with the ALU output, so the
395
load enable signal is pipelined.
396
 
397
\item {\bf Note 5 : JSR-H and JSR-L}\\
398
These fields overlap existing fields which are unused in JSR/TJSR
399
instructions (fields which can be used with no secondary effects).
400
 
401
\end{itemize}
402
 
403
\subsection{Microcode flags}
404
\label{ucodeFlags}
405
 
406
 
407
Flags is what I have called those signals of the microinstruction that you
408
assert individually in the microcode source. Due to the way they have been
409
encoded, I have separated them in two groups. Only one flag in each group can be
410
used in any instruction. These are all the flags in the format thay appear in
411
the microcode source:
412
 
413
\begin{itemize}
414
\item Flags from group 1: use only one of these
415
  \begin{itemize}
416
  \item \#decode : Load address counter and IR with contents of data input
417
  lines, thus starting opcode decoging.
418
  \item \#ei : Set interrupt enable register.
419
  \item \#di : Reset interrupt enable register.
420
  \item \#io : Activate io signal for 1st cycle.
421
  \item \#auxcy : Use aux carry instead of regular carry for this $\mu$I.
422
  \item \#clrt1 : Clear T1 at the end of 1st cycle.
423
  \item \#halt : Jump to microcode address 0x07 without saving return value,
424
  when used with flag \#end, and only if there is no interrupt
425
  pending. Ignored otherwise.
426
  \end{itemize}
427
 
428
\item Flags from group 2: use only one of these
429
  \begin{itemize}
430
  \item \#setacy : Set aux carry at the start of 1st cycle (used for ++).
431
  \item \#end : Jump to microinstruction address 3 after the present m.i.
432
  \item \#ret : Jump to address saved by the last JST or TJSR m.i.
433
  \item \#rd : Activate rd signal for the 2nd cycle.
434
  \item \#wr : Activate wr signal for the 2nd cycle.
435
  \end{itemize}
436
 
437
\item Independent flags: no restrictions
438
  \begin{itemize}
439
  \item \#ld\_al : Load AL register with register bank output as read by opn. 1
440
  (used in memory and io access).
441
  \item \#ld\_addr : Load address register (H byte = register bank output as read
442
  by operation 1, L byte = AL).
443
  Activate vma signal for 1st cycle.
444
  \end{itemize}
445
 
446
\item PSW update flags: use only one of these
447
  \begin{itemize}
448
  \item \#fp\_r : This instruction updates all PSW flags except for C.
449
  \item \#fp\_c : This instruction updates only the C flag in the PSW.
450
  \item \#fp\_rc : This instruction updates all the flags in the PSW.
451
  \end{itemize}
452
 
453
\end{itemize}
454
 
455
\section{Notes on the microcode assembler}
456
\label{ucodeAssembler}
457
 
458
The microcode assembler is a Perl script (\texttt{util/uasm.pl}). Please refer
459
to the comments in the script for a reference on the usage of the assembler.\\
460
I will admit up front that the microcode source format and the assembler
461
program itself are a mess. They were hacked quickly and then often retouched
462
but never redesigned, in order to avoid the 'never ending project' syndrome.\\
463
Please note that use of the assembler, and the microcode assembly source,
464
is optional and perhaps overkill for this simple core. All you need to build the
465
core is the vhdl source file.\\
466
 
467
The perl assembler itself accounted for more than half of all the bugs I caught
468
during development.
469
Though the assembler certainly saved me a lot of mistakes in the hand-assembly
470
of the microcode, a half-cooked assembler like
471
this one may do more harm than good. I expect that the program now behaves
472
correctly; I have done a lot of modifications to the microcode source for
473
testing purposes and I have not found any more bugs in the assembler. But you
474
have been warned: don't trust the assembler too much (in case someone actually
475
wants to mess with these things at all).\\
476
The assembler is a Perl program (\texttt{util/uasm.pl}) that will read a
477
microcode text source file and write to stdout a microcode table in the form of
478
a chunk of VHDL code. You are supposed to capture that output and paste it into
479
the VHDL source (Actually, I used another perl script to do that, but I don't
480
want to overcomplicate an already messy documentation).\\
481
The assembler can do some other simple operations on the source, for debug
482
purposes. The invocation options are documented in the program file.\\
483
You don't need any extra Perl modules or libraries, any distribution of Perl 5
484
will do -– earlier versions should too but might not, I haven't
485
tested.
486
 
487
\section{CPU details}
488
\label{cpuDetails}
489
 
490
\subsection{Synchronous memory and i/o interface}
491
\label{syncMem}
492
 
493
The core is designed to connect to external synchronous memory similar to
494
the internal fpga ram blocks found in the Spartan series. It can be used with
495
asynchronous ram provided that you add the necessary registers (I have used it
496
with external SRAM included on a development board with no trouble).
497
 
498
Signal 'vma' is the master read/write enable. It is designed to be used as
499
a synchronous rd/wr enable. All other memory/io signals are only valid when vma
500
is active. Read data is sampled in the positive clock edge following deassertion
501
of vma. Than is, the core expects external memory and io to behave as an
502
internal fpga block ram would.\\
503
I think the interface is simple enough to be fully described by the
504
comments in the header of the VHDL source file.
505
 
506
\subsection{Interrupt response}
507
\label{irqResponse}
508
 
509
Interrupt response has been greatly simplified, but it follows the outline
510
of the original procedure. The biggest difference is that inta is
511
active for the entire duration of the instruction, and not only the opcode fetch
512
cycle.
513
 
514
Whenever a high value is sampled in line intr in any positive clock edge,
515
an interrupt pending flag is internally raised. After the current instruction
516
finishes execution, the interrupt pending flag is sampled. If active, it is
517
cleared, interrupts are disabled and the processor enters an inta cycle. If
518
inactive, the processor enters a fetch cycle as usual.
519
The inta cycle is identical to a fetch cycle, with the exception that inta
520
signal is asserted high.
521
 
522
The processor will fetch an opcode during the first inta cycle and will
523
execute it normally, except the PC increment will not happen and inta will be
524
high for the duration of the instruction. Note that though pc increment is
525
inhibited while inta is high, pc can be explicitly changed (rst, jmp, etc.).
526
After the special inta instruction execution is done, the processor
527
resumes normal execution, with interrupts disabled.\\
528
The above means that any instruction (even XTHL, which the original 8080
529
forbids) can be used as an interrupt vector and will be executed normally. The
530
core has been tested with rst, lxi and inr, for example.
531
 
532
Since there's no M1 signal available, feeding multi-byte instructions as
533
interrupt vectors can be a little complicated. It is up to you to deal with this
534
situation (i.e. use only single-byte vectors or make up some sort of cycle
535
counter).
536
 
537
\subsection{Instruction timing}
538
\label{timing}
539
 
540
This core is slower than the original in terms of clocks per instruction.
541
Since the original 8080 was itself one of the slowest micros ever, this does not
542
say much for the core. Yet, one of these clocked at 50MHz would outperform an
543
original 8080 at 25 Mhz, which is fast enough for many control applications ---
544
except that there are possibly better alternatives.\\
545
A comparative table follows.
546
 
547
 
548
\begin{tabular}{|l|l|l|l|l|l|l|}
549
\hline
550
\multicolumn{7}{|c|}{Instruction timing (core vs. original)} \\ \hline
551
 
552
Opcode & Intel 8080 & Light8080 & & Opcode & Intel 8080 & Light8080 \\ \hline
553
 
554
MOV r1, r2 & 5 & 6 & &        XRA M & 7 & 9 \\ \hline
555
MOV r, M & 7 & 9 & &          XRI data & 7 & 9 \\ \hline
556
MOV M, r & 7 & 9 & &          ORA r & 4 & 6 \\ \hline
557
MVI r, data & 7 & 9 & &       ORA M & 7 & 9 \\ \hline
558
MVI M, data & 10 & 12 & &     ORI data & 7 & 9 \\ \hline
559
LXI rp, data16 & 10 & 14 & &  CMP r & 4 & 6 \\ \hline
560
LDA addr & 13 & 16 & &        CMP M & 7 & 9 \\ \hline
561
STA addr & 13 & 16 & &        CPI data & 7 & 9 \\ \hline
562
LHLD addr & 16 & 19 & &       RLC & 4 & 5 \\ \hline
563
SHLD addr & 16 & 19 & &       RRC & 4 & 5 \\ \hline
564
LDAX rp & 7 & 9 & &           RAL & 4 & 5 \\ \hline
565
STAX rp & 7 & 9 & &           RAR & 4 & 5 \\ \hline
566
XCHG & 4 & 16 & &             CMA & 4 & 5 \\ \hline
567
ADD r & 4 & 6 & &             CMC & 4 & 5 \\ \hline
568
ADD M & 7 & 9 & &             STC & 4 & 5 \\ \hline
569
ADI data & 7 & 9 & &          JMP & 10 & 15 \\ \hline
570
ADC r & 4 & 6 & &             Jcc & 10 & 12/16 \\ \hline
571
ADC M & 7 & 9 & &             CALL & 17 & 29 \\ \hline
572
ACI data & 7 & 9 & &          Ccc & 11/17 & 12/30 \\ \hline
573
SUB r & 4 & 6 & &             RET & 10 & 14 \\ \hline
574
SUB M & 7 & 9 & &             Rcc & 5/11 & 5/15 \\ \hline
575
SUI data & 7 & 9 & &          RST n & 11 & 20 \\ \hline
576
SBB r & 4 & 6 & &             PCHL & 5 & 8 \\ \hline
577
SBB M & 7 & 9 & &             PUSH rp & 11 & 19 \\ \hline
578
SBI data & 7 & 9 & &          PUSH PSW & 11 & 19 \\ \hline
579
INR r & 5 & 6 & &             POP rp & 10 & 14 \\ \hline
580
INR M & 10 & 13 & & POP PSW & 10 & 14 \\ \hline
581
INX rp & 5 & 6 & & XTHL & 18 & 32 \\ \hline
582
DCR r & 5 & 6 & & SPHL & 5 & 8 \\ \hline
583
DCR M & 10 & 14 & & EI & 4 & 5 \\ \hline
584
DCX rp & 5 & 6 & & DI & 4 & 5 \\ \hline
585
DAD rp & 10 & 8 & & IN port & 10 & 14 \\ \hline
586
DAA & 4 & 6 & & OUT port & 10 & 14 \\ \hline
587
ANA r & 4 & 6 & & HLT & 7 & 5 \\ \hline
588
ANA M & 7 & 9 & & NOP & 4 & 5 \\ \hline
589
ANI data & 7 & 9 & & & & \\ \hline
590
XRA r & 4 & 6 & & & & \\ \hline
591
 
592
\end{tabular}
593
 
594
 
595
\end{document}
596
 

powered by: WebSVN 2.1.0

© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.