1 |
2 |
jsauermann |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|
2 |
|
|
"http://www.w3.org/TR/html4/strict.dtd">
|
3 |
|
|
<HTML>
|
4 |
|
|
<HEAD>
|
5 |
|
|
<TITLE>html/Opcode_Decoder</TITLE>
|
6 |
|
|
<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">
|
7 |
|
|
<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">
|
8 |
|
|
</HEAD>
|
9 |
|
|
<BODY>
|
10 |
|
|
<P><table class="ttop"><th class="tpre"><a href="06_Data_Path.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="08_IO.html">Next Lesson</a></th></table>
|
11 |
|
|
<hr>
|
12 |
|
|
|
13 |
|
|
<H1><A NAME="section_1">7 OPCODE DECODER</A></H1>
|
14 |
|
|
|
15 |
|
|
<P>In this lesson we will describe the opcode decoder. We will also learn
|
16 |
|
|
how the different instructions provided by the CPU will be implemented. We
|
17 |
|
|
will not describe every opcode, but rather groups of instructions whose
|
18 |
|
|
individual instructions are rather similar.
|
19 |
|
|
|
20 |
|
|
<P>The opcode decoder is the middle state of our CPU pipeline. Therefore its
|
21 |
|
|
inputs are defined by the outputs of the previous stage and its outputs
|
22 |
|
|
are defined by the inputs of the next stage.
|
23 |
|
|
|
24 |
|
|
<H2><A NAME="section_1_1">7.1 Inputs of the Opcode Decoder</A></H2>
|
25 |
|
|
|
26 |
|
|
<UL>
|
27 |
|
|
<LI><STRONG>CLK</STRONG> is the clock signal. The opcode decoder is a pure pipeline stage
|
28 |
|
|
so that no internal state is kept between clock cycles. The output of
|
29 |
|
|
the opcode decoder is a pure function of its inputs.
|
30 |
|
|
<LI><STRONG>OPC</STRONG> is the opcode being decoded.
|
31 |
|
|
<LI><STRONG>PC</STRONG> is the program counter (the address in the program memory from
|
32 |
|
|
which OPC was fetched).
|
33 |
|
|
<LI><STRONG>T0</STRONG> is '1' in the first cycle of the execution of the opcode. This allows
|
34 |
|
|
for output signals of two-cycle instructions that are different in the
|
35 |
|
|
first and the second cycle.
|
36 |
|
|
</UL>
|
37 |
|
|
<H2><A NAME="section_1_2">7.2 Outputs of the Opcode Decoder</A></H2>
|
38 |
|
|
|
39 |
|
|
<P>Most data buses of the CPU are contained in the data path. In contrast,
|
40 |
|
|
most control signals are generated in the opcode decoder. We start with
|
41 |
|
|
a complete list of these control signals and their purpose. There are
|
42 |
|
|
two groups of signals: select signals and write enable signals. Select
|
43 |
|
|
signals are used earlier in the execution of the opcode for controlling
|
44 |
|
|
multiplexers. The write enable signals are used at the end of the execution
|
45 |
|
|
to determine where results shall be stored. Select signals are generally
|
46 |
|
|
more time-critical that write enable signals.
|
47 |
|
|
|
48 |
|
|
<P>The select signals are:
|
49 |
|
|
|
50 |
|
|
<UL>
|
51 |
|
|
<LI><STRONG>ALU_OP</STRONG> defines which particular ALU operation (like <STRONG>ADD</STRONG>, <STRONG>ADC</STRONG>, <STRONG>AND</STRONG>,
|
52 |
|
|
...) the ALU shall perform.
|
53 |
|
|
<LI><STRONG>AMOD</STRONG> defines which addressing mode (like <STRONG>absolute</STRONG>, <STRONG>Z+</STRONG>, <STRONG>-SP</STRONG>, etc.)shall
|
54 |
|
|
be used for data memory accesses.
|
55 |
|
|
<LI><STRONG>BIT</STRONG> is a bit value (0 or 1) and a bit number used in bit instructions.
|
56 |
|
|
<LI><STRONG>DDDDD</STRONG> defines the destination register or register pair (if any)
|
57 |
|
|
for storing the result of an operation. It also defines the first
|
58 |
|
|
source register or register pair of a dyadic instructions.
|
59 |
|
|
<LI><STRONG>IMM</STRONG> defines an immediate value or branch address that is computed
|
60 |
|
|
from the opcode.
|
61 |
|
|
<LI><STRONG>JADR</STRONG> is a branch address.
|
62 |
|
|
<LI><STRONG>OPC</STRONG> is the opcode being decoded, or 0 if the opcode was invalidated
|
63 |
|
|
by means of <STRONG>SKIP</STRONG>.
|
64 |
|
|
<LI><STRONG>PC</STRONG> is the <STRONG>PC</STRONG> from which <STRONG>OPC</STRONG> was fetched.
|
65 |
|
|
<LI><STRONG>PC_OP</STRONG> defines an operation to be performed on the <STRONG>PC</STRONG> (such as branching).
|
66 |
|
|
<LI><STRONG>PMS</STRONG> is set when the address defined by <STRONG>AMOD</STRONG> is a program memory address
|
67 |
|
|
rather than a data memory address.
|
68 |
|
|
<LI><STRONG>RD_M</STRONG> is set for reads from the data memory.
|
69 |
|
|
<LI><STRONG>RRRRR</STRONG> defines the second register or register pair of a dyadic
|
70 |
|
|
instruction.
|
71 |
|
|
<LI><STRONG>RSEL</STRONG> selects the source of the second operand in the ALU.
|
72 |
|
|
This can be a register (on the <STRONG>R</STRONG> input), an immediate value (on
|
73 |
|
|
the <STRONG>IMM</STRONG> input), or data from memory or I/O (on the <STRONG>DIN</STRONG> input).
|
74 |
|
|
</UL>
|
75 |
|
|
<P>The write enable signals are:
|
76 |
|
|
|
77 |
|
|
<UL>
|
78 |
|
|
<LI><STRONG>WE_01</STRONG> is set when register pair 0 shall be written. This is used for
|
79 |
|
|
multiplication instructions that store the multiplication product in
|
80 |
|
|
register pair 0.
|
81 |
|
|
<LI><STRONG>WE_D</STRONG> is set when the register or register pair <STRONG>DDDDD</STRONG> shall be written.
|
82 |
|
|
If both bits are set then the entire pair shall be written and <STRONG>DDDDD[0]</STRONG>
|
83 |
|
|
is 0. Otherwise <STRONG>WE_D[1]</STRONG> is 0, and one of the registers (as defined by
|
84 |
|
|
<STRONG>DDDDD[0]</STRONG>) shall be written,
|
85 |
|
|
<LI><STRONG>WE_F</STRONG> is set when the status register (flags) shall be written.
|
86 |
|
|
<LI><STRONG>WE_M</STRONG> is set when the memory (including memory mapped general purpose
|
87 |
|
|
registers and I/O registers) shall be written. If set, then the <STRONG>AMOD</STRONG>
|
88 |
|
|
output defines how to compute the memory address.
|
89 |
|
|
<LI><STRONG>WE_XYZS</STRONG> is set when the stack pointer or one of the pointer register pairs
|
90 |
|
|
<STRONG>X</STRONG>, <STRONG>Y</STRONG>, or <STRONG>Z</STRONG> shall be written. Which of these register is meant is
|
91 |
|
|
encoded in <STRONG>AMOD</STRONG>.
|
92 |
|
|
</UL>
|
93 |
|
|
<H2><A NAME="section_1_3">7.3 Structure of the Opcode Decoder</A></H2>
|
94 |
|
|
|
95 |
|
|
<P>The VHDL code of the opcode decoder consists essentially of a huge case
|
96 |
|
|
statement. At the beginning of the case statement there is a section
|
97 |
|
|
assigning a default value to each output. Then follows a case statement that
|
98 |
|
|
decodes the upper 6 bits of the opcode:
|
99 |
|
|
|
100 |
|
|
<P><br>
|
101 |
|
|
|
102 |
|
|
<pre class="vhdl">
|
103 |
|
|
|
104 |
|
|
66 process(I_CLK)
|
105 |
|
|
67 begin
|
106 |
|
|
68 if (rising_edge(I_CLK)) then
|
107 |
|
|
69 --
|
108 |
|
|
70 -- set the most common settings as default.
|
109 |
|
|
71 --
|
110 |
|
|
72 Q_ALU_OP <= ALU_D_MV_Q;
|
111 |
|
|
73 Q_AMOD <= AMOD_ABS;
|
112 |
|
|
74 Q_BIT <= I_OPC(10) & I_OPC(2 downto 0);
|
113 |
|
|
75 Q_DDDDD <= I_OPC(8 downto 4);
|
114 |
|
|
76 Q_IMM <= X"0000";
|
115 |
|
|
77 Q_JADR <= I_OPC(31 downto 16);
|
116 |
|
|
78 Q_OPC <= I_OPC(15 downto 0);
|
117 |
|
|
79 Q_PC <= I_PC;
|
118 |
|
|
80 Q_PC_OP <= PC_NEXT;
|
119 |
|
|
81 Q_PMS <= '0';
|
120 |
|
|
82 Q_RD_M <= '0';
|
121 |
|
|
83 Q_RRRRR <= I_OPC(9) & I_OPC(3 downto 0);
|
122 |
|
|
84 Q_RSEL <= RS_REG;
|
123 |
|
|
85 Q_WE_D <= "00";
|
124 |
|
|
86 Q_WE_01 <= '0';
|
125 |
|
|
87 Q_WE_F <= '0';
|
126 |
|
|
88 Q_WE_M <= "00";
|
127 |
|
|
89 Q_WE_XYZS <= '0';
|
128 |
|
|
90
|
129 |
|
|
91 case I_OPC(15 downto 10) is
|
130 |
|
|
92 when "000000" =>
|
131 |
|
|
<pre class="filename">
|
132 |
|
|
src/opc_deco.vhd
|
133 |
|
|
</pre></pre>
|
134 |
|
|
<P>
|
135 |
|
|
|
136 |
|
|
<P>...
|
137 |
|
|
|
138 |
|
|
<pre class="vhdl">
|
139 |
|
|
|
140 |
|
|
653 when others =>
|
141 |
|
|
654 end case;
|
142 |
|
|
655 end if;
|
143 |
|
|
656 end process;
|
144 |
|
|
<pre class="filename">
|
145 |
|
|
src/opc_deco.vhd
|
146 |
|
|
</pre></pre>
|
147 |
|
|
<P>
|
148 |
|
|
|
149 |
|
|
<P><br>
|
150 |
|
|
|
151 |
|
|
<H2><A NAME="section_1_4">7.4 Default Values for the Outputs</A></H2>
|
152 |
|
|
|
153 |
|
|
<P>The opcode decoder generates quite a few outputs. A typical instruction,
|
154 |
|
|
however, only sets a small fraction of them. For this reason we provide a
|
155 |
|
|
default value for all outputs before the top level case statement,
|
156 |
|
|
as shown above.<BR>
|
157 |
|
|
For each instruction we then only need to specify those outputs that
|
158 |
|
|
differ from the default value.
|
159 |
|
|
|
160 |
|
|
<P>Every default value is either constant or a function of an input.
|
161 |
|
|
Therefore the opcode decoder is a typical "stateless" pipeline stage.
|
162 |
|
|
The default values are chosen so that they do not change
|
163 |
|
|
anything in the other stages (except incrementing the PC, of course).
|
164 |
|
|
In particular, the default values for all write enable signals are '0'.
|
165 |
|
|
|
166 |
|
|
<H2><A NAME="section_1_5">7.5 Checklist for the Design of an Opcode.</A></H2>
|
167 |
|
|
|
168 |
|
|
<P>Designing an opcode starts with asking a number of questions. The answers
|
169 |
|
|
are found in the specification of the opcode. The answers identify the outputs
|
170 |
|
|
that need to be set other than their default values.
|
171 |
|
|
While the instructions are quite different, the questions are always the same:
|
172 |
|
|
|
173 |
|
|
<OL>
|
174 |
|
|
<LI>What operation shall the ALU perform?
|
175 |
|
|
Set <STRONG>ALU_OP</STRONG> and <STRONG>Q_WE_F</STRONG> accordingly.
|
176 |
|
|
<LI>Is a destination register or destination register pair used?
|
177 |
|
|
If so, set <STRONG>DDDDD</STRONG> (and <STRONG>WE_D</STRONG> if written).
|
178 |
|
|
<LI>Is a second register or register pair involved?
|
179 |
|
|
If so, set <STRONG>RRRRR</STRONG>.
|
180 |
|
|
<LI>Does the opcode access the memory?
|
181 |
|
|
If so, set <STRONG>AMOD</STRONG>, <STRONG>PMS</STRONG>, <STRONG>RSEL</STRONG>, <STRONG>RD_M</STRONG>, <STRONG>WE_M</STRONG>, and <STRONG>WE_XYZS</STRONG> accordingly.
|
182 |
|
|
<LI>Is an immediate or implied operand used?
|
183 |
|
|
If so, set <STRONG>IMM</STRONG> and <STRONG>RSEL</STRONG>.
|
184 |
|
|
<LI>Is the program counter modified (other than incrementing it)?
|
185 |
|
|
If so, set <STRONG>PC_OP</STRONG> and <STRONG>SKIP</STRONG>.
|
186 |
|
|
<LI>Is a bit number specified in the opcode ?
|
187 |
|
|
If so, set <STRONG>BIT</STRONG>.
|
188 |
|
|
<LI>Are instructions skipped?
|
189 |
|
|
If so, set <STRONG>SKIP</STRONG>.
|
190 |
|
|
</OL>
|
191 |
|
|
<P>Equipped with this checklist we can implement all instructions. We
|
192 |
|
|
start with the simplest instructions and proceed to the more complex
|
193 |
|
|
instructions.
|
194 |
|
|
|
195 |
|
|
<H2><A NAME="section_1_6">7.6 Opcode Implementations</A></H2>
|
196 |
|
|
|
197 |
|
|
<H3><A NAME="section_1_6_1">7.6.1 The NOP instruction</A></H3>
|
198 |
|
|
|
199 |
|
|
<P>The simplest instruction is the NOP instruction which does - nothing.
|
200 |
|
|
The default values set for all outputs do nothing either so there is
|
201 |
|
|
no extra VHDL code needed for this instruction.
|
202 |
|
|
|
203 |
|
|
<H3><A NAME="section_1_6_2">7.6.2 8-bit Monadic Instructions</A></H3>
|
204 |
|
|
|
205 |
|
|
<P>We call an instruction <STRONG>monadic</STRONG> if its opcode contains one register
|
206 |
|
|
number and if the instructions reads the register before computing
|
207 |
|
|
a new value for it.
|
208 |
|
|
|
209 |
|
|
<P>Only items 1. and 2. in our checklist apply. The default value for
|
210 |
|
|
<STRONG>DDDDD</STRONG> is already correct. Thus only <STRONG>ALU_OP</STRONG>, <STRONG>WE_D</STRONG>, and <STRONG>WE_F</STRONG>
|
211 |
|
|
need to be set. We take the <STRONG>DEC Rd</STRONG> instruction as an example:
|
212 |
|
|
|
213 |
|
|
<P><br>
|
214 |
|
|
|
215 |
|
|
<pre class="vhdl">
|
216 |
|
|
|
217 |
|
|
465 --
|
218 |
|
|
466 -- 1001 010d dddd 1010 - DEC
|
219 |
|
|
467 --
|
220 |
|
|
468 Q_ALU_OP <= ALU_DEC;
|
221 |
|
|
469 Q_WE_D <= "01";
|
222 |
|
|
470 Q_WE_F <= '1';
|
223 |
|
|
<pre class="filename">
|
224 |
|
|
src/opc_deco.vhd
|
225 |
|
|
</pre></pre>
|
226 |
|
|
<P>
|
227 |
|
|
|
228 |
|
|
<P><br>
|
229 |
|
|
|
230 |
|
|
<P>All monadic arithmetic/logic instructions are implemented in the same way;
|
231 |
|
|
they differ by their <STRONG>ALU_OP</STRONG>.
|
232 |
|
|
|
233 |
|
|
<H3><A NAME="section_1_6_3">7.6.3 8-bit Dyadic Instructions, Register/Register</A></H3>
|
234 |
|
|
|
235 |
|
|
<P>We call an instruction <STRONG>dyadic</STRONG> if its opcode contains two data sources
|
236 |
|
|
(a data source being a register number or an immediate operand).
|
237 |
|
|
As a consequence of the two data sources, dyadic instructions
|
238 |
|
|
occupy a larger fraction of the opcode space than monadic functions.
|
239 |
|
|
|
240 |
|
|
<P>We take the <STRONG>ADD Rd, Rr</STRONG> opcode as an example.
|
241 |
|
|
|
242 |
|
|
<P>Compared to the monadic functions now item 3. in the checklist applies
|
243 |
|
|
as well. This would mean we have to set <STRONG>RRRRR</STRONG> but by chance the default
|
244 |
|
|
value is already correct. Therefore:
|
245 |
|
|
|
246 |
|
|
<P><br>
|
247 |
|
|
|
248 |
|
|
<pre class="vhdl">
|
249 |
|
|
|
250 |
|
|
165 --
|
251 |
|
|
166 -- 0000 11rd dddd rrrr - ADD
|
252 |
|
|
167 --
|
253 |
|
|
168 Q_ALU_OP <= ALU_ADD;
|
254 |
|
|
169 Q_WE_D <= "01";
|
255 |
|
|
170 Q_WE_F <= '1';
|
256 |
|
|
<pre class="filename">
|
257 |
|
|
src/opc_deco.vhd
|
258 |
|
|
</pre></pre>
|
259 |
|
|
<P>
|
260 |
|
|
|
261 |
|
|
<P><br>
|
262 |
|
|
|
263 |
|
|
<P>The dyadic instructions do not use the I/O address space and therefore they
|
264 |
|
|
completely execute inside the data path. The following figure shows the
|
265 |
|
|
signals in the data path that are used by the <STRONG>ADD Rd, Rr</STRONG> instruction:
|
266 |
|
|
|
267 |
|
|
<P><img src="opcode_decoder_1.png">
|
268 |
|
|
|
269 |
|
|
<P>The opcode for <STRONG>ADD Rd, Rr</STRONG> is <STRONG> 0000</STRONG> <STRONG> 11rd</STRONG> <STRONG>dddd</STRONG> <STRONG>rrrr</STRONG>.
|
270 |
|
|
The opcode decoder extracts the 'd' bits into the <STRONG>DDDDD</STRONG> signal (blue),
|
271 |
|
|
the 'r' bits into the <STRONG>RRRRR</STRONG> signal (red), and computes <STRONG>ALU_OP</STRONG>, <STRONG>WE_D</STRONG>,
|
272 |
|
|
and <STRONG>WE_F</STRONG> from the remaining bits (green) as above.
|
273 |
|
|
|
274 |
|
|
<P>The register file converts the register numbers <STRONG>Rd</STRONG> and <STRONG>Rr</STRONG> that are
|
275 |
|
|
encoded in the <STRONG>DDDDD</STRONG> and <STRONG>RRRRR</STRONG> signals to the contents of the register
|
276 |
|
|
pairs at its <STRONG>D</STRONG> and <STRONG>R</STRONG> outputs. The lowest bit of the <STRONG>DDDDD</STRONG> and <STRONG>RRRRR</STRONG>
|
277 |
|
|
signals also go to the ALU (inputs <STRONG>D0</STRONG> and <STRONG>R0</STRONG>) where the odd/even register
|
278 |
|
|
selection from the two register pairs is performed.
|
279 |
|
|
|
280 |
|
|
<P>The decoder also selects the proper <STRONG>ALU_OP</STRONG> from the opcode, which is
|
281 |
|
|
<STRONG>ALU_ADD</STRONG> in this example. With this input, the ALU computes the sum of the
|
282 |
|
|
its <STRONG>D</STRONG> and <STRONG>R</STRONG> inputs and drives its <STRONG>DOUT</STRONG> (pink) with the sum.
|
283 |
|
|
It also computes the flags as defined for the <STRONG>ADD</STRONG> opcode.
|
284 |
|
|
|
285 |
|
|
<P>The decoder sets the <STRONG>WE_D</STRONG> and <STRONG>WE_F</STRONG> inputs of the register file
|
286 |
|
|
so that the <STRONG>DOUT</STRONG> and <STRONG>FLAGS</STRONG> outputs of the ALU are written back to the
|
287 |
|
|
register file.
|
288 |
|
|
|
289 |
|
|
<P>All this happens within a single clock cycle, so that the next instruction
|
290 |
|
|
can be performed in the next clock cycle.
|
291 |
|
|
|
292 |
|
|
<P>The other dyadic instructions are implemented similarly.
|
293 |
|
|
Two instructions, <STRONG>CMP</STRONG> and <STRONG>CPC</STRONG>, deviate a little since they do not set
|
294 |
|
|
<STRONG>WE_D</STRONG>. Only the flags are set as a result of the comparison.
|
295 |
|
|
Apart from that, <STRONG>CMP</STRONG> and <STRONG>CPC</STRONG> are identical to the <STRONG>SUB</STRONG> and <STRONG>SBC</STRONG>;
|
296 |
|
|
they don't have their own <STRONG>ALU_OP</STRONG> but use those of the <STRONG>SUB</STRONG> and <STRONG>SBC</STRONG>
|
297 |
|
|
instructions.
|
298 |
|
|
|
299 |
|
|
<P>The <STRONG>MOV Rd, Rr</STRONG> instruction is implemented as a dyadic function.
|
300 |
|
|
It ignores it first argument and does not set any flags.
|
301 |
|
|
|
302 |
|
|
<H3><A NAME="section_1_6_4">7.6.4 8-bit Dyadic Instructions, Register/Immediate</A></H3>
|
303 |
|
|
|
304 |
|
|
<P>Some of the dyadic instructions have an immediate operand (i.e. the operand is
|
305 |
|
|
contained in the opcode) rather than using a second register. For such
|
306 |
|
|
instructions, for example <STRONG>ANDI</STRONG>, we extract the immediate operand from the
|
307 |
|
|
opcode and set <STRONG>RSEL</STRONG>. Since the immediate operand takes quite some space in
|
308 |
|
|
the opcode, the register range was restricted a little and hence the default
|
309 |
|
|
<STRONG>DDDDD</STRONG> value needs a modification.
|
310 |
|
|
|
311 |
|
|
<P><br>
|
312 |
|
|
|
313 |
|
|
<pre class="vhdl">
|
314 |
|
|
|
315 |
|
|
263 --
|
316 |
|
|
264 -- 0111 KKKK dddd KKKK - ANDI
|
317 |
|
|
265 --
|
318 |
|
|
266 Q_ALU_OP <= ALU_AND;
|
319 |
|
|
267 Q_IMM(7 downto 0) <= I_OPC(11 downto 8) & I_OPC(3 downto 0);
|
320 |
|
|
268 Q_RSEL <= RS_IMM;
|
321 |
|
|
269 Q_DDDDD(4) <= '1'; -- Rd = 16...31
|
322 |
|
|
270 Q_WE_D <= "01";
|
323 |
|
|
271 Q_WE_F <= '1';
|
324 |
|
|
<pre class="filename">
|
325 |
|
|
src/opc_deco.vhd
|
326 |
|
|
</pre></pre>
|
327 |
|
|
<P>
|
328 |
|
|
|
329 |
|
|
<P><br>
|
330 |
|
|
|
331 |
|
|
<H3><A NAME="section_1_6_5">7.6.5 16-bit Dyadic Instructions</A></H3>
|
332 |
|
|
|
333 |
|
|
<P>Some of the dyadic 8-bit instructions have 16-bit variants, for example <STRONG>ADIW</STRONG>.
|
334 |
|
|
The second operand of these 16-bit variants can be another register pair or an
|
335 |
|
|
immediate operand.
|
336 |
|
|
|
337 |
|
|
<P><br>
|
338 |
|
|
|
339 |
|
|
<pre class="vhdl">
|
340 |
|
|
|
341 |
|
|
499 --
|
342 |
|
|
500 -- 1001 0110 KKdd KKKK - ADIW
|
343 |
|
|
501 -- 1001 0111 KKdd KKKK - SBIW
|
344 |
|
|
502 --
|
345 |
|
|
503 if (I_OPC(8) = '0') then Q_ALU_OP <= ALU_ADIW;
|
346 |
|
|
504 else Q_ALU_OP <= ALU_SBIW;
|
347 |
|
|
505 end if;
|
348 |
|
|
506 Q_IMM(5 downto 4) <= I_OPC(7 downto 6);
|
349 |
|
|
507 Q_IMM(3 downto 0) <= I_OPC(3 downto 0);
|
350 |
|
|
508 Q_RSEL <= RS_IMM;
|
351 |
|
|
509 Q_DDDDD <= "11" & I_OPC(5 downto 4) & "0";
|
352 |
|
|
510
|
353 |
|
|
511 Q_WE_D <= "11";
|
354 |
|
|
512 Q_WE_F <= '1';
|
355 |
|
|
<pre class="filename">
|
356 |
|
|
src/opc_deco.vhd
|
357 |
|
|
</pre></pre>
|
358 |
|
|
<P>
|
359 |
|
|
|
360 |
|
|
<P><br>
|
361 |
|
|
|
362 |
|
|
<P>These instructions are implemented similar to their 8-bit relatives, but
|
363 |
|
|
in contrast to them both <STRONG>WE_D</STRONG> bits are set. This causes the entire
|
364 |
|
|
register pair to be updated. <STRONG>LDI</STRONG> and <STRONG>MOVW</STRONG> are also implemented as
|
365 |
|
|
16-bit dyadic instruction.
|
366 |
|
|
|
367 |
|
|
<H3><A NAME="section_1_6_6">7.6.6 Bit Instructions</A></H3>
|
368 |
|
|
|
369 |
|
|
<P>There are some instructions that are very similar to monadic functions
|
370 |
|
|
(in that they refer to only one register) but have a small immediate operand
|
371 |
|
|
that addresses a bit in that register. Unlike dyadic functions with immediate
|
372 |
|
|
operands, these bit instructions do not use the register/immediate
|
373 |
|
|
multiplexer in the ALU (they don't have a register counterpart for the
|
374 |
|
|
immediate operand). Instead, the bit number from the instruction is provided
|
375 |
|
|
on the <STRONG>BIT</STRONG> output of the opcode decoder.
|
376 |
|
|
The <STRONG>BIT</STRONG> output has 4 bits; in addition to the (lower) 3 bits needed to
|
377 |
|
|
address the bit concerned, the fourth (upper) bit indicates the value
|
378 |
|
|
(bit set or bit cleared) of the bit for those instructions that need it.
|
379 |
|
|
|
380 |
|
|
<P>The ALU operations related to these bit instructions are <STRONG>ALU_BLD</STRONG> and
|
381 |
|
|
<STRONG><STRONG>ALU_BIT_CS</STRONG>.</STRONG>
|
382 |
|
|
|
383 |
|
|
<P><STRONG>ALU_BLD</STRONG> stores the T bit of the status register into a bit in a general
|
384 |
|
|
purpose register; this is used to implement the <STRONG>BLD</STRONG> instruction.
|
385 |
|
|
|
386 |
|
|
<P><STRONG>ALU_BIT_CS</STRONG> is a dual-purpose function.
|
387 |
|
|
|
388 |
|
|
<P>The first purpose is to copy a bit in a general purpose register into the
|
389 |
|
|
<STRONG>T</STRONG> flag of the status register. This use of <STRONG>ALU_BIT_CS</STRONG> is selected by
|
390 |
|
|
setting (only) the <STRONG>WE_F</STRONG> signal so that the status register is updated
|
391 |
|
|
with the new <STRONG>T</STRONG> flag. The <STRONG>BST</STRONG> instruction is implemented this way. The
|
392 |
|
|
the bit value in <STRONG>BIT[3]</STRONG> is ignored.
|
393 |
|
|
|
394 |
|
|
<P>The second purpose is to set or clear a bit in an I/O register.
|
395 |
|
|
The ALU first computes a bitmask where only the bit indicated
|
396 |
|
|
by <STRONG>BIT[2:0]</STRONG> is set. Depending on BIT[3] the register is then <STRONG>or</STRONG>'ed with
|
397 |
|
|
the mask or <STRONG>and</STRONG>'ed with the complement of the mask. This sets or clears
|
398 |
|
|
the bit in the current value of the register. This use of <STRONG>ALU_BIT_CS</STRONG>
|
399 |
|
|
is selected by <STRONG>WE_M</STRONG> so the I/O register is updated with the new value.
|
400 |
|
|
The <STRONG>CBI</STRONG> and <STRONG>SBI</STRONG> instructions are implemented this way.
|
401 |
|
|
|
402 |
|
|
<P><STRONG>ALU_BIT_CS</STRONG> is also used by the skip instructions <STRONG>SBRC</STRONG> and <STRONG>SBRC</STRONG> that are
|
403 |
|
|
described in the section about branching.
|
404 |
|
|
|
405 |
|
|
<H3><A NAME="section_1_6_7">7.6.7 Multiplication Instructions</A></H3>
|
406 |
|
|
|
407 |
|
|
<P>There is a zoo of multiplication instructions that differ in the
|
408 |
|
|
signedness of their operands (<STRONG>MUL</STRONG>, <STRONG>MULS</STRONG>, <STRONG>MULSU</STRONG>) and in whether
|
409 |
|
|
the final result is shifted (<STRONG>FMUL</STRONG>, <STRONG>FMULS</STRONG>, and <STRONG>FMULSU</STRONG>) or not.
|
410 |
|
|
The opcode decoder sets certain bits in the IMM signal to indicate the
|
411 |
|
|
type of multiplication:
|
412 |
|
|
|
413 |
|
|
<TABLE>
|
414 |
|
|
<TR><TD>IMM(7) = 1</TD><TD>shift (FMULxx)
|
415 |
|
|
</TD></TR><TR><TD>IMM(6) = 1</TD><TD>Rd is signed
|
416 |
|
|
</TD></TR><TR><TD>IMM(5) = 1</TD><TD>Rr is signed
|
417 |
|
|
</TD></TR>
|
418 |
|
|
</TABLE>
|
419 |
|
|
<P>We also set the <STRONG>WE_01</STRONG> instead of the <STRONG>WE_D</STRONG> signal because the
|
420 |
|
|
multiplication result is stored in register pair 0 rather than in the
|
421 |
|
|
Rd register of the opcode.
|
422 |
|
|
|
423 |
|
|
<P><br>
|
424 |
|
|
|
425 |
|
|
<pre class="vhdl">
|
426 |
|
|
|
427 |
|
|
129 --
|
428 |
|
|
130 -- 0000 0011 0ddd 0rrr - _MULSU SU "010"
|
429 |
|
|
131 -- 0000 0011 0ddd 1rrr - FMUL UU "100"
|
430 |
|
|
132 -- 0000 0011 1ddd 0rrr - FMULS SS "111"
|
431 |
|
|
133 -- 0000 0011 1ddd 1rrr - FMULSU SU "110"
|
432 |
|
|
134 --
|
433 |
|
|
135 Q_DDDDD(4 downto 3) <= "10"; -- regs 16 to 23
|
434 |
|
|
136 Q_RRRRR(4 downto 3) <= "10"; -- regs 16 to 23
|
435 |
|
|
137 Q_ALU_OP <= ALU_MULT;
|
436 |
|
|
138 if I_OPC(7) = '0' then
|
437 |
|
|
139 if I_OPC(3) = '0' then
|
438 |
|
|
140 Q_IMM(7 downto 5) <= MULT_SU;
|
439 |
|
|
141 else
|
440 |
|
|
142 Q_IMM(7 downto 5) <= MULT_FUU;
|
441 |
|
|
143 end if;
|
442 |
|
|
144 else
|
443 |
|
|
145 if I_OPC(3) = '0' then
|
444 |
|
|
146 Q_IMM(7 downto 5) <= MULT_FSS;
|
445 |
|
|
147 else
|
446 |
|
|
148 Q_IMM(7 downto 5) <= MULT_FSU;
|
447 |
|
|
149 end if;
|
448 |
|
|
150 end if;
|
449 |
|
|
151 Q_WE_01 <= '1';
|
450 |
|
|
152 Q_WE_F <= '1';
|
451 |
|
|
<pre class="filename">
|
452 |
|
|
src/opc_deco.vhd
|
453 |
|
|
</pre></pre>
|
454 |
|
|
<P>
|
455 |
|
|
|
456 |
|
|
<P><br>
|
457 |
|
|
|
458 |
|
|
<H3><A NAME="section_1_6_8">7.6.8 Instructions Writing To Memory or I/O</A></H3>
|
459 |
|
|
|
460 |
|
|
<P>Instructions that write to memory or I/O registers need to set <STRONG>AMOD</STRONG>.
|
461 |
|
|
<STRONG>AMOD</STRONG> selects the pointer register involved (<STRONG>X</STRONG>, <STRONG>Y</STRONG>, <STRONG>Z</STRONG>, <STRONG>SP</STRONG>, or none).
|
462 |
|
|
If the addressing mode involves a pointer register and updates it, then
|
463 |
|
|
<STRONG>WE_XYZS</STRONG> needs to be set as well.
|
464 |
|
|
|
465 |
|
|
<P>The following code fragment shows a number of store functions and how
|
466 |
|
|
<STRONG>AMOD</STRONG> is computed:
|
467 |
|
|
|
468 |
|
|
<P><br>
|
469 |
|
|
|
470 |
|
|
<pre class="vhdl">
|
471 |
|
|
|
472 |
|
|
333 --
|
473 |
|
|
334 -- 1001 00-1r rrrr 0000 - STS
|
474 |
|
|
335 -- 1001 00-1r rrrr 0001 - ST Z+. Rr
|
475 |
|
|
336 -- 1001 00-1r rrrr 0010 - ST -Z. Rr
|
476 |
|
|
337 -- 1001 00-1r rrrr 1000 - ST Y. Rr
|
477 |
|
|
338 -- 1001 00-1r rrrr 1001 - ST Y+. Rr
|
478 |
|
|
339 -- 1001 00-1r rrrr 1010 - ST -Y. Rr
|
479 |
|
|
340 -- 1001 00-1r rrrr 1100 - ST X. Rr
|
480 |
|
|
341 -- 1001 00-1r rrrr 1101 - ST X+. Rr
|
481 |
|
|
342 -- 1001 00-1r rrrr 1110 - ST -X. Rr
|
482 |
|
|
343 -- 1001 00-1r rrrr 1111 - PUSH Rr
|
483 |
|
|
344 --
|
484 |
|
|
345 Q_ALU_OP <= ALU_D_MV_Q;
|
485 |
|
|
346 Q_WE_M <= "01";
|
486 |
|
|
347 Q_WE_XYZS <= '1';
|
487 |
|
|
348 case I_OPC(3 downto 0) is
|
488 |
|
|
349 when "0000" => Q_AMOD <= AMOD_ABS; Q_WE_XYZS <= '0';
|
489 |
|
|
350 when "0001" => Q_AMOD <= AMOD_Zi;
|
490 |
|
|
351 when "0010" => Q_AMOD <= AMOD_dZ;
|
491 |
|
|
352 when "1001" => Q_AMOD <= AMOD_Yi;
|
492 |
|
|
353 when "1010" => Q_AMOD <= AMOD_dY;
|
493 |
|
|
354 when "1100" => Q_AMOD <= AMOD_X; Q_WE_XYZS <= '0';
|
494 |
|
|
355 when "1101" => Q_AMOD <= AMOD_Xi;
|
495 |
|
|
356 when "1110" => Q_AMOD <= AMOD_dX;
|
496 |
|
|
357 when "1111" => Q_AMOD <= AMOD_dSP;
|
497 |
|
|
358 when others =>
|
498 |
|
|
359 end case;
|
499 |
|
|
<pre class="filename">
|
500 |
|
|
src/opc_deco.vhd
|
501 |
|
|
</pre></pre>
|
502 |
|
|
<P>
|
503 |
|
|
|
504 |
|
|
<P><br>
|
505 |
|
|
|
506 |
|
|
<P><STRONG>ALU_OP</STRONG> is set to <STRONG>ALU_D_MOV_Q</STRONG>. This causes the source register
|
507 |
|
|
indicated by <STRONG>DDDDD</STRONG> to be switched through the ALU unchanged so that is
|
508 |
|
|
shows up at the input of the data memory and of the I/O block. We set
|
509 |
|
|
<STRONG>WE_M</STRONG> so that the value of the source register will be written.
|
510 |
|
|
|
511 |
|
|
<P>Write instructions to memory execute in a single cycle.
|
512 |
|
|
|
513 |
|
|
<H3><A NAME="section_1_6_9">7.6.9 Instructions Reading From Memory or I/O</A></H3>
|
514 |
|
|
|
515 |
|
|
<P>Instructions that read from memory set <STRONG>AMOD</STRONG> and possibly <STRONG>WE_XYZS</STRONG> in the
|
516 |
|
|
same way as instructions writing to memory.
|
517 |
|
|
|
518 |
|
|
<P>The following code fragment shows a number of load functions:
|
519 |
|
|
|
520 |
|
|
<P><br>
|
521 |
|
|
|
522 |
|
|
<pre class="vhdl">
|
523 |
|
|
|
524 |
|
|
297 Q_IMM <= I_OPC(31 downto 16); -- absolute address for LDS/STS
|
525 |
|
|
298 if (I_OPC(9) = '0') then -- LDD / POP
|
526 |
|
|
299 --
|
527 |
|
|
300 -- 1001 00-0d dddd 0000 - LDS
|
528 |
|
|
301 -- 1001 00-0d dddd 0001 - LD Rd, Z+
|
529 |
|
|
302 -- 1001 00-0d dddd 0010 - LD Rd, -Z
|
530 |
|
|
303 -- 1001 00-0d dddd 0100 - (ii) LPM Rd, (Z)
|
531 |
|
|
304 -- 1001 00-0d dddd 0101 - (iii) LPM Rd, (Z+)
|
532 |
|
|
305 -- 1001 00-0d dddd 0110 - ELPM Z --- not mega8
|
533 |
|
|
306 -- 1001 00-0d dddd 0111 - ELPM Z+ --- not mega8
|
534 |
|
|
307 -- 1001 00-0d dddd 1001 - LD Rd, Y+
|
535 |
|
|
308 -- 1001 00-0d dddd 1010 - LD Rd, -Y
|
536 |
|
|
309 -- 1001 00-0d dddd 1100 - LD Rd, X
|
537 |
|
|
310 -- 1001 00-0d dddd 1101 - LD Rd, X+
|
538 |
|
|
311 -- 1001 00-0d dddd 1110 - LD Rd, -X
|
539 |
|
|
312 -- 1001 00-0d dddd 1111 - POP Rd
|
540 |
|
|
313 --
|
541 |
|
|
314 Q_RSEL <= RS_DIN;
|
542 |
|
|
315 Q_RD_M <= I_T0;
|
543 |
|
|
316 Q_WE_D <= '0' & not I_T0;
|
544 |
|
|
317 Q_WE_XYZS <= not I_T0;
|
545 |
|
|
318 Q_PMS <= (not I_OPC(3)) and I_OPC(2) and (not I_OPC(1));
|
546 |
|
|
319 case I_OPC(3 downto 0) is
|
547 |
|
|
320 when "0000" => Q_AMOD <= AMOD_ABS; Q_WE_XYZS <= '0';
|
548 |
|
|
321 when "0001" => Q_AMOD <= AMOD_Zi;
|
549 |
|
|
322 when "0100" => Q_AMOD <= AMOD_Z; Q_WE_XYZS <= '0';
|
550 |
|
|
323 when "0101" => Q_AMOD <= AMOD_Zi;
|
551 |
|
|
324 when "1001" => Q_AMOD <= AMOD_Yi;
|
552 |
|
|
325 when "1010" => Q_AMOD <= AMOD_dY;
|
553 |
|
|
326 when "1100" => Q_AMOD <= AMOD_X; Q_WE_XYZS <= '0';
|
554 |
|
|
327 when "1101" => Q_AMOD <= AMOD_Xi;
|
555 |
|
|
328 when "1110" => Q_AMOD <= AMOD_dX;
|
556 |
|
|
329 when "1111" => Q_AMOD <= AMOD_SPi;
|
557 |
|
|
330 when others => Q_WE_XYZS <= '0';
|
558 |
|
|
331 end case;
|
559 |
|
|
<pre class="filename">
|
560 |
|
|
src/opc_deco.vhd
|
561 |
|
|
</pre></pre>
|
562 |
|
|
<P>
|
563 |
|
|
|
564 |
|
|
<P><br>
|
565 |
|
|
|
566 |
|
|
<P>The data read from memory now comes from the <STRONG>DIN</STRONG> input. We therefore
|
567 |
|
|
set <STRONG>RSEL</STRONG> to <STRONG>RS_DIN</STRONG>. The data read from the memory is again switched
|
568 |
|
|
through the ALU unchanged, but we use <STRONG>ALU_R_MOV_Q</STRONG> instead of <STRONG>ALU_D_MOV_Q</STRONG>
|
569 |
|
|
because the data from memory is now routed via the multiplexer for <STRONG>R8</STRONG>
|
570 |
|
|
rather than via the multiplexer for <STRONG>D8</STRONG>. We generate <STRONG>RD_M</STRONG> instead of <STRONG>WE_M</STRONG>
|
571 |
|
|
since we are now reading and not writing. The result is stored in the
|
572 |
|
|
register indicated by <STRONG>DDDDD</STRONG>, so we set <STRONG>WE_D</STRONG>.
|
573 |
|
|
|
574 |
|
|
<P>One of the load instructions is <STRONG>LPM</STRONG> which reads from program store rather
|
575 |
|
|
then from the data memory. For this instruction we set <STRONG>PMS</STRONG>.
|
576 |
|
|
|
577 |
|
|
<P>Unlike store instructions, load instructions execute in two cycles. The reason
|
578 |
|
|
is the internal memory modules which need one clock cycle to produce a result.
|
579 |
|
|
We therefore generate the <STRONG>WE_D</STRONG> and <STRONG>WE_XYZS</STRONG> only on the second of the
|
580 |
|
|
two cycles.
|
581 |
|
|
|
582 |
|
|
<H3><A NAME="section_1_6_10">7.6.10 Jump and Call Instructions</A></H3>
|
583 |
|
|
|
584 |
|
|
<H4><A NAME="section_1_6_10_1">7.6.10.1 Unconditional Jump to Absolute Address</A></H4>
|
585 |
|
|
|
586 |
|
|
<P>The simplest case of a jump instruction is <STRONG>JMP</STRONG>, an unconditional jump to
|
587 |
|
|
an absolute address:
|
588 |
|
|
|
589 |
|
|
<P>The target address of the jump follows after the instruction. Due to our
|
590 |
|
|
odd/even trick with the program memory, the target address is provided on
|
591 |
|
|
the upper 16 bits of the opcode and we need not wait for it. We copy the
|
592 |
|
|
target address from the upper 16 bits of the opcode to the <STRONG>IMM</STRONG> output.
|
593 |
|
|
Then we set <STRONG>PC_OP</STRONG> to <STRONG>PC_LD_I</STRONG>:
|
594 |
|
|
|
595 |
|
|
<P><br>
|
596 |
|
|
|
597 |
|
|
<pre class="vhdl">
|
598 |
|
|
|
599 |
|
|
478 --
|
600 |
|
|
479 -- 1001 010k kkkk 110k - JMP (k = 0 for 16 bit)
|
601 |
|
|
480 -- kkkk kkkk kkkk kkkk
|
602 |
|
|
481 --
|
603 |
|
|
482 Q_PC_OP <= PC_LD_I;
|
604 |
|
|
<pre class="filename">
|
605 |
|
|
src/opc_deco.vhd
|
606 |
|
|
</pre></pre>
|
607 |
|
|
<P>
|
608 |
|
|
|
609 |
|
|
<P><br>
|
610 |
|
|
|
611 |
|
|
<P>The execution stage will then cause the <STRONG>PC</STRONG> to be loaded from its <STRONG>JADR</STRONG> input:
|
612 |
|
|
|
613 |
|
|
<P><br>
|
614 |
|
|
|
615 |
|
|
<pre class="vhdl">
|
616 |
|
|
|
617 |
|
|
209 when PC_LD_I => Q_LOAD_PC <= '1'; -- yes: new PC on I_JADR
|
618 |
|
|
<pre class="filename">
|
619 |
|
|
src/data_path.vhd
|
620 |
|
|
</pre></pre>
|
621 |
|
|
<P>
|
622 |
|
|
|
623 |
|
|
<P><br>
|
624 |
|
|
|
625 |
|
|
<P>The next opcode after the <STRONG>JMP</STRONG> is already in the pipeline and would be
|
626 |
|
|
executed next. We invalidate the next opcode so that it will not be
|
627 |
|
|
executed:
|
628 |
|
|
|
629 |
|
|
<P><br>
|
630 |
|
|
|
631 |
|
|
<pre class="vhdl">
|
632 |
|
|
|
633 |
|
|
222 when PC_LD_I => Q_SKIP <= '1'; -- yes
|
634 |
|
|
<pre class="filename">
|
635 |
|
|
src/data_path.vhd
|
636 |
|
|
</pre></pre>
|
637 |
|
|
<P>
|
638 |
|
|
|
639 |
|
|
<P><br>
|
640 |
|
|
|
641 |
|
|
<P>An instruction similar to <STRONG>JMP</STRONG> is <STRONG>IJMP</STRONG>. The difference is that the target
|
642 |
|
|
address of the jump is not provided as an immediate address following the
|
643 |
|
|
opcode, but is the content of the Z register. This case is handled by a
|
644 |
|
|
different <STRONG>PC_OP</STRONG>:
|
645 |
|
|
|
646 |
|
|
<P><br>
|
647 |
|
|
|
648 |
|
|
<pre class="vhdl">
|
649 |
|
|
|
650 |
|
|
450 --
|
651 |
|
|
451 -- 1001 0100 0000 1001 IJMP
|
652 |
|
|
452 -- 1001 0100 0001 1001 EIJMP -- not mega8
|
653 |
|
|
453 -- 1001 0101 0000 1001 ICALL
|
654 |
|
|
454 -- 1001 0101 0001 1001 EICALL -- not mega8
|
655 |
|
|
455 --
|
656 |
|
|
456 Q_PC_OP <= PC_LD_Z;
|
657 |
|
|
<pre class="filename">
|
658 |
|
|
src/opc_deco.vhd
|
659 |
|
|
</pre></pre>
|
660 |
|
|
<P>
|
661 |
|
|
|
662 |
|
|
<P><br>
|
663 |
|
|
|
664 |
|
|
<P>The execution stage, which contains the <STRONG>Z</STRONG> register, performs the
|
665 |
|
|
selection of the target address, as we have already seen in the discussion
|
666 |
|
|
of the data path.
|
667 |
|
|
|
668 |
|
|
<H4><A NAME="section_1_6_10_2">7.6.10.2 Unconditional Jump to Relative Address</A></H4>
|
669 |
|
|
|
670 |
|
|
<P>The <STRONG>RJMP</STRONG> instruction is similar to the <STRONG>JMP</STRONG> instruction. The target
|
671 |
|
|
address of the jump is, however, an address relative to the current <STRONG>PC</STRONG>
|
672 |
|
|
(plus 1). We sign-extend the relative address (by replicating <STRONG>OPC(11)</STRONG>
|
673 |
|
|
until a 16-bit value is reached) and add the current <STRONG>PC</STRONG>.
|
674 |
|
|
|
675 |
|
|
<P><br>
|
676 |
|
|
|
677 |
|
|
<pre class="vhdl">
|
678 |
|
|
|
679 |
|
|
580 --
|
680 |
|
|
581 -- 1100 kkkk kkkk kkkk - RJMP
|
681 |
|
|
582 --
|
682 |
|
|
583 Q_JADR <= I_PC + (I_OPC(11) & I_OPC(11) & I_OPC(11) & I_OPC(11)
|
683 |
|
|
584 & I_OPC(11 downto 0)) + X"0001";
|
684 |
|
|
585 Q_PC_OP <= PC_LD_I;
|
685 |
|
|
<pre class="filename">
|
686 |
|
|
src/opc_deco.vhd
|
687 |
|
|
</pre></pre>
|
688 |
|
|
<P>
|
689 |
|
|
|
690 |
|
|
<P><br>
|
691 |
|
|
|
692 |
|
|
<P>The rest of <STRONG>RJMP</STRONG> is the same as for <STRONG>JMP</STRONG>.
|
693 |
|
|
|
694 |
|
|
<H4><A NAME="section_1_6_10_3">7.6.10.3 Conditional Jump to Relative Address</A></H4>
|
695 |
|
|
|
696 |
|
|
<P>There is a number of conditional jump instructions that differ by the
|
697 |
|
|
bit in the status register that controls whether the branch is taken or not.
|
698 |
|
|
<STRONG>BRCS</STRONG> and <STRONG>BRCC</STRONG> branch if bit 0 (the carry flag) is set resp. cleared.
|
699 |
|
|
<STRONG>BREQ</STRONG> and <STRONG>BRNE</STRONG> branch if bit 1 (the zero flag) is set resp. cleared,
|
700 |
|
|
and so on.
|
701 |
|
|
|
702 |
|
|
<P>There is also a generic form where the bit number is an operand of the
|
703 |
|
|
opcode. <STRONG>BRBS</STRONG> branches if a status register flag is set while <STRONG>BRBC</STRONG>
|
704 |
|
|
branches if a bit is cleared. This means that <STRONG>BRCS</STRONG>, <STRONG>BREQ</STRONG>, ... are
|
705 |
|
|
just different names for the <STRONG>BRBS</STRONG> instruction, while <STRONG>BRCC</STRONG>, <STRONG>BRNE</STRONG>, ...
|
706 |
|
|
are different name for the <STRONG>BRBC</STRONG> instruction.
|
707 |
|
|
|
708 |
|
|
<P>The relative address (i.e. the offset from the PC) for <STRONG>BRBC</STRONG>/<STRONG>BRBS</STRONG> is
|
709 |
|
|
shorter (7 bit) than for <STRONG>RJMP</STRONG> (12 bit). Therefore the sign bit of the
|
710 |
|
|
offset is replicated more often in order to get a 16-bit signed offset
|
711 |
|
|
that can be added to the <STRONG>PC</STRONG>.
|
712 |
|
|
|
713 |
|
|
<P><br>
|
714 |
|
|
|
715 |
|
|
<pre class="vhdl">
|
716 |
|
|
|
717 |
|
|
610 --
|
718 |
|
|
611 -- 1111 00kk kkkk kbbb - BRBS
|
719 |
|
|
612 -- 1111 01kk kkkk kbbb - BRBC
|
720 |
|
|
613 -- v
|
721 |
|
|
614 -- bbb: status register bit
|
722 |
|
|
615 -- v: value (set/cleared) of status register bit
|
723 |
|
|
616 --
|
724 |
|
|
617 Q_JADR <= I_PC + (I_OPC(9) & I_OPC(9) & I_OPC(9) & I_OPC(9)
|
725 |
|
|
618 & I_OPC(9) & I_OPC(9) & I_OPC(9) & I_OPC(9)
|
726 |
|
|
619 & I_OPC(9) & I_OPC(9 downto 3)) + X"0001";
|
727 |
|
|
620 Q_PC_OP <= PC_BCC;
|
728 |
|
|
<pre class="filename">
|
729 |
|
|
src/opc_deco.vhd
|
730 |
|
|
</pre></pre>
|
731 |
|
|
<P>
|
732 |
|
|
|
733 |
|
|
<P><br>
|
734 |
|
|
|
735 |
|
|
<P>The decision to branch or not is taken in the execution stage, because
|
736 |
|
|
at the time where the conditional branch is decoded, the relevant bit
|
737 |
|
|
in the status register is not yet valid.
|
738 |
|
|
|
739 |
|
|
<H4><A NAME="section_1_6_10_4">7.6.10.4 Call Instructions</A></H4>
|
740 |
|
|
|
741 |
|
|
<P>Many unconditional jump instructions have "call" variant. The "call"
|
742 |
|
|
variant are executed like the corresponding jump instruction. In
|
743 |
|
|
addition (and at the same time), the <STRONG>PC</STRONG> after the instruction is pushed
|
744 |
|
|
onto the stack. We take <STRONG>CALL</STRONG>, the brother of <STRONG>JMP</STRONG> as an example:
|
745 |
|
|
|
746 |
|
|
<P><br>
|
747 |
|
|
|
748 |
|
|
<pre class="vhdl">
|
749 |
|
|
|
750 |
|
|
485 --
|
751 |
|
|
486 -- 1001 010k kkkk 111k - CALL (k = 0)
|
752 |
|
|
487 -- kkkk kkkk kkkk kkkk
|
753 |
|
|
488 --
|
754 |
|
|
489 Q_ALU_OP <= ALU_PC_2;
|
755 |
|
|
490 Q_AMOD <= AMOD_ddSP;
|
756 |
|
|
491 Q_PC_OP <= PC_LD_I;
|
757 |
|
|
492 Q_WE_M <= "11"; -- both PC bytes
|
758 |
|
|
493 Q_WE_XYZS <= '1';
|
759 |
|
|
<pre class="filename">
|
760 |
|
|
src/opc_deco.vhd
|
761 |
|
|
</pre></pre>
|
762 |
|
|
<P>
|
763 |
|
|
|
764 |
|
|
<P><br>
|
765 |
|
|
|
766 |
|
|
<P>The new things are an <STRONG>ALU_OP</STRONG> of <STRONG>ALU_PC_2</STRONG>. The ALU adds 2 to the <STRONG>PC</STRONG>,
|
767 |
|
|
since the <STRONG>CALL</STRONG> instructions is 2 words long. The <STRONG>RCALL</STRONG> instruction,
|
768 |
|
|
which is only 1 word long would use <STRONG>ALU_PC_1</STRONG> instead. <STRONG>AMOD</STRONG> is
|
769 |
|
|
pre-decrement of the <STRONG>SP</STRONG> by 2 (since the return address is 2 bytes long).
|
770 |
|
|
Both bits of <STRONG>WE_M</STRONG> are set since we write 2 bytes.
|
771 |
|
|
|
772 |
|
|
<H4><A NAME="section_1_6_10_5">7.6.10.5 Skip Instructions</A></H4>
|
773 |
|
|
|
774 |
|
|
<P>Skip instructions do not modify the PC, but they invalidate the next
|
775 |
|
|
instruction. Like for conditional branch instructions, the condition
|
776 |
|
|
is checked in the execution stage.
|
777 |
|
|
|
778 |
|
|
<P>We take <STRONG>SBIC</STRONG> as an example:
|
779 |
|
|
|
780 |
|
|
<P><br>
|
781 |
|
|
|
782 |
|
|
<pre class="vhdl">
|
783 |
|
|
|
784 |
|
|
516 --
|
785 |
|
|
517 -- 1001 1000 AAAA Abbb - CBI
|
786 |
|
|
518 -- 1001 1001 AAAA Abbb - SBIC
|
787 |
|
|
519 -- 1001 1010 AAAA Abbb - SBI
|
788 |
|
|
520 -- 1001 1011 AAAA Abbb - SBIS
|
789 |
|
|
521 --
|
790 |
|
|
522 Q_ALU_OP <= ALU_BIT_CS;
|
791 |
|
|
523 Q_AMOD <= AMOD_ABS;
|
792 |
|
|
524 Q_BIT(3) <= I_OPC(9); -- set/clear
|
793 |
|
|
525
|
794 |
|
|
526 -- IMM = AAAAAA + 0x20
|
795 |
|
|
527 --
|
796 |
|
|
528 Q_IMM(4 downto 0) <= I_OPC(7 downto 3);
|
797 |
|
|
529 Q_IMM(6 downto 5) <= "01";
|
798 |
|
|
530
|
799 |
|
|
531 Q_RD_M <= I_T0;
|
800 |
|
|
532 if ((I_OPC(8) = '0') ) then -- CBI or SBI
|
801 |
|
|
533 Q_WE_M(0) <= '1';
|
802 |
|
|
534 else -- SBIC or SBIS
|
803 |
|
|
535 if (I_T0 = '0') then -- second cycle.
|
804 |
|
|
536 Q_PC_OP <= PC_SKIP_T;
|
805 |
|
|
537 end if;
|
806 |
|
|
538 end if;
|
807 |
|
|
<pre class="filename">
|
808 |
|
|
src/opc_deco.vhd
|
809 |
|
|
</pre></pre>
|
810 |
|
|
<P>
|
811 |
|
|
|
812 |
|
|
<P><br>
|
813 |
|
|
|
814 |
|
|
<P>First of all, <STRONG>AMOD</STRONG>, <STRONG>IMM</STRONG>, and <STRONG>RSEL</STRONG> are set such that the value
|
815 |
|
|
from the I/O register indicated by <STRONG>IMM</STRONG> reaches the ALU.
|
816 |
|
|
<STRONG>ALU_OP</STRONG> and <STRONG>BIT</STRONG> are set such that the relevant bit reaches
|
817 |
|
|
<STRONG>FLAGS_98(9)</STRONG> in the data path. The access of the bit followed by a
|
818 |
|
|
skip decision would have taken too long for a single cycle.
|
819 |
|
|
We therefore extract the bit in the first cycle and store it in
|
820 |
|
|
the <STRONG>FLAGS_98(9)</STRONG> signal in the data path. In the next cycle,
|
821 |
|
|
the decision to skip or not is taken.
|
822 |
|
|
|
823 |
|
|
<P>The <STRONG>PC_OP</STRONG> of <STRONG>PC_SKIP_T</STRONG> causes the <STRONG>SKIP</STRONG> output of the execution stage
|
824 |
|
|
to be raised if <STRONG>FLAGS_98(9)</STRONG> is set:
|
825 |
|
|
|
826 |
|
|
<P><br>
|
827 |
|
|
|
828 |
|
|
<pre class="vhdl">
|
829 |
|
|
|
830 |
|
|
226 when PC_SKIP_T => Q_SKIP <= L_FLAGS_98(9); -- if T set
|
831 |
|
|
<pre class="filename">
|
832 |
|
|
src/data_path.vhd
|
833 |
|
|
</pre></pre>
|
834 |
|
|
<P>
|
835 |
|
|
|
836 |
|
|
<P><br>
|
837 |
|
|
|
838 |
|
|
<P>A similar instruction is CPSE, which skips the next instruction when a
|
839 |
|
|
comparison (rather than a bit in an I/O register) indicates equality.
|
840 |
|
|
It works like a CP instruction, but raises <STRONG>SKIP</STRONG> in the execution stage
|
841 |
|
|
rather than updating the status register.
|
842 |
|
|
|
843 |
|
|
<H4><A NAME="section_1_6_10_6">7.6.10.6 Interrupts</A></H4>
|
844 |
|
|
|
845 |
|
|
<P>We have seen earlier, that the opcode fetch stage inserts "interrupt
|
846 |
|
|
instructions" into the pipeline when an interrupt occurs. These interrupt
|
847 |
|
|
instructions are similar to <STRONG>CALL</STRONG> instructions. In contrast to <STRONG>CALL</STRONG>
|
848 |
|
|
instructions, however, we use <STRONG>ALU_INTR</STRONG> instead of <STRONG>ALU_PC_2</STRONG>. This
|
849 |
|
|
copies the <STRONG>PC</STRONG> (rather than <STRONG>PC</STRONG> + 2) to the output of the ALU (due to
|
850 |
|
|
the fact that we have overridden a valid instruction and want to continue
|
851 |
|
|
with exactly that instruction after returning from the interrupt, Another
|
852 |
|
|
thing that <STRONG>ALU_INTR</STRONG> does is to clear the <STRONG>I</STRONG> flag in the status register.
|
853 |
|
|
|
854 |
|
|
<P>The interrupt opcodes are implemented as follows:
|
855 |
|
|
|
856 |
|
|
<P><br>
|
857 |
|
|
|
858 |
|
|
<pre class="vhdl">
|
859 |
|
|
|
860 |
|
|
95 --
|
861 |
|
|
96 -- 0000 0000 0000 0000 - NOP
|
862 |
|
|
97 -- 0000 0000 001v vvvv - INTERRUPT
|
863 |
|
|
98 --
|
864 |
|
|
99 if (I_OPC(5)) = '1' then -- interrupt
|
865 |
|
|
100 Q_ALU_OP <= ALU_INTR;
|
866 |
|
|
101 Q_AMOD <= AMOD_ddSP;
|
867 |
|
|
102 Q_JADR <= "0000000000" & I_OPC(4 downto 0) & "0";
|
868 |
|
|
103 Q_PC_OP <= PC_LD_I;
|
869 |
|
|
104 Q_WE_F <= '1';
|
870 |
|
|
105 Q_WE_M <= "11";
|
871 |
|
|
106 end if;
|
872 |
|
|
<pre class="filename">
|
873 |
|
|
src/opc_deco.vhd
|
874 |
|
|
</pre></pre>
|
875 |
|
|
<P>
|
876 |
|
|
|
877 |
|
|
<P><br>
|
878 |
|
|
|
879 |
|
|
<H3><A NAME="section_1_6_11">7.6.11 Instructions Not Implemented</A></H3>
|
880 |
|
|
|
881 |
|
|
<P>A handful of instructions was not implemented. The reasons for not
|
882 |
|
|
implementing them is one of the following:
|
883 |
|
|
|
884 |
|
|
<OL>
|
885 |
|
|
<LI>The instruction is only available in particular devices, typically due
|
886 |
|
|
to extended capabilities of these devices (<STRONG>EICALL</STRONG>, <STRONG>EIJMP</STRONG>, <STRONG>ELPM</STRONG>).
|
887 |
|
|
<LI>The instruction uses capabilities that are somewhat unusual in
|
888 |
|
|
general (<STRONG>BREAK</STRONG>, <STRONG>DES</STRONG>, <STRONG>SLEEP</STRONG>, <STRONG>WDR</STRONG>).
|
889 |
|
|
</OL>
|
890 |
|
|
<P>These instructions are normally not generated by C/C++ compilers, but
|
891 |
|
|
need to be generated by means of #<STRONG>asm</STRONG> directives. At this point the
|
892 |
|
|
reader should have learned enough to implement these functions when needed.
|
893 |
|
|
|
894 |
|
|
<H2><A NAME="section_1_7">7.7 Index of all Instructions</A></H2>
|
895 |
|
|
|
896 |
|
|
<P>The following table lists all CPU instructions and a reference
|
897 |
|
|
to the chapter where they are (supposed to be) described.
|
898 |
|
|
|
899 |
|
|
<TABLE>
|
900 |
|
|
<TR><TD>ADC</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
901 |
|
|
</TD></TR><TR><TD>ADD</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
902 |
|
|
</TD></TR><TR><TD>ADIW</TD><TD>7.6.5</TD><TD>16-bit Dyadic Instructions
|
903 |
|
|
</TD></TR><TR><TD>AND</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
904 |
|
|
</TD></TR><TR><TD>ANDI</TD><TD>7.6.4</TD><TD>8-bit Dyadic Instructions, Register/Immediate
|
905 |
|
|
</TD></TR><TR><TD>ASR</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
906 |
|
|
</TD></TR><TR><TD>BCLR</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
907 |
|
|
</TD></TR><TR><TD>BLD</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
908 |
|
|
</TD></TR><TR><TD>BRcc</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
909 |
|
|
</TD></TR><TR><TD>BREAK</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
910 |
|
|
</TD></TR><TR><TD>BSET</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
911 |
|
|
</TD></TR><TR><TD>BST</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
912 |
|
|
</TD></TR><TR><TD>CALL</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
913 |
|
|
</TD></TR><TR><TD>CBI</TD><TD>7.6.6</TD><TD>Bit Instructions
|
914 |
|
|
</TD></TR><TR><TD>CBR</TD><TD>-</TD><TD>see ANDI
|
915 |
|
|
</TD></TR><TR><TD>CL<flag></TD><TD>-</TD><TD>see BCLR
|
916 |
|
|
</TD></TR><TR><TD>CLR</TD><TD>-</TD><TD>see LDI
|
917 |
|
|
</TD></TR><TR><TD>COM</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
918 |
|
|
</TD></TR><TR><TD>CP</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
919 |
|
|
</TD></TR><TR><TD>CPC</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
920 |
|
|
</TD></TR><TR><TD>CPI</TD><TD>7.6.4</TD><TD>8-bit Dyadic Instructions, Register/Immediate
|
921 |
|
|
</TD></TR><TR><TD>CPSE</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
922 |
|
|
</TD></TR><TR><TD>DEC</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
923 |
|
|
</TD></TR><TR><TD>DES</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
924 |
|
|
</TD></TR><TR><TD>EICALL</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
925 |
|
|
</TD></TR><TR><TD>EIJMP</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
926 |
|
|
</TD></TR><TR><TD>ELPM</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
927 |
|
|
</TD></TR><TR><TD>EOR</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
928 |
|
|
</TD></TR><TR><TD>FMUL[SU]</TD><TD>7.6.7</TD><TD>Multiplication Instructions
|
929 |
|
|
</TD></TR><TR><TD>ICALL</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
930 |
|
|
</TD></TR><TR><TD>IN</TD><TD>7.6.9</TD><TD>Instructions Reading From Memory or I/O
|
931 |
|
|
</TD></TR><TR><TD>INC</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
932 |
|
|
</TD></TR><TR><TD>IJMP</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
933 |
|
|
</TD></TR><TR><TD>JMP</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
934 |
|
|
</TD></TR><TR><TD>LDD</TD><TD>7.6.9</TD><TD>Instructions Reading From Memory or I/O
|
935 |
|
|
</TD></TR><TR><TD>LDI</TD><TD>7.6.5</TD><TD>16-bit Dyadic Instructions
|
936 |
|
|
</TD></TR><TR><TD>LDS</TD><TD>7.6.9</TD><TD>Instructions Reading From Memory or I/O
|
937 |
|
|
</TD></TR><TR><TD>LSL</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
938 |
|
|
</TD></TR><TR><TD>LSR</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
939 |
|
|
</TD></TR><TR><TD>MOV</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
940 |
|
|
</TD></TR><TR><TD>MOVW</TD><TD>7.6.5</TD><TD>16-bit Dyadic Instructions
|
941 |
|
|
</TD></TR><TR><TD>MUL[SU]</TD><TD>7.6.7</TD><TD>Multiplication Instructions
|
942 |
|
|
</TD></TR><TR><TD>NEG</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
943 |
|
|
</TD></TR><TR><TD>NOP</TD><TD>7.6.1</TD><TD>The NOP instruction
|
944 |
|
|
</TD></TR><TR><TD>NOT</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
945 |
|
|
</TD></TR><TR><TD>OR</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
946 |
|
|
</TD></TR><TR><TD>ORI</TD><TD>7.6.4</TD><TD>8-bit Dyadic Instructions, Register/Immediate
|
947 |
|
|
</TD></TR><TR><TD>OUT</TD><TD>7.6.8</TD><TD>Instructions Writing To Memory or I/O
|
948 |
|
|
</TD></TR><TR><TD>POP</TD><TD>7.6.9</TD><TD>Instructions Reading From Memory or I/O
|
949 |
|
|
</TD></TR><TR><TD>PUSH</TD><TD>7.6.8</TD><TD>Instructions Writing To Memory or I/O
|
950 |
|
|
</TD></TR><TR><TD>RCALL</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
951 |
|
|
</TD></TR><TR><TD>RET</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
952 |
|
|
</TD></TR><TR><TD>RETI</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
953 |
|
|
</TD></TR><TR><TD>RJMP</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
954 |
|
|
</TD></TR><TR><TD>ROL</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
955 |
|
|
</TD></TR><TR><TD>SBC</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
956 |
|
|
</TD></TR><TR><TD>SBCI</TD><TD>7.6.4</TD><TD>8-bit Dyadic Instructions, Register/Immediate
|
957 |
|
|
</TD></TR><TR><TD>SBI</TD><TD>7.6.6</TD><TD>Bit Instructions
|
958 |
|
|
</TD></TR><TR><TD>SBIC</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
959 |
|
|
</TD></TR><TR><TD>SBIS</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
960 |
|
|
</TD></TR><TR><TD>SBIW</TD><TD>7.6.5</TD><TD>16-bit Dyadic Instructions
|
961 |
|
|
</TD></TR><TR><TD>SBR</TD><TD>-</TD><TD>see ORI
|
962 |
|
|
</TD></TR><TR><TD>SBRC</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
963 |
|
|
</TD></TR><TR><TD>SBRS</TD><TD>7.6.10</TD><TD>Jump and Call Instructions
|
964 |
|
|
</TD></TR><TR><TD>SE<flag></TD><TD>-</TD><TD>see BSET
|
965 |
|
|
</TD></TR><TR><TD>SER</TD><TD>-</TD><TD>see LDI
|
966 |
|
|
</TD></TR><TR><TD>SLEEP</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
967 |
|
|
</TD></TR><TR><TD>SPM</TD><TD>7.6.8</TD><TD>Instructions Writing To Memory or I/O
|
968 |
|
|
</TD></TR><TR><TD>STD</TD><TD>7.6.8</TD><TD>Instructions Writing To Memory or I/O
|
969 |
|
|
</TD></TR><TR><TD>STS</TD><TD>7.6.8</TD><TD>Instructions Writing To Memory or I/O
|
970 |
|
|
</TD></TR><TR><TD>SUB</TD><TD>7.6.3</TD><TD>8-bit Dyadic Instructions, Register/Register
|
971 |
|
|
</TD></TR><TR><TD>SUBI</TD><TD>7.6.4</TD><TD>8-bit Dyadic Instructions, Register/Immediate
|
972 |
|
|
</TD></TR><TR><TD>SWAP</TD><TD>7.6.2</TD><TD>8-bit Monadic Instructions
|
973 |
|
|
</TD></TR><TR><TD>WDR</TD><TD>7.6.11</TD><TD>Instructions Not Implemented
|
974 |
|
|
</TD></TR>
|
975 |
|
|
</TABLE>
|
976 |
|
|
<P>This concludes the discussion of the CPU. In the next lesson we will
|
977 |
|
|
proceed with the input/output unit.
|
978 |
|
|
|
979 |
|
|
<P><hr><BR>
|
980 |
|
|
<table class="ttop"><th class="tpre"><a href="06_Data_Path.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="08_IO.html">Next Lesson</a></th></table>
|
981 |
|
|
</BODY>
|
982 |
|
|
</HTML>
|