OpenCores
URL https://opencores.org/ocsvn/cpu_lecture/cpu_lecture/trunk

Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [04_Cpu_Core.html] - Blame information for rev 2

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 2 jsauermann
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2
"http://www.w3.org/TR/html4/strict.dtd">
3
<HTML>
4
<HEAD>
5
<TITLE>html/Cpu_Core</TITLE>
6
<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">
7
<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">
8
</HEAD>
9
<BODY>
10
<P><table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
11
<hr>
12
 
13
<H1><A NAME="section_1">4 THE CPU CORE</A></H1>
14
 
15
<P>In this lesson we will discuss the core of the CPU. These days, the same
16
kind of CPU can come in different flavors that differ in the clock
17
frequency that that support, bus sizes, the size of internal caches
18
and memories and the capabilities of the I/O ports they provide.
19
We call the common part of these different CPUs the <STRONG>CPU core</STRONG>.
20
The CPU core is primarily characterized by the instruction set that it
21
provides. One could also say that the CPU core is the implementation
22
of a given instruction set.
23
 
24
<P>The details of the instruction set will only be visible at the next lower
25
level of the design. At the current level different CPUs (with
26
different instruction sets) will still look the same because they
27
all use the same structure. Only some control signals will be different
28
for different CPUs.
29
 
30
<P>We will use the so-called <STRONG>Harvard architecture</STRONG> because it fits better
31
to FPGAs with internal memory modules. Harvard architecture means that
32
the program memory and the data memory of the CPU are different. This
33
gives us more flexibility and some instructions (for example <STRONG>CALL</STRONG>,
34
which involves storing the current program counter in
35
memory while changing the program counter and fetching the next
36
instruction) can be executed in parallel).
37
 
38
<P>Different CPU cores differ in the in the instruction set that
39
they support. The types of CPU instructions (like arithmetic
40
instructions, move instructions, branch instructions, etc.) are
41
essentially the same for all CPUs. The differences are in the details
42
like the encoding of the instructions, operand sizes, number of
43
registers addressable, and the like).
44
 
45
<P>Since all CPUs are rather similar apart from details, within
46
the same base architecture (Harvard vs. von Neumann), the same
47
structure can be used even for different instruction sets. This
48
is because the same cycle is repeated again and again for the
49
different instructions of a program. This cycle consists of 3
50
phases:
51
 
52
<UL>
53
  <LI>Opcode fetch
54
  <LI>Opcode decoding
55
  <LI>Execution
56
</UL>
57
<P><STRONG>Opcode fetch</STRONG> means that for a given value of the program counter
58
<STRONG>PC</STRONG>, the instruction (opcode) stored at location PC is read from the
59
program memory and that the PC is advanced to the next instruction.
60
 
61
<P><STRONG>Opcode decoding</STRONG> computes a number of control signals that will
62
be needed in the execution phase.
63
 
64
<P><STRONG>Execution</STRONG> then executes the opcode which means that a small number
65
of registers or memory locations is read and/or written.
66
 
67
<P>In theory these 3 phases could be implemented in a combinational way
68
(a static program memory, an opcode decoder at the output of the program
69
memory and an execution module at the output of the opcode decoder).
70
We will see later, however, that each phase has a considerable complexity
71
and we therefore use a 3 stage pipeline instead.
72
 
73
<P>In the following figure we see how a sequence of three opcodes ADD, MOV,
74
and JMP is executed in the pipeline.
75
 
76
<P><br>
77
 
78
<P><img src="cpu_core_1.png">
79
 
80
<P><br>
81
 
82
<P>From the discussion above we can already predict the big picture of
83
the CPU core. It consists of a pipeline with 3 stages opcode fetch,
84
opcode decoder, and execution (which is called data path in the design
85
because the operations required by the execution more or less imply
86
the structure of the data paths in the execution stage:
87
 
88
<P><br>
89
 
90
<P><img src="cpu_core_2.png">
91
 
92
<P><br>
93
 
94
<P>The pipeline consists of the <STRONG>opc_fetch</STRONG> stage that drives <STRONG>PC</STRONG>, <STRONG>OPC</STRONG>, and
95
<STRONG>T0</STRONG> signals to the opcode decoder stage.
96
The <STRONG>opc_deco</STRONG> stage decodes the <STRONG>OPC</STRONG> signal and generates a number of
97
control signals towards the execution stage, The execution stage then
98
executes the decoded instruction.
99
 
100
<P>The control signals towards the execution stage can be divided into 3 groups:
101
 
102
<OL>
103
  <LI>Select signals (<STRONG>ALU_OP</STRONG>, <STRONG>AMOD</STRONG>, <STRONG>BIT</STRONG>, <STRONG>DDDDD</STRONG>, <STRONG>IMM</STRONG>, <STRONG>OPC</STRONG>, <STRONG>PMS</STRONG>,
104
   <STRONG>RD_M</STRONG>, <STRONG>RRRRR</STRONG>, and <STRONG>RSEL</STRONG>). These signals control details (like register
105
   numbers) of the instruction being executed.
106
  <LI>Branch and timing signals (<STRONG>PC</STRONG>, <STRONG>PC_OP</STRONG>, <STRONG>WAIT</STRONG>, (and <STRONG>SKIP</STRONG> in the reverse
107
   direction)). These signals control changes in the normal execution
108
   flow.
109
  <LI>Write enable signals (<STRONG>WE_01</STRONG>, <STRONG>WE_D</STRONG>, <STRONG>WE_F</STRONG>, <STRONG>WE_M</STRONG>, and <STRONG>WE_XYZS</STRONG>).
110
   These signals define if and when registers and memory locations are
111
   updated.
112
</OL>
113
<P>We come to the VHDL code for the CPU core.  The entity declaration
114
must match the instantiation in the top-level design. Therefore:
115
 
116
<P><br>
117
 
118
<pre class="vhdl">
119
 
120
 33     entity cpu_core is
121
 34         port (  I_CLK       : in  std_logic;
122
 35                 I_CLR       : in  std_logic;
123
 36                 I_INTVEC    : in  std_logic_vector( 5 downto 0);
124
 37                 I_DIN       : in  std_logic_vector( 7 downto 0);
125
 38
126
 39                 Q_OPC       : out std_logic_vector(15 downto 0);
127
 40                 Q_PC        : out std_logic_vector(15 downto 0);
128
 41                 Q_DOUT      : out std_logic_vector( 7 downto 0);
129
 42                 Q_ADR_IO    : out std_logic_vector( 7 downto 0);
130
 43                 Q_RD_IO     : out std_logic;
131
 44                 Q_WE_IO     : out std_logic);
132
<pre class="filename">
133
src/cpu_core.vhd
134
</pre></pre>
135
<P>
136
 
137
<P><br>
138
 
139
<P>The declaration and instantiation of <STRONG>opc_fetch</STRONG>, <STRONG>opc_deco</STRONG>, and <STRONG>dpath</STRONG>
140
simply reflects what is shown in the previous figure.
141
 
142
<P>The multiplexer driving <STRONG>DIN</STRONG> selects between data from the I/O input and
143
data from the program memory. This is controlled by signal <STRONG>PMS</STRONG> (<STRONG>program
144
memory select</STRONG>):
145
 
146
<P><br>
147
 
148
<pre class="vhdl">
149
 
150
240         L_DIN <= F_PM_DOUT when (D_PMS = '1') else I_DIN(7 downto 0);
151
<pre class="filename">
152
src/cpu_core.vhd
153
</pre></pre>
154
<P>
155
 
156
<P><br>
157
 
158
<P>The interrupt vector input <STRONG>INTVEC</STRONG> is <STRONG>and</STRONG>'ed with the global interrupt
159
enable bit in the status register (which is contained in the data path):
160
 
161
<P><br>
162
 
163
<pre class="vhdl">
164
 
165
241         L_INTVEC_5 <= I_INTVEC(5) and R_INT_ENA;
166
<pre class="filename">
167
src/cpu_core.vhd
168
</pre></pre>
169
<P>
170
 
171
<P><br>
172
 
173
<P>This concludes the discussion of the CPU core and we will proceed with
174
the different stages of the pipeline. Rather than following the natural
175
order (opcode fetch, opcode decoder, execution), however, we will describe
176
the opcode decoder last. The reason is that the opcode decoder is a
177
consequence of the design of the execution stage. Once the execution stage
178
is understood, the opcode decoder will become obvious (though still complex).
179
 
180
<P><hr><BR>
181
<table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
182
</BODY>
183
</HTML>

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.