URL https://opencores.org/ocsvn/cpu_lecture/cpu_lecture/trunk

Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [04_Cpu_Core.html] - Blame information for rev 2

Details | Compare with Previous | View Log


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>html/Cpu_Core</TITLE>
<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">
<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">
</HEAD>
<BODY>
<P><table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
<hr>
 
<H1><A NAME="section_1">4 THE CPU CORE</A></H1>
 
<P>In this lesson we will discuss the core of the CPU. These days, the same
kind of CPU can come in different flavors that differ in the clock
frequency that that support, bus sizes, the size of internal caches
and memories and the capabilities of the I/O ports they provide.
We call the common part of these different CPUs the <STRONG>CPU core</STRONG>.
The CPU core is primarily characterized by the instruction set that it
provides. One could also say that the CPU core is the implementation
of a given instruction set.
 
<P>The details of the instruction set will only be visible at the next lower
level of the design. At the current level different CPUs (with
different instruction sets) will still look the same because they
all use the same structure. Only some control signals will be different
for different CPUs.
 
<P>We will use the so-called <STRONG>Harvard architecture</STRONG> because it fits better
to FPGAs with internal memory modules. Harvard architecture means that
the program memory and the data memory of the CPU are different. This
gives us more flexibility and some instructions (for example <STRONG>CALL</STRONG>,
which involves storing the current program counter in
memory while changing the program counter and fetching the next
instruction) can be executed in parallel).
 
<P>Different CPU cores differ in the in the instruction set that
they support. The types of CPU instructions (like arithmetic
instructions, move instructions, branch instructions, etc.) are
essentially the same for all CPUs. The differences are in the details
like the encoding of the instructions, operand sizes, number of
registers addressable, and the like).
 
<P>Since all CPUs are rather similar apart from details, within
the same base architecture (Harvard vs. von Neumann), the same
structure can be used even for different instruction sets. This
is because the same cycle is repeated again and again for the
different instructions of a program. This cycle consists of 3
phases:
 
<UL>
  <LI>Opcode fetch
  <LI>Opcode decoding
  <LI>Execution
</UL>
<P><STRONG>Opcode fetch</STRONG> means that for a given value of the program counter
<STRONG>PC</STRONG>, the instruction (opcode) stored at location PC is read from the
program memory and that the PC is advanced to the next instruction.
 
<P><STRONG>Opcode decoding</STRONG> computes a number of control signals that will
be needed in the execution phase.
 
<P><STRONG>Execution</STRONG> then executes the opcode which means that a small number
of registers or memory locations is read and/or written.
 
<P>In theory these 3 phases could be implemented in a combinational way
(a static program memory, an opcode decoder at the output of the program
memory and an execution module at the output of the opcode decoder).
We will see later, however, that each phase has a considerable complexity
and we therefore use a 3 stage pipeline instead.
 
<P>In the following figure we see how a sequence of three opcodes ADD, MOV,
and JMP is executed in the pipeline.
 
<P><br>
 
<P><img src="cpu_core_1.png">
 
<P><br>
 
<P>From the discussion above we can already predict the big picture of
the CPU core. It consists of a pipeline with 3 stages opcode fetch,
opcode decoder, and execution (which is called data path in the design
because the operations required by the execution more or less imply
the structure of the data paths in the execution stage:
 
<P><br>
 
<P><img src="cpu_core_2.png">
 
<P><br>
 
<P>The pipeline consists of the <STRONG>opc_fetch</STRONG> stage that drives <STRONG>PC</STRONG>, <STRONG>OPC</STRONG>, and
<STRONG>T0</STRONG> signals to the opcode decoder stage.
The <STRONG>opc_deco</STRONG> stage decodes the <STRONG>OPC</STRONG> signal and generates a number of
control signals towards the execution stage, The execution stage then
executes the decoded instruction.
 
<P>The control signals towards the execution stage can be divided into 3 groups:
 
<OL>
  <LI>Select signals (<STRONG>ALU_OP</STRONG>, <STRONG>AMOD</STRONG>, <STRONG>BIT</STRONG>, <STRONG>DDDDD</STRONG>, <STRONG>IMM</STRONG>, <STRONG>OPC</STRONG>, <STRONG>PMS</STRONG>,
   <STRONG>RD_M</STRONG>, <STRONG>RRRRR</STRONG>, and <STRONG>RSEL</STRONG>). These signals control details (like register
   numbers) of the instruction being executed.
  <LI>Branch and timing signals (<STRONG>PC</STRONG>, <STRONG>PC_OP</STRONG>, <STRONG>WAIT</STRONG>, (and <STRONG>SKIP</STRONG> in the reverse
   direction)). These signals control changes in the normal execution
   flow.
  <LI>Write enable signals (<STRONG>WE_01</STRONG>, <STRONG>WE_D</STRONG>, <STRONG>WE_F</STRONG>, <STRONG>WE_M</STRONG>, and <STRONG>WE_XYZS</STRONG>).
   These signals define if and when registers and memory locations are
   updated.
</OL>
<P>We come to the VHDL code for the CPU core.  The entity declaration
must match the instantiation in the top-level design. Therefore:
 
<P><br>
 
<pre class="vhdl">
 
 33     entity cpu_core is
 34         port (  I_CLK       : in  std_logic;
 35                 I_CLR       : in  std_logic;
 36                 I_INTVEC    : in  std_logic_vector( 5 downto 0);
 37                 I_DIN       : in  std_logic_vector( 7 downto 0);
 38
 39                 Q_OPC       : out std_logic_vector(15 downto 0);
 40                 Q_PC        : out std_logic_vector(15 downto 0);
 41                 Q_DOUT      : out std_logic_vector( 7 downto 0);
 42                 Q_ADR_IO    : out std_logic_vector( 7 downto 0);
 43                 Q_RD_IO     : out std_logic;
 44                 Q_WE_IO     : out std_logic);
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>The declaration and instantiation of <STRONG>opc_fetch</STRONG>, <STRONG>opc_deco</STRONG>, and <STRONG>dpath</STRONG>
simply reflects what is shown in the previous figure.
 
<P>The multiplexer driving <STRONG>DIN</STRONG> selects between data from the I/O input and
data from the program memory. This is controlled by signal <STRONG>PMS</STRONG> (<STRONG>program
memory select</STRONG>):
 
<P><br>
 
<pre class="vhdl">
 
240         L_DIN <= F_PM_DOUT when (D_PMS = '1') else I_DIN(7 downto 0);
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>The interrupt vector input <STRONG>INTVEC</STRONG> is <STRONG>and</STRONG>'ed with the global interrupt
enable bit in the status register (which is contained in the data path):
 
<P><br>
 
<pre class="vhdl">
 
241         L_INTVEC_5 <= I_INTVEC(5) and R_INT_ENA;
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>This concludes the discussion of the CPU core and we will proceed with
the different stages of the pipeline. Rather than following the natural
order (opcode fetch, opcode decoder, execution), however, we will describe
the opcode decoder last. The reason is that the opcode decoder is a
consequence of the design of the execution stage. Once the execution stage
is understood, the opcode decoder will become obvious (though still complex).
 
<P><hr><BR>
<table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
</BODY>
</HTML>

Line No.	Rev	Author	Line
1	2	jsauermann	`<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"`
2			`"http://www.w3.org/TR/html4/strict.dtd">`
3			`<HTML>`
4			`<HEAD>`
5			`<TITLE>html/Cpu_Core</TITLE>`
6			`<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">`
7			`<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">`
8			`</HEAD>`
9			`<BODY>`
10			`<P><table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>`
11			`<hr>`
12
13			`<H1><A NAME="section_1">4 THE CPU CORE</A></H1>`
14
15			`<P>In this lesson we will discuss the core of the CPU. These days, the same`
16			`kind of CPU can come in different flavors that differ in the clock`
17			`frequency that that support, bus sizes, the size of internal caches`
18			`and memories and the capabilities of the I/O ports they provide.`
19			`We call the common part of these different CPUs the <STRONG>CPU core</STRONG>.`
20			`The CPU core is primarily characterized by the instruction set that it`
21			`provides. One could also say that the CPU core is the implementation`
22			`of a given instruction set.`
23
24			`<P>The details of the instruction set will only be visible at the next lower`
25			`level of the design. At the current level different CPUs (with`
26			`different instruction sets) will still look the same because they`
27			`all use the same structure. Only some control signals will be different`
28			`for different CPUs.`
29
30			`<P>We will use the so-called <STRONG>Harvard architecture</STRONG> because it fits better`
31			`to FPGAs with internal memory modules. Harvard architecture means that`
32			`the program memory and the data memory of the CPU are different. This`
33			`gives us more flexibility and some instructions (for example <STRONG>CALL</STRONG>,`
34			`which involves storing the current program counter in`
35			`memory while changing the program counter and fetching the next`
36			`instruction) can be executed in parallel).`
37
38			`<P>Different CPU cores differ in the in the instruction set that`
39			`they support. The types of CPU instructions (like arithmetic`
40			`instructions, move instructions, branch instructions, etc.) are`
41			`essentially the same for all CPUs. The differences are in the details`
42			`like the encoding of the instructions, operand sizes, number of`
43			`registers addressable, and the like).`
44
45			`<P>Since all CPUs are rather similar apart from details, within`
46			`the same base architecture (Harvard vs. von Neumann), the same`
47			`structure can be used even for different instruction sets. This`
48			`is because the same cycle is repeated again and again for the`
49			`different instructions of a program. This cycle consists of 3`
50			`phases:`
51
52			`<UL>`
53			`<LI>Opcode fetch`
54			`<LI>Opcode decoding`
55			`<LI>Execution`
56			`</UL>`
57			`<P><STRONG>Opcode fetch</STRONG> means that for a given value of the program counter`
58			`<STRONG>PC</STRONG>, the instruction (opcode) stored at location PC is read from the`
59			`program memory and that the PC is advanced to the next instruction.`
60
61			`<P><STRONG>Opcode decoding</STRONG> computes a number of control signals that will`
62			`be needed in the execution phase.`
63
64			`<P><STRONG>Execution</STRONG> then executes the opcode which means that a small number`
65			`of registers or memory locations is read and/or written.`
66
67			`<P>In theory these 3 phases could be implemented in a combinational way`
68			`(a static program memory, an opcode decoder at the output of the program`
69			`memory and an execution module at the output of the opcode decoder).`
70			`We will see later, however, that each phase has a considerable complexity`
71			`and we therefore use a 3 stage pipeline instead.`
72
73			`<P>In the following figure we see how a sequence of three opcodes ADD, MOV,`
74			`and JMP is executed in the pipeline.`
75
76			`<P><br>`
77
78			`<P><img src="cpu_core_1.png">`
79
80			`<P><br>`
81
82			`<P>From the discussion above we can already predict the big picture of`
83			`the CPU core. It consists of a pipeline with 3 stages opcode fetch,`
84			`opcode decoder, and execution (which is called data path in the design`
85			`because the operations required by the execution more or less imply`
86			`the structure of the data paths in the execution stage:`
87
88			`<P><br>`
89
90			`<P><img src="cpu_core_2.png">`
91
92			`<P><br>`
93
94			`<P>The pipeline consists of the <STRONG>opc_fetch</STRONG> stage that drives <STRONG>PC</STRONG>, <STRONG>OPC</STRONG>, and`
95			`<STRONG>T0</STRONG> signals to the opcode decoder stage.`
96			`The <STRONG>opc_deco</STRONG> stage decodes the <STRONG>OPC</STRONG> signal and generates a number of`
97			`control signals towards the execution stage, The execution stage then`
98			`executes the decoded instruction.`
99
100			`<P>The control signals towards the execution stage can be divided into 3 groups:`
101
102			`<OL>`
103			`<LI>Select signals (<STRONG>ALU_OP</STRONG>, <STRONG>AMOD</STRONG>, <STRONG>BIT</STRONG>, <STRONG>DDDDD</STRONG>, <STRONG>IMM</STRONG>, <STRONG>OPC</STRONG>, <STRONG>PMS</STRONG>,`
104			`<STRONG>RD_M</STRONG>, <STRONG>RRRRR</STRONG>, and <STRONG>RSEL</STRONG>). These signals control details (like register`
105			`numbers) of the instruction being executed.`
106			`<LI>Branch and timing signals (<STRONG>PC</STRONG>, <STRONG>PC_OP</STRONG>, <STRONG>WAIT</STRONG>, (and <STRONG>SKIP</STRONG> in the reverse`
107			`direction)). These signals control changes in the normal execution`
108			`flow.`
109			`<LI>Write enable signals (<STRONG>WE_01</STRONG>, <STRONG>WE_D</STRONG>, <STRONG>WE_F</STRONG>, <STRONG>WE_M</STRONG>, and <STRONG>WE_XYZS</STRONG>).`
110			`These signals define if and when registers and memory locations are`
111			`updated.`
112			`</OL>`
113			`<P>We come to the VHDL code for the CPU core. The entity declaration`
114			`must match the instantiation in the top-level design. Therefore:`
115
116			`<P><br>`
117
118			`<pre class="vhdl">`
119
120			`33 entity cpu_core is`
121			`34 port ( I_CLK : in std_logic;`
122			`35 I_CLR : in std_logic;`
123			`36 I_INTVEC : in std_logic_vector( 5 downto 0);`
124			`37 I_DIN : in std_logic_vector( 7 downto 0);`
125			`38`
126			`39 Q_OPC : out std_logic_vector(15 downto 0);`
127			`40 Q_PC : out std_logic_vector(15 downto 0);`
128			`41 Q_DOUT : out std_logic_vector( 7 downto 0);`
129			`42 Q_ADR_IO : out std_logic_vector( 7 downto 0);`
130			`43 Q_RD_IO : out std_logic;`
131			`44 Q_WE_IO : out std_logic);`
132			`<pre class="filename">`
133			`src/cpu_core.vhd`
134			`</pre></pre>`
135			`<P>`
136
137			`<P><br>`
138
139			`<P>The declaration and instantiation of <STRONG>opc_fetch</STRONG>, <STRONG>opc_deco</STRONG>, and <STRONG>dpath</STRONG>`
140			`simply reflects what is shown in the previous figure.`
141
142			`<P>The multiplexer driving <STRONG>DIN</STRONG> selects between data from the I/O input and`
143			`data from the program memory. This is controlled by signal <STRONG>PMS</STRONG> (<STRONG>program`
144			`memory select</STRONG>):`
145
146			`<P><br>`
147
148			`<pre class="vhdl">`
149
150			`240 L_DIN <= F_PM_DOUT when (D_PMS = '1') else I_DIN(7 downto 0);`
151			`<pre class="filename">`
152			`src/cpu_core.vhd`
153			`</pre></pre>`
154			`<P>`
155
156			`<P><br>`
157
158			`<P>The interrupt vector input <STRONG>INTVEC</STRONG> is <STRONG>and</STRONG>'ed with the global interrupt`
159			`enable bit in the status register (which is contained in the data path):`
160
161			`<P><br>`
162
163			`<pre class="vhdl">`
164
165			`241 L_INTVEC_5 <= I_INTVEC(5) and R_INT_ENA;`
166			`<pre class="filename">`
167			`src/cpu_core.vhd`
168			`</pre></pre>`
169			`<P>`
170
171			`<P><br>`
172
173			`<P>This concludes the discussion of the CPU core and we will proceed with`
174			`the different stages of the pipeline. Rather than following the natural`
175			`order (opcode fetch, opcode decoder, execution), however, we will describe`
176			`the opcode decoder last. The reason is that the opcode decoder is a`
177			`consequence of the design of the execution stage. Once the execution stage`
178			`is understood, the opcode decoder will become obvious (though still complex).`
179
180			`<P><hr><BR>`
181			`<table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>`
182			`</BODY>`
183			`</HTML>`

Browse

Tools

Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [04_Cpu_Core.html] - Blame information for rev 2