OpenCores
URL https://opencores.org/ocsvn/cpu_lecture/cpu_lecture/trunk

Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [04_Cpu_Core.html] - Rev 2

Compare with Previous | Blame | View Log

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>html/Cpu_Core</TITLE>
<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">
<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">
</HEAD>
<BODY>
<P><table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
<hr>
 
<H1><A NAME="section_1">4 THE CPU CORE</A></H1>
 
<P>In this lesson we will discuss the core of the CPU. These days, the same
kind of CPU can come in different flavors that differ in the clock
frequency that that support, bus sizes, the size of internal caches
and memories and the capabilities of the I/O ports they provide.
We call the common part of these different CPUs the <STRONG>CPU core</STRONG>.
The CPU core is primarily characterized by the instruction set that it
provides. One could also say that the CPU core is the implementation
of a given instruction set.
 
<P>The details of the instruction set will only be visible at the next lower
level of the design. At the current level different CPUs (with
different instruction sets) will still look the same because they
all use the same structure. Only some control signals will be different
for different CPUs.
 
<P>We will use the so-called <STRONG>Harvard architecture</STRONG> because it fits better
to FPGAs with internal memory modules. Harvard architecture means that
the program memory and the data memory of the CPU are different. This
gives us more flexibility and some instructions (for example <STRONG>CALL</STRONG>,
which involves storing the current program counter in
memory while changing the program counter and fetching the next
instruction) can be executed in parallel).
 
<P>Different CPU cores differ in the in the instruction set that
they support. The types of CPU instructions (like arithmetic
instructions, move instructions, branch instructions, etc.) are
essentially the same for all CPUs. The differences are in the details
like the encoding of the instructions, operand sizes, number of
registers addressable, and the like).
 
<P>Since all CPUs are rather similar apart from details, within
the same base architecture (Harvard vs. von Neumann), the same
structure can be used even for different instruction sets. This
is because the same cycle is repeated again and again for the
different instructions of a program. This cycle consists of 3
phases:
 
<UL>
  <LI>Opcode fetch
  <LI>Opcode decoding
  <LI>Execution
</UL>
<P><STRONG>Opcode fetch</STRONG> means that for a given value of the program counter
<STRONG>PC</STRONG>, the instruction (opcode) stored at location PC is read from the
program memory and that the PC is advanced to the next instruction.
 
<P><STRONG>Opcode decoding</STRONG> computes a number of control signals that will
be needed in the execution phase.
 
<P><STRONG>Execution</STRONG> then executes the opcode which means that a small number
of registers or memory locations is read and/or written.
 
<P>In theory these 3 phases could be implemented in a combinational way
(a static program memory, an opcode decoder at the output of the program
memory and an execution module at the output of the opcode decoder).
We will see later, however, that each phase has a considerable complexity
and we therefore use a 3 stage pipeline instead.
 
<P>In the following figure we see how a sequence of three opcodes ADD, MOV,
and JMP is executed in the pipeline.
 
<P><br>
 
<P><img src="cpu_core_1.png">
 
<P><br>
 
<P>From the discussion above we can already predict the big picture of
the CPU core. It consists of a pipeline with 3 stages opcode fetch,
opcode decoder, and execution (which is called data path in the design
because the operations required by the execution more or less imply
the structure of the data paths in the execution stage:
 
<P><br>
 
<P><img src="cpu_core_2.png">
 
<P><br>
 
<P>The pipeline consists of the <STRONG>opc_fetch</STRONG> stage that drives <STRONG>PC</STRONG>, <STRONG>OPC</STRONG>, and
<STRONG>T0</STRONG> signals to the opcode decoder stage.
The <STRONG>opc_deco</STRONG> stage decodes the <STRONG>OPC</STRONG> signal and generates a number of
control signals towards the execution stage, The execution stage then
executes the decoded instruction.
 
<P>The control signals towards the execution stage can be divided into 3 groups:
 
<OL>
  <LI>Select signals (<STRONG>ALU_OP</STRONG>, <STRONG>AMOD</STRONG>, <STRONG>BIT</STRONG>, <STRONG>DDDDD</STRONG>, <STRONG>IMM</STRONG>, <STRONG>OPC</STRONG>, <STRONG>PMS</STRONG>,
   <STRONG>RD_M</STRONG>, <STRONG>RRRRR</STRONG>, and <STRONG>RSEL</STRONG>). These signals control details (like register
   numbers) of the instruction being executed.
  <LI>Branch and timing signals (<STRONG>PC</STRONG>, <STRONG>PC_OP</STRONG>, <STRONG>WAIT</STRONG>, (and <STRONG>SKIP</STRONG> in the reverse
   direction)). These signals control changes in the normal execution
   flow.
  <LI>Write enable signals (<STRONG>WE_01</STRONG>, <STRONG>WE_D</STRONG>, <STRONG>WE_F</STRONG>, <STRONG>WE_M</STRONG>, and <STRONG>WE_XYZS</STRONG>).
   These signals define if and when registers and memory locations are
   updated.
</OL>
<P>We come to the VHDL code for the CPU core.  The entity declaration
must match the instantiation in the top-level design. Therefore:
 
<P><br>
 
<pre class="vhdl">
 
 33	entity cpu_core is
 34	    port (  I_CLK       : in  std_logic;
 35	            I_CLR       : in  std_logic;
 36	            I_INTVEC    : in  std_logic_vector( 5 downto 0);
 37	            I_DIN       : in  std_logic_vector( 7 downto 0);
 38	
 39	            Q_OPC       : out std_logic_vector(15 downto 0);
 40	            Q_PC        : out std_logic_vector(15 downto 0);
 41	            Q_DOUT      : out std_logic_vector( 7 downto 0);
 42	            Q_ADR_IO    : out std_logic_vector( 7 downto 0);
 43	            Q_RD_IO     : out std_logic;
 44	            Q_WE_IO     : out std_logic);
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>The declaration and instantiation of <STRONG>opc_fetch</STRONG>, <STRONG>opc_deco</STRONG>, and <STRONG>dpath</STRONG>
simply reflects what is shown in the previous figure.
 
<P>The multiplexer driving <STRONG>DIN</STRONG> selects between data from the I/O input and
data from the program memory. This is controlled by signal <STRONG>PMS</STRONG> (<STRONG>program
memory select</STRONG>):
 
<P><br>
 
<pre class="vhdl">
 
240	    L_DIN <= F_PM_DOUT when (D_PMS = '1') else I_DIN(7 downto 0);
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>The interrupt vector input <STRONG>INTVEC</STRONG> is <STRONG>and</STRONG>'ed with the global interrupt
enable bit in the status register (which is contained in the data path):
 
<P><br>
 
<pre class="vhdl">
 
241	    L_INTVEC_5 <= I_INTVEC(5) and R_INT_ENA;
<pre class="filename">
src/cpu_core.vhd
</pre></pre>
<P>
 
<P><br>
 
<P>This concludes the discussion of the CPU core and we will proceed with
the different stages of the pipeline. Rather than following the natural
order (opcode fetch, opcode decoder, execution), however, we will describe
the opcode decoder last. The reason is that the opcode decoder is a
consequence of the design of the execution stage. Once the execution stage
is understood, the opcode decoder will become obvious (though still complex).
 
<P><hr><BR>
<table class="ttop"><th class="tpre"><a href="03_Pipelining.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="05_Opcode_Fetch.html">Next Lesson</a></th></table>
</BODY>
</HTML>
 

Compare with Previous | Blame | View Log

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.