Line 3... |
Line 3... |
%% Filename: spec.tex
|
%% Filename: spec.tex
|
%%
|
%%
|
%% Project: Zip CPU -- a small, lightweight, RISC CPU soft core
|
%% Project: Zip CPU -- a small, lightweight, RISC CPU soft core
|
%%
|
%%
|
%% Purpose: This LaTeX file contains all of the documentation/description
|
%% Purpose: This LaTeX file contains all of the documentation/description
|
%% currently provided with this Zip CPU soft core. It supercedes
|
%% currently provided with this Zip CPU soft core. It supersedes
|
%% any information about the instruction set or CPUs found
|
%% any information about the instruction set or CPUs found
|
%% elsewhere. It's not nearly as interesting, though, as the PDF
|
%% elsewhere. It's not nearly as interesting, though, as the PDF
|
%% file it creates, so I'd recommend reading that before diving
|
%% file it creates, so I'd recommend reading that before diving
|
%% into this file. You should be able to find the PDF file in
|
%% into this file. You should be able to find the PDF file in
|
%% the SVN distribution together with this PDF file and a copy of
|
%% the SVN distribution together with this PDF file and a copy of
|
Line 46... |
Line 46... |
\documentclass{gqtekspec}
|
\documentclass{gqtekspec}
|
\project{Zip CPU}
|
\project{Zip CPU}
|
\title{Specification}
|
\title{Specification}
|
\author{Dan Gisselquist, Ph.D.}
|
\author{Dan Gisselquist, Ph.D.}
|
\email{dgisselq (at) opencores.org}
|
\email{dgisselq (at) opencores.org}
|
\revision{Rev.~0.2}
|
\revision{Rev.~0.3}
|
\begin{document}
|
\begin{document}
|
\pagestyle{gqtekspecplain}
|
\pagestyle{gqtekspecplain}
|
\titlepage
|
\titlepage
|
\begin{license}
|
\begin{license}
|
Copyright (C) \theyear\today, Gisselquist Technology, LLC
|
Copyright (C) \theyear\today, Gisselquist Technology, LLC
|
Line 68... |
Line 68... |
You should have received a copy of the GNU General Public License along
|
You should have received a copy of the GNU General Public License along
|
with this program. If not, see \hbox{<http://www.gnu.org/licenses/>} for a
|
with this program. If not, see \hbox{<http://www.gnu.org/licenses/>} for a
|
copy.
|
copy.
|
\end{license}
|
\end{license}
|
\begin{revisionhistory}
|
\begin{revisionhistory}
|
|
0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline
|
0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline
|
0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline
|
0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline
|
0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline
|
\end{revisionhistory}
|
\end{revisionhistory}
|
% Revision History
|
% Revision History
|
% Table of Contents, named Contents
|
% Table of Contents, named Contents
|
Line 89... |
Line 90... |
There's more to it, though. There's a lot that I would like to do with a
|
There's more to it, though. There's a lot that I would like to do with a
|
processor, and I want to be able to do it in a vendor independent fashion.
|
processor, and I want to be able to do it in a vendor independent fashion.
|
I would like to be able to generate Verilog code that can run equivalently
|
I would like to be able to generate Verilog code that can run equivalently
|
on both Xilinx and Altera chips, and that can be easily ported from one
|
on both Xilinx and Altera chips, and that can be easily ported from one
|
manufacturer's chipsets to another. Even more, before purchasing a chip or a
|
manufacturer's chipsets to another. Even more, before purchasing a chip or a
|
board, I would like to know that my chip works. I would like to build a test
|
board, I would like to know that my soft core works. I would like to build a test
|
bench to test components with, and Verilator is my chosen test bench. This
|
bench to test components with, and Verilator is my chosen test bench. This
|
forces me to use all Verilog, and it prevents me from using any proprietary
|
forces me to use all Verilog, and it prevents me from using any proprietary
|
cores. For this reason, Microblaze and Nios are out of the question.
|
cores. For this reason, Microblaze and Nios are out of the question.
|
|
|
Why not OpenRISC? That's a hard question. The OpenRISC team has done some
|
Why not OpenRISC? That's a hard question. The OpenRISC team has done some
|
Line 157... |
Line 158... |
|
|
Now, however, that I've worked on the Zip CPU for a while, it is not nearly
|
Now, however, that I've worked on the Zip CPU for a while, it is not nearly
|
as simple as I originally hoped. Worse, I've had to adjust to create
|
as simple as I originally hoped. Worse, I've had to adjust to create
|
capabilities that I was never expecting to need. These include:
|
capabilities that I was never expecting to need. These include:
|
\begin{itemize}
|
\begin{itemize}
|
\item {\bf Extenal Debug:} Once placed upon an FPGA, some external means is
|
\item {\bf External Debug:} Once placed upon an FPGA, some external means is
|
still necessary to debug this CPU. That means that there needs to be
|
still necessary to debug this CPU. That means that there needs to be
|
an external register that can control the CPU: reset it, halt it, step
|
an external register that can control the CPU: reset it, halt it, step
|
it, and tell whether it is running or not. My chosen interface
|
it, and tell whether it is running or not. My chosen interface
|
includes a second register similar to this control register. This
|
includes a second register similar to this control register. This
|
second register allows the external controller or debugger to examine
|
second register allows the external controller or debugger to examine
|
Line 239... |
Line 240... |
|
|
So I switched to a model of discrete execution: Once an instruction
|
So I switched to a model of discrete execution: Once an instruction
|
enters into either the ALU or memory unit, the instruction is
|
enters into either the ALU or memory unit, the instruction is
|
guaranteed to complete. If the logic recognizes a branch or a
|
guaranteed to complete. If the logic recognizes a branch or a
|
condition that would render the instruction entering into this stage
|
condition that would render the instruction entering into this stage
|
possibly inappropriate (i.e. a conditional branch preceeding a store
|
possibly inappropriate (i.e. a conditional branch preceding a store
|
instruction for example), then the pipeline stalls for one cycle
|
instruction for example), then the pipeline stalls for one cycle
|
until the conditional branch completes. Then, if it generates a new
|
until the conditional branch completes. Then, if it generates a new
|
PC address, the stages preceeding are all wiped clean.
|
PC address, the stages preceding are all wiped clean.
|
|
|
The discrete execution model allows such things as sleeping: if the
|
The discrete execution model allows such things as sleeping: if the
|
CPU is put to ``sleep,'' the ALU and memory stages stall and back up
|
CPU is put to ``sleep,'' the ALU and memory stages stall and back up
|
everything before them. Likewise, anything that has entered the ALU
|
everything before them. Likewise, anything that has entered the ALU
|
or memory stage when the CPU is placed to sleep continues to completion.
|
or memory stage when the CPU is placed to sleep continues to completion.
|
Line 319... |
Line 320... |
|
|
The status register is special, and bears further mention. The lower
|
The status register is special, and bears further mention. The lower
|
10 bits of the status register form a set of CPU state and condition codes.
|
10 bits of the status register form a set of CPU state and condition codes.
|
Writes to other bits of this register are preserved.
|
Writes to other bits of this register are preserved.
|
|
|
Of the eight condition codes, the bottom four are the current flags:
|
Of the condition codes, the bottom four bits are the current flags:
|
Zero (Z),
|
Zero (Z),
|
Carry (C),
|
Carry (C),
|
Negative (N),
|
Negative (N),
|
and Overflow (V).
|
and Overflow (V).
|
|
|
The next bit is a clock enable (0 to enable) or sleep bit (1 to put
|
The next bit is a clock enable (0 to enable) or sleep bit (1 to put
|
the CPU to sleep). Setting this bit will cause the CPU to
|
the CPU to sleep). Setting this bit will cause the CPU to
|
wait for an interrupt (if interrupts are enabled), or to
|
wait for an interrupt (if interrupts are enabled), or to
|
completely halt (if interrupts are disabled).
|
completely halt (if interrupts are disabled).
|
|
|
The sixth bit is a global interrupt enable bit (GIE). When this
|
The sixth bit is a global interrupt enable bit (GIE). When this
|
sixth bit is a `1' interrupts will be enabled, else disabled. When
|
sixth bit is a `1' interrupts will be enabled, else disabled. When
|
interrupts are disabled, the CPU will be in supervisor mode, otherwise
|
interrupts are disabled, the CPU will be in supervisor mode, otherwise
|
it is in user mode. Thus, to execute a context switch, one only
|
it is in user mode. Thus, to execute a context switch, one only
|
need enable or disable interrupts. (When an interrupt line goes
|
need enable or disable interrupts. (When an interrupt line goes
|
Line 385... |
Line 387... |
These status register bits are summarized in Tbl.~\ref{tbl:ccbits}.
|
These status register bits are summarized in Tbl.~\ref{tbl:ccbits}.
|
\begin{table}
|
\begin{table}
|
\begin{center}
|
\begin{center}
|
\begin{tabular}{l|l}
|
\begin{tabular}{l|l}
|
Bit & Meaning \\\hline
|
Bit & Meaning \\\hline
|
9 & Soft trap, set on a trap from user mode, cleared when returing to user mode\\\hline
|
9 & Soft trap, set on a trap from user mode, cleared when returning to user mode\\\hline
|
8 & (Reserved for) Floating point enable \\\hline
|
8 & (Reserved for) Floating point enable \\\hline
|
7 & Halt on break, to support an external debugger \\\hline
|
7 & Halt on break, to support an external debugger \\\hline
|
6 & Step, single step the CPU in user mode\\\hline
|
6 & Step, single step the CPU in user mode\\\hline
|
5 & GIE, or Global Interrupt Enable \\\hline
|
5 & GIE, or Global Interrupt Enable \\\hline
|
4 & Sleep \\\hline
|
4 & Sleep \\\hline
|
Line 437... |
Line 439... |
1'b1 & 4-bit Register & 16--bit Signed immediate offset \\\hline
|
1'b1 & 4-bit Register & 16--bit Signed immediate offset \\\hline
|
\end{tabular}
|
\end{tabular}
|
\caption{Bit allocation for Operand B}\label{tbl:opb}
|
\caption{Bit allocation for Operand B}\label{tbl:opb}
|
\end{center}\end{table}
|
\end{center}\end{table}
|
|
|
Sixteen and twenty bit immediates don't make sense for all instructions. For
|
Sixteen and twenty bit immediate values don't make sense for all instructions.
|
example, what is the point of a 20--bit immediate when executing a 16--bit
|
For example, what is the point of a 20--bit immediate when executing a 16--bit
|
multiply? Likewise, why have a 16--bit immediate when adding to a logical
|
multiply? Likewise, why have a 16--bit immediate when adding to a logical
|
or arithmetic shift? In these cases, the extra bits are reserved for future
|
or arithmetic shift? In these cases, the extra bits are reserved for future
|
instruction possibilities.
|
instruction possibilities.
|
|
|
\section{Address Modes}
|
\section{Address Modes}
|
Line 645... |
Line 647... |
& \multicolumn{21}{l|}{Operand B}
|
& \multicolumn{21}{l|}{Operand B}
|
& Yes \\\hline
|
& Yes \\\hline
|
LSL/ASL & \multicolumn{4}{l|}{4'hd}
|
LSL/ASL & \multicolumn{4}{l|}{4'hd}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{21}{l|}{Operand B, imm. trucated to 6 bits}
|
& \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
|
& Yes \\\hline
|
& Yes \\\hline
|
ASR & \multicolumn{4}{l|}{4'he}
|
ASR & \multicolumn{4}{l|}{4'he}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{21}{l|}{Operand B, imm. trucated to 6 bits}
|
& \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
|
& Yes \\\hline
|
& Yes \\\hline
|
LSR & \multicolumn{4}{l|}{4'hf}
|
LSR & \multicolumn{4}{l|}{4'hf}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{4}{l|}{R. Reg}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{3}{l|}{Cond.}
|
& \multicolumn{21}{l|}{Operand B, imm. trucated to 6 bits}
|
& \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
|
& Yes \\\hline
|
& Yes \\\hline
|
\end{tabular}
|
\end{tabular}
|
\caption{Zip CPU Instruction Set}\label{tbl:zip-instructions}
|
\caption{Zip CPU Instruction Set}\label{tbl:zip-instructions}
|
\end{center}\end{table}
|
\end{center}\end{table}
|
|
|
Line 683... |
Line 685... |
Mapped & Actual & Notes \\\hline
|
Mapped & Actual & Notes \\\hline
|
\parbox[t]{1.4in}{ADD Ra,Rx\\ADDC Rb,Ry}
|
\parbox[t]{1.4in}{ADD Ra,Rx\\ADDC Rb,Ry}
|
& \parbox[t]{1.5in}{Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}
|
& \parbox[t]{1.5in}{Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}
|
& Add with carry \\\hline
|
& Add with carry \\\hline
|
BRA.Cond +/-\$Addr
|
BRA.Cond +/-\$Addr
|
& \hbox{Mov.cond \$Addr+PC,PC}
|
& \hbox{MOV.cond \$Addr+PC,PC}
|
& Branch or jump on condition. Works for 15--bit
|
& Branch or jump on condition. Works for 15--bit
|
signed address offsets.\\\hline
|
signed address offsets.\\\hline
|
BRA.Cond +/-\$Addr
|
BRA.Cond +/-\$Addr
|
& \parbox[t]{1.5in}{LDI \$Addr,Rx \\ ADD.cond Rx,PC}
|
& \parbox[t]{1.5in}{LDI \$Addr,Rx \\ ADD.cond Rx,PC}
|
& Branch/jump on condition. Works for
|
& Branch/jump on condition. Works for
|
Line 709... |
Line 711... |
& ROL \$16,Rx
|
& ROL \$16,Rx
|
& Exchanges the top and bottom 16'bit words of Rx \\\hline
|
& Exchanges the top and bottom 16'bit words of Rx \\\hline
|
HALT
|
HALT
|
& Or \$SLEEP,CC
|
& Or \$SLEEP,CC
|
& Executed while in interrupt mode. In user mode this is simply a
|
& Executed while in interrupt mode. In user mode this is simply a
|
wait until interrupt instructioon. \\\hline
|
wait until interrupt instruction. \\\hline
|
INT & LDI \$0,CC
|
INT & LDI \$0,CC
|
& Since we're using the CC register as a trap vector as well, this
|
& Since we're using the CC register as a trap vector as well, this
|
executes TRAP \#0. \\\hline
|
executes TRAP \#0. \\\hline
|
IRET
|
IRET
|
& OR \$GIE,CC
|
& OR \$GIE,CC
|
Line 774... |
Line 776... |
& \parbox[t]{1.5in}{LSL \$1,Ry \\
|
& \parbox[t]{1.5in}{LSL \$1,Ry \\
|
LSL \$1,Rx \\
|
LSL \$1,Rx \\
|
OR.C \$1,Ry}
|
OR.C \$1,Ry}
|
& Logical shift left with carry. Note that the
|
& Logical shift left with carry. Note that the
|
instruction order is now backwards, to keep the conditions valid.
|
instruction order is now backwards, to keep the conditions valid.
|
That is, LSL sets the carry flag, so if we did this the othe way
|
That is, LSL sets the carry flag, so if we did this the other way
|
with Rx before Ry, then the condition flag wouldn't have been right
|
with Rx before Ry, then the condition flag wouldn't have been right
|
for an OR correction at the end. \\\hline
|
for an OR correction at the end. \\\hline
|
\parbox[t]{1.5in}{LSR \$1,Rx \\ LSRC \$1,Ry}
|
\parbox[t]{1.5in}{LSR \$1,Rx \\ LSRC \$1,Ry}
|
& \parbox[t]{1.5in}{CLR Rz \\
|
& \parbox[t]{1.5in}{CLR Rz \\
|
LSR \$1,Ry \\
|
LSR \$1,Ry \\
|
Line 796... |
Line 798... |
POP Rx
|
POP Rx
|
& \parbox[t]{1.5in}{LOD \$-1(SP),Rx \\ ADD \$1,SP}
|
& \parbox[t]{1.5in}{LOD \$-1(SP),Rx \\ ADD \$1,SP}
|
& Note
|
& Note
|
that for interrupt purposes, one can never depend upon the value at
|
that for interrupt purposes, one can never depend upon the value at
|
(SP). Hence you read from it, then increment it, lest having
|
(SP). Hence you read from it, then increment it, lest having
|
incremented it firost something then comes along and writes to that
|
incremented it first something then comes along and writes to that
|
value before you can read the result. \\\hline
|
value before you can read the result. \\\hline
|
PUSH Rx
|
PUSH Rx
|
& \parbox[t]{1.5in}{SUB \$1,SPa \\
|
& \parbox[t]{1.5in}{SUB \$1,SP \\
|
STO Rx,\$1(SP)}
|
STO Rx,\$1(SP)}
|
& \\\hline
|
& \\\hline
|
RESET
|
RESET
|
& \parbox[t]{1in}{STO \$1,\$watchdog(R12)\\NOOP\\NOOP}
|
& \parbox[t]{1in}{STO \$1,\$watchdog(R12)\\NOOP\\NOOP}
|
& \parbox[t]{3in}{This depends upon the peripheral base address being
|
& \parbox[t]{3in}{This depends upon the peripheral base address being
|
Line 915... |
Line 917... |
|
|
The Zip CPU does not support out of order execution. Therefore, if the memory
|
The Zip CPU does not support out of order execution. Therefore, if the memory
|
unit stalls, every other instruction stalls. Memory stores, however, can take
|
unit stalls, every other instruction stalls. Memory stores, however, can take
|
place concurrently with ALU operations, although memory reads cannot.
|
place concurrently with ALU operations, although memory reads cannot.
|
|
|
|
\iffalse
|
|
|
\section{Pipeline Logic}
|
\section{Pipeline Logic}
|
How the CPU handles some instruction combinations can be telling when
|
How the CPU handles some instruction combinations can be telling when
|
determining what happens in the pipeline. The following lists some examples:
|
determining what happens in the pipeline. The following lists some examples:
|
\begin{itemize}
|
\begin{itemize}
|
\item {\bf Delayed Branching}
|
\item {\bf Delayed Branching}
|
|
|
I had originally hoped to implement delayed branching. However, what
|
I had originally hoped to implement delayed branching. My goal
|
happens in debug mode?
|
was that the compiler would handle any pipeline stall conditions so
|
That is, what happens when a debugger tries to single step an
|
that the pipeline logic could be simpler within the CPU. I ran into
|
instruction? While I can easily single step the computer in either
|
two problems with this.
|
user or supervisor mode from externally, this processor does not appear
|
|
able to step the CPU in user mode from within user mode--gosh, not even
|
The first problem has to deal with debug mode. When the debugger
|
from within supervisor mode--such as if a process had a debugger
|
single steps an instruction, that instruction goes to completion.
|
attached. As the processor exists, I would have one result stepping
|
This means that if the instruction moves a value to the PC register,
|
the CPU from a debugger, and another stepping it externally.
|
the PC register would now contain that value, indicating that the
|
|
next instruction would be on the other side of the branch. There's
|
|
just no easy way around this: the entire CPU state must be captured
|
|
by the registers, to include the program counter. What value should
|
|
the program counter be equal to? The branch? Fine. The address
|
|
you are branching to? Fine. The address of the delay slot? Problem.
|
|
|
|
The second problem with delayed branching is the idea of suspending
|
|
processing for an interrupt. Which address should the CPU return
|
|
to upon completing the interrupt processing? The branch? Good. The
|
|
address after the branch? Also good. The address of the delay slot?
|
|
Not so good.
|
|
|
|
If you then add into this mess the idea that, if the CPU is running
|
|
from a really slow memory such as the flash, the delay slot may never
|
|
be filled before the branch is determined, then this makes even less
|
|
sense.
|
|
|
This is unacceptable, and so this CPU does not support delayed
|
For all of these reasons, this CPU does not support delayed branching.
|
branching.
|
|
|
|
\item {\bf Register Result:} {\tt MOV R0,R1; MOV R1,R2 }
|
\item {\bf Register Result:} {\tt MOV R0,R1; MOV R1,R2 }
|
|
|
What value does
|
What value does R2 get, the value of R1 before the first move or the
|
R2 get, the value of R1 before the first move or the value of R0?
|
value of R0? The Zip CPU has been optimized so that neither of these
|
Placing the value of R0 into R1 requires a pipeline stall, and possibly
|
instructions require a pipeline stall--unless an immediate were to
|
two, as I have the pipeline designed.
|
be added to R1 in the second instruction.
|
|
|
The ZIP CPU architecture requires that R2 must equal R0 at the end of
|
The ZIP CPU architecture requires that R2 must equal R0 at the end of
|
this operation. Even better, such combinations do not (normally)
|
this operation. Even better, such combinations do not (normally)
|
stall the pipeline.
|
stall the pipeline.
|
|
|
\item {\bf Condition Codes Result:} {\tt CMP R0,R1;Mov.EQ \$x,PC}
|
\item {\bf Condition Codes Result:} {\tt CMP R0,R1;} {\tt MOV.EQ \$x,PC}
|
|
|
|
|
At issue is the same item as above, save that the CMP instruction
|
At issue is the same item as above, save that the CMP instruction
|
updates the flags that the MOV instruction depends
|
updates the flags that the MOV instruction depends upon.
|
upon.
|
|
|
|
The Zip CPU architecture requires that condition codes must be updated
|
The Zip CPU architecture requires that condition codes must be updated
|
and available immediately for the next instruction without stalling the
|
and available immediately for the next instruction without stalling the
|
pipeline.
|
pipeline.
|
|
|
Line 963... |
Line 980... |
At issue is the
|
At issue is the
|
fact that the logic supporting the CC register is more complicated than
|
fact that the logic supporting the CC register is more complicated than
|
the logic supporting any other register.
|
the logic supporting any other register.
|
|
|
The ZIP CPU will stall for a cycle cycle on this instruction.
|
The ZIP CPU will stall for a cycle cycle on this instruction.
|
|
\item {\bf Condition Codes Register Operand:} {\tt MOV R0,R1; MOV CC,R2}
|
|
|
\item {\bf Delayed Branching: } {\tt ADD \$x,PC; MOV R0,R1}
|
Unlike the previous case, this move prior to reading the {\tt CC}
|
|
register does not impact the {\tt CC} register. Therefore, this
|
At issues is whether or not the instruction following the jump will
|
does not stall the bus, whereas the previous one would.
|
take place before the jump. In other words, is the MOV to the PC
|
|
register handled differently from an ADD to the PC register?
|
|
|
|
In the Zip architecture, MOV'es and ADD's use the same logic
|
|
(simplifies the logic).
|
|
\end{itemize}
|
\end{itemize}
|
|
|
As I've studied this, I find several approaches to handling pipeline
|
As I've studied this, I find several approaches to handling pipeline
|
issues. These approaches (and their consequences) are listed below.
|
issues. These approaches (and their consequences) are listed below.
|
|
|
\begin{itemize}
|
\begin{itemize}
|
\item {\bf All All issued instructions complete, Stages stall individually}
|
\item {\bf All issued instructions complete, stages stall individually}
|
|
|
What about a slow pre-fetch?
|
What about a slow pre-fetch?
|
|
|
Nominally, this works well: any issued instruction
|
Nominally, this works well: any issued instruction
|
just runs to completion. If there are four issued instructions in the
|
just runs to completion. If there are four issued instructions in the
|
Line 993... |
Line 1006... |
since such reads require N clocks to clocks to complete. Thus
|
since such reads require N clocks to clocks to complete. Thus
|
there may be only one instruction in the pipeline if reading from flash,
|
there may be only one instruction in the pipeline if reading from flash,
|
or a full pipeline if reading from cache. Each of these approaches
|
or a full pipeline if reading from cache. Each of these approaches
|
would produce a different response.
|
would produce a different response.
|
|
|
\item {\bf Issued instructions may be canceled}
|
For this reason, the Zip CPU works off of a different basis: All
|
|
instructions that enter either the ALU or the memory unit will
|
|
complete. Stages still stall individually.
|
|
|
Stages stall individually
|
\item {\bf Issued instructions may be canceled}
|
|
|
First problem:
|
The problem here is that
|
Memory operations cannot be canceled, even reads may have side effects
|
memory operations cannot be canceled: even reads may have side effects
|
on peripherals that cannot be canceled later. Further, in the case of
|
on peripherals that cannot be canceled later. Further, in the case of
|
an interrupt, it's difficult to know what to cancel. What happens in
|
an interrupt, it's difficult to know what to cancel. What happens in
|
a \hbox{\tt MOV.C \$x,PC} followed by a \hbox{\tt MOV \$y,PC}
|
a \hbox{\tt MOV.C \$x,PC} followed by a \hbox{\tt MOV \$y,PC}
|
instruction? Which get
|
instruction? Which get canceled?
|
canceled?
|
|
|
|
Because it isn't clear what would need to be canceled,
|
Because it isn't clear what would need to be canceled, the Zip CPU
|
this instruction combination is not recommended.
|
will not permit this combination. A MOV to the PC register will be
|
|
followed by a stall, and possibly many stalls, so that the second
|
|
move to PC will never be executed.
|
|
|
\item {\bf All issued instructions complete.}
|
\item {\bf All issued instructions complete.}
|
|
|
All stages are filled, or the entire pipeline stalls.
|
In this example, we try all issued instructions complete, but the
|
|
entire pipeline stalls if one stage is not filled. In this approach,
|
What about debug control? What about
|
though, we again struggle with the problems associated with
|
register writes taking an extra clock stage? MOV R0,R1; MOV R1,R2
|
delayed branching. Upon attempting to restart the processor, where
|
should place the value of R0 into R2. How do you restart the pipeline
|
do you restart it from?
|
after an interrupt? What address do you use? The last issued
|
|
instruction? But the branch delay slots may make that invalid!
|
|
|
|
Reading from the CPU debug port in this case yields inconsistent
|
|
results: the CPU will halt or step with instructions stuck in the
|
|
pipeline. Reading registers will give no indication of what is going
|
|
on in the pipeline, just the results of completed operations, not of
|
|
operations that have been started and not yet completed.
|
|
Perhaps we should just report the state of the CPU based upon what
|
|
instructions (PC values) have successfully completed? Thus the
|
|
debug instruction is the one that will write registers on the next
|
|
clock.
|
|
|
|
Suggestion: Suppose we load extra information in the two
|
|
CC register(s) for debugging intermediate pipeline stages?
|
|
|
|
The next problem, though, is how to deal with the read operand
|
|
pipeline stage needing the result from the register pipeline.
|
|
|
|
\item {\bf Memory instructions must complete}
|
\item {\bf Memory instructions must complete}
|
|
|
All instructions that enter into the memory module {\em must}
|
All instructions that enter into the memory module {\em must}
|
complete. Issued instructions from the prefetch, decode, or operand
|
complete. Issued instructions from the prefetch, decode, or operand
|
Line 1054... |
Line 1052... |
go high, memory operations and ALU operations will stall until the
|
go high, memory operations and ALU operations will stall until the
|
result is known. When the flag does go high, anything in the prefetch,
|
result is known. When the flag does go high, anything in the prefetch,
|
decode, and read-op stage will be invalidated.
|
decode, and read-op stage will be invalidated.
|
|
|
\end{itemize}
|
\end{itemize}
|
|
\fi
|
|
|
\section{Pipeline Stalls}
|
\section{Pipeline Stalls}
|
The processing pipeline can and will stall for a variety of reasons. Some of
|
The processing pipeline can and will stall for a variety of reasons. Some of
|
these are obvious, some less so. These reasons are listed below:
|
these are obvious, some less so. These reasons are listed below:
|
\begin{itemize}
|
\begin{itemize}
|
Line 1099... |
Line 1098... |
Since branches take place in the writeback stage, the Zip CPU will stall the
|
Since branches take place in the writeback stage, the Zip CPU will stall the
|
pipeline for one clock anytime there may be a possible jump. This prevents
|
pipeline for one clock anytime there may be a possible jump. This prevents
|
an instruction from executing a memory access after the jump but before the
|
an instruction from executing a memory access after the jump but before the
|
jump is recognized.
|
jump is recognized.
|
|
|
|
This stall cannot be mitigated through scheduling.
|
|
|
\item When reading from the CC register after setting the flags
|
\item When reading from the CC register after setting the flags
|
\begin{enumerate}
|
\begin{enumerate}
|
\item\ {\tt ALUOP RA,RB}
|
\item\ {\tt ALUOP RA,RB}
|
\item\ {\em (stall}
|
\item\ {\em (stall}
|
\item\ {\tt TST sys.ccv,CC}
|
\item\ {\tt TST sys.ccv,CC}
|
Line 1114... |
Line 1115... |
determined. Trying to then place these into the input for one of the operands
|
determined. Trying to then place these into the input for one of the operands
|
created a time delay loop that would no longer execute in a single 100~MHz
|
created a time delay loop that would no longer execute in a single 100~MHz
|
clock cycle. (The time delay of the multiply within the ALU wasn't helping
|
clock cycle. (The time delay of the multiply within the ALU wasn't helping
|
either \ldots).
|
either \ldots).
|
|
|
|
This stall may be eliminated via proper scheduling, by placing an instruction
|
|
that does not set flags in between the ALU operation and the instruction
|
|
that references the CC register. For example, {\tt MOV \$addr+PC,uPC}
|
|
followed by an {\tt RTU} ({\tt OR \$GIE,CC}) instruction will not incur
|
|
this stall, whereas an {\tt OR \$BREAKEN,CC} followed by an {\tt OR \$STEP,CC}
|
|
will incur the stall.
|
|
|
\item When waiting for a memory read operation to complete
|
\item When waiting for a memory read operation to complete
|
\begin{enumerate}
|
\begin{enumerate}
|
\item\ {\tt LOD address,RA}
|
\item\ {\tt LOD address,RA}
|
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
|
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
|
\item\ {\tt OPCODE I+RA,RB}
|
\item\ {\tt OPCODE I+RA,RB}
|
\end{enumerate}
|
\end{enumerate}
|
|
|
Remember, the ZIP CPU does not support out of order execution. Therefore,
|
Remember, the ZIP CPU does not support out of order execution. Therefore,
|
anytime the memory unit becomes busy both the memory unit and the ALU must
|
anytime the memory unit becomes busy both the memory unit and the ALU must
|
stall until the memory unit is cleared. This is especially true of a load
|
stall until the memory unit is cleared. This is especially true of a load
|
instruction, which will write its operand back to the register file. Store
|
instruction, which must still write its operand back to the register file.
|
instructions are different, since they can be busy with no impact on later
|
Store instructions are different, since they can be busy with no impact on
|
ALU write back operations. Hence, only loads stall the pipeline.
|
later ALU write back operations. Hence, only loads stall the pipeline.
|
|
|
This also assumes that the memory being accessed is a single cycle memory.
|
This also assumes that the memory being accessed is a single cycle memory.
|
Slower memories, such as the Quad SPI flash, will take longer--perhaps even
|
Slower memories, such as the Quad SPI flash, will take longer--perhaps even
|
as long as fourty clocks. During this time the CPU and the external bus
|
as long as forty clocks. During this time the CPU and the external bus
|
will be busy, and unable to do anything else.
|
will be busy, and unable to do anything else.
|
|
|
\item Memory operation followed by a memory operation
|
\item Memory operation followed by a memory operation
|
\begin{enumerate}
|
\begin{enumerate}
|
\item\ {\tt STO address,RA}
|
\item\ {\tt STO address,RA}
|
Line 1288... |
Line 1296... |
the ZipSystem now for some time, I have yet to find a need or use for the manual
|
the ZipSystem now for some time, I have yet to find a need or use for the manual
|
cache. I will likely replace this peripheral with a proper DMA controller.
|
cache. I will likely replace this peripheral with a proper DMA controller.
|
|
|
\chapter{Operation}\label{chap:ops}
|
\chapter{Operation}\label{chap:ops}
|
|
|
|
The Zip CPU, and even the Zip System, is not a System on a Chip (SoC). It
|
|
needs to be connected to its operational environment in order to be used.
|
|
Specifically, some per system adjustments need to be made:
|
|
\begin{enumerate}
|
|
\item The Zip System depends upon an external 32-bit Wishbone bus. This
|
|
must exist, and must be connected to the Zip CPU for it to work.
|
|
\item The Zip System needs to be told of its {\tt RESET\_ADDRESS}. This is
|
|
the program counter of the first instruction following a reset.
|
|
\item If you want the Zip System to start up on its own, you will need to
|
|
set the {\tt START\_HALTED} parameter to zero. Otherwise, if you
|
|
wish to manually start the CPU, that is if upon reset you want the
|
|
CPU start start in its halted, reset state, then set this parameter to
|
|
one.
|
|
\item The third parameter to set is the number of interrupts you will be
|
|
providing from external to the CPU. This can be anything from one
|
|
to nine, but it cannot be zero. (Wire this line to a 1'b0 if you
|
|
do not wish to support any external interrupts.)
|
|
\item Finally, you need to place into some wishbone accessible address, whether
|
|
RAM or (more likely) ROM, the initial instructions for the CPU.
|
|
\end{enumerate}
|
|
If you have enabled your CPU to start automatically, then upon power up the
|
|
CPU will immediately start executing your instructions.
|
|
|
|
This is, however, not how I have used the Zip CPU. I have instead used the
|
|
ZIP CPU in a more controlled environment. For me, the CPU starts in a
|
|
halted state, and waits to be told to start. Further, the RESET address is a
|
|
location in RAM. After bringing up the board I am using, and further the
|
|
bus that is on it, the RAM memory is then loaded externally with the program
|
|
I wish the Zip System to run. Once the RAM is loaded, I release the CPU.
|
|
The CPU then runs until its halt condition, at which point its task is
|
|
complete.
|
|
|
|
Eventually, I intend to place an operating system onto the ZipSystem, I'm
|
|
just not there yet.
|
|
|
|
|
\chapter{Registers}\label{chap:regs}
|
\chapter{Registers}\label{chap:regs}
|
|
|
The ZipSystem registers fall into two categories, ZipSystem internal registers
|
The ZipSystem registers fall into two categories, ZipSystem internal registers
|
accessed via the ZipCPU shown in Tbl.~\ref{tbl:zpregs},
|
accessed via the ZipCPU shown in Tbl.~\ref{tbl:zpregs},
|
\begin{table}[htbp]
|
\begin{table}[htbp]
|
Line 1314... |
Line 1358... |
UICNT & \scalebox{0.8}{\tt 0xc000000f} & 32 & R/W & User Instruction Counter\\\hline
|
UICNT & \scalebox{0.8}{\tt 0xc000000f} & 32 & R/W & User Instruction Counter\\\hline
|
% Cache & \scalebox{0.8}{\tt 0xc0100000} & & & Base address of the Cache memory\\\hline
|
% Cache & \scalebox{0.8}{\tt 0xc0100000} & & & Base address of the Cache memory\\\hline
|
\end{reglist}
|
\end{reglist}
|
\caption{Zip System Internal/Peripheral Registers}\label{tbl:zpregs}
|
\caption{Zip System Internal/Peripheral Registers}\label{tbl:zpregs}
|
\end{center}\end{table}
|
\end{center}\end{table}
|
and the two debug registers showin in Tbl.~\ref{tbl:dbgregs}.
|
and the two debug registers shown in Tbl.~\ref{tbl:dbgregs}.
|
\begin{table}[htbp]
|
\begin{table}[htbp]
|
\begin{center}\begin{reglist}
|
\begin{center}\begin{reglist}
|
ZIPCTRL & 0 & 32 & R/W & Debug Control Register \\\hline
|
ZIPCTRL & 0 & 32 & R/W & Debug Control Register \\\hline
|
ZIPDATA & 1 & 32 & R/W & Debug Data Register \\\hline
|
ZIPDATA & 1 & 32 & R/W & Debug Data Register \\\hline
|
\end{reglist}
|
\end{reglist}
|
\caption{Zip System Debug Registers}\label{tbl:dbgregs}
|
\caption{Zip System Debug Registers}\label{tbl:dbgregs}
|
\end{center}\end{table}
|
\end{center}\end{table}
|
|
|
|
\section{Peripheral Registers}
|
|
The peripheral registers, listed in Tbl.~\ref{tbl:zpregs}, are shown in the
|
|
CPU's address space. These may be accessed by the CPU at these addresses,
|
|
and when so accessed will respond as described in Chapt.~\ref{chap:periph}.
|
|
These registers will be discussed briefly again here.
|
|
|
|
The Zip CPU Interrupt controller has four different types of bits, as shown in
|
|
Tbl.~\ref{tbl:picbits}.
|
|
\begin{table}\begin{center}
|
|
\begin{bitlist}
|
|
31 & R/W & Master Interrupt Enable\\\hline
|
|
30\ldots 16 & R/W & Interrupt Enables, write '1' to change\\\hline
|
|
15 & R & Current Master Interrupt State\\\hline
|
|
15\ldots 0 & R/W & Input Interrupt states, write '1' to clear\\\hline
|
|
\end{bitlist}
|
|
\caption{Interrupt Controller Register Bits}\label{tbl:picbits}
|
|
\end{center}\end{table}
|
|
The high order bit, or bit--31, is the master interrupt enable bit. When this
|
|
bit is set, then any time an interrupt occurs the CPU will be interrupted and
|
|
will switch to supervisor mode, etc.
|
|
|
|
Bits 30~\ldots 16 are interrupt enable bits. Should the interrupt line go
|
|
ghile while enabled, an interrupt will be generated. To set an interrupt enable
|
|
bit, one needs to write the master interrupt enable while writing a `1' to this
|
|
the bit. To clear, one need only write a `0' to the master interrupt enable,
|
|
while leaving this line high.
|
|
|
|
Bits 15\ldots 0 are the current state of the interrupt vector. Interrupt lines
|
|
trip when they go high, and remain tripped until they are acknowledged. If
|
|
the interrupt goes high for longer than one pulse, it may be high when a clear
|
|
is requested. If so, the interrupt will not clear. The line must go low
|
|
again before the status bit can be cleared.
|
|
|
|
As an example, consider the following scenario where the Zip CPU supports four
|
|
interrupts, 3\ldots0.
|
|
\begin{enumerate}
|
|
\item The Supervisor will first, while in the interrupts disabled mode,
|
|
write a {\tt 32'h800f000f} to the controller. The supervisor may then
|
|
switch to the user state with interrupts enabled.
|
|
\item When an interrupt occurs, the supervisor will switch to the interrupt
|
|
state. It will then cycle through the interrupt bits to learn which
|
|
interrupt handler to call.
|
|
\item If the interrupt handler expects more interrupts, it will clear its
|
|
current interrupt when it is done handling the interrupt in question.
|
|
To do this, it will write a '1' to the low order interrupt mask,
|
|
such as writing a {\tt 32'h80000001}.
|
|
\item If the interrupt handler does not expect any more interrupts, it will
|
|
instead clear the interrupt from the controller by writing a
|
|
{\tt 32'h00010001} to the controller.
|
|
\item Once all interrupts have been handled, the supervisor will write a
|
|
{\tt 32'h80000000} to the interrupt register to re-enable interrupt
|
|
generation.
|
|
\item The supervisor should also check the user trap bit, and possible soft
|
|
interrupt bits here, but this action has nothing to do with the
|
|
interrupt control register.
|
|
\item The supervisor will then leave interrupt mode, possibly adjusting
|
|
whichever task is running, by executing a return from interrupt
|
|
command.
|
|
\end{enumerate}
|
|
|
|
Leaving the interrupt controller, we show the timer registers bit definitions
|
|
in Tbl.~\ref{tbl:tmrbits}.
|
|
\begin{table}\begin{center}
|
|
\begin{bitlist}
|
|
31 & R/W & Auto-Reload\\\hline
|
|
30\ldots 0 & R/W & Current timer value\\\hline
|
|
\end{bitlist}
|
|
\caption{Timer Register Bits}\label{tbl:tmrbits}
|
|
\end{center}\end{table}
|
|
As you may recall, the timer just counts down to zero and then trips an
|
|
interrupt. Writing to the current timer value sets that value, and reading
|
|
from it returns that value. Writing to the current timer value while also
|
|
setting the auto--reload bit will send the timer into an auto--reload mode.
|
|
In this mode, upon setting its interrupt bit for one cycle, the timer will
|
|
also reset itself back to the value of the timer that was written to it when
|
|
the auto--reload option was written to it. To clear and stop the timer,
|
|
just simply write a `32'h00' to this register.
|
|
|
|
The Jiffies register is somewhat similar in that the register always changes.
|
|
In this case, the register counts up, whereas the timer always counted down.
|
|
Reads from this register, as shown in Tbl.~\ref{tbl:jiffybits},
|
|
\begin{table}\begin{center}
|
|
\begin{bitlist}
|
|
31\ldots 0 & R & Current jiffy value\\\hline
|
|
31\ldots 0 & W & Value/time of next interrupt\\\hline
|
|
\end{bitlist}
|
|
\caption{Jiffies Register Bits}\label{tbl:jiffybits}
|
|
\end{center}\end{table}
|
|
always return the time value contained in the register. Writes greater than
|
|
the current Jiffy value, that is where the new value minus the old value is
|
|
greater than zero while ignoring truncation, will set a new Jiffy interrupt
|
|
time. At that time, the Jiffy vector will clear, and another interrupt time
|
|
may either be written to it, or it will just continue counting without
|
|
activating any more interrupts.
|
|
|
|
The Zip CPU also supports several counter peripherals, mostly in the way of
|
|
process accounting. This peripherals have a single register associated with
|
|
them, shown in Tbl.~\ref{tbl:ctrbits}.
|
|
\begin{table}\begin{center}
|
|
\begin{bitlist}
|
|
31\ldots 0 & R/W & Current counter value\\\hline
|
|
\end{bitlist}
|
|
\caption{Counter Register Bits}\label{tbl:ctrbits}
|
|
\end{center}\end{table}
|
|
Writes to this register set the new counter value. Reads read the current
|
|
counter value.
|
|
|
|
The current design operation of these counters is that of performance counting.
|
|
Two sets of four registers are available for keeping track of performance.
|
|
The first is a task counter. This just counts clock ticks. The second
|
|
counter is a prefetch stall counter, then an master stall counter. These
|
|
allow the CPU to be evaluated as to how efficient it is. The fourth and
|
|
final counter is an instruction counter, which counts how many instructions the
|
|
CPU has issued.
|
|
|
|
It is envisioned that these counters will be used as follows: First, every time
|
|
a master counter rolls over, the supervisor (Operating System) will record
|
|
the fact. Second, whenever activating a user task, the Operating System will
|
|
set the four user counters to zero. When the user task has completed, the
|
|
Operating System will read the timers back off, to determine how much of the
|
|
CPU the process had consumed.
|
|
|
|
\section{Debug Port Registers}
|
|
Accessing the Zip System via the debug port isn't as straight forward as
|
|
accessing the system via the wishbone bus. The debug port itself has been
|
|
reduced to two addresses, as outlined earlier in Tbl.~\ref{tbl:dbgregs}.
|
|
Access to the Zip System begins with the Debug Control register, shown in
|
|
Tbl.~\ref{tbl:dbgctrl}.
|
|
\begin{table}\begin{center}
|
|
\begin{bitlist}
|
|
31\ldots 14 & R & Reserved\\\hline
|
|
13 & R & CPU GIE setting\\\hline
|
|
12 & R & CPU is sleeping\\\hline
|
|
11 & W & Command clear PF cache\\\hline
|
|
10 & R/W & Command HALT, Set to '1' to halt the CPU\\\hline
|
|
9 & R & Stall Status, '1' if CPU is busy\\\hline
|
|
8 & R/W & Step Command, set to '1' to step the CPU\\\hline
|
|
7 & R & Interrupt Request \\\hline
|
|
6 & R/W & Command RESET \\\hline
|
|
5\ldots 0 & R/W & Debug Register Address \\\hline
|
|
\end{bitlist}
|
|
\caption{Debug Control Register Bits}\label{tbl:dbgctrl}
|
|
\end{center}\end{table}
|
|
|
\chapter{Wishbone Datasheet}\label{chap:wishbone}
|
The first step in debugging access is to determine whether or not the CPU
|
|
is halted, and to halt it if not. To do this, first write a '1' to the
|
|
Command HALT bit. This will halt the CPU and place it into debug mode.
|
|
Once the CPU is halted, the stall status bit will drop to zero. Thus,
|
|
if bit 10 is high and bit 9 low, the debug port is open to examine the
|
|
internal state of the CPU.
|
|
|
|
At this point, the external debugger may examine internal state information
|
|
from within the CPU. To do this, first write again to the command register
|
|
a value (with command halt still high) containing the address of an internal
|
|
register of interest in the bottom 6~bits. Internal registers that may be
|
|
accessed this way are listed in Tbl.~\ref{tbl:dbgaddrs}.
|
|
\begin{table}\begin{center}
|
|
\begin{reglist}
|
|
sR0 & 0 & 32 & R/W & Supervisor Register R0 \\\hline
|
|
sR1 & 0 & 32 & R/W & Supervisor Register R1 \\\hline
|
|
sSP & 13 & 32 & R/W & Supervisor Stack Pointer\\\hline
|
|
sCC & 14 & 32 & R/W & Supervisor Condition Code Register \\\hline
|
|
sPC & 15 & 32 & R/W & Supervisor Program Counter\\\hline
|
|
uR0 & 16 & 32 & R/W & User Register R0 \\\hline
|
|
uR1 & 17 & 32 & R/W & User Register R1 \\\hline
|
|
uSP & 29 & 32 & R/W & User Stack Pointer\\\hline
|
|
uCC & 30 & 32 & R/W & User Condition Code Register \\\hline
|
|
uPC & 31 & 32 & R/W & User Program Counter\\\hline
|
|
PIC & 32 & 32 & R/W & Primary Interrupt Controller \\\hline
|
|
WDT & 33 & 32 & R/W & Watchdog Timer\\\hline
|
|
CCHE & 34 & 32 & R/W & Manual Cache Controller\\\hline
|
|
CTRIC & 35 & 32 & R/W & Secondary Interrupt Controller\\\hline
|
|
TMRA & 36 & 32 & R/W & Timer A\\\hline
|
|
TMRB & 37 & 32 & R/W & Timer B\\\hline
|
|
TMRC & 38 & 32 & R/W & Timer C\\\hline
|
|
JIFF & 39 & 32 & R/W & Jiffies peripheral\\\hline
|
|
MTASK & 40 & 32 & R/W & Master task clock counter\\\hline
|
|
MMSTL & 41 & 32 & R/W & Master memory stall counter\\\hline
|
|
MPSTL & 42 & 32 & R/W & Master Pre-Fetch Stall counter\\\hline
|
|
MICNT & 43 & 32 & R/W & Master instruction counter\\\hline
|
|
UTASK & 44 & 32 & R/W & User task clock counter\\\hline
|
|
UMSTL & 45 & 32 & R/W & User memory stall counter\\\hline
|
|
UPSTL & 46 & 32 & R/W & User Pre-Fetch Stall counter\\\hline
|
|
UICNT & 47 & 32 & R/W & User instruction counter\\\hline
|
|
\end{reglist}
|
|
\caption{Debug Register Addresses}\label{tbl:dbgaddrs}
|
|
\end{center}\end{table}
|
|
Primarily, these ``registers'' include access to the entire CPU register
|
|
set, as well as the 16~internal peripherals. To read one of these registers
|
|
once the address is set, simply issue a read from the data port. To write
|
|
one of these registers or peripheral ports, simply write to the data port
|
|
after setting the proper address.
|
|
|
|
In this manner, all of the CPU's internal state may be read and adjusted.
|
|
|
|
As an example of how to use this, consider what would happen in the case
|
|
of an external break point. If and when the CPU hits a break point that
|
|
causes it to halt, the Command HALT bit will activate on its own, the CPU
|
|
will then raise an external interrupt line and wait for a debugger to examine
|
|
its state. After examining the state, the debugger will need to remove
|
|
the breakpoint by writing a different instruction into memory and by writing
|
|
to the command register while holding the clear cache, command halt, and
|
|
step CPU bits high, (32'hd00). The debugger may then replace the breakpoint
|
|
now that the CPU has gone beyond it, and clear the cache again (32'h500).
|
|
|
|
To leave this debug mode, simply write a `32'h0' value to the command register.
|
|
|
|
\chapter{Wishbone Datasheets}\label{chap:wishbone}
|
The Zip System supports two wishbone ports, a slave debug port and a master
|
The Zip System supports two wishbone ports, a slave debug port and a master
|
port for the system itself. These are shown in Tbl.~\ref{tbl:wishbone-slave}
|
port for the system itself. These are shown in Tbl.~\ref{tbl:wishbone-slave}
|
\begin{table}[htbp]
|
\begin{table}[htbp]
|
\begin{center}
|
\begin{center}
|
\begin{wishboneds}
|
\begin{wishboneds}
|
Line 1408... |
Line 1658... |
had struggled with various timing violations to keep it at 100~MHz. So, for
|
had struggled with various timing violations to keep it at 100~MHz. So, for
|
now, I will only state that it can run at 100~MHz.
|
now, I will only state that it can run at 100~MHz.
|
|
|
|
|
\chapter{I/O Ports}\label{chap:ioports}
|
\chapter{I/O Ports}\label{chap:ioports}
|
|
The I/O ports to the Zip CPU may be grouped into three categories. The first
|
|
is that of the master wishbone used by the CPU, then the slave wishbone used
|
|
to command the CPU via a debugger, and then the rest. The first two of these
|
|
were already discussed in the wishbone chapter. They are listed here
|
|
for completeness in Tbl.~\ref{tbl:iowb-master}
|
|
\begin{table}
|
|
\begin{center}\begin{portlist}
|
|
{\tt o\_wb\_cyc} & 1 & Output & Indicates an active Wishbone cycle\\\hline
|
|
{\tt o\_wb\_stb} & 1 & Output & WB Strobe signal\\\hline
|
|
{\tt o\_wb\_we} & 1 & Output & Write enable\\\hline
|
|
{\tt o\_wb\_addr} & 32 & Output & Bus address \\\hline
|
|
{\tt o\_wb\_data} & 32 & Output & Data on WB write\\\hline
|
|
{\tt i\_wb\_ack} & 1 & Input & Slave has completed a R/W cycle\\\hline
|
|
{\tt i\_wb\_stall} & 1 & Input & WB bus slave not ready\\\hline
|
|
{\tt i\_wb\_data} & 32 & Input & Incoming bus data\\\hline
|
|
\end{portlist}\caption{CPU Master Wishbone I/O Ports}\label{tbl:iowb-master}\end{center}\end{table}
|
|
and~\ref{tbl:iowb-slave} respectively.
|
|
\begin{table}
|
|
\begin{center}\begin{portlist}
|
|
{\tt i\_wb\_cyc} & 1 & Input & Indicates an active Wishbone cycle\\\hline
|
|
{\tt i\_wb\_stb} & 1 & Input & WB Strobe signal\\\hline
|
|
{\tt i\_wb\_we} & 1 & Input & Write enable\\\hline
|
|
{\tt i\_wb\_addr} & 1 & Input & Bus address, command or data port \\\hline
|
|
{\tt i\_wb\_data} & 32 & Input & Data on WB write\\\hline
|
|
{\tt o\_wb\_ack} & 1 & Output & Slave has completed a R/W cycle\\\hline
|
|
{\tt o\_wb\_stall} & 1 & Output & WB bus slave not ready\\\hline
|
|
{\tt o\_wb\_data} & 32 & Output & Incoming bus data\\\hline
|
|
\end{portlist}\caption{CPU Debug Wishbone I/O Ports}\label{tbl:iowb-slave}\end{center}\end{table}
|
|
|
|
There are only four other lines to the CPU: the external clock, external
|
|
reset, incoming external interrupt line(s), and the outgoing debug interrupt
|
|
line. These are shown in Tbl.~\ref{tbl:ioports}.
|
|
\begin{table}
|
|
\begin{center}\begin{portlist}
|
|
{\tt i\_clk} & 1 & Input & The master CPU clock \\\hline
|
|
{\tt i\_rst} & 1 & Input & Active high reset line \\\hline
|
|
{\tt i\_ext\_int} & 1\ldots 6 & Input & Incoming external interrupts \\\hline
|
|
{\tt o\_ext\_int} & 1 & Output & CPU Halted interrupt \\\hline
|
|
\end{portlist}\caption{I/O Ports}\label{tbl:ioports}\end{center}\end{table}
|
|
The clock line was discussed briefly in Chapt.~\ref{chap:clocks}. We
|
|
typically run it at 100~MHz. The reset line is an active high reset. When
|
|
asserted, the CPU will start running again from its reset address in
|
|
memory. Further, depending upon how the CPU is configured and specifically on
|
|
the {\tt START\_HALTED} parameter, it may or may not start running
|
|
automatically. The {\tt i\_ext\_int} line is for an external interrupt. This
|
|
line may be as wide as 6~external interrupts, depending upon the setting of
|
|
the {\tt EXTERNAL\_INTERRUPTS} line. As currently configured, the ZipSystem
|
|
only supports one such interrupt line by default. For us, this line is the
|
|
output of another interrupt controller, but that's a board specific setup
|
|
detail. Finally, the Zip System produces one external interrupt whenever
|
|
the CPU halts to wait for the debugger.
|
|
|
% Appendices
|
% Appendices
|
% Index
|
% Index
|
\end{document}
|
\end{document}
|
|
|