URL https://opencores.org/ocsvn/zipcpu/zipcpu/trunk

# Subversion Repositorieszipcpu

## [/] [zipcpu/] [trunk/] [doc/] [src/] [spec.tex] - Diff between revs 24 and 32

Rev 24 Rev 32
Line 101... Line 101...
envious of what they've accomplished. I would like to port binutils to the
envious of what they've accomplished. I would like to port binutils to the
Zip CPU, as I would like to port GCC and GDB. They are way ahead of me. The
Zip CPU, as I would like to port GCC and GDB. They are way ahead of me. The
OpenRISC processor, however, is complex and hefty at about 4,500 LUTs. It has
OpenRISC processor, however, is complex and hefty at about 4,500 LUTs. It has
a lot of features of modern CPUs within it that ... well, let's just say it's
a lot of features of modern CPUs within it that ... well, let's just say it's
not the little guy on the block. The Zip CPU is lighter weight, costing only
not the little guy on the block. The Zip CPU is lighter weight, costing only
about 2,000 LUTs with no peripherals, and 3,000 LUTs with some very basic
about 2,300 LUTs with no peripherals, and 3,200 LUTs with some very basic
peripherals.
peripherals.
 
 
My final reason is that I'm building the Zip CPU as a learning experience. The
My final reason is that I'm building the Zip CPU as a learning experience. The
Zip CPU has allowed me to learn a lot about how CPUs work on a very micro
Zip CPU has allowed me to learn a lot about how CPUs work on a very micro
level. For the first time, I am beginning to understand many of the Computer
level. For the first time, I am beginning to understand many of the Computer
Line 330... Line 330...
The next bit is a clock enable (0 to enable) or sleep bit (1 to put
The next bit is a clock enable (0 to enable) or sleep bit (1 to put
        the CPU to sleep).  Setting this bit will cause the CPU to
        the CPU to sleep).  Setting this bit will cause the CPU to
        wait for an interrupt (if interrupts are enabled), or to
        wait for an interrupt (if interrupts are enabled), or to
        completely halt (if interrupts are disabled).
        completely halt (if interrupts are disabled).
The sixth bit is a global interrupt enable bit (GIE).  When this
The sixth bit is a global interrupt enable bit (GIE).  When this
        sixth bit is a '1' interrupts will be enabled, else disabled.  When
        sixth bit is a 1' interrupts will be enabled, else disabled.  When
        interrupts are disabled, the CPU will be in supervisor mode, otherwise
        interrupts are disabled, the CPU will be in supervisor mode, otherwise
        it is in user mode.  Thus, to execute a context switch, one only
        it is in user mode.  Thus, to execute a context switch, one only
        need enable or disable interrupts.  (When an interrupt line goes
        need enable or disable interrupts.  (When an interrupt line goes
        high, interrupts will automatically be disabled, as the CPU goes
        high, interrupts will automatically be disabled, as the CPU goes
        and deals with its context switch.)
        and deals with its context switch.)  Special logic has been added to
 
        keep the user mode from setting the sleep register and clearing the
 
        GIE register at the same time, with clearing the GIE register taking
 
        precedence.
 
 
The seventh bit is a step bit.  This bit can be
The seventh bit is a step bit.  This bit can be
        set from supervisor mode only.  After setting this bit, should
        set from supervisor mode only.  After setting this bit, should
        the supervisor mode process switch to user mode, it would then
        the supervisor mode process switch to user mode, it would then
        accomplish one instruction in user mode before returning to supervisor
        accomplish one instruction in user mode before returning to supervisor
Line 357... Line 360...
(break enabled), or whether the break instruction will simply send send the
(break enabled), or whether the break instruction will simply send send the
CPU into interrupt mode.  Encountering a break in supervisor mode will
CPU into interrupt mode.  Encountering a break in supervisor mode will
halt the CPU independent of the break enable bit.  This bit can only be set
halt the CPU independent of the break enable bit.  This bit can only be set
within supervisor mode.
within supervisor mode.
 
 
 
% Should break enable be a supervisor mode bit, while the break enable bit
 
% in user mode is a break has taken place bit?
 
%
 
 
This functionality was added to enable an external debugger to
This functionality was added to enable an external debugger to
        set and manage breakpoints.
        set and manage breakpoints.
 
 
The ninth bit is reserved for a floating point enable bit.  When set, the
The ninth bit is reserved for a floating point enable bit.  When set, the
arithmetic for the next instruction will be sent to a floating point unit.
arithmetic for the next instruction will be sent to a floating point unit.
Line 414... Line 421...
\caption{Conditions for conditional operand execution}\label{tbl:conditions}
\caption{Conditions for conditional operand execution}\label{tbl:conditions}
\end{center}
\end{center}
\end{table}
\end{table}
There is no condition code for less than or equal, not C or not V.  Sorry,
There is no condition code for less than or equal, not C or not V.  Sorry,
I ran out of space in 3--bits.  Using these conditions will take an extra
I ran out of space in 3--bits.  Using these conditions will take an extra
instruction.  (Ex: \hbox{\tt TST \$4,CC;} \hbox{\tt STO.NZ R0,(R1)}) instruction and a pipeline stall. (Ex: \hbox{\em (Stall)}; \hbox{\tt TST \$4,CC;} \hbox{\tt STO.NZ R0,(R1)})
 
 
\section{Operand B}
\section{Operand B}
Many instruction forms have a 21-bit source Operand B'' associated with them.
Many instruction forms have a 21-bit source Operand B'' associated with them.
This Operand B is either equal to a register plus a signed immediate offset,
This Operand B is either equal to a register plus a signed immediate offset,
or an immediate offset by itself.  This value is encoded as shown in
or an immediate offset by itself.  This value is encoded as shown in
Line 443... Line 450...
immediate address.  Addresses are therefore encoded in the same fashion as
immediate address.  Addresses are therefore encoded in the same fashion as
Operand B's, shown above.
Operand B's, shown above.
 
 
A lot of long hard thought was put into whether to allow pre/post increment
A lot of long hard thought was put into whether to allow pre/post increment
and decrement addressing modes.  Finding no way to use these operators without
and decrement addressing modes.  Finding no way to use these operators without
taking two or more clocks per instruction, these addressing modes have been
taking two or more clocks per instruction,\footnote{The two clocks figure
 
comes from the design of the register set, allowing only one write per clock.
 
That write is either from the memory unit or the ALU, but never both.} these
 
addressing modes have been
removed from the realm of possibilities.  This means that the Zip CPU has no
removed from the realm of possibilities.  This means that the Zip CPU has no
native way of executing push, pop, return, or jump to subroutine operations.
native way of executing push, pop, return, or jump to subroutine operations.
Each of these instructions can be emulated with a set of instructions from the
Each of these instructions can be emulated with a set of instructions from the
existing set.
existing set.
 
 
Line 482... Line 492...
rule that the register cannot be the PC or CC registers.  The PC register
rule that the register cannot be the PC or CC registers.  The PC register
field has been stolen to create a multiply by immediate instruction.  The
field has been stolen to create a multiply by immediate instruction.  The
CC register field is reserved.
CC register field is reserved.
 
 
\section{Floating Point}
\section{Floating Point}
The ZIP CPU does not support floating point operations today.  However, the
The ZIP CPU does not support floating point operations.  However, the
instruction set reserves a capability for a floating point operation.  To
instruction set reserves two possibilities for future floating point
execute such an operation, simply set the floating point bit in the CC
operations.
register and the following instruction will interpret its registers as
 
a floating point instruction.  Not all instructions, however, have floating
The first floating point operation hole in the instruction set involves
point equivalents.  Further, the immediate fields do not apply in floating
setting the floating point bit in the CC register.  The next instruction
point mode, and must be set to zero.  Not all instructions make sense as
will simply interpret its operands as floating point instructions.
floating point operations.  Therefore, only the CMP, SUB, ADD, and MPY
Not all instructions, however, have floating point equivalents.  Further, the
instructions may be issued as floating point instructions.  Other instructions
immediate fields do not apply in floating point mode, and must be set to
allow the examining of the floating point bit in the CC register.  In all
zero.  Not all instructions make sense as floating point operations.
cases, the floating point bit is cleared one instruction after it is set.
Therefore, only the CMP, SUB, ADD, and MPY instructions may be issued as
 
floating point instructions.  Other instructions allow the examining of the
 
floating point bit in the CC register.  In all cases, the floating point bit
 
is cleared one instruction after it is set.
 
 
 
The other possibility for floating point operations involves exploiting the
 
hole in the instruction set that the NOOP and BREAK instructions reside within.
 
These two instructions use 24--bits of address space.  A simple adjustment
 
to this space could create instructions with 4--bit register addresses for
 
each register, a 3--bit field for conditional execution, and a 2--bit field
 
for which operation.  In this fashion, such a floating point capability would
 
only fill 13--bits of the 24--bit field, still leaving lots of room for
 
expansion.
 
 
 
In both cases, the Zip CPU would support 32--bit single precision floats
 
only.
 
 
The architecture does not support a floating point not-implemented interrupt.
The current architecture does not support a floating point not-implemented
Any soft floating point emulation must be done deliberately.
interrupt.  Any soft floating point emulation must be done deliberately.
 
 
\section{Native Instructions}
\section{Native Instructions}
The instruction set for the Zip CPU is summarized in
The instruction set for the Zip CPU is summarized in
Tbl.~\ref{tbl:zip-instructions}.
Tbl.~\ref{tbl:zip-instructions}.
\begin{table}\begin{center}
\begin{table}\begin{center}
Line 592... Line 617...
STO & \multicolumn{4}{l|}{4'h7}
STO & \multicolumn{4}{l|}{4'h7}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B address}
                & \multicolumn{21}{l|}{Operand B address}
                & \\\hline
                & \\\hline
{\em Rsrd} & \multicolumn{4}{l|}{4'h8}
 
        &       \multicolumn{4}{l|}{R. Reg}
 
        &       \multicolumn{3}{l|}{Cond.}
 
        & 1'b0
 
        &       \multicolumn{20}{l|}{Reserved}
 
        & Yes \\\hline
 
SUB & \multicolumn{4}{l|}{4'h8}
SUB & \multicolumn{4}{l|}{4'h8}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        & 1'b1
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{4}{l|}{Reg}
 
        &       \multicolumn{16}{l|}{16'bit signed offset}
 
        & Yes \\\hline
        & Yes \\\hline
AND & \multicolumn{4}{l|}{4'h9}
AND & \multicolumn{4}{l|}{4'h9}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
Line 646... Line 663...
\caption{Zip CPU Instruction Set}\label{tbl:zip-instructions}
\caption{Zip CPU Instruction Set}\label{tbl:zip-instructions}
\end{center}\end{table}
\end{center}\end{table}
 
 
As you can see, there's lots of room for instruction set expansion.  The
As you can see, there's lots of room for instruction set expansion.  The
NOOP and BREAK instructions are the only instructions within one particular
NOOP and BREAK instructions are the only instructions within one particular
24--bit hole.  Likewise, the subtract leaves half of its space open, since a
24--bit hole.  This spaces are reserved for future enhancements.  For example,
subtract immediate is the same as an add with a negated immediate.  This
floating point operations, consisting of a 3-bit floating point operation,
spaces are reserved for future enhancements.
two 4-bit registers, no immediate offset, and a 3-bit condition would fit
 
nicely into 14--bits of this address space--making it so that the floating
 
point bit in the CC register need not be used.
 
 
\section{Derived Instructions}
\section{Derived Instructions}
The ZIP CPU supports many other common instructions, but not all of them
The ZIP CPU supports many other common instructions, but not all of them
are single cycle instructions.  The derived instruction tables,
are single cycle instructions.  The derived instruction tables,
Tbls.~\ref{tbl:derived-1}, \ref{tbl:derived-2}, and~\ref{tbl:derived-3},
Tbls.~\ref{tbl:derived-1}, \ref{tbl:derived-2}, and~\ref{tbl:derived-3},
Line 860... Line 879...
\caption{Derived Instructions, continued}\label{tbl:derived-3}
\caption{Derived Instructions, continued}\label{tbl:derived-3}
\end{center}\end{table}
\end{center}\end{table}
\iffalse
\iffalse
\fi
\fi
\section{Pipeline Stages}
\section{Pipeline Stages}
 
As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},
 
the Zip CPU supports a five stage pipeline.
\begin{enumerate}
\begin{enumerate}
\item {\bf Prefetch}: Read instruction from memory (cache if possible).  This
\item {\bf Prefetch}: Read instruction from memory (cache if possible).  This
        stage is actually pipelined itself, and so it will stall if the PC
        stage is actually pipelined itself, and so it will stall if the PC
        ever changes.  Stalls are also created here if the instruction isn't
        ever changes.  Stalls are also created here if the instruction isn't
        in the prefetch cache.
        in the prefetch cache.
\item {\bf Decode}: Decode instruction into op code, register(s) to read, and
\item {\bf Decode}: Decode instruction into op code, register(s) to read, and
        immediate offset.
        immediate offset.  This stage also determines whether the flags will
 
        be set or whether the result will be written back.
\item {\bf Read Operands}: Read registers and apply any immediate values to
\item {\bf Read Operands}: Read registers and apply any immediate values to
        them.  There is no means of detecting or flagging arithmetic overflow
        them.  There is no means of detecting or flagging arithmetic overflow
        or carry when adding the immediate to the operand.  This stage will
        or carry when adding the immediate to the operand.  This stage will
        stall if any source operand is pending.
        stall if any source operand is pending.
        A proper optimizing compiler, therefore, will schedule an instruction
 
        between the instruction that produces the result and the instruction
 
        that uses it.
 
\item Split into two tracks: An {\bf ALU} which will accomplish a simple
\item Split into two tracks: An {\bf ALU} which will accomplish a simple
        instruction, and the {\bf MemOps} stage which accomplishes memory
        instruction, and the {\bf MemOps} stage which accomplishes memory
        read/write.
        read/write.
        \begin{itemize}
        \begin{itemize}
        \item Loads stall instructions that access the register until it is
        \item Loads stall instructions that access the register until it is
                written to the register set.
                written to the register set.
        \item Condition codes are available upon completion
        \item Condition codes are available upon completion
        \item Issuing an instruction to the memory while the memory is busy will
        \item Issuing an instruction to the memory while the memory is busy will
                stall the bus.  If the bus deadlocks, only a reset will
                stall the entire pipeline.  If the bus deadlocks, only a reset
                release the CPU.  (Watchdog timer, anyone?)
                will release the CPU.  (Watchdog timer, anyone?)
        \item The Zip CPU currently has no means of reading and acting on any
        \item The Zip CPU currently has no means of reading and acting on any
        error conditions on the bus.
        error conditions on the bus.
        \end{itemize}
        \end{itemize}
\item {\bf Write-Back}: Conditionally write back the result to register set,
\item {\bf Write-Back}: Conditionally write back the result to the register
        applying the condition.  This routine is bi-re-entrant: either the
        set, applying the condition.  This routine is bi-re-entrant: either the
        memory or the simple instruction may request a register write.
        memory or the simple instruction may request a register write.
\end{enumerate}
\end{enumerate}
 
 
The Zip CPU does not support out of order execution.  Therefore, if the memory
The Zip CPU does not support out of order execution.  Therefore, if the memory
unit stalls, every other instruction stalls.  Memory stores, however, can take
unit stalls, every other instruction stalls.  Memory stores, however, can take
place concurrently with ALU operations, although memory writes cannot.
place concurrently with ALU operations, although memory reads cannot.
 
 
\section{Pipeline Logic}
\section{Pipeline Logic}
How the CPU handles some instruction combinations can be telling when
How the CPU handles some instruction combinations can be telling when
determining what happens in the pipeline.  The following lists some examples:
determining what happens in the pipeline.  The following lists some examples:
\begin{itemize}
\begin{itemize}
Line 923... Line 942...
        R2 get, the value of R1 before the first move or the value of R0?
        R2 get, the value of R1 before the first move or the value of R0?
        Placing the value of R0 into R1 requires a pipeline stall, and possibly
        Placing the value of R0 into R1 requires a pipeline stall, and possibly
        two, as I have the pipeline designed.
        two, as I have the pipeline designed.
 
 
        The ZIP CPU architecture requires that R2 must equal R0 at the end of
        The ZIP CPU architecture requires that R2 must equal R0 at the end of
        this operation.  This may stall the pipeline 1-2 cycles.
        this operation.  Even better, such combinations do not (normally)
 
        stall the pipeline.
 
 
\item {\bf Condition Codes Result:} {\tt CMP R0,R1;Mov.EQ \$x,PC} \item {\bf Condition Codes Result:} {\tt CMP R0,R1;Mov.EQ \$x,PC}
 
 
 
 
        At issue is the same item as above, save that the CMP instruction
        At issue is the same item as above, save that the CMP instruction
Line 942... Line 962...
 
 
        At issue is the
        At issue is the
        fact that the logic supporting the CC register is more complicated than
        fact that the logic supporting the CC register is more complicated than
        the logic supporting any other register.
        the logic supporting any other register.
 
 
        The ZIP CPU will stall 1--2 cycles on this instruction, until the
        The ZIP CPU will stall for a cycle cycle on this instruction.
        CC register is valid.
 
 
 
\item {\bf Delayed Branching: } {\tt ADD \$x,PC; MOV R0,R1} \item {\bf Delayed Branching: } {\tt ADD \$x,PC; MOV R0,R1}
 
 
        At issues is whether or not the instruction following the jump will
        At issues is whether or not the instruction following the jump will
        take place before the jump.  In other words, is the MOV to the PC
        take place before the jump.  In other words, is the MOV to the PC
Line 991... Line 1010...
        Because it isn't clear what would need to be canceled,
        Because it isn't clear what would need to be canceled,
        this instruction combination is not recommended.
        this instruction combination is not recommended.
 
 
\item {\bf All issued instructions complete.}
\item {\bf All issued instructions complete.}
 
 
        All stages are filled, or the entire pipeline
        All stages are filled, or the entire pipeline stalls.
        stalls.
 
 
 
        What about debug control?  What about
        What about debug control?  What about
        register writes taking an extra clock stage?  MOV R0,R1; MOV R1,R2
        register writes taking an extra clock stage?  MOV R0,R1; MOV R1,R2
        should place the value of R0 into R2.  How do you restart the pipeline
        should place the value of R0 into R2.  How do you restart the pipeline
        after an interrupt?  What address do you use?  The last issued
        after an interrupt?  What address do you use?  The last issued
Line 1014... Line 1032...
 
 
        Suggestion: Suppose we load extra information in the two
        Suggestion: Suppose we load extra information in the two
        CC register(s) for debugging intermediate pipeline stages?
        CC register(s) for debugging intermediate pipeline stages?
 
 
        The next problem, though, is how to deal with the read operand
        The next problem, though, is how to deal with the read operand
        pipeline stage needing the result from the register pipeline.a
        pipeline stage needing the result from the register pipeline.
 
 
\item {\bf Memory instructions must complete}
\item {\bf Memory instructions must complete}
 
 
        All instructions that enter into the memory module *must*
        All instructions that enter into the memory module {\em must}
        complete.  Issued instructions from the prefetch, decode, or operand
        complete.  Issued instructions from the prefetch, decode, or operand
        read stages may or may not complete.  Jumps into code must be valid,
        read stages may or may not complete.  Jumps into code must be valid,
        so that interrupt returns may be valid.  All instructions entering the
        so that interrupt returns may be valid.  All instructions entering the
        ALU complete.
        ALU complete.
 
 
Line 1037... Line 1055...
        result is known.  When the flag does go high, anything in the prefetch,
        result is known.  When the flag does go high, anything in the prefetch,
        decode, and read-op stage will be invalidated.
        decode, and read-op stage will be invalidated.
 
 
\end{itemize}
\end{itemize}
 
 
 
\section{Pipeline Stalls}
 
The processing pipeline can and will stall for a variety of reasons.  Some of
 
these are obvious, some less so.  These reasons are listed below:
 
\begin{itemize}
 
\item When the prefetch cache is exhausted
 
 
 
This should be obvious.  If the prefetch cache doesn't have the instruction
 
in memory, the entire pipeline must stall until enough of the prefetch cache
 
is loaded to support the next instruction.
 
 
 
\item While waiting for the pipeline to load following any taken branch, jump,
 
        return from interrupt or switch to interrupt context (6 clocks)
 
 
 
If the PC suddenly changes, the pipeline is subsequently cleared and needs to
 
be reloaded.  Given that there are five stages to the pipeline, that accounts
 
for five of the six delay clocks.  The last clock is lost in the prefetch
 
stage which needs at least one clock with a valid PC before it can produce
 
a new output.  Hence, six clocks will always be lost anytime the pipeline needs
 
to be cleared.
 
 
 
\item When reading from a prior register while also adding an immediate offset
 
\begin{enumerate}
 
\item\ {\tt OPCODE ?,RA}
 
\item\ {\em (stall)}
 
\item\ {\tt OPCODE I+RA,RB}
 
\end{enumerate}
 
 
 
Since the addition of the immediate register within OpB decoding gets applied
 
during the read operand stage so that it can be nicely settled before the ALU,
 
any instruction that will write back an operand must be separated from the
 
opcode that will read and apply an immediate offset by one instruction.  The
 
good news is that this stall can easily be mitigated by proper scheduling.
 
 
 
\item When writing to the CC or PC Register
 
\begin{enumerate}
 
\item\ {\tt OPCODE RA,PC} {\em Ex: a branch opcode}
 
\item\ {\em (stall, even if jump not taken)}
 
\item\ {\tt OPCODE RA,RB}
 
\end{enumerate}
 
Since branches take place in the writeback stage, the Zip CPU will stall the
 
pipeline for one clock anytime there may be a possible jump.  This prevents
 
an instruction from executing a memory access after the jump but before the
 
jump is recognized.
 
 
 
\item When reading from the CC register after setting the flags
 
\begin{enumerate}
 
\item\ {\tt ALUOP RA,RB}
 
\item\ {\em (stall}
 
\item\ {\tt TST sys.ccv,CC}
 
\item\ {\tt BZ somewhere}
 
\end{enumerate}
 
 
 
The reason for this stall is simply performance.  Many of the flags are
 
determined via combinatorial logic after the writeback instruction is
 
determined.  Trying to then place these into the input for one of the operands
 
created a time delay loop that would no longer execute in a single 100~MHz
 
clock cycle.  (The time delay of the multiply within the ALU wasn't helping
 
either \ldots).
 
 
 
\item When waiting for a memory read operation to complete
 
\begin{enumerate}
 
\item\ {\tt LOD address,RA}
 
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
 
\item\ {\tt OPCODE I+RA,RB}
 
\end{enumerate}
 
 
 
Remember, the ZIP CPU does not support out of order execution.  Therefore,
 
anytime the memory unit becomes busy both the memory unit and the ALU must
 
stall until the memory unit is cleared.  This is especially true of a load
 
instruction, which will write its operand back to the register file.  Store
 
instructions are different, since they can be busy with no impact on later
 
ALU write back operations.  Hence, only loads stall the pipeline.
 
 
 
This also assumes that the memory being accessed is a single cycle memory.
 
Slower memories, such as the Quad SPI flash, will take longer--perhaps even
 
as long as fourty clocks.   During this time the CPU and the external bus
 
will be busy, and unable to do anything else.
 
 
 
\item Memory operation followed by a memory operation
 
\begin{enumerate}
 
\item\ {\tt STO address,RA}
 
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
 
\item\ {\tt LOD address,RB}
 
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
 
\end{enumerate}
 
 
 
In this case, the LOD instruction cannot start until the STALL is finished.
 
With proper scheduling, it is possible to do something in the ALU while the
 
STO is busy, but otherwise this pipeline will stall waiting for it to complete.
 
 
 
Note that even though the Wishbone bus can support pipelined accesses at
 
one access per clock, only the prefetch stage can take advantage of this.
 
Load and Store instructions are stuck at one wishbone cycle per instruction.
 
\end{itemize}
 
 
 
 
\chapter{Peripherals}\label{chap:periph}
\chapter{Peripherals}\label{chap:periph}
 
 
While the previous chapter describes a CPU in isolation, the Zip System
While the previous chapter describes a CPU in isolation, the Zip System
Line 1120... Line 1232...
 
 
The watchdog timer is no different from any of the other timers, save for one
The watchdog timer is no different from any of the other timers, save for one
critical difference: the interrupt line from the watchdog
critical difference: the interrupt line from the watchdog
timer is tied to the reset line of the CPU.  Hence writing a 1' to the
timer is tied to the reset line of the CPU.  Hence writing a 1' to the
watchdog timer will always reset the CPU.
watchdog timer will always reset the CPU.
To stop the Watchdog timer, write a '0' to it.  To start it,
To stop the Watchdog timer, write a 0' to it.  To start it,
write any other number to it---as with the other timers.
write any other number to it---as with the other timers.
 
 
While the watchdog timer supports interval mode, it doesn't make as much sense
While the watchdog timer supports interval mode, it doesn't make as much sense
as it did with the other timers.
as it did with the other timers.
 
 
Line 1153... Line 1265...
 
 
The purpose of this register is to support alarm times within a CPU.  To
The purpose of this register is to support alarm times within a CPU.  To
set an alarm for a particular process $N$ clocks in advance, read the current
set an alarm for a particular process $N$ clocks in advance, read the current
Jiffies value, and $N$, and write it back to the Jiffies register.  The
Jiffies value, and $N$, and write it back to the Jiffies register.  The
O/S must also keep track of values written to the Jiffies register.  Thus,
O/S must also keep track of values written to the Jiffies register.  Thus,
when an alarm' trips, it should be remoed from the list of alarms, the list
when an alarm' trips, it should be removed from the list of alarms, the list
should be sorted, and the next alarm in terms of Jiffies should be written
should be sorted, and the next alarm in terms of Jiffies should be written
to the register.
to the register.
 
 
\section{Manual Cache}
\section{Manual Cache}
 
 
The manual cache is an experimental setting that may not remain with the Zip
The manual cache is an experimental setting that may not remain with the Zip
CPU for very long.  It is designed to facilitate running from FLASH or ROM
CPU for very long.  It is designed to facilitate running from FLASH or ROM
memory, although the pipe cache really makes this need obsolete.  The manual
memory, although the pipeline prefetch cache really makes this need obsolete.

The manual
cache works by copying data from a wishbone address (range) into the cache
cache works by copying data from a wishbone address (range) into the cache
register, and then by making that memory available as memory to the Zip System.
register, and then by making that memory available as memory to the Zip System.
It is a {\em manual cache} because the processor must first specify what
It is a {\em manual cache} because the processor must first specify what
memory to copy, and then once copied the processor can only access the cache
memory to copy, and then once copied the processor can only access the cache
memory by the cache memory location.  There is no transparency.  It is perhaps
memory by the cache memory location.  There is no transparency.  It is perhaps
Line 1181... Line 1294...
 
 
The ZipSystem registers fall into two categories, ZipSystem internal registers
The ZipSystem registers fall into two categories, ZipSystem internal registers
accessed via the ZipCPU shown in Tbl.~\ref{tbl:zpregs},
accessed via the ZipCPU shown in Tbl.~\ref{tbl:zpregs},
\begin{table}[htbp]
\begin{table}[htbp]
\begin{center}\begin{reglist}
\begin{center}\begin{reglist}
PIC   & {\tt 0xc0000000} & 32 & R/W & Primary Interrupt Controller \\\hline
PIC   & \scalebox{0.8}{\tt 0xc0000000} & 32 & R/W & Primary Interrupt Controller \\\hline
WDT   & {\tt 0xc0000001} & 32 & R/W & Watchdog Timer \\\hline
WDT   & \scalebox{0.8}{\tt 0xc0000001} & 32 & R/W & Watchdog Timer \\\hline
CCHE  & {\tt 0xc0000002} & 32 & R/W & Manual Cache Controller \\\hline
CCHE  & \scalebox{0.8}{\tt 0xc0000002} & 32 & R/W & Manual Cache Controller \\\hline
CTRIC & {\tt 0xc0000003} & 32 & R/W & Secondary Interrupt Controller \\\hline
CTRIC & \scalebox{0.8}{\tt 0xc0000003} & 32 & R/W & Secondary Interrupt Controller \\\hline
TMRA  & {\tt 0xc0000004} & 32 & R/W & Timer A\\\hline
TMRA  & \scalebox{0.8}{\tt 0xc0000004} & 32 & R/W & Timer A\\\hline
TMRB  & {\tt 0xc0000005} & 32 & R/W & Timer B\\\hline
TMRB  & \scalebox{0.8}{\tt 0xc0000005} & 32 & R/W & Timer B\\\hline
TMRC  & {\tt 0xc0000006} & 32 & R/W & Timer C\\\hline
TMRC  & \scalebox{0.8}{\tt 0xc0000006} & 32 & R/W & Timer C\\\hline
JIFF  & {\tt 0xc0000007} & 32 & R/W & Jiffies \\\hline
JIFF  & \scalebox{0.8}{\tt 0xc0000007} & 32 & R/W & Jiffies \\\hline
MTASK  & {\tt 0xc0000008} & 32 & R/W & Master Task Clock Counter \\\hline
MTASK  & \scalebox{0.8}{\tt 0xc0000008} & 32 & R/W & Master Task Clock Counter \\\hline
MMSTL  & {\tt 0xc0000008} & 32 & R/W & Master Stall Counter \\\hline
MMSTL  & \scalebox{0.8}{\tt 0xc0000009} & 32 & R/W & Master Stall Counter \\\hline
MPSTL  & {\tt 0xc0000008} & 32 & R/W & Master Pre--Fetch Stall Counter \\\hline
MPSTL  & \scalebox{0.8}{\tt 0xc000000a} & 32 & R/W & Master Pre--Fetch Stall Counter \\\hline
MICNT  & {\tt 0xc0000008} & 32 & R/W & Master Instruction Counter\\\hline
MICNT  & \scalebox{0.8}{\tt 0xc000000b} & 32 & R/W & Master Instruction Counter\\\hline
UTASK  & {\tt 0xc0000008} & 32 & R/W & User Task Clock Counter \\\hline
UTASK  & \scalebox{0.8}{\tt 0xc000000c} & 32 & R/W & User Task Clock Counter \\\hline
UMSTL  & {\tt 0xc0000008} & 32 & R/W & User Stall Counter \\\hline
UMSTL  & \scalebox{0.8}{\tt 0xc000000d} & 32 & R/W & User Stall Counter \\\hline
UPSTL  & {\tt 0xc0000008} & 32 & R/W & User Pre--Fetch Stall Counter \\\hline
UPSTL  & \scalebox{0.8}{\tt 0xc000000e} & 32 & R/W & User Pre--Fetch Stall Counter \\\hline
UICNT  & {\tt 0xc0000008} & 32 & R/W & User Instruction Counter\\\hline
UICNT  & \scalebox{0.8}{\tt 0xc000000f} & 32 & R/W & User Instruction Counter\\\hline
Cache  & {\tt 0xc0100000} & & & Base address of the Cache memory\\\hline
% Cache  & \scalebox{0.8}{\tt 0xc0100000} & & & Base address of the Cache memory\\\hline
\end{reglist}
\end{reglist}
\caption{Zip System Internal/Peripheral Registers}\label{tbl:zpregs}
\caption{Zip System Internal/Peripheral Registers}\label{tbl:zpregs}
\end{center}\end{table}
\end{center}\end{table}
and the two debug registers showin in Tbl.~\ref{tbl:dbgregs}.
and the two debug registers showin in Tbl.~\ref{tbl:dbgregs}.
\begin{table}[htbp]
\begin{table}[htbp]
Line 1212... Line 1325...
\caption{Zip System Debug Registers}\label{tbl:dbgregs}
\caption{Zip System Debug Registers}\label{tbl:dbgregs}
\end{center}\end{table}
\end{center}\end{table}
 
 
 
 
\chapter{Wishbone Datasheet}\label{chap:wishbone}
\chapter{Wishbone Datasheet}\label{chap:wishbone}
The Zip System supports two wishbone accesses, a slave debug port and a master
The Zip System supports two wishbone ports, a slave debug port and a master
port for the system itself.  These are shown in Tbl.~\ref{tbl:wishbone-slave}
port for the system itself.  These are shown in Tbl.~\ref{tbl:wishbone-slave}
\begin{table}[htbp]
\begin{table}[htbp]
\begin{center}
\begin{center}
\begin{wishboneds}
\begin{wishboneds}
Revision level of wishbone & WB B4 spec \\\hline
Revision level of wishbone & WB B4 spec \\\hline
Line 1279... Line 1392...
it choose to access a value not on the bus, or a peripheral that is not
it choose to access a value not on the bus, or a peripheral that is not
yet properly configured.
yet properly configured.
 
 
\chapter{Clocks}\label{chap:clocks}
\chapter{Clocks}\label{chap:clocks}
 
 
This core is based upon the Basys--3 design.  The Basys--3 development board
This core is based upon the Basys--3 development board sold by Digilent.
contains one external 100~MHz clock, which is sufficient to run the ZIP CPU
The Basys--3 development board contains one external 100~MHz clock, which is
core.
sufficient to run the ZIP CPU core.
\begin{table}[htbp]
\begin{table}[htbp]
\begin{center}
\begin{center}
\begin{clocklist}
\begin{clocklist}
i\_clk & External & 100~MHz & 100~MHz & System clock.\\\hline
i\_clk & External & 100~MHz & 100~MHz & System clock.\\\hline
\end{clocklist}
\end{clocklist}