URL https://opencores.org/ocsvn/zipcpu/zipcpu/trunk

# Subversion Repositorieszipcpu

## [/] [zipcpu/] [trunk/] [doc/] [src/] [spec.tex] - Diff between revs 37 and 39

Rev 37 Rev 39
Line 46... Line 46...
\documentclass{gqtekspec}
\documentclass{gqtekspec}
\project{Zip CPU}
\project{Zip CPU}
\title{Specification}
\title{Specification}
\author{Dan Gisselquist, Ph.D.}
\author{Dan Gisselquist, Ph.D.}
\email{dgisselq (at) opencores.org}
\email{dgisselq (at) opencores.org}
\revision{Rev.~0.4}
\revision{Rev.~0.5}
\definecolor{webred}{rgb}{0.2,0,0}
\definecolor{webred}{rgb}{0.2,0,0}
\definecolor{webgreen}{rgb}{0,0.2,0}
\definecolor{webgreen}{rgb}{0,0.2,0}
\usepackage[dvips,ps2pdf,colorlinks=true,
\usepackage[dvips,ps2pdf,colorlinks=true,
        anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,
        anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,
        pdfauthor={Dan Gisselquist},
        pdfauthor={Dan Gisselquist},
Line 74... Line 74...
You should have received a copy of the GNU General Public License along
You should have received a copy of the GNU General Public License along
with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a
with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a
copy.
copy.
\end{license}
\end{license}
\begin{revisionhistory}
\begin{revisionhistory}

0.5 & 9/29/2015 & Gisselquist & Added pipelined memory access discussion.\\\hline
0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline
0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline
0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline
0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline
0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline
0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline
0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline
0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline
\end{revisionhistory}
\end{revisionhistory}
Line 409... Line 410...
The tenth bit is a trap bit.  It is set whenever the user requests a soft
The tenth bit is a trap bit.  It is set whenever the user requests a soft
interrupt, and cleared on any return to userspace command.  This allows the
interrupt, and cleared on any return to userspace command.  This allows the
supervisor, in supervisor mode, to determine whether it got to supervisor
supervisor, in supervisor mode, to determine whether it got to supervisor
mode from a trap or from an external interrupt or both.
mode from a trap or from an external interrupt or both.
 
 

These status register bits are summarized in Tbl.~\ref{tbl:ccbits}.

\begin{table}

\begin{center}

\begin{tabular}{l|l}

Bit & Meaning \\\hline

9 & Soft trap, set on a trap from user mode, cleared when returning to user mode\\\hline

8 & (Reserved for) Floating point enable \\\hline

7 & Halt on break, to support an external debugger \\\hline

6 & Step, single step the CPU in user mode\\\hline

5 & GIE, or Global Interrupt Enable \\\hline

4 & Sleep \\\hline

3 & V, or overflow bit.\\\hline

2 & N, or negative bit.\\\hline

1 & C, or carry bit.\\\hline

0 & Z, or zero bit. \\\hline

\end{tabular}

\caption{Condition Code / Status Register Bits}\label{tbl:ccbits}

\end{center}\end{table}

 
\section{Conditional Instructions}
\section{Conditional Instructions}
Most, although not quite all, instructions may be conditionally executed.  From
Most, although not quite all, instructions may be conditionally executed.  From
the four condition code flags, eight conditions are defined.  These are shown
the four condition code flags, eight conditions are defined.  These are shown
in Tbl.~\ref{tbl:conditions}.
in Tbl.~\ref{tbl:conditions}.
\begin{table}
\begin{table}
Line 544... Line 564...
\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|c|}\hline
\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|c|}\hline
\rowcolor[gray]{0.85}
\rowcolor[gray]{0.85}
Op Code & \multicolumn{8}{c|}{31\ldots24} & \multicolumn{8}{c|}{23\ldots 16}
Op Code & \multicolumn{8}{c|}{31\ldots24} & \multicolumn{8}{c|}{23\ldots 16}
        & \multicolumn{8}{c|}{15\ldots 8} & \multicolumn{8}{c|}{7\ldots 0}
        & \multicolumn{8}{c|}{15\ldots 8} & \multicolumn{8}{c|}{7\ldots 0}
        & Sets CC? \\\hline\hline
        & Sets CC? \\\hline\hline
CMP(Sub) & \multicolumn{4}{l|}{4'h0}
{\tt CMP(Sub)} & \multicolumn{4}{l|}{4'h0}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B}
                & \multicolumn{21}{l|}{Operand B}
                & Yes \\\hline
                & Yes \\\hline
TST(And) & \multicolumn{4}{l|}{4'h1}
{\tt TST(And)} & \multicolumn{4}{l|}{4'h1}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B}
                & \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
MOV & \multicolumn{4}{l|}{4'h2}
{\tt MOV} & \multicolumn{4}{l|}{4'h2}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & A-Usr
                & A-Usr
                & \multicolumn{4}{l|}{B-Reg}
                & \multicolumn{4}{l|}{B-Reg}
                & B-Usr
                & B-Usr
                & \multicolumn{15}{l|}{15'bit signed offset}
                & \multicolumn{15}{l|}{15'bit signed offset}
                & \\\hline
                & \\\hline
LODI & \multicolumn{4}{l|}{4'h3}
{\tt LODI} & \multicolumn{4}{l|}{4'h3}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{24}{l|}{24'bit Signed Immediate}
                & \multicolumn{24}{l|}{24'bit Signed Immediate}
                & \\\hline
                & \\\hline
NOOP & \multicolumn{4}{l|}{4'h4}
{\tt NOOP} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{24}{l|}{24'h00}
                & \multicolumn{24}{l|}{24'h00}
                & \\\hline
                & \\\hline
BREAK & \multicolumn{4}{l|}{4'h4}
{\tt BREAK} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{24}{l|}{24'h01}
                & \multicolumn{24}{l|}{24'h01}
                & \\\hline
                & \\\hline
{\em Reserved} & \multicolumn{4}{l|}{4'h4}
{\em Reserved} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{4}{l|}{4'he}
                & \multicolumn{24}{l|}{24'bits, but not 0 or 1.}
                & \multicolumn{24}{l|}{24'bits, but not 0 or 1.}
                & \\\hline
                & \\\hline
LODIHI & \multicolumn{4}{l|}{4'h4}
{\tt LODIHI }& \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b1
                & 1'b1
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{16}{l|}{16-bit Immediate}
                & \multicolumn{16}{l|}{16-bit Immediate}
                & \\\hline
                & \\\hline
LODILO & \multicolumn{4}{l|}{4'h4}
{\tt LODILO} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b0
                & 1'b0
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{16}{l|}{16-bit Immediate}
                & \multicolumn{16}{l|}{16-bit Immediate}
                & \\\hline
                & \\\hline
16-b MPYU & \multicolumn{4}{l|}{4'h4}
16-b {\tt MPYU} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b0 & \multicolumn{4}{l|}{Reg}
                & 1'b0 & \multicolumn{4}{l|}{Reg}
                & \multicolumn{16}{l|}{16-bit Offset}
                & \multicolumn{16}{l|}{16-bit Offset}
                & Yes \\\hline
                & Yes \\\hline
16-b MPYU(I) & \multicolumn{4}{l|}{4'h4}
16-b {\tt MPYU}(I) & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b0 & \multicolumn{4}{l|}{4'hf}
                & 1'b0 & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{16}{l|}{16-bit Offset}
                & \multicolumn{16}{l|}{16-bit Offset}
                & Yes \\\hline
                & Yes \\\hline
16-b MPYS & \multicolumn{4}{l|}{4'h4}
16-b {\tt MPYS} & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b1 & \multicolumn{4}{l|}{Reg}
                & 1'b1 & \multicolumn{4}{l|}{Reg}
                & \multicolumn{16}{l|}{16-bit Offset}
                & \multicolumn{16}{l|}{16-bit Offset}
                & Yes \\\hline
                & Yes \\\hline
16-b MPYS(I) & \multicolumn{4}{l|}{4'h4}
16-b {\tt MPYS}(I) & \multicolumn{4}{l|}{4'h4}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & 1'b1 & \multicolumn{4}{l|}{4'hf}
                & 1'b1 & \multicolumn{4}{l|}{4'hf}
                & \multicolumn{16}{l|}{16-bit Offset}
                & \multicolumn{16}{l|}{16-bit Offset}
                & Yes \\\hline
                & Yes \\\hline
ROL & \multicolumn{4}{l|}{4'h5}
{\tt ROL} & \multicolumn{4}{l|}{4'h5}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B, truncated to low order 5 bits}
                & \multicolumn{21}{l|}{Operand B, truncated to low order 5 bits}
                & \\\hline
                & \\\hline
LOD & \multicolumn{4}{l|}{4'h6}
{\tt LOD} & \multicolumn{4}{l|}{4'h6}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{4}{l|}{R. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B address}
                & \multicolumn{21}{l|}{Operand B address}
                & \\\hline
                & \\\hline
STO & \multicolumn{4}{l|}{4'h7}
{\tt STO} & \multicolumn{4}{l|}{4'h7}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{4}{l|}{D. Reg}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{3}{l|}{Cond.}
                & \multicolumn{21}{l|}{Operand B address}
                & \multicolumn{21}{l|}{Operand B address}
                & \\\hline
                & \\\hline
SUB & \multicolumn{4}{l|}{4'h8}
{\tt SUB} & \multicolumn{4}{l|}{4'h8}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
AND & \multicolumn{4}{l|}{4'h9}
{\tt AND} & \multicolumn{4}{l|}{4'h9}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
ADD & \multicolumn{4}{l|}{4'ha}
{\tt ADD} & \multicolumn{4}{l|}{4'ha}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
OR & \multicolumn{4}{l|}{4'hb}
{\tt OR} & \multicolumn{4}{l|}{4'hb}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
XOR & \multicolumn{4}{l|}{4'hc}
{\tt XOR} & \multicolumn{4}{l|}{4'hc}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B}
        &       \multicolumn{21}{l|}{Operand B}
        & Yes \\\hline
        & Yes \\\hline
LSL/ASL & \multicolumn{4}{l|}{4'hd}
{\tt LSL/ASL} & \multicolumn{4}{l|}{4'hd}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        & Yes \\\hline
        & Yes \\\hline
ASR & \multicolumn{4}{l|}{4'he}
{\tt ASR} & \multicolumn{4}{l|}{4'he}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        & Yes \\\hline
        & Yes \\\hline
LSR & \multicolumn{4}{l|}{4'hf}
{\tt LSR} & \multicolumn{4}{l|}{4'hf}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{4}{l|}{R. Reg}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{3}{l|}{Cond.}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}
        & Yes \\\hline
        & Yes \\\hline
\end{tabular}
\end{tabular}
Line 690... Line 710...
the Zip CPU.  Many of these instructions will have assembly equivalents,
the Zip CPU.  Many of these instructions will have assembly equivalents,
such as the branch instructions, to facilitate working with the CPU.
such as the branch instructions, to facilitate working with the CPU.
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
Mapped & Actual  & Notes \\\hline
Mapped & Actual  & Notes \\\hline
ABS Rx
{\tt ABS Rx}
        & \parbox[t]{1.5in}{TST -1,Rx\\NEG.LT Rx}
        & \parbox[t]{1.5in}{\tt TST -1,Rx\\NEG.LT Rx}
        & Absolute value, depends upon derived NEG.\\\hline
        & Absolute value, depends upon derived NEG.\\\hline
\parbox[t]{1.4in}{ADD Ra,Rx\\ADDC Rb,Ry}
\parbox[t]{1.4in}{\tt ADD Ra,Rx\\ADDC Rb,Ry}
        & \parbox[t]{1.5in}{Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}  & \parbox[t]{1.5in}{\tt Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}
        & Add with carry \\\hline
        & Add with carry \\\hline
BRA.Cond +/-\$Addr {\tt BRA.Cond +/-\$Addr}
        & \hbox{MOV.cond \$Addr+PC,PC}  & \hbox{\tt MOV.cond \$Addr+PC,PC}
        & Branch or jump on condition.  Works for 15--bit
        & Branch or jump on condition.  Works for 15--bit
                signed address offsets.\\\hline
                signed address offsets.\\\hline
BRA.Cond +/-\$Addr {\tt BRA.Cond +/-\$Addr}
        & \parbox[t]{1.5in}{LDI \$Addr,Rx \\ ADD.cond Rx,PC}  & \parbox[t]{1.5in}{\tt LDI \$Addr,Rx \\ ADD.cond Rx,PC}
        & Branch/jump on condition.  Works for
        & Branch/jump on condition.  Works for
        23 bit address offsets, but costs a register, an extra instruction,
        23 bit address offsets, but costs a register, an extra instruction,
        and sets the flags. \\\hline
        and sets the flags. \\\hline
BNC PC+\$Addr {\tt BNC PC+\$Addr}
        & \parbox[t]{1.5in}{Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}
        & \parbox[t]{1.5in}{\tt Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}
        & Example of a branch on an unsupported
        & Example of a branch on an unsupported
                condition, in this case a branch on not carry \\\hline
                condition, in this case a branch on not carry \\\hline
BUSY & MOV \$-1(PC),PC & Execute an infinite loop \\\hline {\tt BUSY } & {\tt MOV \$-1(PC),PC} & Execute an infinite loop \\\hline
CLRF.NZ Rx
{\tt CLRF.NZ Rx }
        & XOR.NZ Rx,Rx
        & {\tt XOR.NZ Rx,Rx}
        & Clear Rx, and flags, if the Z-bit is not set \\\hline
        & Clear Rx, and flags, if the Z-bit is not set \\\hline
CLR Rx
{\tt CLR Rx }
        & LDI \$0,Rx  & {\tt LDI \$0,Rx}
        & Clears Rx, leaves flags untouched.  This instruction cannot be
        & Clears Rx, leaves flags untouched.  This instruction cannot be
                conditional. \\\hline
                conditional. \\\hline
EXCH.W Rx
{\tt EXCH.W Rx }
        & ROL \$16,Rx  & {\tt ROL \$16,Rx}
        & Exchanges the top and bottom 16'bit words of Rx \\\hline
        & Exchanges the top and bottom 16'bit words of Rx \\\hline
HALT
{\tt HALT }
        & Or \$SLEEP,CC  & {\tt Or \$SLEEP,CC}
        & Executed while in interrupt mode.  In user mode this is simply a
        & This only works when issued in interrupt/supervisor mode.  In user
        wait until interrupt instruction. \\\hline
        mode this is simply a wait until interrupt instruction. \\\hline
INT & LDI \$0,CC {\tt INT } & {\tt LDI \$0,CC} &  \\\hline
        & Since we're using the CC register as a trap vector as well, this
{\tt IRET}
        executes TRAP \#0. \\\hline
        & {\tt OR \$GIE,CC} IRET  & Also known as an RTU instruction (Return to Userspace) \\\hline  & OR \$GIE,CC
{\tt JMP R6+\$Addr}  & Also an RTU instruction (Return to Userspace) \\\hline  & {\tt MOV \$Addr(R6),PC}
JMP R6+\$Addr    & MOV \$Addr(R6),PC

        & \\\hline
        & \\\hline
JSR PC+\$Addr {\tt JSR PC+\$Addr}
        & \parbox[t]{1.5in}{SUB \$1,SP \\\  & \parbox[t]{1.5in}{\tt SUB \$1,SP \\\
        MOV \$3+PC,R0 \\  MOV \$3+PC,R0 \\
        STO R0,1(SP) \\
        STO R0,1(SP) \\
        MOV \$Addr+PC,PC \\  MOV \$Addr+PC,PC \\
        ADD \$1,SP}  ADD \$1,SP}
        & Jump to Subroutine. Note the required cleanup instruction after
        & Jump to Subroutine. Note the required cleanup instruction after
        returning.  This could easily be turned into a three instruction
        returning.  This could easily be turned into a three instruction
        operand, removing the preliminary stack instruction before and
        operand, removing the preliminary stack instruction before and
        the cleanup after, by adjusting how any stack frame was built for
        the cleanup after, by adjusting how any stack frame was built for
        this routine to include space at the top of the stack for the PC.
        this routine to include space at the top of the stack for the PC.

        Note also that jumping to a subroutine costs a copy register, {\tt R0}

        in this case.
        \\\hline
        \\\hline
JSR PC+\$Addr {\tt JSR PC+\$Addr  }
        & \parbox[t]{1.5in}{MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}
        & \parbox[t]{1.5in}{\tt MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}
        &This is the high speed
        &This is the high speed
        version of a subroutine call, necessitating a register to hold the
        version of a subroutine call, necessitating a register to hold the
        last PC address.  In its favor, this method doesn't suffer the
        last PC address.  In its favor, this method doesn't suffer the
        mandatory memory access of the other approach. \\\hline
        mandatory memory access of the other approach. \\\hline
LDI.l \$val,Rx {\tt LDI.l \$val,Rx }
        & \parbox[t]{1.5in}{LDIHI (\$val$>>$16)\&0x0ffff, Rx \\  & \parbox[t]{1.8in}{\tt LDIHI (\$val$>>$16)\&0x0ffff, Rx \\
                        LDILO (\$val \& 0x0ffff)}  LDILO (\$val\&0x0ffff),Rx}
        & Sadly, there's not enough instruction
        & Sadly, there's not enough instruction
                space to load a complete immediate value into any register.
                space to load a complete immediate value into any register.
                Therefore, fully loading any register takes two cycles.
                Therefore, fully loading any register takes two cycles.
                The LDIHI (load immediate high) and LDILO (load immediate low)
                The LDIHI (load immediate high) and LDILO (load immediate low)
                instructions have been created to facilitate this. \\\hline
                instructions have been created to facilitate this. \\\hline
Line 765... Line 785...
\caption{Derived Instructions}\label{tbl:derived-1}
\caption{Derived Instructions}\label{tbl:derived-1}
\end{center}\end{table}
\end{center}\end{table}
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
Mapped & Actual  & Notes \\\hline
Mapped & Actual  & Notes \\\hline
LOD.b \$addr,Rx {\tt LOD.b \$addr,Rx}
        & \parbox[t]{1.5in}{%
        & \parbox[t]{1.5in}{\tt %
        LDI     \$addr,Ra \\  LDI \$addr,Ra \\
        LDI     \$addr,Rb \\  LDI \$addr,Rb \\
        LSR     \$2,Ra \\  LSR \$2,Ra \\
        AND     \$3,Rb \\  AND \$3,Rb \\
        LOD     (Ra),Rx \\
        LOD     (Ra),Rx \\
Line 786... Line 806...
        all other addresses in this document are 32-bit wordlength addresses.
        all other addresses in this document are 32-bit wordlength addresses.
        For this reason,
        For this reason,
        we needed to drop the bottom two bits.  This also limits the address
        we needed to drop the bottom two bits.  This also limits the address
        space of character accesses using this method from 16 MB down to 4MB.}
        space of character accesses using this method from 16 MB down to 4MB.}
                \\\hline
                \\\hline
\parbox[t]{1.5in}{LSL \$1,Rx\\ LSLC \$1,Ry}
\parbox[t]{1.5in}{\tt LSL \$1,Rx\\ LSLC \$1,Ry}
        & \parbox[t]{1.5in}{LSL \$1,Ry \\  & \parbox[t]{1.5in}{\tt LSL \$1,Ry \\
        LSL \$1,Rx \\  LSL \$1,Rx \\
        OR.C \$1,Ry}  OR.C \$1,Ry}
        & Logical shift left with carry.  Note that the
        & Logical shift left with carry.  Note that the
        instruction order is now backwards, to keep the conditions valid.
        instruction order is now backwards, to keep the conditions valid.
        That is, LSL sets the carry flag, so if we did this the other way
        That is, LSL sets the carry flag, so if we did this the other way
        with Rx before Ry, then the condition flag wouldn't have been right
        with Rx before Ry, then the condition flag wouldn't have been right
        for an OR correction at the end. \\\hline
        for an OR correction at the end. \\\hline
\parbox[t]{1.5in}{LSR \$1,Rx \\ LSRC \$1,Ry}
\parbox[t]{1.5in}{\tt LSR \$1,Rx \\ LSRC \$1,Ry}
        & \parbox[t]{1.5in}{CLR Rz \\
        & \parbox[t]{1.5in}{\tt CLR Rz \\
        LSR \$1,Ry \\  LSR \$1,Ry \\
        LDIHI.C \$8000h,Rz \\  LDIHI.C \$8000h,Rz \\
        LSR \$1,Rx \\  LSR \$1,Rx \\
        OR Rz,Rx}
        OR Rz,Rx}
        & Logical shift right with carry \\\hline
        & Logical shift right with carry \\\hline
NEG Rx & \parbox[t]{1.5in}{XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline
{\tt NEG Rx} & \parbox[t]{1.5in}{\tt XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline
NEG.C Rx & \parbox[t]{1.5in}{MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline
{\tt NEG.C Rx} & \parbox[t]{1.5in}{\tt MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline
NOOP & NOOP & While there are many
{\tt NOOP} & {\tt NOOP} & While there are many
        operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these  operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these
        operations have consequences in that they might stall the bus if
        operations have consequences in that they might stall the bus if
        Rx isn't ready yet.  For this reason, we have a dedicated NOOP
        Rx isn't ready yet.  For this reason, we have a dedicated NOOP
        instruction. \\\hline
        instruction. \\\hline
NOT Rx & XOR \$-1,Rx & \\\hline {\tt NOT Rx } & {\tt XOR \$-1,Rx } & \\\hline
POP Rx
{\tt POP Rx }
        & \parbox[t]{1.5in}{LOD \$1(SP),Rx \\ ADD \$1,SP}
        & \parbox[t]{1.5in}{\tt LOD \$1(SP),Rx \\ ADD \$1,SP}
        & Note
        & Note
        that for interrupt purposes, one can never depend upon the value at
        that for interrupt purposes, one can never depend upon the value at
        (SP).  Hence you read from it, then increment it, lest having
        (SP).  Hence you read from it, then increment it, lest having
        incremented it first something then comes along and writes to that
        incremented it first something then comes along and writes to that
        value before you can read the result. \\\hline
        value before you can read the result. \\\hline
\end{tabular}
\end{tabular}
\caption{Derived Instructions, continued}\label{tbl:derived-2}
\caption{Derived Instructions, continued}\label{tbl:derived-2}
\end{center}\end{table}
\end{center}\end{table}
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
PUSH Rx
{\tt PUSH Rx}
        & \parbox[t]{1.5in}{SUB \$1,SP \\  & \parbox[t]{1.5in}{SUB \$1,SP \\
        STO Rx,\$1(SP)}  STO Rx,\$1(SP)}
        & \\\hline
        & Note that for pipelined operation, it helps to coalesce all the
PUSH Rx-Ry
        {\tt SUB}'s into one command, and place the {\tt STO}'s right
        & \parbox[t]{1.5in}{SUB \$n,SP \\  after each other.\\\hline   {\tt PUSH Rx-Ry}    & \parbox[t]{1.5in}{\tt SUB \$n,SP \\
        STO Rx,\$n(SP)  STO Rx,\$n(SP)
        \ldots \\
        \ldots \\
        STO Ry,\$1(SP)}  STO Ry,\$1(SP)}
        & Multiple pushes at once only need the single subtract from the
        & Multiple pushes at once only need the single subtract from the
        stack pointer.  This derived instruction is analogous to a similar one
        stack pointer.  This derived instruction is analogous to a similar one
        on the Motoroloa 68k architecture, although the Zip Assembler
        on the Motoroloa 68k architecture, although the Zip Assembler
        does not support this instruction (yet).\\\hline
        does not support this instruction (yet).  This instruction
RESET
        also supports pipelined memory access.\\\hline
        & \parbox[t]{1in}{STO \$1,\$watchdog(R12)\\NOOP\\NOOP}
{\tt RESET}
        & \parbox[t]{3in}{This depends upon the peripheral base address being
        & \parbox[t]{1in}{\tt STO \$1,\$watchdog(R12)\\NOOP\\NOOP}

        & This depends upon the peripheral base address being
        in R12.
        in R12.
 
 
        Another opportunity might be to jump to the reset address from within
        Another opportunity might be to jump to the reset address from within
        supervisor mode.}\\\hline
        supervisor mode.\\\hline
RET & \parbox[t]{1.5in}{LOD \$1(SP),PC} {\tt RET} & \parbox[t]{1.5in}{\tt LOD \$1(SP),PC}
        & Note that this depends upon the calling context to clean up the
        & Note that this depends upon the calling context to clean up the
        stack, as outlined for the JSR instruction.  \\\hline
        stack, as outlined for the JSR instruction.  \\\hline
RET & MOV R12,PC
{\tt RET} & {\tt MOV R12,PC}
        & This is the high(er) speed version, that doesn't touch the stack.
        & This is the high(er) speed version, that doesn't touch the stack.
        As such, it doesn't suffer a stall on memory read/write to the stack.
        As such, it doesn't suffer a stall on memory read/write to the stack.
        \\\hline
        \\\hline
STEP Rr,Rt
{\tt STEP Rr,Rt}
        & \parbox[t]{1.5in}{LSR \$1,Rr \\ XOR.C Rt,Rr}  & \parbox[t]{1.5in}{\tt LSR \$1,Rr \\ XOR.C Rt,Rr}
        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,
        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,
                using taps Rt \\\hline
                using taps Rt \\\hline
STO.b Rx,\$addr {\tt STO.b Rx,\$addr}
        & \parbox[t]{1.5in}{%
        & \parbox[t]{1.5in}{\tt %
        LDI \$addr,Ra \\  LDI \$addr,Ra \\
        LDI \$addr,Rb \\  LDI \$addr,Rb \\
        LSR \$2,Ra \\  LSR \$2,Ra \\
        AND \$3,Rb \\  AND \$3,Rb \\
        SUB \$32,Rb \\  SUB \$32,Rb \\
        LOD (Ra),Ry \\
        LOD (Ra),Ry \\
        AND \$0ffh,Rx \\  AND \$0ffh,Rx \\
        AND \$-0ffh,Ry \\  AND \~\$0ffh,Ry \\
        ROL Rb,Rx \\
        ROL Rb,Rx \\
        OR Rx,Ry \\
        OR Rx,Ry \\
        STO Ry,(Ra) }
        STO Ry,(Ra) }
        & \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized
        & \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized
        for byte-wise operations.
        for byte-wise operations.
Line 875... Line 898...
        byte-wise address, whereas in all of our other examples it is a
        byte-wise address, whereas in all of our other examples it is a
        32-bit word address. This also limits the address space
        32-bit word address. This also limits the address space
        of character accesses from 16 MB down to 4MB.F
        of character accesses from 16 MB down to 4MB.F
        Further, this instruction implies a byte ordering,
        Further, this instruction implies a byte ordering,
        such as big or little endian.} \\\hline
        such as big or little endian.} \\\hline
SWAP Rx,Ry
{\tt SWAP Rx,Ry }
        & \parbox[t]{1.5in}{
        & \parbox[t]{1.5in}{\tt
        XOR Ry,Rx \\
        XOR Ry,Rx \\
        XOR Rx,Ry \\
        XOR Rx,Ry \\
        XOR Ry,Rx}
        XOR Ry,Rx}
        & While no extra registers are needed, this example
        & While no extra registers are needed, this example
        does take 3-clocks. \\\hline
        does take 3-clocks. \\\hline
TRAP \#X
{\tt TRAP \#X}
        & \parbox[t]{1.5in}{LDI \$x,R0 \\ AND ~\$GIE,CC }
        & \parbox[t]{1.5in}{\tt LDI \$x,R0 \\ AND \~\$GIE,CC }
        & This works because whenever a user lowers the \$GIE flag, it sets  & This works because whenever a user lowers the \$GIE flag, it sets
        a TRAP bit within the CC register.  Therefore, upon entering the
        a TRAP bit within the CC register.  Therefore, upon entering the
        supervisor state, the CPU only need check this bit to know that it
        supervisor state, the CPU only need check this bit to know that it
        got there via a TRAP.  The trap could be made conditional by making
        got there via a TRAP.  The trap could be made conditional by making
        the LDI and the AND conditional.  In that case, the assembler would
        the LDI and the AND conditional.  In that case, the assembler would
Line 896... Line 919...
\end{tabular}
\end{tabular}
\caption{Derived Instructions, continued}\label{tbl:derived-3}
\caption{Derived Instructions, continued}\label{tbl:derived-3}
\end{center}\end{table}
\end{center}\end{table}
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline
TST Rx
{\tt TST Rx}
        & TST \$-1,Rx  & {\tt TST \$-1,Rx}
        & Set the condition codes based upon Rx.  Could also do a CMP \$0,Rx,  & Set the condition codes based upon Rx. Could also do a CMP \$0,Rx,
        ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc. The TST and CMP  ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc.  The TST and CMP
        approaches won't stall future pipeline stages looking for the value
        approaches won't stall future pipeline stages looking for the value
        of Rx. \\\hline
        of Rx. \\\hline
WAIT
{\tt WAIT}
        & Or \$SLEEP,CC  & {\tt Or \$GIE | \\$SLEEP,CC}
        & Wait 'til interrupt.  In an interrupts disabled context, this
        & Wait until the next interrupt, then jump to supervisor/interrupt
        becomes a HALT instruction.
        mode.
\end{tabular}
\end{tabular}
\caption{Derived Instructions, continued}\label{tbl:derived-4}
\caption{Derived Instructions, continued}\label{tbl:derived-4}
\end{center}\end{table}
\end{center}\end{table}
\section{Pipeline Stages}
\section{Pipeline Stages}
As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},
As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},
Line 1071... Line 1094...
In this case, the LOD instruction cannot start until the STO is finished.
In this case, the LOD instruction cannot start until the STO is finished.
With proper scheduling, it is possible to do something in the ALU while the
With proper scheduling, it is possible to do something in the ALU while the
memory unit is busy with the STO instruction, but otherwise this pipeline will
memory unit is busy with the STO instruction, but otherwise this pipeline will
stall waiting for it to complete.
stall waiting for it to complete.
 
 
Note that even though the Wishbone bus can support pipelined accesses at
The Zip CPU does have the capability of supporting pipelined memory access,
one access per clock, only the prefetch stage can take advantage of this.
but only under the following conditions: all accesses within the pipeline
Load and Store instructions are stuck at one wishbone cycle per instruction.
must all be reads or all be writes, all must use the same register for their

address, and there can be no stalls or other instructions between pipelined

memory access instructions.  Further, the offset to memory must be increasing

by one address each instruction.  These conditions work well for saving or

storing registers to the stack.
 
 
\item When waiting for a conditional memory read operation to complete
\item When waiting for a conditional memory read operation to complete
\begin{enumerate}
\begin{enumerate}
\item\ {\tt LOD.Z address,RA}
\item\ {\tt LOD.Z address,RA}
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}
Line 1233... Line 1260...
restart its transfer by writing the contents of its internal buffer and then
restart its transfer by writing the contents of its internal buffer and then
re-entering its read cycle again.
re-entering its read cycle again.
 
 
When coupled with a peripheral, the DMA controller can be configured to start
When coupled with a peripheral, the DMA controller can be configured to start
a memory copy on an interrupt line going high.  Further, the controller can be
a memory copy on an interrupt line going high.  Further, the controller can be
configured to issue reads from (or two) the same address instead of incrementing
configured to issue reads from (or to) the same address instead of incrementing
the address at each clock.  The DMA completes once the total number of items
the address at each clock.  The DMA completes once the total number of items
specified (not the transfer length) have been transferred.
specified (not the transfer length) have been transferred.
 
 
In each case, once the transfer is complete and the DMA unit returns to
In each case, once the transfer is complete and the DMA unit returns to
idle, the DMA will issue an interrupt.
idle, the DMA will issue an interrupt.
Line 1400... Line 1427...
        onto the user stack and then copying the resulting stack address
        onto the user stack and then copying the resulting stack address
        into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.
        into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{ll}
\begin{tabular}{ll}
{\tt swap\_out:} \\
{\tt swap\_out:} \\
&        {\tt MOV -15(uSP),R1} \\
&        {\tt MOV -15(uSP),R5} \\
&        {\tt STO R1,stack(R12)} \\
&        {\tt STO R5,stack(R12)} \\
&        {\tt MOV uPC,R0} \\
&        {\tt MOV uR0,R0} \\
&        {\tt STO R0,15(R1)} \\
&        {\tt MOV uR1,R1} \\
&        {\tt MOV uCC,R0} \\
&        {\tt MOV uR2,R2} \\
&        {\tt STO R0,14(R1)} \\
&        {\tt MOV uR3,R3} \\

&        {\tt MOV uR4,R4} \\

&        {\tt STO R0,1(R5)} {\em ; Exploit memory pipelining: }\\

&        {\tt STO R1,2(R5)} {\em ; All instructions write to stack }\\

&        {\tt STO R2,3(R5)} {\em ; All offsets increment by one }\\

&        {\tt STO R3,4(R5)} {\em ; Longest pipeline is 5 cycles.}\\

&        {\tt STO R4,5(R5)} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

\iffalse

&        {\tt MOV uR5,R0} \\

&        {\tt MOV uR6,R1} \\

&        {\tt MOV uR7,R2} \\

&        {\tt MOV uR8,R3} \\

&        {\tt MOV uR9,R4} \\

&        {\tt STO R0,6(R5) }\\

&        {\tt STO R1,7(R5) }\\

&        {\tt STO R2,8(R5) }\\

&        {\tt STO R3,9(R5) }\\

&        {\tt STO R4,10(R5)} \\

\fi

&        {\tt MOV uR10,R0} \\

&        {\tt MOV uR11,R1} \\

&        {\tt MOV uR12,R2} \\

&        {\tt MOV uCC,R3} \\

&        {\tt MOV uPC,R4} \\

&        {\tt STO R0,11(R5)}\\

&        {\tt STO R1,12(R5)}\\

&        {\tt STO R2,13(R5)}\\

&        {\tt STO R3,14(R5)}\\

&        {\tt STO R4,15(R5)} \\
&       {\em ; We can skip storing the stack, uSP, since it'll be stored}\\
&       {\em ; We can skip storing the stack, uSP, since it'll be stored}\\
&       {\em ; elsewhere (in the task structure) }\\
&       {\em ; elsewhere (in the task structure) }\\
&        {\tt MOV uR13,R0} \\

&        {\tt STO R0,13(R1)} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

&        {\tt MOV uR0,R0} \\

&        {\tt STO R0,1(R1)} \\

\end{tabular}
\end{tabular}
\caption{Example Storing User Task Context}\label{tbl:context-out}
\caption{Example Storing User Task Context}\label{tbl:context-out}
\end{center}\end{table}
\end{center}\end{table}
For the sake of discussion, we assume the supervisor maintains a
For the sake of discussion, we assume the supervisor maintains a
pointer to the current task's structure in supervisor register
pointer to the current task's structure in supervisor register
Line 1507... Line 1558...
        back off of the stack to run this task.  An example of this is
        back off of the stack to run this task.  An example of this is
        shown in Tbl.~\ref{tbl:context-in},
        shown in Tbl.~\ref{tbl:context-in},
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{ll}
\begin{tabular}{ll}
{\tt swap\_in:} \\
{\tt swap\_in:} \\
&       {\tt LOD stack(R12),R1} \\
&       {\tt LOD stack(R12),R5} \\
&       {\tt MOV 15(R1),uSP} \\
&       {\tt MOV 15(R1),uSP} \\
&       {\tt LOD 15(R1),R0} \\
        & {\em ; Be sure to exploit the memory pipelining capability} \\
&       {\tt MOV R0,uPC} \\
&       {\tt LOD 1(R5),R0} \\
&       {\tt LOD 14(R1),R0} \\
&       {\tt LOD 2(R5),R1} \\
&       {\tt MOV R0,uCC} \\
&       {\tt LOD 3(R5),R2} \\
&       {\tt LOD 13(R1),R0} \\
&       {\tt LOD 4(R5),R3} \\
&       {\tt MOV R0,uR12} \\
&       {\tt LOD 5(R5),R4} \\
        & \ldots {\em ; Need to repeat for all user registers} \\

&       {\tt LOD 1(R1),R0} \\

&       {\tt MOV R0,uR0} \\
&       {\tt MOV R0,uR0} \\

&       {\tt MOV R1,uR1} \\

&       {\tt MOV R2,uR2} \\

&       {\tt MOV R3,uR3} \\

&       {\tt MOV R4,uR4} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

&       {\tt LOD 11(R5),R0} \\

&       {\tt LOD 12(R5),R1} \\

&       {\tt LOD 13(R5),R2} \\

&       {\tt LOD 14(R5),R3} \\

&       {\tt LOD 15(R5),R4} \\

&       {\tt MOV R0,uR10} \\

&       {\tt MOV R1,uR11} \\

&       {\tt MOV R2,uR12} \\

&       {\tt MOV R3,uCC} \\

&       {\tt MOV R4,uPC} \\

 
&       {\tt BRA return\_to\_user} \\
&       {\tt BRA return\_to\_user} \\
\end{tabular}
\end{tabular}
\caption{Example Restoring User Task Context}\label{tbl:context-in}
\caption{Example Restoring User Task Context}\label{tbl:context-in}
\end{center}\end{table}
\end{center}\end{table}
        assuming as before that the task
        assuming as before that the task
Line 1714... Line 1779...
 
 
The bit allocation of the control register is shown in Tbl.~\ref{tbl:dmacbits}.
The bit allocation of the control register is shown in Tbl.~\ref{tbl:dmacbits}.
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{bitlist}
\begin{bitlist}
31 & R & DMA Active\\\hline
31 & R & DMA Active\\\hline
30 & R & Wishbone error, transaction aborted (cleared on any write)\\\hline
30 & R & Wishbone error, transaction aborted.  This bit is cleared the next time

        this register is written to.\\\hline
29 & R/W & Set to '1' to prevent the controller from incrementing the source address, '0' for normal memory copy. \\\hline
29 & R/W & Set to '1' to prevent the controller from incrementing the source address, '0' for normal memory copy. \\\hline
28 & R/W & Set to '0' to prevent the controller from incrementing the
28 & R/W & Set to '1' to prevent the controller from incrementing the
        destination address, '0' for normal memory copy. \\\hline
        destination address, '0' for normal memory copy. \\\hline
27 \ldots 16 & W & The DMA Key.  Write a 12'hfed to these bits to start the
27 \ldots 16 & W & The DMA Key.  Write a 12'hfed to these bits to start the
        activate any DMA transfer.  \\\hline
        activate any DMA transfer.  \\\hline
27 & R & Always reads '0', to force the deliberate writing of the key. \\\hline
27 & R & Always reads '0', to force the deliberate writing of the key. \\\hline
26 \ldots 16 & R & Indicates the number of items in the transfer buffer that
26 \ldots 16 & R & Indicates the number of items in the transfer buffer that
Line 1793... Line 1859...
uSP & 29 & 32 & R/W & User Stack Pointer\\\hline
uSP & 29 & 32 & R/W & User Stack Pointer\\\hline
uCC & 30 & 32 & R/W & User Condition Code Register \\\hline
uCC & 30 & 32 & R/W & User Condition Code Register \\\hline
uPC & 31 & 32 & R/W & User Program Counter\\\hline
uPC & 31 & 32 & R/W & User Program Counter\\\hline
PIC & 32 & 32 & R/W & Primary Interrupt Controller \\\hline
PIC & 32 & 32 & R/W & Primary Interrupt Controller \\\hline
WDT & 33 & 32 & R/W & Watchdog Timer\\\hline
WDT & 33 & 32 & R/W & Watchdog Timer\\\hline
CCHE & 34 & 32 & R/W & Manual Cache Controller\\\hline

CTRIC & 35 & 32 & R/W & Secondary Interrupt Controller\\\hline
CTRIC & 35 & 32 & R/W & Secondary Interrupt Controller\\\hline
TMRA & 36 & 32 & R/W & Timer A\\\hline
TMRA & 36 & 32 & R/W & Timer A\\\hline
TMRB & 37 & 32 & R/W & Timer B\\\hline
TMRB & 37 & 32 & R/W & Timer B\\\hline
TMRC & 38 & 32 & R/W & Timer C\\\hline
TMRC & 38 & 32 & R/W & Timer C\\\hline
JIFF & 39 & 32 & R/W & Jiffies peripheral\\\hline
JIFF & 39 & 32 & R/W & Jiffies peripheral\\\hline
Line 1807... Line 1872...
MICNT & 43 & 32 & R/W & Master instruction counter\\\hline
MICNT & 43 & 32 & R/W & Master instruction counter\\\hline
UTASK & 44 & 32 & R/W & User task clock counter\\\hline
UTASK & 44 & 32 & R/W & User task clock counter\\\hline
UMSTL & 45 & 32 & R/W & User memory stall counter\\\hline
UMSTL & 45 & 32 & R/W & User memory stall counter\\\hline
UPSTL & 46 & 32 & R/W & User Pre-Fetch Stall counter\\\hline
UPSTL & 46 & 32 & R/W & User Pre-Fetch Stall counter\\\hline
UICNT & 47 & 32 & R/W & User instruction counter\\\hline
UICNT & 47 & 32 & R/W & User instruction counter\\\hline

DMACMD & 48 & 32 & R/W & DMA command and status register\\\hline

DMALEN & 49 & 32 & R/W & DMA transfer length\\\hline

DMARD & 50 & 32 & R/W & DMA read address\\\hline

DMAWR & 51 & 32 & R/W & DMA write address\\\hline
\end{reglist}
\end{reglist}
\caption{Debug Register Addresses}\label{tbl:dbgaddrs}
\caption{Debug Register Addresses}\label{tbl:dbgaddrs}
\end{center}\end{table}
\end{center}\end{table}
Primarily, these registers'' include access to the entire CPU register
Primarily, these registers'' include access to the entire CPU register
set, as well as the internal peripherals.  To read one of these registers
set, as well as the internal peripherals.  To read one of these registers
Line 2113... Line 2182...
        (yet) support a compiler. The standard C library is an even longer
        (yet) support a compiler. The standard C library is an even longer
        shot. My dream of having binutils and gcc support has not been
        shot. My dream of having binutils and gcc support has not been
        realized and at this rate may not be realized. (I've been intimidated
        realized and at this rate may not be realized. (I've been intimidated
        by the challenge everytime I've looked through those codes.)
        by the challenge everytime I've looked through those codes.)
 
 

\iffalse
\item While the Wishbone Bus (B4) supports a pipelined mode with single cycle
\item While the Wishbone Bus (B4) supports a pipelined mode with single cycle
        execution, the Zip CPU is unable to exploit this parallelism. Instead,
        execution, the Zip CPU is unable to exploit this parallelism. Instead,
        apart from the DMA and the pipelined prefetch, all loads and stores
        apart from the DMA and the pipelined prefetch, all loads and stores
        are single wishbone bus operations requiring a minimum of 3 clocks.
        are single wishbone bus operations requiring a minimum of 3 clocks.
        (In practice, this has turned into 7-clocks.)
        (In practice, this has turned into 7-clocks.)

        % Addressed, 20150929
 
 
\iffalse

\item There is no control over whether or not an instruction sets the
\item There is no control over whether or not an instruction sets the
        condition codes--certain instructions always set the condition codes,
        condition codes--certain instructions always set the condition codes,
        other instructions never set them. This effectively limits conditional
        other instructions never set them. This effectively limits conditional
        instructions to a single instruction only (with two or more
        instructions to a single instruction only (with two or more
        instructions as an exception), as the first instruction that sets
        instructions as an exception), as the first instruction that sets
Line 2171... Line 2241...
        the process accounting registers are anything but light weight, why
        the process accounting registers are anything but light weight, why
        keep them?  Why not instead make some compile flags that just turn them
        keep them?  Why not instead make some compile flags that just turn them
        off, keeping the CPU lightweight?  The same holds for the prefetch
        off, keeping the CPU lightweight?  The same holds for the prefetch
        cache.
        cache.
 
 

\item The {\tt .V}' condition was never used in any code other than my test
 
        code.  Suggest changing it to a {\tt .LE}' condition, which seems

        to be more useful.

 

\item {\bf Consider a more traditional Instruction Cache.}  The current

        pipelined instruction cache just reads a window of memory into

        its cache.  If the CPU leaves that window, the entire cache is

        invalidated.  A more traditional cache, however, might allow

        common subroutines to stay within the cache without invalidating the

        entire cache structure.

 
\iffalse
\iffalse
\item {\bf Adjust the Zip CPU so that conditional instructions do not set
\item {\bf Adjust the Zip CPU so that conditional instructions do not set
        flags}, although they may explicitly set condition codes if writing
        flags}, although they may explicitly set condition codes if writing
        to the CC register.
        to the CC register.
 
 
        This is a simple change to the core, and may show up in new releases.
        This is a simple change to the core, and may show up in new releases.
        % Fixed, 20150918
        % Fixed, 20150918
\fi

 

\item The {\tt .V}' condition was never used in any code other than my test
 
        code.  Suggest changing it to a {\tt .LE}' condition, which seems

        to be more useful.

 
 
\iffalse

\item Add in an {\bf unpredictable branch delay slot}, so that on any branch
\item Add in an {\bf unpredictable branch delay slot}, so that on any branch
        the delay slot may or may not be executed before the branch.
        the delay slot may or may not be executed before the branch.
        Instructions that do not depend upon the branch, and that should be
        Instructions that do not depend upon the branch, and that should be
        executed were the branch not taken, could be placed into the delay
        executed were the branch not taken, could be placed into the delay
        slot. Thus, if the branch isn't taken, we wouldn't suffer the stall,
        slot. Thus, if the branch isn't taken, we wouldn't suffer the stall,
Line 2224... Line 2299...
        for one cycle before starting again, these extra cycles add up.
        for one cycle before starting again, these extra cycles add up.
        It should be possible to tell the prefetch stage to give up the bus
        It should be possible to tell the prefetch stage to give up the bus
        as soon as the decoder knows the instruction will need the bus.
        as soon as the decoder knows the instruction will need the bus.
        Indeed, if done in the decode stage, this might drop the seven cycle
        Indeed, if done in the decode stage, this might drop the seven cycle
        access down by two cycles.
        access down by two cycles.
 

        % FIXED: 20150918
        % FIXED: 20150918
\fi

 
 
\item {\bf Consider a more traditional Instruction Cache.}  The current

        pipelined instruction cache just reads a window of memory into

        its cache.  If the CPU leaves that window, the entire cache is

        invalidated.  A more traditional cache, however, might allow

        common subroutines to stay within the cache without invalidating the

        entire cache structure.

 

\iffalse

\item {\bf Very Long Instruction Word (VLIW).}  Now, to speed up operation, I
\item {\bf Very Long Instruction Word (VLIW).}  Now, to speed up operation, I
        propose that the Zip CPU instruction set be modified towards a Very
        propose that the Zip CPU instruction set be modified towards a Very
        Long Instruction Word (VLIW) implementation. In this implementation,
        Long Instruction Word (VLIW) implementation. In this implementation,
        an instruction word may contain either one or two separate
        an instruction word may contain either one or two separate
        instructions. The first instruction would take up the high order bits,
        instructions. The first instruction would take up the high order bits,