OpenCores

Rev 37	Rev 39
Line 46...	Line 46...
`\documentclass{gqtekspec}`	`\documentclass{gqtekspec}`
`\project{Zip CPU}`	`\project{Zip CPU}`
`\title{Specification}`	`\title{Specification}`
`\author{Dan Gisselquist, Ph.D.}`	`\author{Dan Gisselquist, Ph.D.}`
`\email{dgisselq (at) opencores.org}`	`\email{dgisselq (at) opencores.org}`
`\revision{Rev.~0.4}`	`\revision{Rev.~0.5}`
`\definecolor{webred}{rgb}{0.2,0,0}`	`\definecolor{webred}{rgb}{0.2,0,0}`
`\definecolor{webgreen}{rgb}{0,0.2,0}`	`\definecolor{webgreen}{rgb}{0,0.2,0}`
`\usepackage[dvips,ps2pdf,colorlinks=true,`	`\usepackage[dvips,ps2pdf,colorlinks=true,`
`anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,`	`anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,`
`pdfauthor={Dan Gisselquist},`	`pdfauthor={Dan Gisselquist},`
Line 74...	Line 74...
`You should have received a copy of the GNU General Public License along`	`You should have received a copy of the GNU General Public License along`
`with this program. If not, see \hbox{<http://www.gnu.org/licenses/>} for a`	`with this program. If not, see \hbox{<http://www.gnu.org/licenses/>} for a`
`copy.`	`copy.`
`\end{license}`	`\end{license}`
`\begin{revisionhistory}`	`\begin{revisionhistory}`
	`0.5 & 9/29/2015 & Gisselquist & Added pipelined memory access discussion.\\\hline`
`0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline`	`0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline`
`0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline`	`0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline`
`0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline`	`0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline`
`0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline`	`0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline`
`\end{revisionhistory}`	`\end{revisionhistory}`
Line 409...	Line 410...
`The tenth bit is a trap bit. It is set whenever the user requests a soft`	`The tenth bit is a trap bit. It is set whenever the user requests a soft`
`interrupt, and cleared on any return to userspace command. This allows the`	`interrupt, and cleared on any return to userspace command. This allows the`
`supervisor, in supervisor mode, to determine whether it got to supervisor`	`supervisor, in supervisor mode, to determine whether it got to supervisor`
`mode from a trap or from an external interrupt or both.`	`mode from a trap or from an external interrupt or both.`

	`These status register bits are summarized in Tbl.~\ref{tbl:ccbits}.`
	`\begin{table}`
	`\begin{center}`
	`\begin{tabular}{l\|l}`
	`Bit & Meaning \\\hline`
	`9 & Soft trap, set on a trap from user mode, cleared when returning to user mode\\\hline`
	`8 & (Reserved for) Floating point enable \\\hline`
	`7 & Halt on break, to support an external debugger \\\hline`
	`6 & Step, single step the CPU in user mode\\\hline`
	`5 & GIE, or Global Interrupt Enable \\\hline`
	`4 & Sleep \\\hline`
	`3 & V, or overflow bit.\\\hline`
	`2 & N, or negative bit.\\\hline`
	`1 & C, or carry bit.\\\hline`
	`0 & Z, or zero bit. \\\hline`
	`\end{tabular}`
	`\caption{Condition Code / Status Register Bits}\label{tbl:ccbits}`
	`\end{center}\end{table}`

`\section{Conditional Instructions}`	`\section{Conditional Instructions}`
`Most, although not quite all, instructions may be conditionally executed. From`	`Most, although not quite all, instructions may be conditionally executed. From`
`the four condition code flags, eight conditions are defined. These are shown`	`the four condition code flags, eight conditions are defined. These are shown`
`in Tbl.~\ref{tbl:conditions}.`	`in Tbl.~\ref{tbl:conditions}.`
`\begin{table}`	`\begin{table}`
Line 544...	Line 564...
`\begin{tabular}{\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|c\|}\hline`	`\begin{tabular}{\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|l\|c\|}\hline`
`\rowcolor[gray]{0.85}`	`\rowcolor[gray]{0.85}`
`Op Code & \multicolumn{8}{c\|}{31\ldots24} & \multicolumn{8}{c\|}{23\ldots 16}`	`Op Code & \multicolumn{8}{c\|}{31\ldots24} & \multicolumn{8}{c\|}{23\ldots 16}`
`& \multicolumn{8}{c\|}{15\ldots 8} & \multicolumn{8}{c\|}{7\ldots 0}`	`& \multicolumn{8}{c\|}{15\ldots 8} & \multicolumn{8}{c\|}{7\ldots 0}`
`& Sets CC? \\\hline\hline`	`& Sets CC? \\\hline\hline`
`CMP(Sub) & \multicolumn{4}{l\|}{4'h0}`	`{\tt CMP(Sub)} & \multicolumn{4}{l\|}{4'h0}`
`& \multicolumn{4}{l\|}{D. Reg}`	`& \multicolumn{4}{l\|}{D. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`TST(And) & \multicolumn{4}{l\|}{4'h1}`	`{\tt TST(And)} & \multicolumn{4}{l\|}{4'h1}`
`& \multicolumn{4}{l\|}{D. Reg}`	`& \multicolumn{4}{l\|}{D. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`MOV & \multicolumn{4}{l\|}{4'h2}`	`{\tt MOV} & \multicolumn{4}{l\|}{4'h2}`
`& \multicolumn{4}{l\|}{D. Reg}`	`& \multicolumn{4}{l\|}{D. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& A-Usr`	`& A-Usr`
`& \multicolumn{4}{l\|}{B-Reg}`	`& \multicolumn{4}{l\|}{B-Reg}`
`& B-Usr`	`& B-Usr`
`& \multicolumn{15}{l\|}{15'bit signed offset}`	`& \multicolumn{15}{l\|}{15'bit signed offset}`
`& \\\hline`	`& \\\hline`
`LODI & \multicolumn{4}{l\|}{4'h3}`	`{\tt LODI} & \multicolumn{4}{l\|}{4'h3}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{24}{l\|}{24'bit Signed Immediate}`	`& \multicolumn{24}{l\|}{24'bit Signed Immediate}`
`& \\\hline`	`& \\\hline`
`NOOP & \multicolumn{4}{l\|}{4'h4}`	`{\tt NOOP} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{4'he}`	`& \multicolumn{4}{l\|}{4'he}`
`& \multicolumn{24}{l\|}{24'h00}`	`& \multicolumn{24}{l\|}{24'h00}`
`& \\\hline`	`& \\\hline`
`BREAK & \multicolumn{4}{l\|}{4'h4}`	`{\tt BREAK} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{4'he}`	`& \multicolumn{4}{l\|}{4'he}`
`& \multicolumn{24}{l\|}{24'h01}`	`& \multicolumn{24}{l\|}{24'h01}`
`& \\\hline`	`& \\\hline`
`{\em Reserved} & \multicolumn{4}{l\|}{4'h4}`	`{\em Reserved} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{4'he}`	`& \multicolumn{4}{l\|}{4'he}`
`& \multicolumn{24}{l\|}{24'bits, but not 0 or 1.}`	`& \multicolumn{24}{l\|}{24'bits, but not 0 or 1.}`
`& \\\hline`	`& \\\hline`
`LODIHI & \multicolumn{4}{l\|}{4'h4}`	`{\tt LODIHI }& \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{4'hf}`	`& \multicolumn{4}{l\|}{4'hf}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b1`	`& 1'b1`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{16}{l\|}{16-bit Immediate}`	`& \multicolumn{16}{l\|}{16-bit Immediate}`
`& \\\hline`	`& \\\hline`
`LODILO & \multicolumn{4}{l\|}{4'h4}`	`{\tt LODILO} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{4'hf}`	`& \multicolumn{4}{l\|}{4'hf}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b0`	`& 1'b0`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{16}{l\|}{16-bit Immediate}`	`& \multicolumn{16}{l\|}{16-bit Immediate}`
`& \\\hline`	`& \\\hline`
`16-b MPYU & \multicolumn{4}{l\|}{4'h4}`	`16-b {\tt MPYU} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b0 & \multicolumn{4}{l\|}{Reg}`	`& 1'b0 & \multicolumn{4}{l\|}{Reg}`
`& \multicolumn{16}{l\|}{16-bit Offset}`	`& \multicolumn{16}{l\|}{16-bit Offset}`
`& Yes \\\hline`	`& Yes \\\hline`
`16-b MPYU(I) & \multicolumn{4}{l\|}{4'h4}`	`16-b {\tt MPYU}(I) & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b0 & \multicolumn{4}{l\|}{4'hf}`	`& 1'b0 & \multicolumn{4}{l\|}{4'hf}`
`& \multicolumn{16}{l\|}{16-bit Offset}`	`& \multicolumn{16}{l\|}{16-bit Offset}`
`& Yes \\\hline`	`& Yes \\\hline`
`16-b MPYS & \multicolumn{4}{l\|}{4'h4}`	`16-b {\tt MPYS} & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b1 & \multicolumn{4}{l\|}{Reg}`	`& 1'b1 & \multicolumn{4}{l\|}{Reg}`
`& \multicolumn{16}{l\|}{16-bit Offset}`	`& \multicolumn{16}{l\|}{16-bit Offset}`
`& Yes \\\hline`	`& Yes \\\hline`
`16-b MPYS(I) & \multicolumn{4}{l\|}{4'h4}`	`16-b {\tt MPYS}(I) & \multicolumn{4}{l\|}{4'h4}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& 1'b1 & \multicolumn{4}{l\|}{4'hf}`	`& 1'b1 & \multicolumn{4}{l\|}{4'hf}`
`& \multicolumn{16}{l\|}{16-bit Offset}`	`& \multicolumn{16}{l\|}{16-bit Offset}`
`& Yes \\\hline`	`& Yes \\\hline`
`ROL & \multicolumn{4}{l\|}{4'h5}`	`{\tt ROL} & \multicolumn{4}{l\|}{4'h5}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B, truncated to low order 5 bits}`	`& \multicolumn{21}{l\|}{Operand B, truncated to low order 5 bits}`
`& \\\hline`	`& \\\hline`
`LOD & \multicolumn{4}{l\|}{4'h6}`	`{\tt LOD} & \multicolumn{4}{l\|}{4'h6}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B address}`	`& \multicolumn{21}{l\|}{Operand B address}`
`& \\\hline`	`& \\\hline`
`STO & \multicolumn{4}{l\|}{4'h7}`	`{\tt STO} & \multicolumn{4}{l\|}{4'h7}`
`& \multicolumn{4}{l\|}{D. Reg}`	`& \multicolumn{4}{l\|}{D. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B address}`	`& \multicolumn{21}{l\|}{Operand B address}`
`& \\\hline`	`& \\\hline`
`SUB & \multicolumn{4}{l\|}{4'h8}`	`{\tt SUB} & \multicolumn{4}{l\|}{4'h8}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`AND & \multicolumn{4}{l\|}{4'h9}`	`{\tt AND} & \multicolumn{4}{l\|}{4'h9}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`ADD & \multicolumn{4}{l\|}{4'ha}`	`{\tt ADD} & \multicolumn{4}{l\|}{4'ha}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`OR & \multicolumn{4}{l\|}{4'hb}`	`{\tt OR} & \multicolumn{4}{l\|}{4'hb}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`XOR & \multicolumn{4}{l\|}{4'hc}`	`{\tt XOR} & \multicolumn{4}{l\|}{4'hc}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B}`	`& \multicolumn{21}{l\|}{Operand B}`
`& Yes \\\hline`	`& Yes \\\hline`
`LSL/ASL & \multicolumn{4}{l\|}{4'hd}`	`{\tt LSL/ASL} & \multicolumn{4}{l\|}{4'hd}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`	`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`
`& Yes \\\hline`	`& Yes \\\hline`
`ASR & \multicolumn{4}{l\|}{4'he}`	`{\tt ASR} & \multicolumn{4}{l\|}{4'he}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`	`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`
`& Yes \\\hline`	`& Yes \\\hline`
`LSR & \multicolumn{4}{l\|}{4'hf}`	`{\tt LSR} & \multicolumn{4}{l\|}{4'hf}`
`& \multicolumn{4}{l\|}{R. Reg}`	`& \multicolumn{4}{l\|}{R. Reg}`
`& \multicolumn{3}{l\|}{Cond.}`	`& \multicolumn{3}{l\|}{Cond.}`
`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`	`& \multicolumn{21}{l\|}{Operand B, imm. truncated to 6 bits}`
`& Yes \\\hline`	`& Yes \\\hline`
`\end{tabular}`	`\end{tabular}`
Line 690...	Line 710...
`the Zip CPU. Many of these instructions will have assembly equivalents,`	`the Zip CPU. Many of these instructions will have assembly equivalents,`
`such as the branch instructions, to facilitate working with the CPU.`	`such as the branch instructions, to facilitate working with the CPU.`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`	`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`
`Mapped & Actual & Notes \\\hline`	`Mapped & Actual & Notes \\\hline`
`ABS Rx`	`{\tt ABS Rx}`
`& \parbox[t]{1.5in}{TST -1,Rx\\NEG.LT Rx}`	`& \parbox[t]{1.5in}{\tt TST -1,Rx\\NEG.LT Rx}`
`& Absolute value, depends upon derived NEG.\\\hline`	`& Absolute value, depends upon derived NEG.\\\hline`
`\parbox[t]{1.4in}{ADD Ra,Rx\\ADDC Rb,Ry}`	`\parbox[t]{1.4in}{\tt ADD Ra,Rx\\ADDC Rb,Ry}`
`& \parbox[t]{1.5in}{Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}`	`& \parbox[t]{1.5in}{\tt Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}`
`& Add with carry \\\hline`	`& Add with carry \\\hline`
`BRA.Cond +/-\$Addr`	`{\tt BRA.Cond +/-\$Addr}`
`& \hbox{MOV.cond \$Addr+PC,PC}`	`& \hbox{\tt MOV.cond \$Addr+PC,PC}`
`& Branch or jump on condition. Works for 15--bit`	`& Branch or jump on condition. Works for 15--bit`
`signed address offsets.\\\hline`	`signed address offsets.\\\hline`
`BRA.Cond +/-\$Addr`	`{\tt BRA.Cond +/-\$Addr}`
`& \parbox[t]{1.5in}{LDI \$Addr,Rx \\ ADD.cond Rx,PC}`	`& \parbox[t]{1.5in}{\tt LDI \$Addr,Rx \\ ADD.cond Rx,PC}`
`& Branch/jump on condition. Works for`	`& Branch/jump on condition. Works for`
`23 bit address offsets, but costs a register, an extra instruction,`	`23 bit address offsets, but costs a register, an extra instruction,`
`and sets the flags. \\\hline`	`and sets the flags. \\\hline`
`BNC PC+\$Addr`	`{\tt BNC PC+\$Addr}`
`& \parbox[t]{1.5in}{Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}`	`& \parbox[t]{1.5in}{\tt Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}`
`& Example of a branch on an unsupported`	`& Example of a branch on an unsupported`
`condition, in this case a branch on not carry \\\hline`	`condition, in this case a branch on not carry \\\hline`
`BUSY & MOV \$-1(PC),PC & Execute an infinite loop \\\hline`	`{\tt BUSY } & {\tt MOV \$-1(PC),PC} & Execute an infinite loop \\\hline`
`CLRF.NZ Rx`	`{\tt CLRF.NZ Rx }`
`& XOR.NZ Rx,Rx`	`& {\tt XOR.NZ Rx,Rx}`
`& Clear Rx, and flags, if the Z-bit is not set \\\hline`	`& Clear Rx, and flags, if the Z-bit is not set \\\hline`
`CLR Rx`	`{\tt CLR Rx }`
`& LDI \$0,Rx`	`& {\tt LDI \$0,Rx}`
`& Clears Rx, leaves flags untouched. This instruction cannot be`	`& Clears Rx, leaves flags untouched. This instruction cannot be`
`conditional. \\\hline`	`conditional. \\\hline`
`EXCH.W Rx`	`{\tt EXCH.W Rx }`
`& ROL \$16,Rx`	`& {\tt ROL \$16,Rx}`
`& Exchanges the top and bottom 16'bit words of Rx \\\hline`	`& Exchanges the top and bottom 16'bit words of Rx \\\hline`
`HALT`	`{\tt HALT }`
`& Or \$SLEEP,CC`	`& {\tt Or \$SLEEP,CC}`
`& Executed while in interrupt mode. In user mode this is simply a`	`& This only works when issued in interrupt/supervisor mode. In user`
`wait until interrupt instruction. \\\hline`	`mode this is simply a wait until interrupt instruction. \\\hline`
`INT & LDI \$0,CC`	`{\tt INT } & {\tt LDI \$0,CC} & \\\hline`
`& Since we're using the CC register as a trap vector as well, this`	`{\tt IRET}`
`executes TRAP \#0. \\\hline`	`& {\tt OR \$GIE,CC}`
`IRET`	`& Also known as an RTU instruction (Return to Userspace) \\\hline`
`& OR \$GIE,CC`	`{\tt JMP R6+\$Addr}`
`& Also an RTU instruction (Return to Userspace) \\\hline`	`& {\tt MOV \$Addr(R6),PC}`
`JMP R6+\$Addr`
`& MOV \$Addr(R6),PC`
`& \\\hline`	`& \\\hline`
`JSR PC+\$Addr`	`{\tt JSR PC+\$Addr}`
`& \parbox[t]{1.5in}{SUB \$1,SP \\\`	`& \parbox[t]{1.5in}{\tt SUB \$1,SP \\\`
`MOV \$3+PC,R0 \\`	`MOV \$3+PC,R0 \\`
`STO R0,1(SP) \\`	`STO R0,1(SP) \\`
`MOV \$Addr+PC,PC \\`	`MOV \$Addr+PC,PC \\`
`ADD \$1,SP}`	`ADD \$1,SP}`
`& Jump to Subroutine. Note the required cleanup instruction after`	`& Jump to Subroutine. Note the required cleanup instruction after`
`returning. This could easily be turned into a three instruction`	`returning. This could easily be turned into a three instruction`
`operand, removing the preliminary stack instruction before and`	`operand, removing the preliminary stack instruction before and`
`the cleanup after, by adjusting how any stack frame was built for`	`the cleanup after, by adjusting how any stack frame was built for`
`this routine to include space at the top of the stack for the PC.`	`this routine to include space at the top of the stack for the PC.`
	`Note also that jumping to a subroutine costs a copy register, {\tt R0}`
	`in this case.`
`\\\hline`	`\\\hline`
`JSR PC+\$Addr`	`{\tt JSR PC+\$Addr }`
`& \parbox[t]{1.5in}{MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}`	`& \parbox[t]{1.5in}{\tt MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}`
`&This is the high speed`	`&This is the high speed`
`version of a subroutine call, necessitating a register to hold the`	`version of a subroutine call, necessitating a register to hold the`
`last PC address. In its favor, this method doesn't suffer the`	`last PC address. In its favor, this method doesn't suffer the`
`mandatory memory access of the other approach. \\\hline`	`mandatory memory access of the other approach. \\\hline`
`LDI.l \$val,Rx`	`{\tt LDI.l \$val,Rx }`
`& \parbox[t]{1.5in}{LDIHI (\$val$>>$16)\&0x0ffff, Rx \\`	`& \parbox[t]{1.8in}{\tt LDIHI (\$val$>>$16)\&0x0ffff, Rx \\`
`LDILO (\$val \& 0x0ffff)}`	`LDILO (\$val\&0x0ffff),Rx}`
`& Sadly, there's not enough instruction`	`& Sadly, there's not enough instruction`
`space to load a complete immediate value into any register.`	`space to load a complete immediate value into any register.`
`Therefore, fully loading any register takes two cycles.`	`Therefore, fully loading any register takes two cycles.`
`The LDIHI (load immediate high) and LDILO (load immediate low)`	`The LDIHI (load immediate high) and LDILO (load immediate low)`
`instructions have been created to facilitate this. \\\hline`	`instructions have been created to facilitate this. \\\hline`
Line 765...	Line 785...
`\caption{Derived Instructions}\label{tbl:derived-1}`	`\caption{Derived Instructions}\label{tbl:derived-1}`
`\end{center}\end{table}`	`\end{center}\end{table}`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`	`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`
`Mapped & Actual & Notes \\\hline`	`Mapped & Actual & Notes \\\hline`
`LOD.b \$addr,Rx`	`{\tt LOD.b \$addr,Rx}`
`& \parbox[t]{1.5in}{%`	`& \parbox[t]{1.5in}{\tt %`
`LDI \$addr,Ra \\`	`LDI \$addr,Ra \\`
`LDI \$addr,Rb \\`	`LDI \$addr,Rb \\`
`LSR \$2,Ra \\`	`LSR \$2,Ra \\`
`AND \$3,Rb \\`	`AND \$3,Rb \\`
`LOD (Ra),Rx \\`	`LOD (Ra),Rx \\`
Line 786...	Line 806...
`all other addresses in this document are 32-bit wordlength addresses.`	`all other addresses in this document are 32-bit wordlength addresses.`
`For this reason,`	`For this reason,`
`we needed to drop the bottom two bits. This also limits the address`	`we needed to drop the bottom two bits. This also limits the address`
`space of character accesses using this method from 16 MB down to 4MB.}`	`space of character accesses using this method from 16 MB down to 4MB.}`
`\\\hline`	`\\\hline`
`\parbox[t]{1.5in}{LSL \$1,Rx\\ LSLC \$1,Ry}`	`\parbox[t]{1.5in}{\tt LSL \$1,Rx\\ LSLC \$1,Ry}`
`& \parbox[t]{1.5in}{LSL \$1,Ry \\`	`& \parbox[t]{1.5in}{\tt LSL \$1,Ry \\`
`LSL \$1,Rx \\`	`LSL \$1,Rx \\`
`OR.C \$1,Ry}`	`OR.C \$1,Ry}`
`& Logical shift left with carry. Note that the`	`& Logical shift left with carry. Note that the`
`instruction order is now backwards, to keep the conditions valid.`	`instruction order is now backwards, to keep the conditions valid.`
`That is, LSL sets the carry flag, so if we did this the other way`	`That is, LSL sets the carry flag, so if we did this the other way`
`with Rx before Ry, then the condition flag wouldn't have been right`	`with Rx before Ry, then the condition flag wouldn't have been right`
`for an OR correction at the end. \\\hline`	`for an OR correction at the end. \\\hline`
`\parbox[t]{1.5in}{LSR \$1,Rx \\ LSRC \$1,Ry}`	`\parbox[t]{1.5in}{\tt LSR \$1,Rx \\ LSRC \$1,Ry}`
`& \parbox[t]{1.5in}{CLR Rz \\`	`& \parbox[t]{1.5in}{\tt CLR Rz \\`
`LSR \$1,Ry \\`	`LSR \$1,Ry \\`
`LDIHI.C \$8000h,Rz \\`	`LDIHI.C \$8000h,Rz \\`
`LSR \$1,Rx \\`	`LSR \$1,Rx \\`
`OR Rz,Rx}`	`OR Rz,Rx}`
`& Logical shift right with carry \\\hline`	`& Logical shift right with carry \\\hline`
`NEG Rx & \parbox[t]{1.5in}{XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline`	`{\tt NEG Rx} & \parbox[t]{1.5in}{\tt XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline`
`NEG.C Rx & \parbox[t]{1.5in}{MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline`	`{\tt NEG.C Rx} & \parbox[t]{1.5in}{\tt MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline`
`NOOP & NOOP & While there are many`	`{\tt NOOP} & {\tt NOOP} & While there are many`
`operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these`	`operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these`
`operations have consequences in that they might stall the bus if`	`operations have consequences in that they might stall the bus if`
`Rx isn't ready yet. For this reason, we have a dedicated NOOP`	`Rx isn't ready yet. For this reason, we have a dedicated NOOP`
`instruction. \\\hline`	`instruction. \\\hline`
`NOT Rx & XOR \$-1,Rx & \\\hline`	`{\tt NOT Rx } & {\tt XOR \$-1,Rx } & \\\hline`
`POP Rx`	`{\tt POP Rx }`
`& \parbox[t]{1.5in}{LOD \$1(SP),Rx \\ ADD \$1,SP}`	`& \parbox[t]{1.5in}{\tt LOD \$1(SP),Rx \\ ADD \$1,SP}`
`& Note`	`& Note`
`that for interrupt purposes, one can never depend upon the value at`	`that for interrupt purposes, one can never depend upon the value at`
`(SP). Hence you read from it, then increment it, lest having`	`(SP). Hence you read from it, then increment it, lest having`
`incremented it first something then comes along and writes to that`	`incremented it first something then comes along and writes to that`
`value before you can read the result. \\\hline`	`value before you can read the result. \\\hline`
`\end{tabular}`	`\end{tabular}`
`\caption{Derived Instructions, continued}\label{tbl:derived-2}`	`\caption{Derived Instructions, continued}\label{tbl:derived-2}`
`\end{center}\end{table}`	`\end{center}\end{table}`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`	`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`
`PUSH Rx`	`{\tt PUSH Rx}`
`& \parbox[t]{1.5in}{SUB \$1,SP \\`	`& \parbox[t]{1.5in}{SUB \$1,SP \\`
`STO Rx,\$1(SP)}`	`STO Rx,\$1(SP)}`
`& \\\hline`	`& Note that for pipelined operation, it helps to coalesce all the`
`PUSH Rx-Ry`	`{\tt SUB}'s into one command, and place the {\tt STO}'s right`
`& \parbox[t]{1.5in}{SUB \$n,SP \\`	`after each other.\\\hline`
	`{\tt PUSH Rx-Ry}`
	`& \parbox[t]{1.5in}{\tt SUB \$n,SP \\`
`STO Rx,\$n(SP)`	`STO Rx,\$n(SP)`
`\ldots \\`	`\ldots \\`
`STO Ry,\$1(SP)}`	`STO Ry,\$1(SP)}`
`& Multiple pushes at once only need the single subtract from the`	`& Multiple pushes at once only need the single subtract from the`
`stack pointer. This derived instruction is analogous to a similar one`	`stack pointer. This derived instruction is analogous to a similar one`
`on the Motoroloa 68k architecture, although the Zip Assembler`	`on the Motoroloa 68k architecture, although the Zip Assembler`
`does not support this instruction (yet).\\\hline`	`does not support this instruction (yet). This instruction`
`RESET`	`also supports pipelined memory access.\\\hline`
`& \parbox[t]{1in}{STO \$1,\$watchdog(R12)\\NOOP\\NOOP}`	`{\tt RESET}`
`& \parbox[t]{3in}{This depends upon the peripheral base address being`	`& \parbox[t]{1in}{\tt STO \$1,\$watchdog(R12)\\NOOP\\NOOP}`
	`& This depends upon the peripheral base address being`
`in R12.`	`in R12.`

`Another opportunity might be to jump to the reset address from within`	`Another opportunity might be to jump to the reset address from within`
`supervisor mode.}\\\hline`	`supervisor mode.\\\hline`
`RET & \parbox[t]{1.5in}{LOD \$1(SP),PC}`	`{\tt RET} & \parbox[t]{1.5in}{\tt LOD \$1(SP),PC}`
`& Note that this depends upon the calling context to clean up the`	`& Note that this depends upon the calling context to clean up the`
`stack, as outlined for the JSR instruction. \\\hline`	`stack, as outlined for the JSR instruction. \\\hline`
`RET & MOV R12,PC`	`{\tt RET} & {\tt MOV R12,PC}`
`& This is the high(er) speed version, that doesn't touch the stack.`	`& This is the high(er) speed version, that doesn't touch the stack.`
`As such, it doesn't suffer a stall on memory read/write to the stack.`	`As such, it doesn't suffer a stall on memory read/write to the stack.`
`\\\hline`	`\\\hline`
`STEP Rr,Rt`	`{\tt STEP Rr,Rt}`
`& \parbox[t]{1.5in}{LSR \$1,Rr \\ XOR.C Rt,Rr}`	`& \parbox[t]{1.5in}{\tt LSR \$1,Rr \\ XOR.C Rt,Rr}`
`& Step a Galois implementation of a Linear Feedback Shift Register, Rr,`	`& Step a Galois implementation of a Linear Feedback Shift Register, Rr,`
`using taps Rt \\\hline`	`using taps Rt \\\hline`
`STO.b Rx,\$addr`	`{\tt STO.b Rx,\$addr}`
`& \parbox[t]{1.5in}{%`	`& \parbox[t]{1.5in}{\tt %`
`LDI \$addr,Ra \\`	`LDI \$addr,Ra \\`
`LDI \$addr,Rb \\`	`LDI \$addr,Rb \\`
`LSR \$2,Ra \\`	`LSR \$2,Ra \\`
`AND \$3,Rb \\`	`AND \$3,Rb \\`
`SUB \$32,Rb \\`	`SUB \$32,Rb \\`
`LOD (Ra),Ry \\`	`LOD (Ra),Ry \\`
`AND \$0ffh,Rx \\`	`AND \$0ffh,Rx \\`
`AND \$-0ffh,Ry \\`	`AND \~\$0ffh,Ry \\`
`ROL Rb,Rx \\`	`ROL Rb,Rx \\`
`OR Rx,Ry \\`	`OR Rx,Ry \\`
`STO Ry,(Ra) }`	`STO Ry,(Ra) }`
`& \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized`	`& \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized`
`for byte-wise operations.`	`for byte-wise operations.`
Line 875...	Line 898...
`byte-wise address, whereas in all of our other examples it is a`	`byte-wise address, whereas in all of our other examples it is a`
`32-bit word address. This also limits the address space`	`32-bit word address. This also limits the address space`
`of character accesses from 16 MB down to 4MB.F`	`of character accesses from 16 MB down to 4MB.F`
`Further, this instruction implies a byte ordering,`	`Further, this instruction implies a byte ordering,`
`such as big or little endian.} \\\hline`	`such as big or little endian.} \\\hline`
`SWAP Rx,Ry`	`{\tt SWAP Rx,Ry }`
`& \parbox[t]{1.5in}{`	`& \parbox[t]{1.5in}{\tt`
`XOR Ry,Rx \\`	`XOR Ry,Rx \\`
`XOR Rx,Ry \\`	`XOR Rx,Ry \\`
`XOR Ry,Rx}`	`XOR Ry,Rx}`
`& While no extra registers are needed, this example`	`& While no extra registers are needed, this example`
`does take 3-clocks. \\\hline`	`does take 3-clocks. \\\hline`
`TRAP \#X`	`{\tt TRAP \#X}`
`& \parbox[t]{1.5in}{LDI \$x,R0 \\ AND ~\$GIE,CC }`	`& \parbox[t]{1.5in}{\tt LDI \$x,R0 \\ AND \~\$GIE,CC }`
`& This works because whenever a user lowers the \$GIE flag, it sets`	`& This works because whenever a user lowers the \$GIE flag, it sets`
`a TRAP bit within the CC register. Therefore, upon entering the`	`a TRAP bit within the CC register. Therefore, upon entering the`
`supervisor state, the CPU only need check this bit to know that it`	`supervisor state, the CPU only need check this bit to know that it`
`got there via a TRAP. The trap could be made conditional by making`	`got there via a TRAP. The trap could be made conditional by making`
`the LDI and the AND conditional. In that case, the assembler would`	`the LDI and the AND conditional. In that case, the assembler would`
Line 896...	Line 919...
`\end{tabular}`	`\end{tabular}`
`\caption{Derived Instructions, continued}\label{tbl:derived-3}`	`\caption{Derived Instructions, continued}\label{tbl:derived-3}`
`\end{center}\end{table}`	`\end{center}\end{table}`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`	`\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline`
`TST Rx`	`{\tt TST Rx}`
`& TST \$-1,Rx`	`& {\tt TST \$-1,Rx}`
`& Set the condition codes based upon Rx. Could also do a CMP \$0,Rx,`	`& Set the condition codes based upon Rx. Could also do a CMP \$0,Rx,`
`ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc. The TST and CMP`	`ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc. The TST and CMP`
`approaches won't stall future pipeline stages looking for the value`	`approaches won't stall future pipeline stages looking for the value`
`of Rx. \\\hline`	`of Rx. \\\hline`
`WAIT`	`{\tt WAIT}`
`& Or \$SLEEP,CC`	`& {\tt Or \$GIE \| \$SLEEP,CC}`
`& Wait 'til interrupt. In an interrupts disabled context, this`	`& Wait until the next interrupt, then jump to supervisor/interrupt`
`becomes a HALT instruction.`	`mode.`
`\end{tabular}`	`\end{tabular}`
`\caption{Derived Instructions, continued}\label{tbl:derived-4}`	`\caption{Derived Instructions, continued}\label{tbl:derived-4}`
`\end{center}\end{table}`	`\end{center}\end{table}`
`\section{Pipeline Stages}`	`\section{Pipeline Stages}`
`As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},`	`As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},`
Line 1071...	Line 1094...
`In this case, the LOD instruction cannot start until the STO is finished.`	`In this case, the LOD instruction cannot start until the STO is finished.`
`With proper scheduling, it is possible to do something in the ALU while the`	`With proper scheduling, it is possible to do something in the ALU while the`
`memory unit is busy with the STO instruction, but otherwise this pipeline will`	`memory unit is busy with the STO instruction, but otherwise this pipeline will`
`stall waiting for it to complete.`	`stall waiting for it to complete.`

`Note that even though the Wishbone bus can support pipelined accesses at`	`The Zip CPU does have the capability of supporting pipelined memory access,`
`one access per clock, only the prefetch stage can take advantage of this.`	`but only under the following conditions: all accesses within the pipeline`
`Load and Store instructions are stuck at one wishbone cycle per instruction.`	`must all be reads or all be writes, all must use the same register for their`
	`address, and there can be no stalls or other instructions between pipelined`
	`memory access instructions. Further, the offset to memory must be increasing`
	`by one address each instruction. These conditions work well for saving or`
	`storing registers to the stack.`

`\item When waiting for a conditional memory read operation to complete`	`\item When waiting for a conditional memory read operation to complete`
`\begin{enumerate}`	`\begin{enumerate}`
`\item\ {\tt LOD.Z address,RA}`	`\item\ {\tt LOD.Z address,RA}`
`\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}`	`\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}`
Line 1233...	Line 1260...
`restart its transfer by writing the contents of its internal buffer and then`	`restart its transfer by writing the contents of its internal buffer and then`
`re-entering its read cycle again.`	`re-entering its read cycle again.`

`When coupled with a peripheral, the DMA controller can be configured to start`	`When coupled with a peripheral, the DMA controller can be configured to start`
`a memory copy on an interrupt line going high. Further, the controller can be`	`a memory copy on an interrupt line going high. Further, the controller can be`
`configured to issue reads from (or two) the same address instead of incrementing`	`configured to issue reads from (or to) the same address instead of incrementing`
`the address at each clock. The DMA completes once the total number of items`	`the address at each clock. The DMA completes once the total number of items`
`specified (not the transfer length) have been transferred.`	`specified (not the transfer length) have been transferred.`

`In each case, once the transfer is complete and the DMA unit returns to`	`In each case, once the transfer is complete and the DMA unit returns to`
`idle, the DMA will issue an interrupt.`	`idle, the DMA will issue an interrupt.`
Line 1400...	Line 1427...
`onto the user stack and then copying the resulting stack address`	`onto the user stack and then copying the resulting stack address`
`into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.`	`into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{ll}`	`\begin{tabular}{ll}`
`{\tt swap\_out:} \\`	`{\tt swap\_out:} \\`
`& {\tt MOV -15(uSP),R1} \\`	`& {\tt MOV -15(uSP),R5} \\`
`& {\tt STO R1,stack(R12)} \\`	`& {\tt STO R5,stack(R12)} \\`
`& {\tt MOV uPC,R0} \\`	`& {\tt MOV uR0,R0} \\`
`& {\tt STO R0,15(R1)} \\`	`& {\tt MOV uR1,R1} \\`
`& {\tt MOV uCC,R0} \\`	`& {\tt MOV uR2,R2} \\`
`& {\tt STO R0,14(R1)} \\`	`& {\tt MOV uR3,R3} \\`
	`& {\tt MOV uR4,R4} \\`
	`& {\tt STO R0,1(R5)} {\em ; Exploit memory pipelining: }\\`
	`& {\tt STO R1,2(R5)} {\em ; All instructions write to stack }\\`
	`& {\tt STO R2,3(R5)} {\em ; All offsets increment by one }\\`
	`& {\tt STO R3,4(R5)} {\em ; Longest pipeline is 5 cycles.}\\`
	`& {\tt STO R4,5(R5)} \\`
	`& \ldots {\em ; Need to repeat for all user registers} \\`
	`\iffalse`
	`& {\tt MOV uR5,R0} \\`
	`& {\tt MOV uR6,R1} \\`
	`& {\tt MOV uR7,R2} \\`
	`& {\tt MOV uR8,R3} \\`
	`& {\tt MOV uR9,R4} \\`
	`& {\tt STO R0,6(R5) }\\`
	`& {\tt STO R1,7(R5) }\\`
	`& {\tt STO R2,8(R5) }\\`
	`& {\tt STO R3,9(R5) }\\`
	`& {\tt STO R4,10(R5)} \\`
	`\fi`
	`& {\tt MOV uR10,R0} \\`
	`& {\tt MOV uR11,R1} \\`
	`& {\tt MOV uR12,R2} \\`
	`& {\tt MOV uCC,R3} \\`
	`& {\tt MOV uPC,R4} \\`
	`& {\tt STO R0,11(R5)}\\`
	`& {\tt STO R1,12(R5)}\\`
	`& {\tt STO R2,13(R5)}\\`
	`& {\tt STO R3,14(R5)}\\`
	`& {\tt STO R4,15(R5)} \\`
`& {\em ; We can skip storing the stack, uSP, since it'll be stored}\\`	`& {\em ; We can skip storing the stack, uSP, since it'll be stored}\\`
`& {\em ; elsewhere (in the task structure) }\\`	`& {\em ; elsewhere (in the task structure) }\\`
`& {\tt MOV uR13,R0} \\`
`& {\tt STO R0,13(R1)} \\`
`& \ldots {\em ; Need to repeat for all user registers} \\`
`& {\tt MOV uR0,R0} \\`
`& {\tt STO R0,1(R1)} \\`
`\end{tabular}`	`\end{tabular}`
`\caption{Example Storing User Task Context}\label{tbl:context-out}`	`\caption{Example Storing User Task Context}\label{tbl:context-out}`
`\end{center}\end{table}`	`\end{center}\end{table}`
`For the sake of discussion, we assume the supervisor maintains a`	`For the sake of discussion, we assume the supervisor maintains a`
`pointer to the current task's structure in supervisor register`	`pointer to the current task's structure in supervisor register`
Line 1507...	Line 1558...
`back off of the stack to run this task. An example of this is`	`back off of the stack to run this task. An example of this is`
`shown in Tbl.~\ref{tbl:context-in},`	`shown in Tbl.~\ref{tbl:context-in},`
`\begin{table}\begin{center}`	`\begin{table}\begin{center}`
`\begin{tabular}{ll}`	`\begin{tabular}{ll}`
`{\tt swap\_in:} \\`	`{\tt swap\_in:} \\`
`& {\tt LOD stack(R12),R1} \\`	`& {\tt LOD stack(R12),R5} \\`

Line 46...

\documentclass{gqtekspec}

\documentclass{gqtekspec}

\project{Zip CPU}

\project{Zip CPU}

\title{Specification}

\title{Specification}

\author{Dan Gisselquist, Ph.D.}

\author{Dan Gisselquist, Ph.D.}

\email{dgisselq (at) opencores.org}

\email{dgisselq (at) opencores.org}

\revision{Rev.~0.4}

\revision{Rev.~0.5}

\definecolor{webred}{rgb}{0.2,0,0}

\definecolor{webred}{rgb}{0.2,0,0}

\definecolor{webgreen}{rgb}{0,0.2,0}

\definecolor{webgreen}{rgb}{0,0.2,0}

\usepackage[dvips,ps2pdf,colorlinks=true,

\usepackage[dvips,ps2pdf,colorlinks=true,

        anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,

        anchorcolor=black,pagecolor=webgreen,pdfpagelabels,hypertexnames,

        pdfauthor={Dan Gisselquist},

        pdfauthor={Dan Gisselquist},

Line 74...

You should have received a copy of the GNU General Public License along

You should have received a copy of the GNU General Public License along

with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a

with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a

copy.

copy.

\end{license}

\end{license}

\begin{revisionhistory}

\begin{revisionhistory}

0.5 & 9/29/2015 & Gisselquist & Added pipelined memory access discussion.\\\hline

0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline

0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline

0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline

0.3 & 8/22/2015 & Gisselquist & First completed draft\\\hline

0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline

0.2 & 8/19/2015 & Gisselquist & Still Draft, more complete \\\hline

0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline

0.1 & 8/17/2015 & Gisselquist & Incomplete First Draft \\\hline

\end{revisionhistory}

\end{revisionhistory}

Line 409...

Line 410...

The tenth bit is a trap bit.  It is set whenever the user requests a soft

The tenth bit is a trap bit.  It is set whenever the user requests a soft

interrupt, and cleared on any return to userspace command.  This allows the

interrupt, and cleared on any return to userspace command.  This allows the

supervisor, in supervisor mode, to determine whether it got to supervisor

supervisor, in supervisor mode, to determine whether it got to supervisor

mode from a trap or from an external interrupt or both.

mode from a trap or from an external interrupt or both.

These status register bits are summarized in Tbl.~\ref{tbl:ccbits}.

\begin{table}

\begin{center}

\begin{tabular}{l|l}

Bit & Meaning \\\hline

9 & Soft trap, set on a trap from user mode, cleared when returning to user mode\\\hline

8 & (Reserved for) Floating point enable \\\hline

7 & Halt on break, to support an external debugger \\\hline

6 & Step, single step the CPU in user mode\\\hline

5 & GIE, or Global Interrupt Enable \\\hline

4 & Sleep \\\hline

3 & V, or overflow bit.\\\hline

2 & N, or negative bit.\\\hline

1 & C, or carry bit.\\\hline

0 & Z, or zero bit. \\\hline

\end{tabular}

\caption{Condition Code / Status Register Bits}\label{tbl:ccbits}

\end{center}\end{table}

\section{Conditional Instructions}

\section{Conditional Instructions}

Most, although not quite all, instructions may be conditionally executed.  From

Most, although not quite all, instructions may be conditionally executed.  From

the four condition code flags, eight conditions are defined.  These are shown

the four condition code flags, eight conditions are defined.  These are shown

in Tbl.~\ref{tbl:conditions}.

in Tbl.~\ref{tbl:conditions}.

\begin{table}

\begin{table}

Line 544...

Line 564...

\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|c|}\hline

\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|c|}\hline

\rowcolor[gray]{0.85}

\rowcolor[gray]{0.85}

Op Code & \multicolumn{8}{c|}{31\ldots24} & \multicolumn{8}{c|}{23\ldots 16}

Op Code & \multicolumn{8}{c|}{31\ldots24} & \multicolumn{8}{c|}{23\ldots 16}

        & \multicolumn{8}{c|}{15\ldots 8} & \multicolumn{8}{c|}{7\ldots 0}

        & \multicolumn{8}{c|}{15\ldots 8} & \multicolumn{8}{c|}{7\ldots 0}

        & Sets CC? \\\hline\hline

        & Sets CC? \\\hline\hline

CMP(Sub) & \multicolumn{4}{l|}{4'h0}

{\tt CMP(Sub)} & \multicolumn{4}{l|}{4'h0}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{21}{l|}{Operand B}

                & \multicolumn{21}{l|}{Operand B}

                & Yes \\\hline

                & Yes \\\hline

TST(And) & \multicolumn{4}{l|}{4'h1}

{\tt TST(And)} & \multicolumn{4}{l|}{4'h1}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{21}{l|}{Operand B}

                & \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

MOV & \multicolumn{4}{l|}{4'h2}

{\tt MOV} & \multicolumn{4}{l|}{4'h2}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & A-Usr

                & A-Usr

                & \multicolumn{4}{l|}{B-Reg}

                & \multicolumn{4}{l|}{B-Reg}

                & B-Usr

                & B-Usr

                & \multicolumn{15}{l|}{15'bit signed offset}

                & \multicolumn{15}{l|}{15'bit signed offset}

                & \\\hline

                & \\\hline

LODI & \multicolumn{4}{l|}{4'h3}

{\tt LODI} & \multicolumn{4}{l|}{4'h3}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{24}{l|}{24'bit Signed Immediate}

                & \multicolumn{24}{l|}{24'bit Signed Immediate}

                & \\\hline

                & \\\hline

NOOP & \multicolumn{4}{l|}{4'h4}

{\tt NOOP} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{24}{l|}{24'h00}

                & \multicolumn{24}{l|}{24'h00}

                & \\\hline

                & \\\hline

BREAK & \multicolumn{4}{l|}{4'h4}

{\tt BREAK} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{24}{l|}{24'h01}

                & \multicolumn{24}{l|}{24'h01}

                & \\\hline

                & \\\hline

{\em Reserved} & \multicolumn{4}{l|}{4'h4}

{\em Reserved} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{4}{l|}{4'he}

                & \multicolumn{24}{l|}{24'bits, but not 0 or 1.}

                & \multicolumn{24}{l|}{24'bits, but not 0 or 1.}

                & \\\hline

                & \\\hline

LODIHI & \multicolumn{4}{l|}{4'h4}

{\tt LODIHI }& \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b1

                & 1'b1

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{16}{l|}{16-bit Immediate}

                & \multicolumn{16}{l|}{16-bit Immediate}

                & \\\hline

                & \\\hline

LODILO & \multicolumn{4}{l|}{4'h4}

{\tt LODILO} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b0

                & 1'b0

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{16}{l|}{16-bit Immediate}

                & \multicolumn{16}{l|}{16-bit Immediate}

                & \\\hline

                & \\\hline

16-b MPYU & \multicolumn{4}{l|}{4'h4}

16-b {\tt MPYU} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b0 & \multicolumn{4}{l|}{Reg}

                & 1'b0 & \multicolumn{4}{l|}{Reg}

                & \multicolumn{16}{l|}{16-bit Offset}

                & \multicolumn{16}{l|}{16-bit Offset}

                & Yes \\\hline

                & Yes \\\hline

16-b MPYU(I) & \multicolumn{4}{l|}{4'h4}

16-b {\tt MPYU}(I) & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b0 & \multicolumn{4}{l|}{4'hf}

                & 1'b0 & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{16}{l|}{16-bit Offset}

                & \multicolumn{16}{l|}{16-bit Offset}

                & Yes \\\hline

                & Yes \\\hline

16-b MPYS & \multicolumn{4}{l|}{4'h4}

16-b {\tt MPYS} & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b1 & \multicolumn{4}{l|}{Reg}

                & 1'b1 & \multicolumn{4}{l|}{Reg}

                & \multicolumn{16}{l|}{16-bit Offset}

                & \multicolumn{16}{l|}{16-bit Offset}

                & Yes \\\hline

                & Yes \\\hline

16-b MPYS(I) & \multicolumn{4}{l|}{4'h4}

16-b {\tt MPYS}(I) & \multicolumn{4}{l|}{4'h4}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & 1'b1 & \multicolumn{4}{l|}{4'hf}

                & 1'b1 & \multicolumn{4}{l|}{4'hf}

                & \multicolumn{16}{l|}{16-bit Offset}

                & \multicolumn{16}{l|}{16-bit Offset}

                & Yes \\\hline

                & Yes \\\hline

ROL & \multicolumn{4}{l|}{4'h5}

{\tt ROL} & \multicolumn{4}{l|}{4'h5}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{21}{l|}{Operand B, truncated to low order 5 bits}

                & \multicolumn{21}{l|}{Operand B, truncated to low order 5 bits}

                & \\\hline

                & \\\hline

LOD & \multicolumn{4}{l|}{4'h6}

{\tt LOD} & \multicolumn{4}{l|}{4'h6}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{4}{l|}{R. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{21}{l|}{Operand B address}

                & \multicolumn{21}{l|}{Operand B address}

                & \\\hline

                & \\\hline

STO & \multicolumn{4}{l|}{4'h7}

{\tt STO} & \multicolumn{4}{l|}{4'h7}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{4}{l|}{D. Reg}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{3}{l|}{Cond.}

                & \multicolumn{21}{l|}{Operand B address}

                & \multicolumn{21}{l|}{Operand B address}

                & \\\hline

                & \\\hline

SUB & \multicolumn{4}{l|}{4'h8}

{\tt SUB} & \multicolumn{4}{l|}{4'h8}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B}

        &       \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

AND & \multicolumn{4}{l|}{4'h9}

{\tt AND} & \multicolumn{4}{l|}{4'h9}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B}

        &       \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

ADD & \multicolumn{4}{l|}{4'ha}

{\tt ADD} & \multicolumn{4}{l|}{4'ha}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B}

        &       \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

OR & \multicolumn{4}{l|}{4'hb}

{\tt OR} & \multicolumn{4}{l|}{4'hb}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B}

        &       \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

XOR & \multicolumn{4}{l|}{4'hc}

{\tt XOR} & \multicolumn{4}{l|}{4'hc}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B}

        &       \multicolumn{21}{l|}{Operand B}

        & Yes \\\hline

        & Yes \\\hline

LSL/ASL & \multicolumn{4}{l|}{4'hd}

{\tt LSL/ASL} & \multicolumn{4}{l|}{4'hd}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        & Yes \\\hline

        & Yes \\\hline

ASR & \multicolumn{4}{l|}{4'he}

{\tt ASR} & \multicolumn{4}{l|}{4'he}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        & Yes \\\hline

        & Yes \\\hline

LSR & \multicolumn{4}{l|}{4'hf}

{\tt LSR} & \multicolumn{4}{l|}{4'hf}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{4}{l|}{R. Reg}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{3}{l|}{Cond.}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        &       \multicolumn{21}{l|}{Operand B, imm. truncated to 6 bits}

        & Yes \\\hline

        & Yes \\\hline

\end{tabular}

\end{tabular}

Line 690...

Line 710...

the Zip CPU.  Many of these instructions will have assembly equivalents,

the Zip CPU.  Many of these instructions will have assembly equivalents,

such as the branch instructions, to facilitate working with the CPU.

such as the branch instructions, to facilitate working with the CPU.

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

Mapped & Actual  & Notes \\\hline

Mapped & Actual  & Notes \\\hline

ABS Rx

{\tt ABS Rx}

        & \parbox[t]{1.5in}{TST -1,Rx\\NEG.LT Rx}

        & \parbox[t]{1.5in}{\tt TST -1,Rx\\NEG.LT Rx}

        & Absolute value, depends upon derived NEG.\\\hline

        & Absolute value, depends upon derived NEG.\\\hline

\parbox[t]{1.4in}{ADD Ra,Rx\\ADDC Rb,Ry}

\parbox[t]{1.4in}{\tt ADD Ra,Rx\\ADDC Rb,Ry}

        & \parbox[t]{1.5in}{Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}

        & \parbox[t]{1.5in}{\tt Add Ra,Rx\\ADD.C \$1,Ry\\Add Rb,Ry}

        & Add with carry \\\hline

        & Add with carry \\\hline

BRA.Cond +/-\$Addr

{\tt BRA.Cond +/-\$Addr}

        & \hbox{MOV.cond \$Addr+PC,PC}

        & \hbox{\tt MOV.cond \$Addr+PC,PC}

        & Branch or jump on condition.  Works for 15--bit

        & Branch or jump on condition.  Works for 15--bit

                signed address offsets.\\\hline

                signed address offsets.\\\hline

BRA.Cond +/-\$Addr

{\tt BRA.Cond +/-\$Addr}

        & \parbox[t]{1.5in}{LDI \$Addr,Rx \\ ADD.cond Rx,PC}

        & \parbox[t]{1.5in}{\tt LDI \$Addr,Rx \\ ADD.cond Rx,PC}

        & Branch/jump on condition.  Works for

        & Branch/jump on condition.  Works for

        23 bit address offsets, but costs a register, an extra instruction,

        23 bit address offsets, but costs a register, an extra instruction,

        and sets the flags. \\\hline

        and sets the flags. \\\hline

BNC PC+\$Addr

{\tt BNC PC+\$Addr}

        & \parbox[t]{1.5in}{Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}

        & \parbox[t]{1.5in}{\tt Test \$Carry,CC \\ MOV.Z PC+\$Addr,PC}

        & Example of a branch on an unsupported

        & Example of a branch on an unsupported

                condition, in this case a branch on not carry \\\hline

                condition, in this case a branch on not carry \\\hline

BUSY & MOV \$-1(PC),PC & Execute an infinite loop \\\hline

{\tt BUSY } & {\tt MOV \$-1(PC),PC} & Execute an infinite loop \\\hline

CLRF.NZ Rx

{\tt CLRF.NZ Rx }

        & XOR.NZ Rx,Rx

        & {\tt XOR.NZ Rx,Rx}

        & Clear Rx, and flags, if the Z-bit is not set \\\hline

        & Clear Rx, and flags, if the Z-bit is not set \\\hline

CLR Rx

{\tt CLR Rx }

        & LDI \$0,Rx

        & {\tt LDI \$0,Rx}

        & Clears Rx, leaves flags untouched.  This instruction cannot be

        & Clears Rx, leaves flags untouched.  This instruction cannot be

                conditional. \\\hline

                conditional. \\\hline

EXCH.W Rx

{\tt EXCH.W Rx }

        & ROL \$16,Rx

        & {\tt ROL \$16,Rx}

        & Exchanges the top and bottom 16'bit words of Rx \\\hline

        & Exchanges the top and bottom 16'bit words of Rx \\\hline

HALT

{\tt HALT }

        & Or \$SLEEP,CC

        & {\tt Or \$SLEEP,CC}

        & Executed while in interrupt mode.  In user mode this is simply a

        & This only works when issued in interrupt/supervisor mode.  In user

        wait until interrupt instruction. \\\hline

        mode this is simply a wait until interrupt instruction. \\\hline

INT & LDI \$0,CC

{\tt INT } & {\tt LDI \$0,CC} &  \\\hline

        & Since we're using the CC register as a trap vector as well, this

{\tt IRET}

        executes TRAP \#0. \\\hline

        & {\tt OR \$GIE,CC}

IRET

        & Also known as an RTU instruction (Return to Userspace) \\\hline

        & OR \$GIE,CC

{\tt JMP R6+\$Addr}

        & Also an RTU instruction (Return to Userspace) \\\hline

        & {\tt MOV \$Addr(R6),PC}

JMP R6+\$Addr

        & MOV \$Addr(R6),PC

        & \\\hline

        & \\\hline

JSR PC+\$Addr

{\tt JSR PC+\$Addr}

        & \parbox[t]{1.5in}{SUB \$1,SP \\\

        & \parbox[t]{1.5in}{\tt SUB \$1,SP \\\

        MOV \$3+PC,R0 \\

        MOV \$3+PC,R0 \\

        STO R0,1(SP) \\

        STO R0,1(SP) \\

        MOV \$Addr+PC,PC \\

        MOV \$Addr+PC,PC \\

        ADD \$1,SP}

        ADD \$1,SP}

        & Jump to Subroutine. Note the required cleanup instruction after

        & Jump to Subroutine. Note the required cleanup instruction after

        returning.  This could easily be turned into a three instruction

        returning.  This could easily be turned into a three instruction

        operand, removing the preliminary stack instruction before and

        operand, removing the preliminary stack instruction before and

        the cleanup after, by adjusting how any stack frame was built for

        the cleanup after, by adjusting how any stack frame was built for

        this routine to include space at the top of the stack for the PC.

        this routine to include space at the top of the stack for the PC.

        Note also that jumping to a subroutine costs a copy register, {\tt R0}

        in this case.

        \\\hline

        \\\hline

JSR PC+\$Addr

{\tt JSR PC+\$Addr  }

        & \parbox[t]{1.5in}{MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}

        & \parbox[t]{1.5in}{\tt MOV \$3+PC,R12 \\ MOV \$addr+PC,PC}

        &This is the high speed

        &This is the high speed

        version of a subroutine call, necessitating a register to hold the

        version of a subroutine call, necessitating a register to hold the

        last PC address.  In its favor, this method doesn't suffer the

        last PC address.  In its favor, this method doesn't suffer the

        mandatory memory access of the other approach. \\\hline

        mandatory memory access of the other approach. \\\hline

LDI.l \$val,Rx

{\tt LDI.l \$val,Rx }

        & \parbox[t]{1.5in}{LDIHI (\$val$>>$16)\&0x0ffff, Rx \\

        & \parbox[t]{1.8in}{\tt LDIHI (\$val$>>$16)\&0x0ffff, Rx \\

                        LDILO (\$val \& 0x0ffff)}

                        LDILO (\$val\&0x0ffff),Rx}

        & Sadly, there's not enough instruction

        & Sadly, there's not enough instruction

                space to load a complete immediate value into any register.

                space to load a complete immediate value into any register.

                Therefore, fully loading any register takes two cycles.

                Therefore, fully loading any register takes two cycles.

                The LDIHI (load immediate high) and LDILO (load immediate low)

                The LDIHI (load immediate high) and LDILO (load immediate low)

                instructions have been created to facilitate this. \\\hline

                instructions have been created to facilitate this. \\\hline

Line 765...

Line 785...

\caption{Derived Instructions}\label{tbl:derived-1}

\caption{Derived Instructions}\label{tbl:derived-1}

\end{center}\end{table}

\end{center}\end{table}

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

Mapped & Actual  & Notes \\\hline

Mapped & Actual  & Notes \\\hline

LOD.b \$addr,Rx

{\tt LOD.b \$addr,Rx}

        & \parbox[t]{1.5in}{%

        & \parbox[t]{1.5in}{\tt %

        LDI     \$addr,Ra \\

        LDI     \$addr,Ra \\

        LDI     \$addr,Rb \\

        LDI     \$addr,Rb \\

        LSR     \$2,Ra \\

        LSR     \$2,Ra \\

        AND     \$3,Rb \\

        AND     \$3,Rb \\

        LOD     (Ra),Rx \\

        LOD     (Ra),Rx \\

Line 786...

Line 806...

        all other addresses in this document are 32-bit wordlength addresses.

        all other addresses in this document are 32-bit wordlength addresses.

        For this reason,

        For this reason,

        we needed to drop the bottom two bits.  This also limits the address

        we needed to drop the bottom two bits.  This also limits the address

        space of character accesses using this method from 16 MB down to 4MB.}

        space of character accesses using this method from 16 MB down to 4MB.}

                \\\hline

                \\\hline

\parbox[t]{1.5in}{LSL \$1,Rx\\ LSLC \$1,Ry}

\parbox[t]{1.5in}{\tt LSL \$1,Rx\\ LSLC \$1,Ry}

        & \parbox[t]{1.5in}{LSL \$1,Ry \\

        & \parbox[t]{1.5in}{\tt LSL \$1,Ry \\

        LSL \$1,Rx \\

        LSL \$1,Rx \\

        OR.C \$1,Ry}

        OR.C \$1,Ry}

        & Logical shift left with carry.  Note that the

        & Logical shift left with carry.  Note that the

        instruction order is now backwards, to keep the conditions valid.

        instruction order is now backwards, to keep the conditions valid.

        That is, LSL sets the carry flag, so if we did this the other way

        That is, LSL sets the carry flag, so if we did this the other way

        with Rx before Ry, then the condition flag wouldn't have been right

        with Rx before Ry, then the condition flag wouldn't have been right

        for an OR correction at the end. \\\hline

        for an OR correction at the end. \\\hline

\parbox[t]{1.5in}{LSR \$1,Rx \\ LSRC \$1,Ry}

\parbox[t]{1.5in}{\tt LSR \$1,Rx \\ LSRC \$1,Ry}

        & \parbox[t]{1.5in}{CLR Rz \\

        & \parbox[t]{1.5in}{\tt CLR Rz \\

        LSR \$1,Ry \\

        LSR \$1,Ry \\

        LDIHI.C \$8000h,Rz \\

        LDIHI.C \$8000h,Rz \\

        LSR \$1,Rx \\

        LSR \$1,Rx \\

        OR Rz,Rx}

        OR Rz,Rx}

        & Logical shift right with carry \\\hline

        & Logical shift right with carry \\\hline

NEG Rx & \parbox[t]{1.5in}{XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline

{\tt NEG Rx} & \parbox[t]{1.5in}{\tt XOR \$-1,Rx \\ ADD \$1,Rx} & \\\hline

NEG.C Rx & \parbox[t]{1.5in}{MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline

{\tt NEG.C Rx} & \parbox[t]{1.5in}{\tt MOV.C \$-1+Rx,Rx\\XOR.C \$-1,Rx} & \\\hline

NOOP & NOOP & While there are many

{\tt NOOP} & {\tt NOOP} & While there are many

        operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these

        operations that do nothing, such as MOV Rx,Rx, or OR \$0,Rx, these

        operations have consequences in that they might stall the bus if

        operations have consequences in that they might stall the bus if

        Rx isn't ready yet.  For this reason, we have a dedicated NOOP

        Rx isn't ready yet.  For this reason, we have a dedicated NOOP

        instruction. \\\hline

        instruction. \\\hline

NOT Rx & XOR \$-1,Rx & \\\hline

{\tt NOT Rx } & {\tt XOR \$-1,Rx } & \\\hline

POP Rx

{\tt POP Rx }

        & \parbox[t]{1.5in}{LOD \$1(SP),Rx \\ ADD \$1,SP}

        & \parbox[t]{1.5in}{\tt LOD \$1(SP),Rx \\ ADD \$1,SP}

        & Note

        & Note

        that for interrupt purposes, one can never depend upon the value at

        that for interrupt purposes, one can never depend upon the value at

        (SP).  Hence you read from it, then increment it, lest having

        (SP).  Hence you read from it, then increment it, lest having

        incremented it first something then comes along and writes to that

        incremented it first something then comes along and writes to that

        value before you can read the result. \\\hline

        value before you can read the result. \\\hline

\end{tabular}

\end{tabular}

\caption{Derived Instructions, continued}\label{tbl:derived-2}

\caption{Derived Instructions, continued}\label{tbl:derived-2}

\end{center}\end{table}

\end{center}\end{table}

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

PUSH Rx

{\tt PUSH Rx}

        & \parbox[t]{1.5in}{SUB \$1,SP \\

        & \parbox[t]{1.5in}{SUB \$1,SP \\

        STO Rx,\$1(SP)}

        STO Rx,\$1(SP)}

        & \\\hline

        & Note that for pipelined operation, it helps to coalesce all the

PUSH Rx-Ry

        {\tt SUB}'s into one command, and place the {\tt STO}'s right

        & \parbox[t]{1.5in}{SUB \$n,SP \\

        after each other.\\\hline

{\tt PUSH Rx-Ry}

        & \parbox[t]{1.5in}{\tt SUB \$n,SP \\

        STO Rx,\$n(SP)

        STO Rx,\$n(SP)

        \ldots \\

        \ldots \\

        STO Ry,\$1(SP)}

        STO Ry,\$1(SP)}

        & Multiple pushes at once only need the single subtract from the

        & Multiple pushes at once only need the single subtract from the

        stack pointer.  This derived instruction is analogous to a similar one

        stack pointer.  This derived instruction is analogous to a similar one

        on the Motoroloa 68k architecture, although the Zip Assembler

        on the Motoroloa 68k architecture, although the Zip Assembler

        does not support this instruction (yet).\\\hline

        does not support this instruction (yet).  This instruction

RESET

        also supports pipelined memory access.\\\hline

        & \parbox[t]{1in}{STO \$1,\$watchdog(R12)\\NOOP\\NOOP}

{\tt RESET}

        & \parbox[t]{3in}{This depends upon the peripheral base address being

        & \parbox[t]{1in}{\tt STO \$1,\$watchdog(R12)\\NOOP\\NOOP}

        & This depends upon the peripheral base address being

        in R12.

        in R12.

        Another opportunity might be to jump to the reset address from within

        Another opportunity might be to jump to the reset address from within

        supervisor mode.}\\\hline

        supervisor mode.\\\hline

RET & \parbox[t]{1.5in}{LOD \$1(SP),PC}

{\tt RET} & \parbox[t]{1.5in}{\tt LOD \$1(SP),PC}

        & Note that this depends upon the calling context to clean up the

        & Note that this depends upon the calling context to clean up the

        stack, as outlined for the JSR instruction.  \\\hline

        stack, as outlined for the JSR instruction.  \\\hline

RET & MOV R12,PC

{\tt RET} & {\tt MOV R12,PC}

        & This is the high(er) speed version, that doesn't touch the stack.

        & This is the high(er) speed version, that doesn't touch the stack.

        As such, it doesn't suffer a stall on memory read/write to the stack.

        As such, it doesn't suffer a stall on memory read/write to the stack.

        \\\hline

        \\\hline

STEP Rr,Rt

{\tt STEP Rr,Rt}

        & \parbox[t]{1.5in}{LSR \$1,Rr \\ XOR.C Rt,Rr}

        & \parbox[t]{1.5in}{\tt LSR \$1,Rr \\ XOR.C Rt,Rr}

        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,

        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,

                using taps Rt \\\hline

                using taps Rt \\\hline

STO.b Rx,\$addr

{\tt STO.b Rx,\$addr}

        & \parbox[t]{1.5in}{%

        & \parbox[t]{1.5in}{\tt %

        LDI \$addr,Ra \\

        LDI \$addr,Ra \\

        LDI \$addr,Rb \\

        LDI \$addr,Rb \\

        LSR \$2,Ra \\

        LSR \$2,Ra \\

        AND \$3,Rb \\

        AND \$3,Rb \\

        SUB \$32,Rb \\

        SUB \$32,Rb \\

        LOD (Ra),Ry \\

        LOD (Ra),Ry \\

        AND \$0ffh,Rx \\

        AND \$0ffh,Rx \\

        AND \$-0ffh,Ry \\

        AND \~\$0ffh,Ry \\

        ROL Rb,Rx \\

        ROL Rb,Rx \\

        OR Rx,Ry \\

        OR Rx,Ry \\

        STO Ry,(Ra) }

        STO Ry,(Ra) }

        & \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized

        & \parbox[t]{3in}{This CPU and it's bus are {\em not} optimized

        for byte-wise operations.

        for byte-wise operations.

Line 875...

Line 898...

        byte-wise address, whereas in all of our other examples it is a

        byte-wise address, whereas in all of our other examples it is a

        32-bit word address. This also limits the address space

        32-bit word address. This also limits the address space

        of character accesses from 16 MB down to 4MB.F

        of character accesses from 16 MB down to 4MB.F

        Further, this instruction implies a byte ordering,

        Further, this instruction implies a byte ordering,

        such as big or little endian.} \\\hline

        such as big or little endian.} \\\hline

SWAP Rx,Ry

{\tt SWAP Rx,Ry }

        & \parbox[t]{1.5in}{

        & \parbox[t]{1.5in}{\tt

        XOR Ry,Rx \\

        XOR Ry,Rx \\

        XOR Rx,Ry \\

        XOR Rx,Ry \\

        XOR Ry,Rx}

        XOR Ry,Rx}

        & While no extra registers are needed, this example

        & While no extra registers are needed, this example

        does take 3-clocks. \\\hline

        does take 3-clocks. \\\hline

TRAP \#X

{\tt TRAP \#X}

        & \parbox[t]{1.5in}{LDI \$x,R0 \\ AND ~\$GIE,CC }

        & \parbox[t]{1.5in}{\tt LDI \$x,R0 \\ AND \~\$GIE,CC }

        & This works because whenever a user lowers the \$GIE flag, it sets

        & This works because whenever a user lowers the \$GIE flag, it sets

        a TRAP bit within the CC register.  Therefore, upon entering the

        a TRAP bit within the CC register.  Therefore, upon entering the

        supervisor state, the CPU only need check this bit to know that it

        supervisor state, the CPU only need check this bit to know that it

        got there via a TRAP.  The trap could be made conditional by making

        got there via a TRAP.  The trap could be made conditional by making

        the LDI and the AND conditional.  In that case, the assembler would

        the LDI and the AND conditional.  In that case, the assembler would

Line 896...

Line 919...

\end{tabular}

\end{tabular}

\caption{Derived Instructions, continued}\label{tbl:derived-3}

\caption{Derived Instructions, continued}\label{tbl:derived-3}

\end{center}\end{table}

\end{center}\end{table}

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

\begin{tabular}{p{1.4in}p{1.5in}p{3in}}\\\hline

TST Rx

{\tt TST Rx}

        & TST \$-1,Rx

        & {\tt TST \$-1,Rx}

        & Set the condition codes based upon Rx.  Could also do a CMP \$0,Rx,

        & Set the condition codes based upon Rx.  Could also do a CMP \$0,Rx,

        ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc.  The TST and CMP

        ADD \$0,Rx, SUB \$0,Rx, etc, AND \$-1,Rx, etc.  The TST and CMP

        approaches won't stall future pipeline stages looking for the value

        approaches won't stall future pipeline stages looking for the value

        of Rx. \\\hline

        of Rx. \\\hline

WAIT

{\tt WAIT}

        & Or \$SLEEP,CC

        & {\tt Or \$GIE | \$SLEEP,CC}

        & Wait 'til interrupt.  In an interrupts disabled context, this

        & Wait until the next interrupt, then jump to supervisor/interrupt

        becomes a HALT instruction.

        mode.

\end{tabular}

\end{tabular}

\caption{Derived Instructions, continued}\label{tbl:derived-4}

\caption{Derived Instructions, continued}\label{tbl:derived-4}

\end{center}\end{table}

\end{center}\end{table}

\section{Pipeline Stages}

\section{Pipeline Stages}

As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},

As mentioned in the introduction, and highlighted in Fig.~\ref{fig:cpu},

Line 1071...

Line 1094...

In this case, the LOD instruction cannot start until the STO is finished.

In this case, the LOD instruction cannot start until the STO is finished.

With proper scheduling, it is possible to do something in the ALU while the

With proper scheduling, it is possible to do something in the ALU while the

memory unit is busy with the STO instruction, but otherwise this pipeline will

memory unit is busy with the STO instruction, but otherwise this pipeline will

stall waiting for it to complete.

stall waiting for it to complete.

Note that even though the Wishbone bus can support pipelined accesses at

The Zip CPU does have the capability of supporting pipelined memory access,

one access per clock, only the prefetch stage can take advantage of this.

but only under the following conditions: all accesses within the pipeline

Load and Store instructions are stuck at one wishbone cycle per instruction.

must all be reads or all be writes, all must use the same register for their

address, and there can be no stalls or other instructions between pipelined

memory access instructions.  Further, the offset to memory must be increasing

by one address each instruction.  These conditions work well for saving or

storing registers to the stack.

\item When waiting for a conditional memory read operation to complete

\item When waiting for a conditional memory read operation to complete

\begin{enumerate}

\begin{enumerate}

\item\ {\tt LOD.Z address,RA}

\item\ {\tt LOD.Z address,RA}

\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}

\item\ {\em (multiple stalls, bus dependent, 7 clocks best)}

Line 1233...

Line 1260...

restart its transfer by writing the contents of its internal buffer and then

restart its transfer by writing the contents of its internal buffer and then

re-entering its read cycle again.

re-entering its read cycle again.

When coupled with a peripheral, the DMA controller can be configured to start

When coupled with a peripheral, the DMA controller can be configured to start

a memory copy on an interrupt line going high.  Further, the controller can be

a memory copy on an interrupt line going high.  Further, the controller can be

configured to issue reads from (or two) the same address instead of incrementing

configured to issue reads from (or to) the same address instead of incrementing

the address at each clock.  The DMA completes once the total number of items

the address at each clock.  The DMA completes once the total number of items

specified (not the transfer length) have been transferred.

specified (not the transfer length) have been transferred.

In each case, once the transfer is complete and the DMA unit returns to

In each case, once the transfer is complete and the DMA unit returns to

idle, the DMA will issue an interrupt.

idle, the DMA will issue an interrupt.

Line 1400...

Line 1427...

        onto the user stack and then copying the resulting stack address

        onto the user stack and then copying the resulting stack address

        into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.

        into the tasks task structure, as shown in Tbl.~\ref{tbl:context-out}.

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{ll}

\begin{tabular}{ll}

{\tt swap\_out:} \\

{\tt swap\_out:} \\

&        {\tt MOV -15(uSP),R1} \\

&        {\tt MOV -15(uSP),R5} \\

&        {\tt STO R1,stack(R12)} \\

&        {\tt STO R5,stack(R12)} \\

&        {\tt MOV uPC,R0} \\

&        {\tt MOV uR0,R0} \\

&        {\tt STO R0,15(R1)} \\

&        {\tt MOV uR1,R1} \\

&        {\tt MOV uCC,R0} \\

&        {\tt MOV uR2,R2} \\

&        {\tt STO R0,14(R1)} \\

&        {\tt MOV uR3,R3} \\

&        {\tt MOV uR4,R4} \\

&        {\tt STO R0,1(R5)} {\em ; Exploit memory pipelining: }\\

&        {\tt STO R1,2(R5)} {\em ; All instructions write to stack }\\

&        {\tt STO R2,3(R5)} {\em ; All offsets increment by one }\\

&        {\tt STO R3,4(R5)} {\em ; Longest pipeline is 5 cycles.}\\

&        {\tt STO R4,5(R5)} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

\iffalse

&        {\tt MOV uR5,R0} \\

&        {\tt MOV uR6,R1} \\

&        {\tt MOV uR7,R2} \\

&        {\tt MOV uR8,R3} \\

&        {\tt MOV uR9,R4} \\

&        {\tt STO R0,6(R5) }\\

&        {\tt STO R1,7(R5) }\\

&        {\tt STO R2,8(R5) }\\

&        {\tt STO R3,9(R5) }\\

&        {\tt STO R4,10(R5)} \\

\fi

&        {\tt MOV uR10,R0} \\

&        {\tt MOV uR11,R1} \\

&        {\tt MOV uR12,R2} \\

&        {\tt MOV uCC,R3} \\

&        {\tt MOV uPC,R4} \\

&        {\tt STO R0,11(R5)}\\

&        {\tt STO R1,12(R5)}\\

&        {\tt STO R2,13(R5)}\\

&        {\tt STO R3,14(R5)}\\

&        {\tt STO R4,15(R5)} \\

&       {\em ; We can skip storing the stack, uSP, since it'll be stored}\\

&       {\em ; We can skip storing the stack, uSP, since it'll be stored}\\

&       {\em ; elsewhere (in the task structure) }\\

&       {\em ; elsewhere (in the task structure) }\\

&        {\tt MOV uR13,R0} \\

&        {\tt STO R0,13(R1)} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

&        {\tt MOV uR0,R0} \\

&        {\tt STO R0,1(R1)} \\

\end{tabular}

\end{tabular}

\caption{Example Storing User Task Context}\label{tbl:context-out}

\caption{Example Storing User Task Context}\label{tbl:context-out}

\end{center}\end{table}

\end{center}\end{table}

For the sake of discussion, we assume the supervisor maintains a

For the sake of discussion, we assume the supervisor maintains a

pointer to the current task's structure in supervisor register

pointer to the current task's structure in supervisor register

Line 1507...

Line 1558...

        back off of the stack to run this task.  An example of this is

        back off of the stack to run this task.  An example of this is

        shown in Tbl.~\ref{tbl:context-in},

        shown in Tbl.~\ref{tbl:context-in},

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{tabular}{ll}

\begin{tabular}{ll}

{\tt swap\_in:} \\

{\tt swap\_in:} \\

&       {\tt LOD stack(R12),R1} \\

&       {\tt LOD stack(R12),R5} \\

&       {\tt MOV 15(R1),uSP} \\

&       {\tt MOV 15(R1),uSP} \\

&       {\tt LOD 15(R1),R0} \\

        & {\em ; Be sure to exploit the memory pipelining capability} \\

&       {\tt MOV R0,uPC} \\

&       {\tt LOD 1(R5),R0} \\

&       {\tt LOD 14(R1),R0} \\

&       {\tt LOD 2(R5),R1} \\

&       {\tt MOV R0,uCC} \\

&       {\tt LOD 3(R5),R2} \\

&       {\tt LOD 13(R1),R0} \\

&       {\tt LOD 4(R5),R3} \\

&       {\tt MOV R0,uR12} \\

&       {\tt LOD 5(R5),R4} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

&       {\tt LOD 1(R1),R0} \\

&       {\tt MOV R0,uR0} \\

&       {\tt MOV R0,uR0} \\

&       {\tt MOV R1,uR1} \\

&       {\tt MOV R2,uR2} \\

&       {\tt MOV R3,uR3} \\

&       {\tt MOV R4,uR4} \\

        & \ldots {\em ; Need to repeat for all user registers} \\

&       {\tt LOD 11(R5),R0} \\

&       {\tt LOD 12(R5),R1} \\

&       {\tt LOD 13(R5),R2} \\

&       {\tt LOD 14(R5),R3} \\

&       {\tt LOD 15(R5),R4} \\

&       {\tt MOV R0,uR10} \\

&       {\tt MOV R1,uR11} \\

&       {\tt MOV R2,uR12} \\

&       {\tt MOV R3,uCC} \\

&       {\tt MOV R4,uPC} \\

&       {\tt BRA return\_to\_user} \\

&       {\tt BRA return\_to\_user} \\

\end{tabular}

\end{tabular}

\caption{Example Restoring User Task Context}\label{tbl:context-in}

\caption{Example Restoring User Task Context}\label{tbl:context-in}

\end{center}\end{table}

\end{center}\end{table}

        assuming as before that the task

        assuming as before that the task

Line 1714...

Line 1779...

The bit allocation of the control register is shown in Tbl.~\ref{tbl:dmacbits}.

The bit allocation of the control register is shown in Tbl.~\ref{tbl:dmacbits}.

\begin{table}\begin{center}

\begin{table}\begin{center}

\begin{bitlist}

\begin{bitlist}

31 & R & DMA Active\\\hline

31 & R & DMA Active\\\hline

30 & R & Wishbone error, transaction aborted (cleared on any write)\\\hline

30 & R & Wishbone error, transaction aborted.  This bit is cleared the next time

        this register is written to.\\\hline

29 & R/W & Set to '1' to prevent the controller from incrementing the source address, '0' for normal memory copy. \\\hline

29 & R/W & Set to '1' to prevent the controller from incrementing the source address, '0' for normal memory copy. \\\hline

28 & R/W & Set to '0' to prevent the controller from incrementing the

28 & R/W & Set to '1' to prevent the controller from incrementing the

        destination address, '0' for normal memory copy. \\\hline

        destination address, '0' for normal memory copy. \\\hline

27 \ldots 16 & W & The DMA Key.  Write a 12'hfed to these bits to start the

27 \ldots 16 & W & The DMA Key.  Write a 12'hfed to these bits to start the

        activate any DMA transfer.  \\\hline

        activate any DMA transfer.  \\\hline

27 & R & Always reads '0', to force the deliberate writing of the key. \\\hline

27 & R & Always reads '0', to force the deliberate writing of the key. \\\hline

26 \ldots 16 & R & Indicates the number of items in the transfer buffer that

26 \ldots 16 & R & Indicates the number of items in the transfer buffer that

Line 1793...

Line 1859...

uSP & 29 & 32 & R/W & User Stack Pointer\\\hline

uSP & 29 & 32 & R/W & User Stack Pointer\\\hline

uCC & 30 & 32 & R/W & User Condition Code Register \\\hline

uCC & 30 & 32 & R/W & User Condition Code Register \\\hline

uPC & 31 & 32 & R/W & User Program Counter\\\hline

uPC & 31 & 32 & R/W & User Program Counter\\\hline

PIC & 32 & 32 & R/W & Primary Interrupt Controller \\\hline

PIC & 32 & 32 & R/W & Primary Interrupt Controller \\\hline

WDT & 33 & 32 & R/W & Watchdog Timer\\\hline

WDT & 33 & 32 & R/W & Watchdog Timer\\\hline

CCHE & 34 & 32 & R/W & Manual Cache Controller\\\hline

CTRIC & 35 & 32 & R/W & Secondary Interrupt Controller\\\hline

CTRIC & 35 & 32 & R/W & Secondary Interrupt Controller\\\hline

TMRA & 36 & 32 & R/W & Timer A\\\hline

TMRA & 36 & 32 & R/W & Timer A\\\hline

TMRB & 37 & 32 & R/W & Timer B\\\hline

TMRB & 37 & 32 & R/W & Timer B\\\hline

TMRC & 38 & 32 & R/W & Timer C\\\hline

TMRC & 38 & 32 & R/W & Timer C\\\hline

JIFF & 39 & 32 & R/W & Jiffies peripheral\\\hline

JIFF & 39 & 32 & R/W & Jiffies peripheral\\\hline

Line 1807...

Line 1872...

MICNT & 43 & 32 & R/W & Master instruction counter\\\hline

MICNT & 43 & 32 & R/W & Master instruction counter\\\hline

UTASK & 44 & 32 & R/W & User task clock counter\\\hline

UTASK & 44 & 32 & R/W & User task clock counter\\\hline

UMSTL & 45 & 32 & R/W & User memory stall counter\\\hline

UMSTL & 45 & 32 & R/W & User memory stall counter\\\hline

UPSTL & 46 & 32 & R/W & User Pre-Fetch Stall counter\\\hline

UPSTL & 46 & 32 & R/W & User Pre-Fetch Stall counter\\\hline

UICNT & 47 & 32 & R/W & User instruction counter\\\hline

UICNT & 47 & 32 & R/W & User instruction counter\\\hline

DMACMD & 48 & 32 & R/W & DMA command and status register\\\hline

DMALEN & 49 & 32 & R/W & DMA transfer length\\\hline

DMARD & 50 & 32 & R/W & DMA read address\\\hline

DMAWR & 51 & 32 & R/W & DMA write address\\\hline

\end{reglist}

\end{reglist}

\caption{Debug Register Addresses}\label{tbl:dbgaddrs}

\caption{Debug Register Addresses}\label{tbl:dbgaddrs}

\end{center}\end{table}

\end{center}\end{table}

Primarily, these ``registers'' include access to the entire CPU register

Primarily, these ``registers'' include access to the entire CPU register

set, as well as the internal peripherals.  To read one of these registers

set, as well as the internal peripherals.  To read one of these registers

Line 2113...

Line 2182...

        (yet) support a compiler. The standard C library is an even longer

        (yet) support a compiler. The standard C library is an even longer

        shot. My dream of having binutils and gcc support has not been

        shot. My dream of having binutils and gcc support has not been

        realized and at this rate may not be realized. (I've been intimidated

        realized and at this rate may not be realized. (I've been intimidated

        by the challenge everytime I've looked through those codes.)

        by the challenge everytime I've looked through those codes.)

\iffalse

\item While the Wishbone Bus (B4) supports a pipelined mode with single cycle

\item While the Wishbone Bus (B4) supports a pipelined mode with single cycle

        execution, the Zip CPU is unable to exploit this parallelism. Instead,

        execution, the Zip CPU is unable to exploit this parallelism. Instead,

        apart from the DMA and the pipelined prefetch, all loads and stores

        apart from the DMA and the pipelined prefetch, all loads and stores

        are single wishbone bus operations requiring a minimum of 3 clocks.

        are single wishbone bus operations requiring a minimum of 3 clocks.

        (In practice, this has turned into 7-clocks.)

        (In practice, this has turned into 7-clocks.)

        % Addressed, 20150929

\iffalse

\item There is no control over whether or not an instruction sets the

\item There is no control over whether or not an instruction sets the

        condition codes--certain instructions always set the condition codes,

        condition codes--certain instructions always set the condition codes,

        other instructions never set them. This effectively limits conditional

        other instructions never set them. This effectively limits conditional

        instructions to a single instruction only (with two or more

        instructions to a single instruction only (with two or more

        instructions as an exception), as the first instruction that sets

        instructions as an exception), as the first instruction that sets

Line 2171...

Line 2241...

        the process accounting registers are anything but light weight, why

        the process accounting registers are anything but light weight, why

        keep them?  Why not instead make some compile flags that just turn them

        keep them?  Why not instead make some compile flags that just turn them

        off, keeping the CPU lightweight?  The same holds for the prefetch

        off, keeping the CPU lightweight?  The same holds for the prefetch

        cache.

        cache.

\item The `{\tt .V}' condition was never used in any code other than my test

        code.  Suggest changing it to a `{\tt .LE}' condition, which seems

        to be more useful.

\item {\bf Consider a more traditional Instruction Cache.}  The current

        pipelined instruction cache just reads a window of memory into

        its cache.  If the CPU leaves that window, the entire cache is

        invalidated.  A more traditional cache, however, might allow

        common subroutines to stay within the cache without invalidating the

        entire cache structure.

\iffalse

\iffalse

\item {\bf Adjust the Zip CPU so that conditional instructions do not set

\item {\bf Adjust the Zip CPU so that conditional instructions do not set

        flags}, although they may explicitly set condition codes if writing

        flags}, although they may explicitly set condition codes if writing

        to the CC register.

        to the CC register.

        This is a simple change to the core, and may show up in new releases.

        This is a simple change to the core, and may show up in new releases.

        % Fixed, 20150918

        % Fixed, 20150918

\fi

\item The `{\tt .V}' condition was never used in any code other than my test

        code.  Suggest changing it to a `{\tt .LE}' condition, which seems

        to be more useful.

\iffalse

\item Add in an {\bf unpredictable branch delay slot}, so that on any branch

\item Add in an {\bf unpredictable branch delay slot}, so that on any branch

        the delay slot may or may not be executed before the branch.

        the delay slot may or may not be executed before the branch.

        Instructions that do not depend upon the branch, and that should be

        Instructions that do not depend upon the branch, and that should be

        executed were the branch not taken, could be placed into the delay

        executed were the branch not taken, could be placed into the delay

        slot. Thus, if the branch isn't taken, we wouldn't suffer the stall,

        slot. Thus, if the branch isn't taken, we wouldn't suffer the stall,

Line 2224...

Line 2299...

        for one cycle before starting again, these extra cycles add up.

        for one cycle before starting again, these extra cycles add up.

        It should be possible to tell the prefetch stage to give up the bus

        It should be possible to tell the prefetch stage to give up the bus

        as soon as the decoder knows the instruction will need the bus.

        as soon as the decoder knows the instruction will need the bus.

        Indeed, if done in the decode stage, this might drop the seven cycle

        Indeed, if done in the decode stage, this might drop the seven cycle

        access down by two cycles.

        access down by two cycles.

        % FIXED: 20150918

        % FIXED: 20150918

\fi

\item {\bf Consider a more traditional Instruction Cache.}  The current

        pipelined instruction cache just reads a window of memory into

        its cache.  If the CPU leaves that window, the entire cache is

        invalidated.  A more traditional cache, however, might allow

        common subroutines to stay within the cache without invalidating the

        entire cache structure.

\iffalse

\item {\bf Very Long Instruction Word (VLIW).}  Now, to speed up operation, I

\item {\bf Very Long Instruction Word (VLIW).}  Now, to speed up operation, I

        propose that the Zip CPU instruction set be modified towards a Very

        propose that the Zip CPU instruction set be modified towards a Very

        Long Instruction Word (VLIW) implementation. In this implementation,

        Long Instruction Word (VLIW) implementation. In this implementation,

        an instruction word may contain either one or two separate

        an instruction word may contain either one or two separate

        instructions. The first instruction would take up the high order bits,

        instructions. The first instruction would take up the high order bits,

Browse

Tools

Subversion Repositories zipcpu

[/] [zipcpu/] [trunk/] [doc/] [src/] [spec.tex] - Diff between revs 37 and 39