URL https://opencores.org/ocsvn/zipcpu/zipcpu/trunk

# Subversion Repositorieszipcpu

## [/] [zipcpu/] [trunk/] [doc/] [src/] [spec.tex] - Diff between revs 92 and 139

Rev 92 Rev 139
Line 41... Line 41...
%% License:     GPL, v3, as defined and found on www.gnu.org,
%% License:     GPL, v3, as defined and found on www.gnu.org,
%%              http://www.gnu.org/licenses/gpl.html
%%              http://www.gnu.org/licenses/gpl.html
%%
%%
%%
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%

%

%

% From TI about DSPs vs FPGAs:

%       www.ti.com/general/docs/video/foldersGallery.tsp?bkg=gray

%       &gpn=35145&familyid=1622&keyMatch=DSP Breaktime Episode Three

%       &tisearch=Search-EN-Everything&DCMP=leadership

%       &HQS=ep-pro-dsp-leadership-problog-150518-v-en

%

%       FPGA's are annoyingly faster, cheaper, and not quite as power hungry

%       as they used to be.

%

%       Why would you choose DSPs over FPGAs?  If you care about size,

%       if you care about power, or happen to have a complicated algorithm

%       that just isn't simply doing the same thing over and over

%

%       For complex algorithms that change over time.  Each have their strengths

%       sometimes you can use both.

%

%       "No assembly required" -- TI tools all C programming, very GUI based

%       environment, very little optimization by hand ...

%

%

% The FPGA's achilles heel: Reconfigurability.  It is very difficult, although

% I'm sure major vendors will tell you not impossible, to reconfigure an FPGA

% based upon the need to process time-sensitive data.  If you need one of two

% algorithms, both which will fit on the FPGA individually but not together,

% switching between them on the fly is next to impossible, whereas switching

% algorithm within a CPU is not difficult at all.  For example, imagine

% receiving a packet and needing to apply one of two data algorithms on the

% packet before sending it back out, and needing to do so fast.  If both

% algorithms don't fit in memory, where does the packet go when you need to

% swap one algorithm out for the other?  And what is the cost of that "context"

% swap?

%

%
\documentclass{gqtekspec}
\documentclass{gqtekspec}
\usepackage{import}
\usepackage{import}
\usepackage{bytefield}
\usepackage{bytefield}  % Install via apt-get install texlive-science
% \graphicspath{{../gfx}}
% \graphicspath{{../gfx}}
\project{Zip CPU}
\project{Zip CPU}
\title{Specification}
\title{Specification}
\author{Dan Gisselquist, Ph.D.}
\author{Dan Gisselquist, Ph.D.}
\email{dgisselq (at) opencores.org}
\email{dgisselq (at) opencores.org}
\revision{Rev.~0.8}
\revision{Rev.~0.9}
\definecolor{webred}{rgb}{0.5,0,0}
\definecolor{webred}{rgb}{0.5,0,0}
\definecolor{webgreen}{rgb}{0,0.4,0}
\definecolor{webgreen}{rgb}{0,0.4,0}
\usepackage[dvips,ps2pdf,colorlinks=true,
\usepackage[dvips,ps2pdf,colorlinks=true,
        anchorcolor=black,pdfpagelabels,hypertexnames,
        anchorcolor=black,pdfpagelabels,hypertexnames,
        pdfauthor={Dan Gisselquist},
        pdfauthor={Dan Gisselquist},
Line 82... Line 118...
You should have received a copy of the GNU General Public License along
You should have received a copy of the GNU General Public License along
with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a
with this program.  If not, see \hbox{<http://www.gnu.org/licenses/>} for a
copy.
copy.
\end{license}
\end{license}
\begin{revisionhistory}
\begin{revisionhistory}

0.9 & 4/20/2016 & Gisselquist & Modified ISA: LDIHI replaced with MPY, MPYU and MPYS replaced with MPYUHI, and MPYSHI respectively.  LOCK instruction now

permits an intermediate ALU operation. \\\hline
0.8 & 1/28/2016 & Gisselquist & Reduced complexity early branching \\\hline
0.8 & 1/28/2016 & Gisselquist & Reduced complexity early branching \\\hline
0.7 & 12/22/2015 & Gisselquist & New Instruction Set Architecture \\\hline
0.7 & 12/22/2015 & Gisselquist & New Instruction Set Architecture \\\hline
0.6 & 11/17/2015 & Gisselquist & Added graphics to illustrate pipeline discussion.\\\hline
0.6 & 11/17/2015 & Gisselquist & Added graphics to illustrate pipeline discussion.\\\hline
0.5 & 9/29/2015 & Gisselquist & Added pipelined memory access discussion.\\\hline
0.5 & 9/29/2015 & Gisselquist & Added pipelined memory access discussion.\\\hline
0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline
0.4 & 9/19/2015 & Gisselquist & Added DMA controller, improved stall information, and self--assessment info.\\\hline
Line 650... Line 688...
\end{bytefield}
\end{bytefield}
\caption{Zip Instruction Set Format}\label{fig:iset-format}
\caption{Zip Instruction Set Format}\label{fig:iset-format}
\end{center}\end{figure}
\end{center}\end{figure}
The basic format is that some operation, defined by the OpCode, is applied
The basic format is that some operation, defined by the OpCode, is applied
if a condition, Cnd, is true in order to produce a result which is placed in
if a condition, Cnd, is true in order to produce a result which is placed in
the destination register, or DR.  The Load 23--bit signed immediate instruction
the destination register, or DR.  The load 23--bit signed immediate instruction
is different in that it requires no conditions, and uses only a 4-bit opcode.
(LDI) is different in that it accepts no conditions, and uses only a 4-bit

opcode.
 
 
This is actually a second version of instruction set definition, given certain
This is actually a second version of instruction set definition, given certain
lessons learned.  For example, the original instruction set had the following
lessons learned.  For example, the original instruction set had the following
problems:
problems:
\begin{enumerate}
\begin{enumerate}
Line 665... Line 704...
        require extra logic to use.
        require extra logic to use.
\item The carveouts for instructions such as NOOP and LDIHI/LDILO required
\item The carveouts for instructions such as NOOP and LDIHI/LDILO required
        extra logic to process.
        extra logic to process.
\item The instruction set wasn't very compact.  One bus operation was required
\item The instruction set wasn't very compact.  One bus operation was required
        for every instruction.
        for every instruction.

\item While the CPU supported multiplies, they were only 16x16 bit multiplies.
\end{enumerate}
\end{enumerate}
This second version was designed with two criteria.  The first was that the
This second version was designed with two criteria.  The first was that the
new instruction set needed to be compatible, at the assembly language level,
new instruction set needed to be compatible, at the assembly language level,
with the previous instruction set.  Thus, it must be able to support all of
with the previous instruction set.  Thus, it must be able to support all of
the previous menumonics and more.  This was achieved with the sole exception
the previous menumonics and more.  This was achieved with the sole exception
Line 690... Line 730...
to interrupt mode in between the two instructions.  Likewise a new job given
to interrupt mode in between the two instructions.  Likewise a new job given
to the assembler is that of automatically packing as many instructions as
to the assembler is that of automatically packing as many instructions as
possible into the VLIW format.  Where necessary to place both VLIW instructions
possible into the VLIW format.  Where necessary to place both VLIW instructions
on the same line, they will be separated by a vertical bar.
on the same line, they will be separated by a vertical bar.
 
 

One belated change to the instruction set violates some of the above

principles.  This latter instruction set change replaced the {\tt LDIHI}

instruction with a 32--bit multiply instruction {\tt MPY}, and then changed

the two 16--bit multiply instructions {\tt MPYU} and {\tt MPYS} for

{\tt MPYUHI} and {\tt MPYSHI} respectively.  This creates a 32--bit

multiply capability, while removing the 16--bit multiply that wasn't very

useful. Further, the {\tt LDIHI} instruction was being used primarily by the

assembler and linker to create a 32--bit load immediate pair of instructions.

This instruction set combination, {\tt LDIHI} followed by {\tt LDILO} was

replaced with an equivalent instruction set, {\tt BREV} followed by {\tt LDILO},

save that linking has been made more complicated in the process.

 
\section{Instruction OpCodes}
\section{Instruction OpCodes}
With a 5--bit opcode field, there are 32--possible instructions as shown in
With a 5--bit opcode field, there are 32--possible instructions as shown in
Tbl.~\ref{tbl:iset-opcodes}.
Tbl.~\ref{tbl:iset-opcodes}.
\begin{table}\begin{center}
\begin{table}\begin{center}
\begin{tabular}{|l|l|l|c|} \hline \rowcolor[gray]{0.85}
\begin{tabular}{|l|l|l|c|} \hline \rowcolor[gray]{0.85}
Line 704... Line 756...
5'h03 & OR  & Bitwise Or & Y \\\cline{1-3}
5'h03 & OR  & Bitwise Or & Y \\\cline{1-3}
5'h04 & XOR & Bitwise Exclusive Or &   \\\cline{1-3}
5'h04 & XOR & Bitwise Exclusive Or &   \\\cline{1-3}
5'h05 & LSR & Logical Shift Right &   \\\cline{1-3}
5'h05 & LSR & Logical Shift Right &   \\\cline{1-3}
5'h06 & LSL & Logical Shift Left &   \\\cline{1-3}
5'h06 & LSL & Logical Shift Left &   \\\cline{1-3}
5'h07 & ASR & Arithmetic Shift Right &   \\\hline
5'h07 & ASR & Arithmetic Shift Right &   \\\hline
5'h08 & LDIHI & Load Immediate High & N \\\cline{1-3}
5'h08 & MPY & 32x32 bit multiply & Y \\\hline
5'h09 & LDILO & Load Immediate Low &  \\\hline
5'h09 & LDILO & Load Immediate Low & N\\\hline
5'h0a & MPYU & Unsigned 16--bit Multiply &  \\\cline{1-3}
5'h0a & MPYUHI & Upper 32 of 64 bits from an unsigned 32x32 multiply &  \\\cline{1-3}
5'h0b & MPYS & Signed 16--bit Multiply & Y \\\cline{1-3}
5'h0b & MPYSHI & Upper 32 of 64 bits from a signed 32x32 multiply & Y \\\cline{1-3}
5'h0c & BREV & Bit Reverse &  \\\cline{1-3}
5'h0c & BREV & Bit Reverse &  \\\cline{1-3}
5'h0d & POPC& Population Count &  \\\cline{1-3}
5'h0d & POPC& Population Count &  \\\cline{1-3}
5'h0e & ROL & Rotate left &   \\\hline
5'h0e & ROL & Rotate left &   \\\hline
5'h0f & MOV & Move register & N \\\hline
5'h0f & MOV & Move register & N \\\hline
5'h10 & CMP & Compare & Y \\\cline{1-3}
5'h10 & CMP & Compare & Y \\\cline{1-3}
Line 727... Line 779...
5'h1b & FPDIV & Floating point divide &   \\\cline{1-3}
5'h1b & FPDIV & Floating point divide &   \\\cline{1-3}
5'h1c & FPCVT & Convert integer to floating point &   \\\cline{1-3}
5'h1c & FPCVT & Convert integer to floating point &   \\\cline{1-3}
5'h1d & FPINT & Convert to integer &   \\\hline
5'h1d & FPINT & Convert to integer &   \\\hline
5'h1e & & {\em Reserved for future use} &\\\hline
5'h1e & & {\em Reserved for future use} &\\\hline
5'h1f & & {\em Reserved for future use} &\\\hline
5'h1f & & {\em Reserved for future use} &\\\hline

5'h18 & & NOOP (A-register = PC)&\\\cline{1-3}

5'h19 & & BREAK (A-register = PC)& N\\\cline{1-3}

5'h1a & & LOCK (A-register = PC)&\\\hline
\end{tabular}
\end{tabular}
\caption{Zip CPU OpCodes}\label{tbl:iset-opcodes}
\caption{Zip CPU OpCodes}\label{tbl:iset-opcodes}
\end{center}\end{table}
\end{center}\end{table}
%
%
Of these opcodes, the {\tt BREV} and {\tt POPC} are experimental, and may be
Of these opcodes, the {\tt BREV} and {\tt POPC} are experimental, and may be
Line 751... Line 806...
3'h1 & {\tt .LT} & Less than ('N' set) \\
3'h1 & {\tt .LT} & Less than ('N' set) \\
3'h2 & {\tt .Z} & Only execute when 'Z' is set \\
3'h2 & {\tt .Z} & Only execute when 'Z' is set \\
3'h3 & {\tt .NZ} & Only execute when 'Z' is not set \\
3'h3 & {\tt .NZ} & Only execute when 'Z' is not set \\
3'h4 & {\tt .GT} & Greater than ('N' not set, 'Z' not set) \\
3'h4 & {\tt .GT} & Greater than ('N' not set, 'Z' not set) \\
3'h5 & {\tt .GE} & Greater than or equal ('N' not set, 'Z' irrelevant) \\
3'h5 & {\tt .GE} & Greater than or equal ('N' not set, 'Z' irrelevant) \\
3'h6 & {\tt .C} & Carry set\\
3'h6 & {\tt .C} & Carry set (Also known as less-than unsigned) \\
3'h7 & {\tt .V} & Overflow set\\
3'h7 & {\tt .V} & Overflow set\\
\end{tabular}
\end{tabular}
\caption{Conditions for conditional operand execution}\label{tbl:conditions}
\caption{Conditions for conditional operand execution}\label{tbl:conditions}
\end{center}\end{table}
\end{center}\end{table}
There is no condition code for less than or equal, not C or not V---there
There is no condition code for less than or equal, not C or not V---there
just wasn't enough space in 3--bits.  Conditioning on a non--supported
just wasn't enough space in 3--bits.  Conditioning on a non--supported
condition is still possible, but it will take an extra instruction and a
condition is still possible, but it will take an extra instruction and a
pipeline stall.  (Ex: \hbox{\em (Stall)}; \hbox{\tt TST \$4,CC;} \hbox{\tt pipeline stall. (Ex: \hbox{\em (Stall)}; \hbox{\tt TST \$4,CC;} \hbox{\tt
STO.NZ R0,(R1)}) As an alternative, it is often possible to reverse the
STO.NZ R0,(R1)}) As an alternative, it is often possible to reverse the
condition, and thus recovering those extra two clocks.  Thus instead of
condition, and thus recovering those extra two clocks.  Thus instead of
\hbox{\tt CMP Rx,Ry;} \hbox{\tt BNV label} you can issue a
\hbox{\tt CMP Rx,Ry;} \hbox{\tt BNC label} you can issue a
\hbox{\tt CMP Ry,Rx;} \hbox{\tt BV label}.
\hbox{\tt CMP 1+Ry,Rx;} \hbox{\tt BC label}.
 
 
Conditionally executed instructions will not further adjust the
Conditionally executed instructions will not further adjust the
condition codes, with the exception of \hbox{\tt CMP} and \hbox{\tt TST}
condition codes, with the exception of \hbox{\tt CMP} and \hbox{\tt TST}
instructions.   Conditional \hbox{\tt CMP} or \hbox{\tt TST} instructions
instructions.   Conditional \hbox{\tt CMP} or \hbox{\tt TST} instructions
will adjust conditions whenever they are executed.  In this way,
will adjust conditions whenever they are executed.  In this way,
Line 801... Line 856...
\caption{VLIW Conditions}\label{tbl:vliw-conditions}
\caption{VLIW Conditions}\label{tbl:vliw-conditions}
\end{center}\end{table}
\end{center}\end{table}
Further, the first bit is given a special meaning.  If the first bit is set,
Further, the first bit is given a special meaning.  If the first bit is set,
the conditions apply to the second half of the instruction, otherwise the
the conditions apply to the second half of the instruction, otherwise the
conditions will only apply to the first half of a conditional instruction.
conditions will only apply to the first half of a conditional instruction.

Of course, the other conditions are still available by mingling the

non--VLIW instructions with VLIW instructions.
 
 
\section{Operand B}
\section{Operand B}
Many instruction forms have a 19-bit source Operand B'' associated with them.
Many instruction forms have a 19-bit source Operand B'' associated with them.
This Operand B'' is shown in Fig.~\ref{fig:iset-format} as part of the
This Operand B'' is shown in Fig.~\ref{fig:iset-format} as part of the
standard instructions.  This Operand B is either equal to a register plus a
standard instructions.  This Operand B is either equal to a register plus a
Line 848... Line 905...
removed from the realm of possibilities.  This means that the Zip CPU has no
removed from the realm of possibilities.  This means that the Zip CPU has no
native way of executing push, pop, return, or jump to subroutine operations.
native way of executing push, pop, return, or jump to subroutine operations.
Each of these instructions can be emulated with a set of instructions from the
Each of these instructions can be emulated with a set of instructions from the
existing set.
existing set.
 
 

\section{Modifying Conditions}

A quick look at the list of conditions supported by the Zip CPU and listed

in Tbl.~\ref{tbl:conditions} reveals that the Zip CPU does not have a full set

of conditions.  In particular, only one explicit unsigned condition is

supported.  Therefore, Tbl.~\ref{tbl:creating-conditions}

\begin{table}\begin{center}

\begin{tabular}{|l|l|l|}\hline

Original & Modified & Name \\\hline\hline

\parbox[t]{1.5in}{\tt CMP Rx,Ry\\BLE label} % If Ry <= Rx -> Ry < Rx+1

        & \parbox[t]{1.5in}{\tt CMP 1+Rx,Ry\\BLT label}

        & Less-than or equal (signed, {\tt Z} or {\tt N} set)\\[4mm]\hline

\parbox[t]{1.5in}{\tt CMP Rx,Ry\\BLEU label}

        & \parbox[t]{1.5in}{\tt CMP 1+Rx,Ry\\BC label}

        & Less-than or equal unsigned \\[4mm]\hline

\parbox[t]{1.5in}{\tt CMP Rx,Ry\\BGTU label}    % if (Ry > Rx) -> Rx < Ry

        & \parbox[t]{1.5in}{\tt CMP Ry,Rx\\BC label}

        & Greater-than unsigned \\[4mm]\hline

\parbox[t]{1.5in}{\tt CMP Rx,Ry\\BGEU label}    % if (Ry >= Rx) -> Rx <= Ry -> Rx < Ry+1

        & \parbox[t]{1.5in}{\tt CMP 1+Ry,Rx\\BC label}

        & Greater-than equal unsigned \\[4mm]\hline

\parbox[t]{1.5in}{\tt CMP A+Rx,Ry\\BGEU label} % if (Ry >= A+Rx)-> A+Rx <= Ry -> Rx < Ry+1-A

        & \parbox[t]{1.5in}{\tt CMP (1-A)+Ry,Rx\\BC label}

        & Greater-than equal unsigned (with offset)\\[4mm]\hline

\parbox[t]{1.5in}{\tt CMP A,Ry\\BGEU label} % if (Ry >= A+Rx)-> A+Rx <= Ry -> Rx < Ry+1-A

        & \parbox[t]{1.5in}{\tt LDI (A-1),Rx\\CMP Ry,Rx\\BC label}

        & Greater-than equal comparison with a constant\\[4mm]\hline

\end{tabular}

\caption{Modifying conditions}\label{tbl:creating-conditions}

\end{center}\end{table}

shows examples of how these unsupported conditions can be created

simply by adjusting the compare instruction, for no extra cost in clocks.

Of course, if the compare originally had an immediate within it, that immediate

would need to be loaded into a register in order to do some of these compares.

This case is shown as the last case above.

 
\section{Move Operands}
\section{Move Operands}
The previous set of operands would be perfect and complete, save only that
The previous set of operands would be perfect and complete, save only that
the CPU needs access to non--supervisory registers while in supervisory mode.
the CPU needs access to non--supervisory registers while in supervisory mode.
Therefore, the MOV instruction is special and offers access to these registers
Therefore, the MOV instruction is special and offers access to these registers
\ldots when in supervisory mode.  To keep the compiler simple, the extra bits
\ldots when in supervisory mode.  To keep the compiler simple, the extra bits
Line 873... Line 965...
Anything with the user bit set will be treated as a user register and displayed
Anything with the user bit set will be treated as a user register and displayed
special.  Since the CPU quietly ignores the supervisor bits while in user mode,
special.  Since the CPU quietly ignores the supervisor bits while in user mode,
anything marked as a user register will always be specific.
anything marked as a user register will always be specific.
 
 
\section{Multiply Operations}
\section{Multiply Operations}
The Zip CPU supports two Multiply operations, a 16x16 bit signed multiply
 
({\tt MPYS}) and a 16x16 bit unsigned multiply ({\tt MPYU}).  A 32--bit
The ZipCPU originally only supported 16x16 multiply operations.  GCC, however,
multiply, should it be desired, needs to be created via software from this
wanted 32x32-bit operations and building these from 16x16-bit multiplies
16x16 bit multiply.
is painful.  Therefore, the ZipCPU was modified to support 32x32-bit multiplies.

 

In particular, the ZipCPU supports three separate 32x32-bit multiply

instructions: {\tt MPY}, {\tt MPYUHI}, and {\tt MPYSHI}.  The first of these

produces the low 32-bits of a 32x32-bit multiply result.  The second two

produce the upper 32-bits.  The first, {\tt MPYUHI}, produces the upper 32-bits

assuming the multiply was unsigned, whereas the second assuming it was signed.

Each multiply instruction is independent of each other in execution, although

the compiler may use them quite dependently.

 

In an effort to maintain single clock pipeline timing, all three of these

multiplies have been slowed down in logic.  Thus, depending upon the setting

of {\tt OPT\_MULTIPLY} within {\tt cpudefs.v}, the multiply instructions

will either 1)~cause an ILLEGAL instruction error, 2)~take one additional clock,

or 3)~take two additional clocks.

 
 
 
\section{Divide Unit}
\section{Divide Unit}
The Zip CPU also has a divide unit which can be built alongside the ALU.
The Zip CPU also has a divide unit which can be built alongside the ALU.
This divide unit provides the Zip CPU with its first two instructions that
This divide unit provides the Zip CPU with another two instructions that
cannot be executed in a single cycle: {\tt DIVS}, or signed divide, and
cannot be executed in a single cycle: {\tt DIVS}, or signed divide, and
{\tt DIVU}, the unsigned divide.  These are both 32--bit divide instructions,
{\tt DIVU}, the unsigned divide.  These are both 32--bit divide instructions,
dividing one 32--bit number by another.  In this case, the Operand B field,
dividing one 32--bit number by another.  In this case, the Operand B field,
whether it be register or register plus immediate, constitutes the denominator,
whether it be register or register plus immediate, constitutes the denominator,
whereas the numerator is given by the other register.
whereas the numerator is given by the other register.
 
 
The Divide is also a multi--clock instruction.  While the divide is running,
The Divide is also a multi--clock instruction.  While the divide is running,
the ALU, memory unit, and floating point unit (if installed) will be idle.
the ALU, any memory loads, and the floating point unit (if installed) will be
Once the divide completes, other units may continue.
idle.  Once the divide completes, other units may continue.
 
 
Of course, divides can have errors: division by zero.  In the case of division
Of course, divides can have errors: division by zero.  In the case of division
by zero, an exception will be caused that will send the CPU either from
by zero, an exception will be caused that will send the CPU either from
user mode to supervisor mode, or halt the CPU if it is already in supervisor
user mode to supervisor mode, or halt the CPU if it is already in supervisor
mode.
mode.
 
 
\section{NOOP, BREAK, and Bus Lock Instruction}
\section{NOOP, BREAK, and Bus Lock Instruction}
Three instructions are not listed in the opcode list in
Three instructions within the opcode list in Tbl.~\ref{tbl:iset-opcodes}, are
Tbl.~\ref{tbl:iset-opcodes}, yet fit in the NOOP type instruction format of
somewhat special.  These are the {\tt NOOP}, {\tt Break}, and bus {\tt LOCK}
Fig.~\ref{fig:iset-format}.  These are the {\tt NOOP}, {\tt Break}, and
instructions.  These are encoded according to
bus {\tt LOCK} instructions.  These are encoded according to

Fig.~\ref{fig:iset-noop}, and have the following meanings:
Fig.~\ref{fig:iset-noop}, and have the following meanings:
\begin{figure}\begin{center}
\begin{figure}\begin{center}
\begin{bytefield}[endianness=big]{32}
\begin{bytefield}[endianness=big]{32}
\bitheader{0-31}\\
\bitheader{0-31}\\
\begin{leftwordgroup}{NOOP}
\begin{leftwordgroup}{NOOP}
\bitbox{1}{0}\bitbox{3}{3'h7}\bitbox{1}{}
\bitbox{1}{0}\bitbox{3}{3'h7}\bitbox{1}{}
        \bitbox{2}{11}\bitbox{3}{001}\bitbox{22}{Ignored} \\
        \bitbox{2}{11}\bitbox{3}{000}\bitbox{22}{Ignored} \\
\bitbox{1}{1}\bitbox{3}{3'h7}\bitbox{1}{}
\bitbox{1}{1}\bitbox{3}{3'h7}\bitbox{1}{}
        \bitbox{2}{11}\bitbox{3}{001}\bitbox{22}{---} \\
        \bitbox{2}{11}\bitbox{3}{000}\bitbox{22}{---} \\
\bitbox{1}{1}\bitbox{9}{---}\bitbox{3}{---}\bitbox{5}{---}
\bitbox{1}{1}\bitbox{9}{---}\bitbox{3}{---}\bitbox{5}{---}
        \bitbox{3}{3'h7}\bitbox{1}{}\bitbox{2}{11}\bitbox{3}{001}
        \bitbox{3}{3'h7}\bitbox{1}{}\bitbox{2}{11}\bitbox{3}{001}
        \bitbox{5}{Ignored}
        \bitbox{5}{Ignored}
                \end{leftwordgroup} \\
                \end{leftwordgroup} \\
\begin{leftwordgroup}{BREAK}
\begin{leftwordgroup}{BREAK}
\bitbox{1}{0}\bitbox{3}{3'h7}
\bitbox{1}{0}\bitbox{3}{3'h7}
                \bitbox{1}{}\bitbox{2}{11}\bitbox{3}{010}\bitbox{22}{Ignored}
                \bitbox{1}{}\bitbox{2}{11}\bitbox{3}{001}\bitbox{22}{Ignored}
                \end{leftwordgroup} \\
                \end{leftwordgroup} \\
\begin{leftwordgroup}{LOCK}
\begin{leftwordgroup}{LOCK}
\bitbox{1}{0}\bitbox{3}{3'h7}
\bitbox{1}{0}\bitbox{3}{3'h7}
                \bitbox{1}{}\bitbox{2}{11}\bitbox{3}{100}\bitbox{22}{Ignored}
                \bitbox{1}{}\bitbox{2}{11}\bitbox{3}{010}\bitbox{22}{Ignored}
                \end{leftwordgroup} \\
                \end{leftwordgroup} \\
\end{bytefield}
\end{bytefield}
\caption{NOOP/Break/LOCK Instruction Format}\label{fig:iset-noop}
\caption{NOOP/Break/LOCK Instruction Format}\label{fig:iset-noop}
\end{center}\end{figure}
\end{center}\end{figure}
 
 
Line 937... Line 1043...
The {\tt BREAK} instruction is useful for creating a debug instruction that
The {\tt BREAK} instruction is useful for creating a debug instruction that
will halt the CPU without executing.  If in user mode, depending upon the
will halt the CPU without executing.  If in user mode, depending upon the
setting of the break enable bit, it will either switch to supervisor mode or
setting of the break enable bit, it will either switch to supervisor mode or
halt the CPU--depending upon where the user wishes to do his debugging.
halt the CPU--depending upon where the user wishes to do his debugging.
 
 
Finally, the {\tt LOCK} instruction was added in order to make a test and
Finally, the {\tt LOCK} instruction was added in order to provide for
set multi--CPU operation possible.  Following a LOCK instruction, the next
atomic operations.  The {\tt LOCK} instruction only works in pipeline mode.
two instructions, if they are memory LOD/STO instructions, will execute without
It works by stalling the ALU pipeline stack until all prior stages are
dropping the wishbone {\tt CYC} line between the instructions.   Thus a
filled, and then it guarantees that once a bus cycle is started, the
{\tt LOCK} followed by {\tt LOD (Rx),Ry} and a {\tt STO Rz,(Rx)}, where Rz
wishbone {\tt CYC} line will remain asserted until the LOCK is deasserted.
is initially set, can be used to set an address while guaranteeing that Ry
This allows the execution of one instruction that was waiting in the load
was the value before setting the address to Rz.   This is a useful instruction
operands pipeline stage, and one instruction that was waiting in the
while trying to achieve concurrency among multiple CPU's.
instruction decode stage.  Further, if the instruction waiting in the decode

stage was a VLIW instruction, then it may be possible to execute a third

instruction.

 

This was originally written to implement an atomic test and set instruction,

such as a {\tt LOCK} followed by {\tt LOD (Rx),Ry} and a {\tt STO Rz,(Rx)},

where Rz is initially set.

 

Other instructions using a VLIW instruction combining a single ALU instruction

with a store, such as an atomic increment, or {\tt LOCK}, {\tt LOD (Rx),Ry},

{\tt ADD 1,Ry}, {\tt STO Ry,(Rx)}, should be possible as well.  Many of these

combinations remain to be tested.
 
 
\section{Floating Point}
\section{Floating Point}
Although the Zip CPU does not (yet) have a floating point unit, the current
Although the Zip CPU does not (yet) have a floating point unit, the current
instruction set offers eight opcodes for floating point operations, and treats
instruction set offers eight opcodes for floating point operations, and treats
floating point exceptions like divide by zero errors.  Once this unit is built
floating point exceptions like divide by zero errors.  Once this unit is built
and integrated together with the rest of the CPU, the Zip CPU will support
and integrated together with the rest of the CPU, the Zip CPU will support
32--bit floating point instructions natively.  Any 64--bit floating point
32--bit floating point instructions natively.  Any 64--bit floating point
instructions will still need to be emulated in software.
instructions will still need to be emulated in software.
 
 

Until that time, of even after if the floating point unit is not installed,

floating point instructions will trigger an illegal instruction exception,

which may be trapped and then implemented in software.

 
\section{Derived Instructions}
\section{Derived Instructions}
The Zip CPU supports many other common instructions, but not all of them
The Zip CPU supports many other common instructions, but not all of them
are single cycle instructions.  The derived instruction tables,
are single cycle instructions.  The derived instruction tables,
Tbls.~\ref{tbl:derived-1}, \ref{tbl:derived-2}, \ref{tbl:derived-3}
Tbls.~\ref{tbl:derived-1}, \ref{tbl:derived-2}, \ref{tbl:derived-3}
and~\ref{tbl:derived-4},
and~\ref{tbl:derived-4},
Line 1123... Line 1244...
        \\\hline
        \\\hline
{\tt STEP Rr,Rt}
{\tt STEP Rr,Rt}
        & \parbox[t]{1.5in}{\tt LSR \$1,Rr \\ XOR.C Rt,Rr}  & \parbox[t]{1.5in}{\tt LSR \$1,Rr \\ XOR.C Rt,Rr}
        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,
        & Step a Galois implementation of a Linear Feedback Shift Register, Rr,
                using taps Rt \\\hline
                using taps Rt \\\hline

%

%

{\tt SEX.b Rx }

        & \parbox[t]{1.5in}{\tt LSL 24,Rx \\ ASR 24,Rx}

        & Signed extend a byte into a full word.\\\hline

{\tt SEX.h Rx }

        & \parbox[t]{1.5in}{\tt LSL 16,Rx \\ ASR 16,Rx}

        & Sign extend a half word into a full word.\\\hline

%
{\tt STO.b Rx,\$addr} {\tt STO.b Rx,\$addr}
        & \parbox[t]{1.5in}{\tt %
        & \parbox[t]{1.5in}{\tt %
        LDI \$addr,Ra \\  LDI \$addr,Ra \\
        LDI \$addr,Rb \\  LDI \$addr,Rb \\
        LSR \$2,Ra \\  LSR \$2,Ra \\