URL
https://opencores.org/ocsvn/marca/marca/trunk
Subversion Repositories marca
Compare Revisions
- This comparison shows the changes necessary to convert path
/marca/tags/INITIAL/doc
- from Rev 3 to Rev 8
- ↔ Reverse comparison
Rev 3 → Rev 8
/marca.dia
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
marca.dia
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: implementation.tex
===================================================================
--- implementation.tex (nonexistent)
+++ implementation.tex (revision 8)
@@ -0,0 +1,247 @@
+\documentclass[10pt, twoside, a4paper]{article}
+\usepackage{graphicx}
+\usepackage{listings}
+
+\title{marca - McAdam's RISC Computer Architecture\\Implementation Details}
+\author{Wolfgang Puffitsch}
+
+\begin{document}
+
+ \maketitle
+
+ \section{General}
+
+ \begin{itemize}
+ \item 16 16-bit registers
+ \item 16KB instruction ROM (8192 instructions)
+ \item 8KB data RAM
+ \item 256 byte data ROM
+ \item 75 instructions
+ \item 16 interrupt vectors
+ \end{itemize}
+
+ \section{Internals}
+
+ The processor features a 4-stage pipeline:
+ \begin{itemize}
+ \item instruction fetch
+ \item instruction decode
+ \item execution/memory access
+ \item write back
+ \end{itemize}
+ This scheme is similar to the one used in the MIPS architecture,
+ only execution and write back stage are drawn together. For our
+ architecture does not support indexed addressing, it does not need
+ the ALU's result and can work in parallel, having the advantage of
+ reducing the possible hazards.
+
+ Figure \ref{fig:marca} shows a rough scheme of the internals of the
+ processor.
+ \begin{figure}[ht!]
+ \centering
+ \includegraphics[width=.95\textwidth]{marca}
+ \caption{Internal scheme}
+ \label{fig:marca}
+ \end{figure}
+
+ \subsection{Branches}
+ Branches are not predicted and if executed they stall the the
+ pipeline, leading to a total execution time of 4 cycles. The fetch
+ stage is not stalled, the decode stage however is stalled for two
+ cycles to compensate that.
+
+ \subsection{Instruction fetch}
+ This stage is not spectacular: it simply reads an instruction from
+ the instruction ROM, and extracts the bits for the source and
+ destination registers.
+
+ \subsection{Instruction decode}
+ This stage translates the bit-patterns of the opcodes to the signals
+ used internally for the operations. It also holds the register file
+ and handles access to it. Immediate values are also constructed here.
+
+ \subsection{Execution / Memory access}
+ The execution stage is the heart and soul of the processor: it holds
+ the ALU, the memory/IO unit and a unit for interrupt handling.
+
+ \subsubsection{ALU}
+ The ALU does all arithmetic and logic computations as well as taking
+ care of the processors flags (which are organized as seen in table
+ \ref{tab:flags}).
+
+ \begin{table}[ht!]
+ \centering
+ \begin{tabular}{|p{.75em}|p{.75em}|p{.75em}|p{.75em}
+ |p{.75em}|p{.75em}|p{.75em}|p{.75em}
+ |p{.75em}|p{.75em}|p{.75em}|p{.75em}
+ |p{.75em}|p{.75em}|p{.75em}|p{.75em}|p{.75em}}
+ \multicolumn{16}{c}{Bit 15 \hfill Bit 0} \\
+ \hline
+ & & & & & & & & & & P & I & N & V & C & Z \\
+ \hline
+ \end{tabular}
+ \caption{The flag register}
+ \label{tab:flags}
+ \end{table}
+
+ Operations which need more than one cycle to execute (multiplication,
+ division and modulo) block the rest of the processor until they are
+ finished.
+
+ \subsubsection{Memory/IO unit}
+ The memory/IO unit takes care of the ordinary data memory, the data
+ ROM (which is mapped to the addresses right above the RAM) and the
+ communication to peripheral modules. Peripheral modules are located
+ within the memory/IO unit and mapped to the highest addresses.
+
+ The memories (the instruction ROM too) are Altera specific; we
+ decided not to use generic memories, because \textsl{Quartus} can update the
+ contents of its proprietary ROMs without synthesizing the whole
+ design. Because all memories are single-ported (and thus fairly
+ simple) it should be easy to replace them with memories specific to
+ other vendors.
+
+ We also decided against the use of external memories; larger FPGAs
+ can accommodate all addressable memory on-chip, so the implementation
+ overhead would not have paid off.
+
+ Accesses which take more than one cycle (stores to peripheral
+ modules and all load operations) block the rest of the processor
+ until they are finished.
+
+ \paragraph{Peripheral modules}
+ The peripheral modules use a slightly modified version of the SimpCon
+ interface. The SimpCon specific signals are pulled together to
+ records, and the words which can be read/written are limited to 16
+ bits. For accessing such a module, one may only use \texttt{load}
+ and \texttt{store} instructions which point to aligned addresses.
+
+ \paragraph{UART}
+ The built-in UART is derived from the sc\_uart from Martin
+ Sch\"oberl. Apart from adapting the SimpCon interface, an interrupt
+ line and two bits for enabling/masking receive (bit 3 in the status
+ register) and transmit (bit 2) interrupts. In the current version
+ address 0xFFF8 (-8) correspond to the UART's status register and
+ address 0xFFFA (-6) to the wr\_data/rd\_data register.
+
+ \subsubsection{Interrupt unit}
+ The interrupt unit takes care of the interrupt vectors and, of
+ course, the triggering of interrupts. Interrupts are executed only
+ if the global interrupt flag is set, none of the other units is busy
+ and the instruction in the execution stage is valid (it takes 3
+ cycles after jumps, branches etc. until a new valid instruction is
+ in that stage).
+
+ Instructions which cannot be decoded as well as the ``error''
+ instruction trigger interrupt 0; the ALU can trigger interrupt 1
+ (division by zero), the memory unit can trigger interrupt 2 (invalid
+ memory access). In contrast to all other interrupts, these three
+ interrupts do not repeat the instruction which is executed when they
+ occur.
+
+ \subsection{Write back}
+ The write back stage passes on the result of the execution stage to
+ all other stages.
+
+ \section{Assembler}
+ The assembler \textsl{spar} (SPear Assembler Recycled) uses a syntax
+ quite like usual Unix-style assemblers. It accepts the pseudo-ops
+ \texttt{.file}, \texttt{.text}, \texttt{.data}, \texttt{.bss},
+ \texttt{.align}, \texttt{.comm}, \texttt{.lcomm}, \texttt{.org} and
+ \texttt{.skip} with the usual meanings. The mnemonic \texttt{data}
+ initializes a byte to some constant value. In difference to the
+ instruction set architecture specification, \texttt{mod} and
+ \texttt{umod} accept three operands (if a move is needed, it is
+ silently inserted).
+
+ The assembler produces three files: one file for the instruction
+ ROM, one file for the even bytes of the data ROM and one file for
+ the odd bytes of the instruction ROM. The splitting of the data is
+ necessary, because the data memories internally are split into two
+ 8-bit memories in order to support unaligned memory accesses without
+ delays.
+
+ Three output formats are supported: .mif (Memory Initialization
+ Format), .hex (Intel Hex Format) and a binary format designed for
+ download via UART.
+
+ \section{Resource usage and speed}
+
+ The processor was synthesized with \textsl{Quartus II} for the
+ \textsl{Cyclone EP1C12Q240C8} FPGA with 12060 logic cells and 29952
+ bytes of on-chip memory available.
+
+ The processor needs $\sim$3550 logic cells or 29\% when being
+ compiled for maximum clock frequency, which is $\sim$60 MHz. When
+ optimizing for area, it needs $\sim$2600 logic cells or 22\% at
+ $\sim$25 MHz.
+
+ The processor uses 24832 bytes or 83\% of on-chip memory.
+
+ \section{Example}
+
+ \subsection{Reversing a line}
+
+ In listing \ref{lst:uart} one can see how to interface the uart via
+ interrupts. The program reads in a line from the UART and the writes
+ it back reversed. The lines 1 to 4 show how to instantiate memory
+ (the two bytes defined form the DOS-style end-of-line). The
+ lines 7 to 25 initialize the registers and register the interrupt
+ vectors, line 28 builds a barrier against the rest of the code.
+
+ The lines 32 to 76 form the interrupt service routine. It first
+ checks if it is operating in read or in write mode. When reading, it
+ reads from the UART and stores the result. A mode switch occurs when
+ a newline character is encountered. In write mode the contents of
+ the buffer is written to the UART and switching back to read mode is
+ done when finished.
+
+ In figure \ref{fig:sim} the results of the simulation are presented.
+
+ \lstset{basicstyle=\footnotesize,numbers=left,numberstyle=\tiny}
+ \lstset{caption=Example for the UART and interrupts}
+ \lstset{label=lst:uart}
+ \lstinputlisting{uart_reverse.s}
+
+ \begin{figure}[ht!]
+ \centering
+ \includegraphics[width=.95\textwidth]{uart_sim}
+ \caption{Simulation results}
+ \label{fig:sim}
+ \end{figure}
+
+ \subsection{Computing factorials}
+
+ The example in \ref{lst:fact} computes the factorials of 1 \ldots 9
+ and writes the results to the PC via UART. Note that the last result
+ transmitted will be wrong, because it is truncated to 16 bits.
+
+ \lstset{basicstyle=\footnotesize,numbers=left,numberstyle=\tiny}
+ \lstset{caption=Computing factorials}
+ \lstset{label=lst:fact}
+ \lstinputlisting{factorial.s}
+
+
+ \section{Versions Of This Document}
+
+ 2006-12-14: Draft version \textbf{0.1}
+
+ \noindent
+ 2006-12-29: Draft version \textbf{0.2}
+ \begin{itemize}
+ \item A few refinements.
+ \end{itemize}
+
+ \noindent
+ 2007-01-22: Draft version \textbf{0.3}
+ \begin{itemize}
+ \item Added another example.
+ \end{itemize}
+
+ \noindent
+ 2007-02-02: Draft version \textbf{0.4}
+ \begin{itemize}
+ \item Updated resource usage and speed section.
+ \end{itemize}
+
+\end{document}
Index: factorial.s
===================================================================
--- factorial.s (nonexistent)
+++ factorial.s (revision 8)
@@ -0,0 +1,134 @@
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; factorial
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; compute factorials of 1 to 9 and write results to
+;;; the PC via UART
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+.data
+
+;;; the numbers to be written are placed here
+iobuf:
+ data 0x0A
+ data 0x0D
+ data 0
+ data 0
+ data 0
+ data 0
+ data 0
+ data 0
+
+;;; stack for recursive calls of factorial()
+stack:
+
+.text
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; main()
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+ ldib r15, 1 ; number to start
+ ldib r5, 10 ; number to stop
+
+ ldil r1, lo(stack) ; setup for factorial()
+ ldih r1, hi(stack)
+ ldil r2, lo(factorial)
+ ldih r2, hi(factorial)
+
+ ldib r6, 0x30 ; setup for convert()
+ ldib r7, 10
+ ldil r8, lo(iobuf)
+ ldih r8, hi(iobuf)
+ ldil r9, lo(convert)
+ ldih r9, hi(convert)
+
+ ldib r12, -8 ; enable write interrupts
+ ldib r11, (1 << 2)
+ store r11, r12
+
+ ldil r12, lo(isr) ; register isr() to be called upon
+ ldih r12, hi(isr) ; interrupt #3
+ stvec r12, 3
+
+ ldib r12, -6 ; address where to write data
+ ; to the UART
+
+loop:
+ mov r0, r15 ; r0 is the argument
+ call r2, r3 ; call factorial()
+ call r9, r3 ; call convert()
+
+wait: getfl r13
+ btest r13, 4 ; interrupts still enabled?
+ brnz wait
+
+ addi r15, 1 ; loop
+ cmp r15, r5
+ brnz loop
+
+exit: br exit ; stop here after all
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; converting content of r4 to a string
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+convert:
+ addi r8, 2
+convert_loop:
+ umod r4, r7, r10 ; the conversion
+ add r10, r6, r10
+ storel r10, r8
+ addi r8, 1
+
+ udiv r4, r7, r4 ; next digit
+
+ cmpi r4, 0
+ brnz convert_loop
+
+ sei ; trigger write
+ jmp r3
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; write out content of iobuf
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+isr:
+ cmpi r8, iobuf ; reached end?
+ brz written
+
+ addi r8, -1 ; write data to UART
+ loadb r10, r8
+ store r10, r12
+
+ reti
+
+written:
+ getshfl r10
+ bclr r10, 4 ; clear interrupt flag
+ setshfl r10
+ reti
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;; recursively compute factorial
+;;; argument: r0
+;;; return value: r4
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+factorial:
+ cmpi r0, 1 ; reached end?
+ brule fact_leaf
+
+ store r0, r1 ; push argument and return
+ addi r1, 2 ; address onto stack
+ store r3, r1
+ addi r1, 2
+
+ addi r0, -1 ; call factorial(r0-1)
+ call r2, r3
+
+ addi r1, -2 ; pop argument and return
+ load r3, r1 ; address from stack
+ addi r1, -2
+ load r0, r1
+
+ mul r0, r4, r4 ; return r0*factorial(r0-1)
+ jmp r3
+
+fact_leaf: ; factorial(1) = 1
+ ldib r4, 1
+ jmp r3
Index: marca.png
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: marca.png
===================================================================
--- marca.png (nonexistent)
+++ marca.png (revision 8)
marca.png
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: uart_sim.png
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: uart_sim.png
===================================================================
--- uart_sim.png (nonexistent)
+++ uart_sim.png (revision 8)
uart_sim.png
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: uart_reverse.s
===================================================================
--- uart_reverse.s (nonexistent)
+++ uart_reverse.s (revision 8)
@@ -0,0 +1,79 @@
+.data
+ data 0x0A
+ data 0x0D
+buffer:
+
+.text
+;;; initialization
+ ldib r0, -8 ; config/status
+ ldib r1, -6 ; data
+
+ ldil r2, lo(buffer) ; buffer address
+ ldih r2, hi(buffer) ; buffer address
+
+ ldib r3, 0x0A ; newline character
+ ldib r4, 0x0D ; carriage return
+
+ ldib r5, 0 ; mode
+
+ ldib r7, isr ; register isr
+ stvec r7, 3
+
+ ldib r7, (1 << 3) ; enable receive interrupts
+ store r7, r0
+
+ sei ; enable interrupts
+
+;;; loop forever
+loop: br loop
+
+
+;;; ISR
+isr:
+ cmpi r5, 0 ; check mode
+ brnz write_mode
+
+;;; reading
+read_mode:
+ load r7, r1 ; read data
+
+ cmp r7, r3 ; change mode upon newline
+ brnz read_CR
+
+ ldib r7, (1 << 2) ; do the change
+ store r7, r0
+ ldib r5, 1
+ reti
+
+read_CR:
+ cmp r7, r4 ; ignore carriage return
+ brnz read_cont
+ reti
+
+read_cont:
+ storel r7, r2 ; store date
+ addi r2, 1
+ reti
+
+;;; writing
+write_mode:
+ addi r2, -1
+
+ cmpi r2, -1 ; change mode if there is no more data
+ brnz write_cont
+
+ ldil r2, lo(buffer) ; correct pointer to buffer
+ ldih r2, hi(buffer)
+
+ ldib r7, (1 << 3) ; do the change
+ store r7, r0
+ ldib r5, 0
+ reti
+
+write_cont:
+ loadl r7, r2 ; write data
+ store r7, r1
+ reti
+
+
+
Index: marca.eps
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: marca.eps
===================================================================
--- marca.eps (nonexistent)
+++ marca.eps (revision 8)
marca.eps
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: uart_sim.eps
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: uart_sim.eps
===================================================================
--- uart_sim.eps (nonexistent)
+++ uart_sim.eps (revision 8)
uart_sim.eps
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: isa.tex
===================================================================
--- isa.tex (nonexistent)
+++ isa.tex (revision 8)
@@ -0,0 +1,245 @@
+\documentclass[10pt, twoside, a4paper]{article}
+\usepackage{longtable}
+
+\newcommand{\shl}{\ensuremath{<\!\!<}}
+\newcommand{\shr}{\ensuremath{>\!\!>\!\!>}}
+\newcommand{\sar}{\ensuremath{>\!\!>}}
+\newcommand{\at}{\ensuremath{\!\!:\!\!}}
+
+\title{marca - McAdam's RISC Computer Architecture\\Instruction Set Architecture}
+\author{Kenan Bilic, Roland Kammerer, Wolfgang Puffitsch}
+
+\begin{document}
+
+ \maketitle
+
+ \section{General}
+
+ \begin{itemize}
+ \item 16 16-bit registers, r0 \ldots r15
+ \item any register as return address
+ \item flags: Z, C, V, N
+ \begin{itemize}
+ \item Z: all bits of the last result are zero
+ \item C: ``17$^{th}$ bit'' of the last result
+ \item N: 16$^{th}$ bit of the last result
+ \item V: overflow, after sub/cmp it is $r1 \at 15 \oplus r2 \at 15
+ \oplus N \oplus C$, the latter two according to the result,
+ other operations accordingly
+ \item I: allow interrupts
+ \item P: parity of the last result
+ \end{itemize}
+ Flags are written where meaningful: P and Z are computed whenever
+ a register is written, arithmetic operations may change C, N and
+ V, interrupts clear I upon entry.
+ \item flags are stored and restored upon interrupt entry and exit
+ to/from ``shflags'' (shadow flags)
+ \item separate registers for interrupt vectors - read and written
+ through ``ldvec'' / ``stvec''
+ \item Some parts come from the Alpha architecture. The handling of
+ branches is inspired by the Intel x86.
+ \item External hardware modules shall be mapped to the highest
+ memory locations.
+ \end{itemize}
+
+ The processor uses a Harvard architecture; although it has not
+ prevailed in mainstream-architectures, it is still used in embedded
+ processors such as the Atmel AVR. The separation of code- and
+ data-memory is not flexible enough for mainstream systems, but with
+ small embedded processors the program code tends to be fixed
+ anyway. A Harvard architecture enables the processor to make use of
+ more memory (which is an issue when the address space is limited to
+ 64k), and the program code can be read from a ROM directly. A
+ transient failure thus cannot destroy the program by overwriting its
+ code section.
+
+ \clearpage
+
+ \section{Instruction Set}
+
+ {\small
+ \begin{longtable}{llp{.62\textwidth}}
+ Instruction & Opcode & Semantics \\
+ add r1, r2, r3 & \texttt{0000} & $r1 + r2 \rightarrow r3$ \\
+ sub r1, r2, r3 & \texttt{0001} & $r1 - r2 \rightarrow r3$ \\
+ addc r1, r2, r3 & \texttt{0010} & $r1 + r2 + C \rightarrow r3$ \\
+ subc r1, r2, r3 & \texttt{0011} & $r1 - r2 - C \rightarrow r3$ \\
+ and r1, r2, r3 & \texttt{0100} & $r1 \wedge r2 \rightarrow r3$ \\
+ or r1, r2, r3 & \texttt{0101} & $r1 \vee r2 \rightarrow r3$ \\
+ xor r1, r2, r3 & \texttt{0110} & $r1 \oplus r2 \rightarrow r3$ \\
+ mul r1, r2, r3 & \texttt{0111} & $r1 * r2 \rightarrow r3$ \\
+ div r1, r2, r3 & \texttt{1000} & $r1 \div r2 \rightarrow r3$ \\
+ udiv r1, r2, r3 & \texttt{1001} & $r1 \div r2 \rightarrow r3, \textnormal{unsigned} $ \\
+ ldil r1, n8 & \texttt{1010} & $(r1 \wedge \texttt{0xff00}) \vee n8 \rightarrow r1, -128 \leq n8 \leq 255 $ \\
+ ldih r1, n8 & \texttt{1011} & $(r1 \wedge \texttt{0x00ff}) \vee (n8 \shl 8) \rightarrow r1, -128 \leq n8 \leq 255 $ \\
+ ldib r1, n8 & \texttt{1100} & $n8 \rightarrow r1, -128 \leq n8 \leq 127$ \\
+ \hline
+ mov r1, r2 & \texttt{11010000} & $r2 \rightarrow r1$ \\
+ mod r1, r2 & \texttt{11010001} & $r1\ \textnormal{mod}\ r2 \rightarrow r1$ \\
+ umod r1, r2 & \texttt{11010010} & $r1\ \textnormal{mod}\ r2 \rightarrow r1, \textnormal{unsigned} $ \\
+ not r1, r2 & \texttt{11010011} & $\lnot r2 \rightarrow r1$ \\
+ neg r1, r2 & \texttt{11010100} & $-r1 \rightarrow r2$ \\
+ cmp r1, r2 & \texttt{11010101} & $r1 - r2, \textnormal{sets flags}$ \\
+ addi r1, n4 & \texttt{11010110} & $r1 + n4 \rightarrow r1, -8 \leq n4 \leq 7$ \\
+ cmpi r1, n4 & \texttt{11010111} & $r1 - n4, \textnormal{sets flags}, -8 \leq n4 \leq 7$ \\
+ shl r1, r2 & \texttt{11011000} & $r1 \shl r2 \rightarrow r1$ \\
+ shr r1, r2 & \texttt{11011001} & $r1 \shr r2 \rightarrow r1$ \\
+ sar r1, r2 & \texttt{11011010} & $r1 \sar r2 \rightarrow r1$ \\
+ rolc r1, r2 & \texttt{11011011} & $(r1 \shl r2) \vee (C \shl (r2-1)) \vee (r1 \shr (16-r2-1))$ \\
+ rorc r1, r2 & \texttt{11011100} & $(r1 \shr r2) \vee (C \shl (16-r2)) \vee (r1 \shl (16-r2-1))$ \\
+ bset r1, n4 & \texttt{11011101} & $r1 \vee (1 \shl n4) \rightarrow r1, 0 \leq n4 \leq 15$ \\
+ bclr r1, n4 & \texttt{11011110} & $r1 \wedge \lnot (1 \shl n4) \rightarrow r1, 0 \leq n4 \leq 15$ \\
+ btest r1, n4 & \texttt{11011111} & $(r1 \shr n4) \wedge 1 \rightarrow Z, 0 \leq n4 \leq 15$ \\
+ \hline
+ load r1, r2 & \texttt{11100000} & $[r2] \at [r2+1] \rightarrow r1$ \\
+ store r1, r2 & \texttt{11100001} & $r1 \rightarrow [r2] \at [r2+1]$ \\
+ loadl r1, r2 & \texttt{11100010} & $(r1 \wedge \texttt{0xff00}) \vee [r2] \rightarrow r1$ \\
+ loadh r1, r2 & \texttt{11100011} & $(r1 \wedge \texttt{0x00ff}) \vee ([r2] \shl 8) \rightarrow r1$ \\
+ loadb r1, r2 & \texttt{11100100} & $[r2] \rightarrow r1, \textnormal{signed}$ \\
+ storel r1, r2 & \texttt{11100101} & $(r1 \wedge \texttt{0x00ff}) \rightarrow [r2]$ \\
+ storeh r1, r2 & \texttt{11100110} & $(r1 \shr 8) \rightarrow [r2]$ \\
+ call r1, r2 & \texttt{11101000} & $r1 \rightarrow pc, pc \rightarrow r2$ \\
+ \hline
+ br n8 & \texttt{11110000} & $pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brz n8 & \texttt{11110001} & $Z = 1 \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brnz n8 & \texttt{11110010} & $Z = 0 \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brle n8 & \texttt{11110011} & $(Z = 1) \vee (N \not = V) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brlt n8 & \texttt{11110100} & $(Z = 0) \wedge (N \not = V) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brge n8 & \texttt{11110101} & $(Z = 1) \vee (N = V) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brgt n8 & \texttt{11110110} & $(Z = 0) \wedge (N = V) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brule n8 & \texttt{11110111} & $(Z = 1) \vee (C = 1) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brult n8 & \texttt{11111000} & $(Z = 0) \wedge (C = 1) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ bruge n8 & \texttt{11111001} & $(Z = 1) \vee (C = 0) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ brugt n8 & \texttt{11111010} & $(Z = 0) \wedge (C = 0) \Rightarrow pc + n8 \rightarrow pc, -128 \leq n8 \leq 127$ \\
+ sext r1, r2 & \texttt{11111011} & $(r1 \shl 8) \sar 8 \rightarrow r2$ \\
+ ldvec r1, n4 & \texttt{11111100} & $\textnormal{interrupt vector}\ n4 \rightarrow r1, 0 \leq n4 \leq 15$ \\
+ stvec r1, n4 & \texttt{11111101} & $r1 \rightarrow \textnormal{interrupt vector}\ n4, 0 \leq n4 \leq 15$ \\
+ \hline
+ jmp r1 & \texttt{111111100000} & $r1 \rightarrow pc$ \\
+ jmpz r1 & \texttt{111111100001} & $Z = 1 \Rightarrow r1 \rightarrow pc$ \\
+ jmpnz r1 & \texttt{111111100010} & $Z = 0 \Rightarrow r1 \rightarrow pc$ \\
+ jmple r1 & \texttt{111111100011} & $(Z = 1) \vee (N \not = V) \Rightarrow r1 \rightarrow pc$ \\
+ jmplt r1 & \texttt{111111100100} & $(Z = 0) \wedge (N \not = V) \Rightarrow r1 \rightarrow pc$ \\
+ jmpge r1 & \texttt{111111100101} & $(Z = 1) \vee (N = V) \Rightarrow r1 \rightarrow pc$ \\
+ jmpgt r1 & \texttt{111111100110} & $(Z = 0) \wedge (N = V) \Rightarrow r1 \rightarrow pc$ \\
+ jmpule r1 & \texttt{111111100111} & $(Z = 1) \vee (C = 1) \Rightarrow r1 \rightarrow pc$ \\
+ jmpult r1 & \texttt{111111101000} & $(Z = 0) \wedge (C = 1) \Rightarrow r1 \rightarrow pc$ \\
+ jmpuge r1 & \texttt{111111101001} & $(Z = 1) \vee (C = 0) \Rightarrow r1 \rightarrow pc$ \\
+ jmpugt r1 & \texttt{111111101010} & $(Z = 0) \wedge (C = 0) \Rightarrow r1 \rightarrow pc$ \\
+ intr n4 & \texttt{111111101011} & $\textnormal{interrupt vector}\ n4 \rightarrow pc, pc \rightarrow ira, flags \rightarrow shflags, 0 \leq n4 \leq 15$ \\
+ getira r1 & \texttt{111111101100} & $ira \rightarrow r1$ \\
+ setira r1 & \texttt{111111101101} & $r1 \rightarrow ira$ \\
+ getfl r1 & \texttt{111111101110} & $flags \rightarrow r1$ \\
+ setfl r1 & \texttt{111111101111} & $r1 \rightarrow flags$ \\
+ getshfl r1 & \texttt{111111110000} & $shflags \rightarrow r1$ \\
+ setshfl r1 & \texttt{111111110001} & $r1 \rightarrow shflags$ \\
+ \hline
+ reti & \texttt{1111111111110000} & $ira \rightarrow pc, shflags \rightarrow flags$ \\
+ nop & \texttt{1111111111110001} & $\textnormal{do nothing}$ \\
+ sei & \texttt{1111111111110010} & $1 \rightarrow I$ \\
+ cli & \texttt{1111111111110011} & $0 \rightarrow I$ \\
+ error & \texttt{1111111111111111} & $\textnormal{invalid operation}$ \\
+ \end{longtable}}
+
+ \subsection{NOTES}
+ \begin{itemize}
+ \item Apart from the standard operators, the following notation is
+ used in the table above:
+ \begin{itemize}
+ \item \shl, \shr, \sar are shifting operators, with semantics as in Java
+ \item $[x]$ means accessing memory location $x$, 8 bits wide
+ \item $x \at y$ means concatenating $x$ and $y$, in the sense of
+ forming a 16-bit value from two 8-bit values
+ \end{itemize}
+ \item Modulo does not follow the patterns for ``div'' and ``udiv'',
+ because there was not enough room for two more 3-operand
+ operations. The assembler accepts the mnemonic with 3 registers as
+ operands and substitute it with the according ``mov'' and ``mod''
+ instructions.
+ \end{itemize}
+
+ \clearpage
+
+ \subsection{Instruction formats}
+
+ The following formats for instructions are to be used:
+
+ \begin{center}
+ \begin{tabular}{|p{1in}|p{1in}|p{1in}|p{1in}|}
+ \hline
+ Bits 15 \ldots 12 & Bits 11 \ldots 8 & Bits 7 \ldots 4 & Bits 3 \ldots 0 \\
+ \hline
+ Opcode & r3 & r2 & r1 \\
+ \hline
+ Opcode & \multicolumn{2}{|l|}{n8} & r1 \\
+ \hline
+ \multicolumn{2}{|l|}{Opcode} & r2 & r1 \\
+ \hline
+ \multicolumn{2}{|l|}{Opcode} & n4 & r1 \\
+ \hline
+ \multicolumn{2}{|l|}{Opcode} & \multicolumn{2}{|l|}{n8} \\
+ \hline
+ \multicolumn{3}{|l|}{Opcode} & r1 \\
+ \hline
+ \multicolumn{3}{|l|}{Opcode} & n4 \\
+ \hline
+ \multicolumn{4}{|l|}{Opcode} \\
+ \hline
+ \end{tabular}
+ \end{center}
+
+ \section{Versions Of This Document}
+
+ 2006-10-04: Draft version \textbf{0.1}
+
+ \noindent
+ 2006-10-05: Draft version \textbf{0.2}
+ \begin{itemize}
+ \item rearrangement of some ops
+ \end{itemize}
+
+ \noindent
+ 2006-10-11: Draft version \textbf{0.3}
+ \begin{itemize}
+ \item replaced ``ror''/``rol'' with ``mod''/``umod''
+ \item refined considerations of direction flag
+ \item proposal for priorities of implementation
+ \end{itemize}
+
+ \noindent
+ 2006-10-28: Draft version \textbf{0.4}
+ \begin{itemize}
+ \item settled to singed loads
+ \item settled to shifts by registers
+ \item dropped ``push''/``pop''; the secondary result would cause a
+ considerable overhead
+ \item specified pipelining
+ \end{itemize}
+
+ \noindent
+ 2006-10-30: Draft version \textbf{0.5}
+ \begin{itemize}
+ \item added shflags to ease interrupt (and stack) handling
+ \item a few refinements
+ \end{itemize}
+
+ \noindent
+ 2006-12-02: Draft version \textbf{0.6}
+ \begin{itemize}
+ \item the first register is the target with ``mov'' and ``not'' now.
+ \item now the second register is always the address when accessing memory
+ \item reversed order with immediate loads
+ \item ``ldvec'' and ``stvec'' use the same order now
+ \item fixed instruction format for immediate loads
+ \end{itemize}
+
+ \noindent
+ 2006-12-14: Draft version \textbf{0.7}
+ \begin{itemize}
+ \item dropped ``ldpgm'' in favor of a ROM which is mapped to the
+ ordinary memory space
+ \item moved section about pipelinign to the implementation document
+ \item removed note about interrupts; they are implemented already
+ \end{itemize}
+
+\end{document}
Index: Makefile
===================================================================
--- Makefile (nonexistent)
+++ Makefile (revision 8)
@@ -0,0 +1,19 @@
+all: isa.ps.gz isa.pdf implementation.ps.gz implementation.pdf
+
+EXTERN_DATA: factorial.s uart_reverse.s marca.eps marca.png uart_sim.eps uart_sim.png
+
+%.dvi: %.tex $(EXTERN_DATA)
+ latex $<
+ latex $<
+
+%.ps: %.dvi
+ dvips $< -o $@
+
+%.ps.gz: %.ps
+ gzip -c < $< > $@
+
+%.pdf: %.tex %.dvi
+ pdflatex $<
+
+clean:
+ rm -f *.ps *.pdf *.dvi *.aux *.log
\ No newline at end of file