1 |
28 |
skordal |
% The Potato Processor - User's Manual
|
2 |
|
|
% (c) Kristian Klomsten Skordal 2015 <skordal@opencores.org>
|
3 |
|
|
% Report bugs and issues on <http://opencores.org/project,potato,bugtracker>
|
4 |
|
|
|
5 |
|
|
\documentclass[12pt,a4paper]{report}
|
6 |
|
|
|
7 |
|
|
\usepackage[utf8]{inputenc}
|
8 |
|
|
\usepackage[pdftitle={The Potato Processor: Technical Reference Manual},
|
9 |
|
|
pdfauthor={Kristian Klomsten Skordal}]{hyperref}
|
10 |
|
|
\usepackage{placeins}
|
11 |
|
|
\usepackage{titlesec}
|
12 |
|
|
\usepackage[english]{babel}
|
13 |
|
|
|
14 |
|
|
\newcommand{\register}[1]{\textsc{#1}}
|
15 |
|
|
|
16 |
|
|
\titleformat{\chapter}[block]{\normalfont\huge\bfseries}{\thechapter}{20pt}{\Huge}\titlespacing*{\chapter}{0pt}{50pt}{40pt}
|
17 |
|
|
|
18 |
|
|
\begin{document}
|
19 |
|
|
|
20 |
|
|
\begin{titlepage}
|
21 |
|
|
\begin{center}
|
22 |
|
|
\vspace*{3cm}
|
23 |
|
|
{\large The Potato Processor}\\[0.5cm]
|
24 |
|
|
{\Huge\bf Technical Reference Manual}\\[6cm]
|
25 |
|
|
|
26 |
|
|
\textsc{Kristian Klomsten Skordal}\\\href{mailto:skordal@opencores.org}{skordal@opencores.org}\\[3em]
|
27 |
|
|
|
28 |
|
|
\vfill
|
29 |
|
|
{Project page:\\\url{http://opencores.org/project,potato}}\\[0.2em]
|
30 |
|
|
{Report bugs and issues on\\\url{http://opencores.org/project,potato,bugtracker}}\\[1.2em]
|
31 |
|
|
{\small Updated \today}
|
32 |
|
|
\end{center}
|
33 |
|
|
\end{titlepage}
|
34 |
|
|
|
35 |
|
|
\tableofcontents
|
36 |
|
|
|
37 |
|
|
\chapter{Introduction}
|
38 |
|
|
|
39 |
|
|
The Potato processor is an implementation of the 32-bit integer subset of the RISC-V
|
40 |
|
|
instruction set v2.0. It is designed around a standard 5-stage pipeline. All instructions
|
41 |
|
|
execute in 1 cycle, with the exception of load and store instructions when the processor
|
42 |
|
|
has to wait for external memory.
|
43 |
|
|
|
44 |
|
|
The processor has been tested on an Artix~7 (xc7a100tcsg324-1) FPGA from Xilinx, on the
|
45 |
|
|
Nexys 4 board from Digilent. More details about the test design can be found in chapter
|
46 |
|
|
\ref{cha:quickstart}.
|
47 |
|
|
|
48 |
|
|
\section{Features}
|
49 |
|
|
Here is a highlight of the current features of the Potato processor:
|
50 |
|
|
|
51 |
|
|
\begin{itemize}
|
52 |
|
|
\item Implements the complete 32-bit integer subset of the RISC-V ISA v2.0.
|
53 |
|
|
\item Implements the \textsc{csr*} instructions from the RISC-V supervisor extensions v1.0.
|
54 |
|
|
\item Supports using the \register{fromhost}/\register{tohost} registers for communicating
|
55 |
|
|
with a host environment, such as a simulator, or peripherals.
|
56 |
|
|
\item Supports exception handling, with support for 8 individually maskable IRQ inputs.
|
57 |
|
|
\item Includes a wishbone B4 compatible interface.
|
58 |
|
|
\end{itemize}
|
59 |
|
|
|
60 |
|
|
\section{Planned features}
|
61 |
|
|
Here is a highlight of the future planned features of the Potato processor:
|
62 |
|
|
|
63 |
|
|
\begin{itemize}
|
64 |
|
|
\item Caches.
|
65 |
|
|
\item Branch prediction.
|
66 |
|
|
\item Hardware multiplication and division support (the RISC-V M extension).
|
67 |
|
|
\item Compressed instruction support (the RISC-V C extension).
|
68 |
|
|
\item Supervisor mode support
|
69 |
|
|
\end{itemize}
|
70 |
|
|
|
71 |
|
|
\chapter{Quick Start Guide}
|
72 |
|
|
\label{cha:quickstart}
|
73 |
|
|
|
74 |
|
|
This chapter contains instructions on getting started with the demo/example design that is
|
75 |
|
|
included in the Potato source distribution. The example design targets the Nexys 4 board
|
76 |
|
|
available from Digilent\footnote{See \url{http://www.digilentinc.com/Products/Detail.cfm?Prod=NEXYS4}}.
|
77 |
|
|
|
78 |
|
|
\section{Setting up the Vivado Project}
|
79 |
|
|
|
80 |
|
|
Start by creating a new project in Vivado. Import all source files from the \texttt{src/} directory,
|
81 |
|
|
which contains all source files required for using the processor. Then import all source files from
|
82 |
|
|
the \texttt{example/} directory, which contains the toplevel setup for the example SoC design,
|
83 |
|
|
and from the \texttt{soc/} directory, which contains various peripherals for the processor.
|
84 |
|
|
|
85 |
|
|
\section{Adding IP Modules}
|
86 |
|
|
|
87 |
|
|
The example design requires two additional IP modules. These are not included in the source
|
88 |
|
|
distribution and must be added separately.
|
89 |
|
|
|
90 |
|
|
\subsection{Clock Generator}
|
91 |
|
|
|
92 |
|
|
Add a clock generator using the Clocking Wizard. Name the component ``\texttt{clock\_generator}''
|
93 |
|
|
and make sure that the checkboxes for ``frequency synthesis'' and ``safe clock startup'' are
|
94 |
|
|
selected.
|
95 |
|
|
|
96 |
|
|
Add two output clocks with frequencies of 50~MHz and 10~MHz. Rename the corresponding ports
|
97 |
|
|
to ``\texttt{system\_clk}'' and ``\texttt{timer\_clk}''. Rename the input clock signal to
|
98 |
|
|
``\texttt{clk}''.
|
99 |
|
|
|
100 |
|
|
The added module should appear in the hierarchy under the toplevel module as ``\texttt{clkgen}''.
|
101 |
|
|
|
102 |
|
|
\subsection{Instruction memory}
|
103 |
|
|
|
104 |
|
|
Add a block RAM to use for storing the test application using the Block Memory Generator.
|
105 |
|
|
Choose ``Single-port ROM'' as memory type and name the module ``\texttt{instruction\_rom}''.
|
106 |
|
|
Set port A width to 32 bits and the depth to 2048 words. Initialize the block RAM with
|
107 |
|
|
your application or use one of the provided benchmarks, such as the SHA256 benchmark,
|
108 |
|
|
which, when built, produces a \texttt{.coe} file that can be used for this purpose.
|
109 |
|
|
|
110 |
|
|
Note that in order to build a benchmark application, you have to install a RISC-V toolchain.
|
111 |
|
|
See section \ref{sec:toolchain} for instructions on how to accomplish this.
|
112 |
|
|
|
113 |
|
|
\section{Running an Example Application}
|
114 |
|
|
|
115 |
|
|
Assuming you initialized the instruction memory with the SHA256 benchmark, synthesize and
|
116 |
|
|
implement the design, generate a bitfile and load it into the FPGA. Using a serial port
|
117 |
|
|
application, such as \texttt{minicom}, watch as the number of hashes per second are
|
118 |
|
|
printed to the screen and rejoice because it works!
|
119 |
|
|
|
120 |
|
|
\chapter{Instantiating}
|
121 |
|
|
|
122 |
|
|
The Potato processor can be used either with or without a wishbone interface. Using the wishbone
|
123 |
|
|
interface allows the processor to communicate with other wishbone-compatible peripherals. However,
|
124 |
|
|
if no such peripherals are to be used, the processor can, for instance, be connected directly to
|
125 |
|
|
block RAM memories for full performance without needing to use caches.
|
126 |
|
|
|
127 |
|
|
\section{Customizing the Processor Core}
|
128 |
|
|
The processor can be customized using generics. The following list details the parameters
|
129 |
|
|
that can be changed:
|
130 |
|
|
|
131 |
|
|
\begin{description}
|
132 |
|
|
\item[\texttt{PROCESSOR\_ID}:] Any 32-bit value used as the processor ID. This value can
|
133 |
|
|
be read back from the hardware thread ID register, \register{hartid}.
|
134 |
|
|
\item[\texttt{RESET\_ADDRESS}:] Any 32-bit value used as the address of the first instruction
|
135 |
|
|
fetched by the processor after it has been reset.
|
136 |
|
|
\end{description}
|
137 |
|
|
|
138 |
|
|
\section{Instantiating in a Wishbone System}
|
139 |
|
|
\label{sec:instantiating-wishbone}
|
140 |
|
|
|
141 |
|
|
In order to integrate the Potato processor into a wishbone-based system, the module \texttt{pp\_potato}
|
142 |
|
|
is used. It provides signals for the wishbone master interface, prefixed with \texttt{wb\_}, and
|
143 |
|
|
inputs for interrupts and the HTIF interface.
|
144 |
|
|
|
145 |
|
|
The specifics of the wishbone interface is listed in table \ref{tab:wishbone}. To see an example
|
146 |
|
|
of the processor used in a Wishbone system, see the example design under the \texttt{example/}
|
147 |
|
|
directory.
|
148 |
|
|
|
149 |
|
|
\begin{table}
|
150 |
|
|
\centering
|
151 |
|
|
\begin{tabular}{|l|l|}
|
152 |
|
|
\hline
|
153 |
|
|
Wishbone revision & B4 \\
|
154 |
|
|
Interface type & Master \\
|
155 |
|
|
Address port width & 32 bits \\
|
156 |
|
|
Data port width & 32 bits \\
|
157 |
|
|
Data port granularity & 8 bits \\
|
158 |
|
|
Maximum operand size & 32 bits \\
|
159 |
|
|
Endianess & Little endian \\
|
160 |
|
|
Sequence of data transfer & Not specified \\
|
161 |
|
|
\hline
|
162 |
|
|
\end{tabular}
|
163 |
|
|
\caption{Wishbone Interface Specifics}
|
164 |
|
|
\label{tab:wishbone}
|
165 |
|
|
\end{table}
|
166 |
|
|
|
167 |
|
|
\FloatBarrier
|
168 |
|
|
|
169 |
|
|
\section{Instantiating in a Standalone System}
|
170 |
|
|
\label{sec:instantiating-standalone}
|
171 |
|
|
|
172 |
|
|
The processor can also be used without connecting it to the Wishbone bus. An example
|
173 |
|
|
of this can be seen in the processor testbench, \texttt{tb\_processor.vhd}.
|
174 |
|
|
|
175 |
|
|
\section{Verifying}
|
176 |
|
|
|
177 |
|
|
The processor provides an automatic testing environment for verifying that the processor
|
178 |
|
|
correctly executes the instructions of the ISA. The tests have been extracted from the
|
179 |
|
|
official test suite available at \url{https://github.com/riscv/riscv-tests} and covers
|
180 |
|
|
most of the available instructions.
|
181 |
|
|
|
182 |
|
|
Two testbenches are used to execute the test programmes: \texttt{tb\_processor.vhd}, in
|
183 |
|
|
which the processor is directly connected to block-RAM-like memories so the processor
|
184 |
|
|
never stalls to wait for memory operations to finish (see section \ref{sec:instantiating-standalone}
|
185 |
|
|
for more details about this kind of setup), and \texttt{tb\_soc.vhd}, which models a
|
186 |
|
|
simple system-on-chip with the processor connected to memories through the
|
187 |
|
|
wishbone interface (see section \ref{sec:instantiating-wishbone} for more information
|
188 |
|
|
about this kind of setup).
|
189 |
|
|
|
190 |
|
|
To run the test suites, run \texttt{make run-tests} or \texttt{make run-soc-tests}.
|
191 |
|
|
|
192 |
|
|
Make sure that \texttt{xelab} and \texttt{xsim} is in your \texttt{PATH} or the
|
193 |
|
|
tests will fail.
|
194 |
|
|
|
195 |
|
|
\chapter{Programming}
|
196 |
|
|
|
197 |
|
|
The processor implements the RISC-V instruction set, and can be programmed with tools
|
198 |
|
|
such as GCC.
|
199 |
|
|
|
200 |
|
|
\section{Building a RISC-V Toolchain}
|
201 |
|
|
\label{sec:toolchain}
|
202 |
|
|
|
203 |
|
|
An ``official'' toolchain is provided by the RISC-V project. In order to install it, clone
|
204 |
|
|
the ``riscv-tools'' Git repository from \url{https://github.com/riscv/riscv-tools} and follow
|
205 |
|
|
the instructions provided by the README file.
|
206 |
|
|
|
207 |
|
|
\section{Control and Status Registers}
|
208 |
|
|
|
209 |
|
|
The supported control and status registers are shown in table \ref{tab:csr_list}. The registers
|
210 |
|
|
can be manipulated using the \textsc{csr*} family of instructions, listed in \ref{sec:csr_instrs}.
|
211 |
|
|
|
212 |
|
|
\begin{table}
|
213 |
|
|
\centering
|
214 |
|
|
\begin{tabular}{|l|l|l|}
|
215 |
|
|
\hline
|
216 |
|
|
\textbf{Name} & \textbf{ID} & \textbf{Description} \\
|
217 |
|
|
\hline
|
218 |
|
|
\register{hartid} & 0x50b & Hardware thread ID \\
|
219 |
|
|
\register{evec} & 0x508 & Exception vector address \\
|
220 |
|
|
\register{epc} & 0x502 & Return address for exceptions \\
|
221 |
|
|
\register{cause} & 0x509 & Exception cause \\
|
222 |
|
|
\register{sup0} & 0x500 & Support register 0, for operating system use \\
|
223 |
|
|
\register{sup1} & 0x501 & Support register 1, for operating system use \\
|
224 |
|
|
\register{badvaddr} & 0x503 & Bad address, used for invalid address exceptions \\
|
225 |
|
|
\register{status} & 0x50a & Processor status and control register \\
|
226 |
|
|
\register{tohost} & 0x51e & Register for sending data to a host system \\
|
227 |
|
|
\register{fromhost} & 0x51f & Register where data received from a host system is stored \\
|
228 |
|
|
\register{cycle} & 0xc00 & Cycle counter, low 32 bits \\
|
229 |
|
|
\register{cycleh} & 0xc80 & Cycle counter, high 32 bits \\
|
230 |
|
|
\register{time} & 0xc01 & Timer tick counter, low 32 bits \\
|
231 |
|
|
\register{timeh} & 0xc81 & Timer tick counter, high 32 bits \\
|
232 |
|
|
\register{instret} & 0xc02 & Retired instruction counter, low 32 bits \\
|
233 |
|
|
\register{instreth} & 0xc82 & Retired instruction counter, high 32 bits \\
|
234 |
|
|
\hline
|
235 |
|
|
\end{tabular}
|
236 |
|
|
\caption{List of Control and Status Registers}
|
237 |
|
|
\label{tab:csr_list}
|
238 |
|
|
\end{table}
|
239 |
|
|
\FloatBarrier
|
240 |
|
|
|
241 |
|
|
\chapter{Instruction Set}
|
242 |
|
|
|
243 |
|
|
The Potato processor is designed to support the full 32-bit integer subset of
|
244 |
|
|
the RISC-V instruction set, version 2.0. The ISA documentation is available
|
245 |
|
|
from \url{http://riscv.org}.
|
246 |
|
|
|
247 |
|
|
\section{Status and Control Register Instructions}
|
248 |
|
|
\label{sec:csr_instrs}
|
249 |
|
|
|
250 |
|
|
In addition to the base ISA, some additional instructions have been imported
|
251 |
|
|
from the RISC-V supervisor specification\footnote{The processor is in
|
252 |
|
|
the process of being upgraded to the new specification.} version 1.0.
|
253 |
|
|
|
254 |
|
|
\begin{table}[htb]
|
255 |
|
|
\centering
|
256 |
|
|
\begin{tabular}{|l|l|}
|
257 |
|
|
\hline
|
258 |
|
|
\textbf{Mnemonic} & \textbf{Description} \\
|
259 |
|
|
\hline
|
260 |
|
|
\texttt{scall} & System call \\
|
261 |
|
|
\texttt{sbreak} & Breakpoint instruction \\
|
262 |
|
|
\texttt{sret} & Exception return \\
|
263 |
|
|
\hline
|
264 |
|
|
\texttt{csrrw rd, rs1, CSR} & Writes rs1 into CSR, place sold value in rd \\
|
265 |
|
|
\texttt{csrrs rd, rs1, CSR} & Ors rs1 with CSR, places old value in rd \\
|
266 |
|
|
\texttt{csrrc rd, rs1, CSR} & Ands the inverse of rs1 with CSR, places old value in rd \\
|
267 |
|
|
\texttt{csrrwi rd, imm, CSR} & Writes imm into CSR, places old value in rd \\
|
268 |
|
|
\texttt{csrrsi rd, imm, CSR} & Ors CSR with imm, places old value in rd \\
|
269 |
|
|
\texttt{csrrci rd, imm, CSR} & Ands the inverse of imm with CSR, places old value in rd \\
|
270 |
|
|
\hline
|
271 |
|
|
\end{tabular}
|
272 |
|
|
\caption{List of CSR Instructions}
|
273 |
|
|
\end{table}
|
274 |
|
|
|
275 |
|
|
\appendix
|
276 |
|
|
|
277 |
|
|
\chapter{Peripherals}
|
278 |
|
|
|
279 |
|
|
The source distribution of the processor contains several peripheral modules that can be
|
280 |
|
|
used in system-on-chip designs using the Potato processor (or other processors).
|
281 |
|
|
|
282 |
|
|
This chapter briefly describes each of the modules.
|
283 |
|
|
|
284 |
|
|
\section{GPIO}
|
285 |
|
|
|
286 |
|
|
The GPIO module provides a simple GPIO interface for up to 32 general purpose pins.
|
287 |
|
|
Each pin can be separately configured to work as either an input or an output pin.
|
288 |
|
|
|
289 |
|
|
Registers are provided to set the direction of each pin. Additional registers
|
290 |
|
|
provide the ability to read or write the values of the pins.
|
291 |
|
|
|
292 |
|
|
\section{Timer}
|
293 |
|
|
|
294 |
|
|
The timer module provides a timer that fires off an interrupt at a specified
|
295 |
|
|
interval.
|
296 |
|
|
|
297 |
|
|
\section{UART}
|
298 |
|
|
|
299 |
|
|
The UART module provies a fixed-baudrate serial port interface. It features
|
300 |
|
|
separate FIFOs for buffering input and output data, and interrupts for when
|
301 |
|
|
the module is ready to send or has received data.
|
302 |
|
|
|
303 |
|
|
\section{Memory}
|
304 |
|
|
|
305 |
|
|
The memory module is basically a simple block RAM wrapper with support for
|
306 |
|
|
byte-writes.
|
307 |
|
|
|
308 |
|
|
\end{document}
|
309 |
|
|
|