URL
https://opencores.org/ocsvn/ion/ion/trunk
Subversion Repositories ion
Compare Revisions
- This comparison shows the changes necessary to convert path
/ion/trunk/doc
- from Rev 210 to Rev 221
- ↔ Reverse comparison
Rev 210 → Rev 221
/src/tex/tools.tex
2,6 → 2,16
\chapter{Tools} |
\label{tools} |
|
Directory '/tools' of the project includes a few tools -- small C or Python |
programs purpose-built for this project. |
|
What follows is a brief description of each of the tools. This document |
won't go into the implementation or usage details. The tools themselves have |
brief usage instructions and for any further details the user must read |
the source code. |
|
|
|
\section{MIPS Software Simulator} |
\label{sw_simulator} |
|
27,10 → 37,12
\end{itemize} |
|
Each code sample includes a DOS batch file named 'swsim.bat' that runs the |
simulator in batch mode.\\ |
simulator in batch mode. Note that the BAT file invokes a windows binary |
which is included in the SVN repository and should be immediately useable |
after checkout.\\ |
|
The program includes usage help (a short description of the command line |
parameters). The source code (very simple and straighforward) is includef in |
parameters). The source code (very simple and straighforward) is included in |
the project. The BAT files provide an usage example. And anyone who is |
interested and finds trouble can always contact me. |
|
43,9 → 55,56
The hardcoded log file name is "sw\_sim\_log.txt" and it is generated in the |
same directory from which the simulator is run.\\ |
|
\section{Configuration Package Builder Script build\_pkg.py} |
\label{python_script} |
|
This tools is used to build a simulation and synthesis configuration |
package. |
|
The generated package contains configuration constants used by the |
simulation test bench \emph{'mips\_tb.vhdl'} and by the hardware demo |
\emph{'c2sb\_demo.vhdl'}. |
|
It too includes memory initialization constants containing object code, |
used to initialize simulated and inferred memories, both in simulation |
and in synthesis. |
|
In the code samples, this script is used to generate two separate packages |
for simulation and synthesis. Please refer to the makefiles for detailed |
usage examples. |
|
|
\section{Conversion Script bin2hdl.py} |
\label{python_script} |
|
|
\begin{figure*}[ht] |
\begin{center} |
{\small |
\framebox[7in]{ |
\begin{minipage}[t]{6.0in} |
|
NOTE: This script was used in previous versions of the project -- it came |
in handy to initialize byte-sliced memories when the caches were under |
development. |
|
It has been abandoned because it was far too complicated and no longer |
necessary. The VHDL |
templates it refers to and the script itself have been moved from the /src |
directory to their own subdirectory in /tools. |
|
It is being retained in case it becomes useful again but it is no longer |
used. |
|
\end{minipage} |
} |
} |
\end{center} |
\label{lb} |
\end{figure*} |
|
|
|
This Python script reads one or more binary files and 'inserts' them in a |
vhdl template. It makes the |
conversion from binary to vhdl strings and slices the data in byte columns, |
/src/tex/hw_demo.tex
10,7 → 10,8
makefiles -- assuming you have a mips toolchain.\\ |
|
'Pre-generated' in this context means that all the vhdl files necessary for |
building the demo are already included with the project, and the only |
building the demo are already included with the project, including the |
configuration package that contains the program's object code, and the only |
tool needed is the synthesis tool. |
|
The pregenerated demo is included just for convenience, so that you can |
42,6 → 43,7
\item 'Next' your way out of the new project wizard. |
\item Add to the project all the vhdl files in /vhdl and /vhdl/demo, |
except mips\_cache\_stub.vhdl and sdram\_controller.vhdl. |
\item Add to the project all the vhdl files in /vhdl/SoC. |
\item Select file c2sb\_demo.vhdl as top. |
\item Import pin constraints file (assignments-\textgreater import assignments). |
\item Create a clock constraint for signal clk (51 MHz or some other |
51,7 → 53,7
\item Double-click on nCEO value column and select "use as regular I/O". |
IMPORTANT: otherwise the synthesis will fail; we need to use a FPGA |
pin that happens to be dual-purpose (programming and regular). |
\item Select 'balanced' optimization. |
\item Select 'speed' optimization. |
\item Save the project and synthesize. |
\item Make sure the clock constraint is met (timing analyzer report). |
There is a random element to the synthesis process, as you know, |
59,8 → 61,7
\item Program the FPGA from Quartus-2 |
\item If you have a terminal hooked to the serial port (19200/8/N/1) you |
should see a welcome message after depressing the reset button. |
(by default this is pusbutton 2). |
|
(by default this is pushbutton 2). |
\end{enumerate} |
|
In the present version, the synthesis will produce a lot of warnings. The |
88,20 → 89,20
this: |
|
\begin{itemize} |
\item An FPGA capable enough (the demo uses internal memory for code) |
\item At least 4KB of 16-bit wide external, asynchronous, old-fashioned SRAM |
\item A reset pin (possibly a pushbutton) |
\item A clock input (uart modules assume 50MHz, see below) |
\item RXD and TXD UART pins, plus a connector, header or whatever |
\item An FPGA capable enough (the demo uses internal memory for code). |
\item At least 4KB of 16-bit wide external, asynchronous, old-fashioned SRAM. |
\item A reset pin (possibly a pushbutton). |
\item A clock input (uart modules assume 50MHz, see below). |
\item RXD and TXD UART pins, plus a connector, header or whatever. |
\end{itemize} |
|
The only modules that care at all about clock rate are the UART |
modules. They are hardwired to 19200 bauds when clocked at 50MHz, so if you |
The only module that care at all about clock rate is the UART embedded into |
the SoC module. It's hardwired to 19200 bauds when clocked at 50MHz, so if you |
use a different frequency you must edit the generics in the demo entity |
accordingly.\\ |
Be aware that these uart modules have been used a lot in other projects but |
have not been tested with a wide range of clock rates; they should work but |
you have been warned.\\ |
accordingly -- the demo generics are passed all the way down to whatever |
module needs them.\\ |
The UART has hardly been tested at clock rates other than 50MHz and has not |
passed any independent test bench; try the core first at 50 MHz.\\ |
|
Though there is no reset control logic, the reset input is synchronized |
internally, so you can use a raw pushbutton -- you may trigger multiple |
110,7 → 111,8
|
Assuming you take care of all of the above, the easiest way I see to port |
the demo is just editing the top module ports ('/vhdl/demo/c2sb\_demo.vhdl') |
to match your board setup.\\ |
to match your board setup. The only tricky part is the interface to FLASH |
and SDRAM.\\ |
|
All the code in this project is vendor agnostic (or should be, I have only |
tried it on Quartus and ISE). Specifically, it does not instantiate memory |
148,4 → 150,4
great as a confidence builder.\\ |
|
Besides, running Adventure on a computer built by myself is something |
I just wanted to do :)\\ |
I've always wanted to do :)\\ |
/src/tex/cache.tex
19,7 → 19,7
alternative, simplified scheme.\\ |
|
The standard R3000 cache control flags in the SR are not used, either. Instead, |
two flags from the SR have been commandeered for cache control.\\ |
two flags from the SR have been repurposed for cache control.\\ |
|
\subsection{Cache control flags} |
\label{cache_control_flags} |
67,8 → 67,8
\needspace{10\baselineskip} |
\begin{verbatim} |
|
___________ <-- These address bits are NOT in the tag |
/ \ |
_________ <-- These address bits are NOT in the tag |
/ \ |
31 .. 27| 26 .. 21 |20 .. 12|11 .. 4|3:2| |
+---------+-----------+-----------------+---------------+---+---+ |
| 5 | | 9 | 8 | 2 | | |
81,10 → 81,11
Since bits 26 downto 21 are not included in the tag, there will be a |
'mirror' effect in the cache. We have effectively split the memory space |
into 32 separate blocks of 1MB which is obviously not enough but will do |
for the initial tests. |
for the initial versions of the core. |
|
In subsequent versions of the cache, the tag size needs to be enlarged AND |
some of the top bits might be omitted when they're not needed to implement |
the default memory map (namely bit 30 which is always '0'). |
the default MIPS memory map (namely bit 30 which is always '0'). |
|
|
\section{Memory Controller} |
122,10 → 123,10
For each address, the memory map logic will supply the following information: |
|
\begin{enumerate} |
\item What kind of memory it is |
\item How many wait states to use |
\item Whether it is writeable or not (ignored in current version) |
\item Whether it is cacheable or not (ignored in current version) |
\item What kind of memory it is. |
\item How many wait states to use. |
\item Whether it is writeable or not (ignored in current version). |
\item Whether it is cacheable or not (ignored in current version). |
\end{enumerate} |
|
In the present implementation the memory map can't be modified at run time.\\ |
209,7 → 210,7
|
cache/ps ?| (1) | (2) | ... | (2) |?? |
|
refill_ctr ?| 0 | ... < 3 |?? |
refill_ctr ?| 0 | ... | 3 |?? |
|
chip_addr ?| 210h | 211h | ... | 217h |-- |
|
229,7 → 230,6
in this chronogram it takes the following values: |
|
\begin{enumerate} |
\item idle |
\item data\_refill\_sram\_0 |
\item data\_refill\_sram\_1 |
\end{enumerate} |
244,10 → 244,15
\subsubsection{SRAM interface read cycle timing -- 8-bit interface} |
\label{sram_read_cycle_8b} |
|
TODO: 8-bit refill procedure to be done. |
The refill from an 8-bit static memory is essentially the same as depicted |
above, except we need to read 4 bytes (over the LSB lines of the static memory |
data bus) instead of 2 16-bit halfwords. The operation takes correspondingly |
longer to perform and uses an extra address line but is otherwise identical. |
|
TODO: 8-bit refill chronogram to be done. |
|
\subsubsection{SRAM interface write cycle timing} |
|
\subsubsection{16-bit SRAM interface write cycle timing} |
\label{sram_write_cycle} |
|
The path of the state machine that deals with SRAM writethroughs is linear so |
261,7 → 266,7
A general memory write will be 32-bit wide and thus it will take two 16-bit |
memory accesses to complete. Unaligned, halfword or byte wide CPU writes might |
in some cases be optimized to take only a single 16-bit memory access. This |
module does no such optimization. |
module does no such optimization yet. |
For simplicity, all writethroughs take two 16-bit access cycles, even if one |
of them has both we\_n signals deasserted.\\ |
|
274,7 → 279,7
|
In this example, the SRAM is being accessed with 1 WS: WE\_N is asserted for |
two cycles. |
Note how a lot of cycles are lost in order to guarantee compliance with the |
Note how a lot of cycles are used in order to guarantee compliance with the |
setup and hold times of the SRAM against the we, address and data lines. |
|
\needspace{15\baselineskip} |
302,7 → 307,6
in this chronogram it takes the following values: |
|
\begin{enumerate} |
\item idle |
\item data\_writethrough\_sram\_0a |
\item data\_writethrough\_sram\_0b |
\item data\_writethrough\_sram\_0c |
316,6 → 320,9
\section{Known Problems} |
\label{cache_problems} |
|
The cache implementation is still provisional and has a number of |
acknowledged problems: |
|
\begin{enumerate} |
\item All parameters hardcoded -- generics are almost ignored. |
\item SRAM read state machine does not guarantee internal FPGA $T_{hold}$. |
324,5 → 331,6
in the parent module) are far smaller than the SRAM response times, but |
it would be better to insert an extra cycle after the wait states in |
the sram read state machine. |
\item Cache logic mixed with memory controller logic. |
\end{enumerate} |
|
/src/tex/simulation.tex
11,7 → 11,7
the cpu state to a text log file.\\ |
|
This log file can then be compared to a log file generated by a software |
simulator for the same code sample (see section 5.1). The software |
simulator for the same code sample (see section \ref{sw_simulator}). The software |
simulator is the 'golden model' against which the cpu is tested, so any |
difference between both log files means trouble.\\ |
|
22,38 → 22,31
In addition to the main log file, there is a console log file to which all |
data written to the UART is logged (see section~\ref{uart_logging}).\\ |
|
There are a few simulation test bench templates in the /src directory, which |
are used by all the code samples.\\ |
The only ones actually used are '/src/code\_rom\_template.vhdl' and |
'/src/sim\_params\_template.vhdl'. The others |
are remnants of previous versions that will be removed ASAP.\\ |
|
The template in file '/src/code\_rom\_template.vhdl' is filled with object |
code meant to be run from internal FPGA BRAM. This is how we load bootstrap |
code into our FPGA. The resulting file is '/vhdl/demo/code\_rom\_pkg.vhdl' |
and is used by both the simulation test bench and the synthesizable MCU.\\ |
The simulation test bench can be found in file '/vhdl/tb/mips\_tb.vhdl'. |
This test bench is meant to be used with all the code samples. |
|
The template in file '/src/sim\_params\_template.vhdl' is filled with |
simulation parameters (such as the simulation length, etc.) and the resulting |
file is written as '/vhdl/tb/sim\_params\_pkg.vhdl'. This file is only used |
by the simulation test bench. |
Each of the code samples configures the simulation test bench with certain |
parameters (such as simulation length or memory sizes) and of course each |
sample has a different object code to be run. The way to pass these |
parameters to the simulation is through a simulation package, in file |
'/vhdl/tb/sim\_params\_pkg.vhdl'. |
|
All of this template filling is done by a python script (/src/bin2hdl.py) |
which is invoked from the makefiles and explained in section xxx.\\ |
This file is generated from a template whenever you 'make' each code sample |
(see section~\ref{samples}). The package is built using oe of the |
provided tools, 'build\_pkg', explained in section ~\ref{build_pkg}. |
|
|
Note that all code samples share the same vhdl files: you need to run the |
makefile target 'sim' for the sample you want to simulate; that will |
overwrite the two files mentioned above. So there's no vhdl file that is |
overwrite the package file mentioned above. So there's no vhdl file that is |
specific to a particular code sample.\\ |
|
The actual test bench entity is at '/vhdl/tb/mips\_tb.vhdl' and is shared |
by all the code samples.\\ |
|
|
While the test benches and sample code are good enough to catch MOST errors |
in the full system (i.e. cache included) they don't help with diagnostic; |
once you know there's an error, and the approximate address where it's |
triggered (approximate because of the cache) you have to dig into the |
simulation waveforms to find it. It's easier than it seems.\\ |
simulation waveforms to find it.\\ |
|
\section{Running the Simulation} |
\label{running_the_simulation} |
66,11 → 59,11
The test bench files mentioned in the previous section are automatically |
generated for each of the sample programs. This is automatically done by the |
sample code makefile, |
assuming you have a MIPS cross-toolchain in your computer (see section~\ref{code_samples}).\\ |
assuming you have a MIPS cross-toolchain in your computer (see section~\ref{samples}).\\ |
|
For convenience, a pre-generated mips\_tb.vhdl is included so you can launch |
a simulation without having to install toolchains, etc. The code is that |
of the 'hello world' sample.\\ |
For convenience, a pre-generated file 'sim\_params\_pkg.vhdl' is included |
so you can launch a simulation without having to install toolchains, etc. |
The code is that of the 'hello world' sample.\\ |
|
I guess that if you are interested in this sort of stuff then you probably |
know more about Modelsim than I do. Yet, here's a step-by-step guide to |
79,7 → 72,7
\begin{enumerate} |
\item Run 'make hello\_sim' from directory '/src/hello'. |
This will compile the program sources, build the necessary binary object |
files and then create the two package files mentioned above.\\ |
files and then create the package file mentioned above.\\ |
Read the makefile and comments in the python script '/src/bin2hdl.py' |
for details.\\ |
|
138,8 → 131,6
|
Events are logged with the address of the instruction that triggered |
the change. This holds true even for load instructions.\\ |
Early versions of the project logged the address of the |
preceding instruction -- it was confusing and I have fixed it.\\ |
|
The simulation log file is stored by default in modelsim's working directory |
(see above). I don't provide any automated script to do the comparison, you |
/src/tex/usage.tex
9,135 → 9,102
\begin{enumerate} |
\item The CPU (mips\_cpu.vhdl). |
\item The cache+memory controller (mips\_cache.vhdl). |
\item An 'MCU' entity which combines CPU+Cache (mips\_mpu.vhdl). |
\item A 'SoC' entity which combines CPU+Cache (mips\_soc.vhdl). |
\end{enumerate} |
|
The entity you should use in your projects is the MCU module. The project |
The entity you should use in your projects is the SoC module. The project |
includes a 'hardware demo' built around this module (see section |
~\ref{pregenerated_demo}) which can be used as an usage example.\\ |
~\ref{pregenerated_demo}) which is meant as an usage example.\\ |
|
The main modules are briefly described in the following subsections. |
|
\section{Bootstrap Code} |
\label{bootstrap_code} |
|
\section{MCU Module} |
\label{mcu_module} |
Though the core is meant to run mostly from off-chip memory, the current version |
of the SoC module includes a small ROM implemented as FPGA BRAM and called |
'bootstrap BRAM'. In the current version of the core, this BRAM can be loaded |
with arbitrary code and its size can be configured by using generics, but it |
can't be removed from the SoC. Even though the memory map can be modified to |
boot from external FLASH and not use a BRAM at all, a BRAM will still be |
inferred within the SoC -- subsequent versions will fix this. |
|
The MCU module main purpose is to encapsulate the somewhat complex |
interconnection between the CPU and the Cache module. |
As can be seen in table~\ref{tab_soc_memory_map}, the internal BRAM is mirrored |
all over a wide area starting at \texttt{0xb8000000}. In practice, this means |
the BRAM will be mapped at the CPU reset address (\texttt{0xbfc00000}) and thus |
the bootstrap code should be placed there. |
Unless the bootstrap BRAM is very small, it will span over the interrupt vector |
address too (\texttt{0xbfc00180}). |
|
If some project demands that some piece of hardware be directly connected to the |
CPU, bypassing the cache, this is where it should be -- an MMU comes to mind. |
For example, the 'Adventure' demo included with the project uses bootstrap |
code included in file \texttt{/src/common/bootstrap.s}. This bootstrap code |
is fairly incomplete (interrupt response code is mostly a stub) yet it's enough |
to boot most applications. |
Note that the C startup code, which deals with things like initializing the |
static variables on the data segment, etc. is not part of this bootstrap code. |
It can be found in file \texttt{/src/common/c\_startup.s} |
|
Any peripherals deemed common enough that they will be present in all projects |
might be placed in the MCU module too -- after all, the MCU name has been chosen |
to imply that 'bundling together' of a CPU and a bunch of peripherals. |
So, in short, the code loaded onto the startup BRAM should include the most |
basic system initialization (cache initialization at least) and the entry point |
for the interrupt response code; plus a jump to the main program entry address. |
|
In the current version of the MCU module, there is only a peripheral included in |
it -- a hardwired UART module. There is no penalty for placing peripherals |
ouside the MCU module, so there is no incentive to place them inside, thus |
making the interface more complex. This is an implementation option of yours.\\ |
Anyone trying to build some application on this core is advised to use the code |
samples as starting points, specially the makefiles. |
|
\subsection{MCU Ports} |
\label{mcu_ports} |
|
\begin{figure}[h] |
\makebox[\textwidth]{\framebox[9cm]{\rule{0pt}{9cm} |
\includegraphics[width=8cm]{img/mpu_symbol.png}}} |
\caption{MPU module interface\label{mpu_symbol}} |
\end{figure} |
\subsection{Loading Bootstrap Code on the SoC Module} |
\label{loading_bootstrap_code} |
|
\begin{table}[h] |
\caption{MCU module interface ports} |
\begin{tabularx}{\textwidth}{ lll|X } |
\toprule |
Name & Type & Width & Description \\ |
\midrule |
clk & in & 1 & Clock input, active rising edge. \\ |
reset & in & 1 & Synchronous global reset. \\ |
\midrule |
sram\_address & out & 16 & Memory word address (bit 0 absent). \\ |
sram\_data\_wr & out & 16 & Memory write data. Only valid when one of the \\ |
& & & memory byte write enable outputs is active.\\ |
sram\_data\_rd & in & 16 & Memory read data. Latched when xxx. \\ |
sram\_byte\_we\_n & out & 2 & Memory byte write enable, active low. \\ |
& & & (0) enables the low byte (7 downto 0) \\ |
& & & (1) enables the high byte (15 downto 8). \\ |
\midrule |
io\_rd\_addr & out & 30 & I/O port read address (bits 1..0 absent). \\ |
& & & Only valid when io\_rd\_vma is high. \\ |
io\_wr\_addr & out & 30 & I/O port write address (bits 1..0 absent). \\ |
io\_wr\_data & out & 32 & I/O write data. Only valid when one of the \\ |
& & & i/o byte write enable outputs is active.\\ |
io\_rd\_data & in & 32 & I/O read data. Latched when xxx. \\ |
io\_byte\_we & out & 4 & I/O byte write enable, active high. \\ |
& & & (0) enables the low byte (7 downto 0) \\ |
& & & (3) enables the high byte (31 downto 24). \\ |
io\_rd\_vma & out & 1 & Active high on i/o read cycles. \\ |
\midrule |
uart\_rxd & in & 1 & RxD input to internal UART. \\ |
uart\_txd & out & 1 & TxD output from internal UART. \\ |
\midrule |
interrupt & in & 8 & Interrupt request inputs, active high. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
Once the code that is to be loaded on the bootstrap BRAM has been built, you |
need to load it onto the bootstrap BRAM within the FPGA. |
|
As you can see in figure~\ref{mpu_symbol} (symbol generated by Xilinx ISE), |
the MCU has the following interfaces: |
As you probably already know, there are several possible ways to deal with this |
and most of them involve using \emph{'Memory Initialization Files'} of |
some sort. This project is different. |
|
\begin{enumerate} |
\item Interface to external static asynchronous memory (SRAM, FLASH...). |
\item Interface to on-chip peripherals. |
\item Interrupt inputs. |
\end{enumerate} |
So far, this project does not include any support for using IMF |
files of any kind. Instead, the bootstrap BRAM is inferred and initialized |
using regular VHDL constructs and a constant passed to the SoC module as a |
generic. |
|
These interfaces will be explained in the following subsections. The top module |
for the demo supplied with the project (c2sb\_demo.vhdl) will be used for |
illustration. |
This scheme has a big drawback: every time the object code within the FPGA |
changes, the whole synthesis needs to be re-run. This drawback is manageable |
as long as the core is not used in any big project or if the bootstrap code |
does not change often. |
|
\emph{NOTE}: This section needs a lot of elaboration -- ideally this should be |
equivalent to |
a datasheet in thoroughness and detail. This work, like many other parts of this |
project, will have to wait. |
On the other hand, I see some big advantages in using regular BRAM inference in |
this stage of the project: |
|
\subsection{MCU interface to static memory} |
\label{mcu_if_sram} |
\begin{enumerate} |
\item The whole scheme is totally vendor agnostic. |
\item Object code embedded on VHDL constants can very easily be used in both simulation and synthesis. |
\end{enumerate} |
|
The interface to external memory in the MCU module is essentially that of the |
internal cache/memory controller. Its timing is described in section |
~\ref{cache_state_machine}.\\ |
So, whatever object code is to be used to initialize the SoC bootstrap BRAM has |
to be passed to the SoC module instance as a generic constant (see section |
~\ref{soc_generics}). The constant must be of type \texttt{t\_obj\_code}, which |
is defined in package \emph{mips\_pkg}. |
|
The MCU inputs are meant to be connected straight to the FPGA i/o pins. The only |
trick is the bidirectional memory data bus: as you can see, the MCU data buses |
are unidirectional and thus you will need to provide an interconnection |
external to this module. This interconnection shall include the requisite |
3-state buffers: |
|
\begin{verbatim} |
sram_databus <= sram_data_wr when sram_byte_we_n/="11" else (others => 'Z'); |
\end{verbatim} |
\subsection{Building the Bootstrap Initialization Constant} |
\label{boot_code_conversion} |
|
The top level module can be used as a fully tested example of how to use this |
interface to connect to a common SRAM chip (ISSI IS61LV25616). |
|
In reviewing the top module source, note that I had to adapt the dual |
byte-write-enable outputs to the SRAM |
configuration of a single write-enable plus dual byte-enable inputs. |
|
Note too that the static memory bus is used to access both the 16-bit wide SRAM |
and an 8-bit wide FLASH. These chips are connected to separate buses on the |
target board, so the top module needs to conflate both buses before connecting |
them to the MPU. This is why a multiplexor is used in the mpu\_sram\_data\_rd |
bus. A real-world board would probably have the SRAM and the FLASH connected |
to the same bus, simplifying the interface logic. |
|
The project includes a python script (\texttt{/tools/build\_pkg/build\_pkg.py}) |
whose purpose is to build an VHDL \texttt{t\_obj\_code} constant out of a |
\emph{binary} object code file. |
|
This script will read one or more big-endian, binary object files and will |
produce a VHDL package file that will contain initialization constants for |
the bootstrap BRAM and for some other memories that are only used in the |
simulation test bench. |
The package can optionally include too some simulation and synthesis |
configuration constants -- such as the size of the bootstrap BRAM. |
|
\subsection{MCU interface to peripherals} |
\label{mcu_if_io} |
|
TODO Documentation to be done |
|
\subsection{MCU interrupt inputs} |
\label{mcu_irqs} |
|
TODO Documentation to be done |
The makefiles included in the code samples invoke this script twice: once |
to generate a package called \emph{sim\_params\_pkg} and used in the |
simulation test bench; and once to build a package called |
\emph{bootstrap\_code\_pkg} used for synthesis. |
|
Please refer to the makefiles for usage examples, and read the script source |
for more detailed usage instructions. |
|
|
/src/tex/soc.tex
0,0 → 1,397
|
\chapter{SoC Module} |
\label{soc_module} |
|
The main purpose of the SoC module is to encapsulate the somewhat complex |
interconnection between the CPU and the Cache/Memory Controller module. |
|
If some project demands that some piece of hardware be directly connected to the |
CPU, bypassing the cache, this is where it should be -- an MMU comes to mind. |
|
Any peripherals deemed common enough that they will be present in all projects |
might be placed in the SoC module too. |
|
In the current version of the SoC module, there is only one peripheral included |
in it -- a hardwired UART module. There is no penalty for placing peripherals |
ouside the SoC module, so there is no incentive to place them inside. This is |
an implementation option of yours.\\ |
|
Bear in mind that, in its current state, the SoC module is little more than a |
vehicle for building demos around the ION CPU. It is not meant as a real-world |
SoC, though it might be deloped into one eventually. |
|
\section{SoC Generics} |
\label{soc_generics} |
|
The SoC needs to be configured upon instantiation by setting the following |
generics: |
|
\begin{table}[h] |
\caption{SoC module generics\label{tab_soc_generics}} |
\begin{tabularx}{\textwidth}{ lll|X } |
\toprule |
Name & Type & Default value & Description \\ |
\midrule |
\texttt{BOOT\_BRAM\_SIZE} & integer & 1024 & Bootstrap BRAM size in 32-bit words. \\ |
\texttt{OBJ\_CODE} & t\_obj\_code & (void code) & Bootstrap BRAM contents. \\ |
\midrule |
\texttt{CLOCK\_FREQ} & integer & 50e6 & Main clock rate. \\ |
\texttt{BAUD\_RATE} & integer & 19200 & UART baud rate. \\ |
\midrule |
\texttt{SRAM\_ADDR\_SIZE} & integer & 17 & Size of SRAM address bus. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
The current version of the SoC is not very strict in the enforcement of limits |
for the generics. You are advised to use only 'reasonable' values. This will |
be fixed, eventually. |
|
Generic \texttt{CLOCK\_FREQ} is only needed in order to compute the default |
baud period for the internal UART (from the value of generic \texttt{BAUD\_RATE}). |
|
|
Generic \texttt{BOOT\_BRAM\_SIZE} will determine the size of the internal |
bootstrap BRAM. This generic \emph{can't be zero}; in the current version of |
the SoC, the BRAM can't be disabled or omitted. |
|
Note that if the size of the bootstrap BRAM is not enough to hold the whole |
bootstrap code provided in generic \texttt{OBJ\_CODE}, the code \emph{will |
be sineltly truncated!}. Usually this will result in an early crash. |
|
Generic \texttt{OBJ\_CODE} is used at synthesis time to initialize the bootstrap |
BRAM. This generic is meant to contain boostrap code, as seen in section |
~\ref{bootstrap_code}). It can be omitted, in which case the bootstrap BRAM |
will be initialized to all zeros. |
|
|
\section{SoC Ports} |
\label{soc_ports} |
|
\begin{figure}[h] |
\makebox[\textwidth]{\framebox[9cm]{\rule{0pt}{9cm} |
\includegraphics[width=8cm]{img/soc_symbol.png}}} |
\caption{SoC module interface\label{soc_symbol}} |
\end{figure} |
|
\begin{table}[h] |
\caption{SoC module interface ports} |
\begin{tabularx}{\textwidth}{ lll|X } |
\toprule |
Name & Type & Width & Description \\ |
\midrule |
clk & in & 1 & Clock input, active rising edge. \\ |
reset & in & 1 & Synchronous global reset. \\ |
\midrule |
sram\_address & out & 16 & Memory word address (bit 0 absent). \\ |
sram\_data\_wr & out & 16 & Memory write data. Only valid when one of the \\ |
& & & memory byte write enable outputs is active.\\ |
sram\_data\_rd & in & 16 & Memory read data. Latched when xxx. \\ |
sram\_byte\_we\_n & out & 2 & Memory byte write enable, active low. \\ |
& & & (0) enables the low byte (7 downto 0) \\ |
& & & (1) enables the high byte (15 downto 8). \\ |
\midrule |
io\_rd\_addr & out & 30 & I/O port read address (bits 1..0 absent). \\ |
& & & Only valid when io\_rd\_vma is high. \\ |
io\_wr\_addr & out & 30 & I/O port write address (bits 1..0 absent). \\ |
io\_wr\_data & out & 32 & I/O write data. Only valid when one of the \\ |
& & & i/o byte write enable outputs is active.\\ |
io\_rd\_data & in & 32 & I/O read data. Latched when xxx. \\ |
io\_byte\_we & out & 4 & I/O byte write enable, active high. \\ |
& & & (0) enables the low byte (7 downto 0) \\ |
& & & (3) enables the high byte (31 downto 24). \\ |
io\_rd\_vma & out & 1 & Active high on i/o read cycles. \\ |
\midrule |
uart\_rxd & in & 1 & RxD input to internal UART. \\ |
uart\_txd & out & 1 & TxD output from internal UART. \\ |
\midrule |
interrupt & in & 8 & Interrupt request inputs, active high. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
As you can see in figure~\ref{soc_symbol} (symbol generated by Xilinx ISE), |
the SoC has the following interfaces: |
|
\begin{enumerate} |
\item Interface to external static asynchronous memory (SRAM, FLASH...). |
\item Interface to on-chip peripherals. |
\item Interrupt inputs. |
\item Debug port. |
\end{enumerate} |
|
These interfaces will be explained in the following subsections. The top module |
for the demo supplied with the project (c2sb\_demo.vhdl) will be used for |
illustration. |
|
\emph{NOTE}: This section needs a lot of elaboration -- ideally this should be |
equivalent to a datasheet in thoroughness and detail. This work, like many |
other parts of this project, will have to wait. |
|
\subsection{SoC interface to static memory} |
\label{soc_if_sram} |
|
The interface to external memory in the SoC module is essentially that of the |
internal cache/memory controller. Its timing is described in section |
~\ref{cache_state_machine}.\\ |
|
The SoC inputs are meant to be connected straight to the FPGA i/o pins. The only |
trick is the bidirectional memory data bus: as you can see, the SoC data buses |
are unidirectional and thus you will need to provide an interconnection |
external to this module. This interconnection shall include the requisite |
3-state buffers: |
|
\begin{verbatim} |
sram_databus <= sram_data_wr when sram_byte_we_n/="11" else (others => 'Z'); |
\end{verbatim} |
|
The top level \emph{c2sb\_demo} module can be used as a fully tested example of |
how to use this interface to connect to a common 16-bit-wide SRAM chip |
(ISSI IS61LV25616). |
|
In reviewing the top module source, note that I had to adapt the dual |
byte-write-enable outputs to the SRAM |
configuration of a single write-enable plus dual byte-enable inputs. |
|
Note too that the static memory bus of the SoC module is used to access both the 16-bit wide SRAM |
and an 8-bit wide FLASH. These chips are connected to separate buses on the |
target board, so the top c2sb\_demo module needs to conflate both buses before connecting |
them to the SoC. This is why a multiplexor is used in the \texttt{mpu\_sram\_data\_rd} |
bus. A real-world board would probably have the SRAM and the FLASH connected |
to the same bus, simplifying the interface logic. |
|
|
\subsection{SoC interface to peripherals} |
\label{soc_if_io} |
|
Every CPU access to an area designated as I/O (see ~\ref{soc_memory_map}, memory map) |
will trigger a read/write cycle on this interface. |
|
I/O ports are synchronous, byte accesible registers meant to be implemented |
within the FPGA. I/O ports do not support wait states. |
|
The I/O interface has separate input and output buses. |
|
In an output cycle, one or more lines of signal \texttt{io\_byte\_we} will be |
asserted for one clock cycle. Signals \texttt{io\_wr\_addr} and \texttt{io\_wr\_addr} will |
be valid as long as \texttt{io\_byte\_we} is asserted. |
|
In an input cycle, \texttt{io\_rd\_vma} will be asserted for one cycle and the input |
data should be present at \texttt{io\_rd\_data} at the end of the following clock |
cycle. The full read operation extends over two clock cycles. |
|
\subsection{SoC interrupt inputs} |
\label{soc_irqs} |
|
The present version of the CPU does not have support for hardware interrupts |
and therefore these signals are not used yet and are unconnected. |
Hardware interrupts will be implemented in some future version as |
time permits. |
|
\subsection{SoC debug port} |
\label{soc_debug_port} |
|
The debug port is a VHDL record (\texttt{t\_debug\_info}, defined in |
package \emph{mips\_pkg}), which holds some internal CPU status flags that |
can be useful while debugging the core. It is not meant to be useful for |
a real application. |
|
Currently the record holds only two flags: |
|
\begin{itemize} |
\item \texttt{cache\_enabled}, asserted when the cache is enabled. |
\item \texttt{unmapped\_access}, asserted when some access to an unmapped |
address is made. |
\end{itemize} |
|
The current version of the demo connects these signals to some on-board |
LEDs. |
|
|
\section{SoC Memory Map} |
\label{soc_memory_map} |
|
The \emph{memory map} determines the type of memory that is connected to |
each of a number of predefined address rangess (see section |
~\ref{memory_map_definition}). |
It is defined in package \emph{mips\_pkg} and it is implemented in the |
\emph{mips\_cache} module. |
|
\begin{table}[h] |
\caption{SoC module memory map\label{tab_soc_memory_map}} |
\begin{tabularx}{\textwidth}{ lll|X } |
\toprule |
Address range & Type & Wait States & Intended usage \\ |
\midrule |
\texttt{0xb8000000-0xbfffffff} & BRAM & 0 & SoC internal boot BRAM. \\ |
\midrule |
\texttt{0x00000000-0x07ffffff} & SRAM-16 & 2 & Off-chip SRAM. \\ |
\texttt{0x80000000-0x87ffffff} & SRAM-16 & 2 & Off-chip SRAM. \\ |
\texttt{0x20000000-0x27ffffff} & I/O & 0 & On-chip I/O registers. \\ |
\texttt{0xb0000000-0xb7ffffff} & SRAM-8 & 7 & Off-chip SRAM or FLASH. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
|
\section{SoC UART} |
\label{soc_uart} |
|
The current revision of the SoC includes a single peripheral, a hardwired |
8-bit UART (file \emph{uart.vhdl}). |
|
This UART is an 8-bit module built for some other unrelated project of mine |
and commandeered to serve on this SoC. Therefore, it has some features |
(like its 8-bit interface) which are sub-optimal for this application and/or |
are not used. |
|
The UART is 'hardwired' because some of its operational parameters are |
hardcoded and can't be changed even at synthesis time. Namely: |
|
\begin{itemize} |
\item Stop bits: 1. |
\item Parity: None. |
\item Bits per character: 8. |
\end{itemize} |
|
All other parameters can at least be configured at synthesis time, and |
under some conditions can be configured at run time too. The interested |
user must read the module source for a better explaination of these |
features. This document will only deal with the UART module as it is |
instantiated in the SoC. |
|
|
These are the UART control registers: |
|
\begin{table}[h] |
\caption{UART control registers\label{uart_control_regs}} |
\begin{tabularx}{\textwidth}{ ll|X } |
\toprule |
Byte Address & Word Address & Register \\ |
\midrule |
\texttt{0x20000003} & \texttt{0x20000000} & Tx/Rx Buffer \\ |
\texttt{0x20000007} & \texttt{0x20000004} & Status. \\ |
\texttt{0x2000000b} & \texttt{0x20000008} & Baud rate period, low byte. \\ |
\texttt{0x2000000f} & \texttt{0x2000000c} & Baud rate period, high byte. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
|
All of these registers are mapped to the byte address given in the table, |
that is, they are mapped on the \emph{low} byte of the 32-bit word they |
belong to -- you don't have to worry about this unless you use a word |
pointer to access these registers. |
|
|
\subsection{UART Usage} |
\label{soc_uart_usage} |
|
Until hardware interrupts are implemented, you have to rely on polling to |
use the UART. |
|
When you want to transmit, you wait until flag TxRdy is '1' and then write |
to the Tx Buffer. That will clear TxRdy until the transmission is done. |
Writing to the Tx Buffer will NOT clear flag TxIrq. |
|
Writing to the Tx Buffer while TxRdy is '0' will have no effect. |
|
When you want to read received data, you wait until RxRdy is '1' and then |
read the Rx Buffer. Reading the Rx Buffer will clear flag RxRdy until a new |
byte arrives. |
Reading the RxBuffer while RxRdy is '0' will return undefined data. |
Reading the Rx Buffer will NOT clear flag RxIrq. |
|
Of course, once hardware interrupts are implemented, you will use them |
instead of polling, but this is the basic mechanics. Same as any old UART, |
really. |
|
|
|
\subsection{UART Status Register} |
\label{soc_status_reg} |
|
These are the flags present in the status register: |
|
|
\needspace{7\baselineskip} |
\begin{verbatim} |
UART Status Register |
|
7 6 5 4 3 2 1 0 |
+-------+-------+-------+-------+-------+-------+-------+-------+ |
| 0 | 0 | RxIrq | TxIrq | 0 | 0 | RxRdy | TxRdy | |
+-------+-------+-------+-------+-------+-------+-------+-------+ |
h h W1C W1C h h r r |
\end{verbatim} |
|
|
Bits marked 'h' are hardwired and can't be modified. |
|
Bits marked 'r' are read only; they are set and clear by the UART core. |
|
Bits marked W1C ('Write 1 Clear') are set by the UART core when an interrupt |
has been triggered and must be cleared by the software by writing a '1'. |
|
\begin{itemize} |
\item Status bit TxRdy is high when there isn't any transmission in progress. |
It is cleared when data is written to the transmission buffer and is |
raised at the same time the transmission interrupt is triggered. |
\item Status bit RxRdy is raised at the same time the receive interrupt is |
triggered and is cleared when the data register is read. |
\item Status bit TxIrq is raised when the transmission interrupt is triggered |
and is cleared when a 1 is written to it. |
\item Status bit RxIrq is raised when the reception interrupt is triggered |
and is cleared when a 1 is written to it. |
\end{itemize} |
|
When writing to the status/control registers, only flags TxIrq and RxIrq are |
affected, and only when writing a '1' as explained above. All other flags |
are read-only. |
|
|
\subsection{UART Baud Rate Registers} |
\label{soc_uart_baud_regs} |
|
When the UART module generic 'HARDWIRED' is set to 'false', these registers |
can be written to in order to configure the baud rate -- see the source for |
details. |
|
When the UART module generic 'HARDWIRED' is set to 'true', these registers |
are frozen and can't be modified. This is how the module is instantiated in |
the current version of the SoC. |
|
In either case, these are write-only registers: reading them will return |
the contents of the status register (simplified multiplexor). |
|
The baud rate is configured by loading these registers with the baud period |
in clock cycles. |
|
|
\subsection{UART Interrupt} |
\label{soc_uart_interrupts} |
|
The UART can trigger an interrupt (i.e. assert its interrupt output for one |
clock cycle) whenever a character is received or transmitted. The UART |
source explains in detail exactly when these interrupts are triggered. |
|
The interrupt status is kept in two flags on the status register that can |
be used for interrupt polling. Note there's no way to tell what kind of |
interrupt we got other than looking at those flags. |
|
Since the current CPU revision does not support hardware interrupts, this |
feature is still unused and the interrupt line is unconnected. |
Again, details can be found in the UART module source. |
|
|
|
|
|
|
|
|
|
|
|
|
|
/src/tex/cpu.tex
10,8 → 10,8
\caption{CPU module interface\label{cpu_symbol}} |
\end{figure} |
|
the CPU module is not meant to be used directly. Instead, the MCU module |
described in section ~\ref{mcu_module} should be used.\\ |
the CPU module is not meant to be used directly. Instead, the SoC module |
described in chapter ~\ref{soc_module} should be used.\\ |
|
The following sections will describe the CPU structure and interfaces.\\ |
|
46,7 → 46,7
Note that the basic cpu module (mips\_cpu) is meant to be connected to |
internal, synchronous BRAMs only (i.e. the cache BRAMs). Some of its |
outputs are not registered because they needn't be. The parent module |
(called 'mips\_mcu') has its outputs registered to limit $t_{co}$ to |
(called 'mips\_soc') has its outputs registered to limit $t_{co}$ to |
acceptable values.\\ |
|
|
171,9 → 171,9
stages as long as it is active. It is meant to be used by the cache at cache |
refills.\\ |
|
In short, the cache/memory controller stops the cpu for all data/code |
The cache/memory controller stops the cpu for all data/code |
misses for as long as it takes to do a line refill. The current cache |
implementation does refills in order (i.e. not 'target address first'). |
implementation does refills in reverse order (i.e. not 'target address first'). |
|
Note that external memory wait states are a separate issue. They too are |
handled in the cache/memory controller. See section~\ref{cache} on the memory |
254,7 → 254,7
\end{verbatim}\\ |
|
In the source code, all registers and signals in stage |
\textless i\textgreater are prefixed by |
\textless i\textgreater are prefixed by |
"p\textless i\textgreater\_", as in p0\_*, p1\_* and p2\_*. |
A stage includes a set of registers and |
all the logic that feeds from those registers (actually, all the logic |
271,9 → 271,12
ports belong logically to stage 1 and the write port to stage 2.\\ |
|
IMPORTANT: though the register bank read port is synchronous, its data can |
be used in stage 1 because it is read early (the read port is loaded at the |
be used in stage 1 because it is read early (the read address port is loaded at the |
same time as the instruction opcode). That is, a small part of the |
instruction decoding is done on stage FETCH-1. Bearing in mind that the code |
instruction decoding is done on stage FETCH-1, by feeding the source |
register index field straight from the code bus to the register bank BRAM. |
|
Bearing in mind that the code cache |
ram is meant to be the exact same type of block as the register bank (or |
faster if the register bank is implemented with distributed RAM), and we |
will cram the whole ALU delay plus the reg bank delay in stage 1, it does |
454,7 → 457,7
saved to EPF is not the victim instruction's but the preceding jump |
instruction's as explained in \cite[p.~64]{see_mips_run}.\\ |
|
Plasma used to save in epc the address of the instruction after break or |
Plasma CPU used to save in epc the address of the instruction after break or |
syscall, and used an unstandard vector address (0x03c). This core will go |
the standard R3000 way instead.\\ |
|
/src/tex/intro.tex
1,11 → 1,11
\clearpage |
This file contains usage instructions and notes about the Ion CPU core project. |
The core structure is briefly explained in sections 1 to 4. The rest of this |
The core structure is briefly explained in sections 1 to 5. The rest of this |
doc describes other aspects of the project: code samples, utility scripts, |
etc.\\ |
|
This document is not yet a full reference on the Ion core. Instead, it should be |
taken as a companion and commentary to the source code.\\ |
This document is not yet a full reference on the Ion core or a data sheet. |
Instead, it should be taken as a companion and commentary to the source code.\\ |
|
This document assumes you know in some depth the MIPS-I architecture. Terms and |
concepts from \cite['See MIPS Run']{see_mips_run} and |
36,9 → 36,9
\item All unimplemented opcodes trigger the proper traps. |
\item Includes minimalistic memory handler with interfaces for external |
SRAM (or FLASH) on 8- and 16-bit data bus. |
\item Size and speed compares favorably to other free MIPS cores. |
\item Size and speed comparable to other free MIPS cores. |
\item Fully sinchronous (rising clock edge only). No latches. |
\item Source HDL is vendor independent (Though it has only been tested on |
\item Source HDL is fully vendor independent (Only tested on |
Xilinx and Altera synthesis tools). |
\end{enumerate} |
\end{framed} |
71,7 → 71,7
\item Hardware interrupts not implemented. |
\item Memory handler does not support dynamic RAM. |
\item Caches are not configurable or parametrizable. |
\item Documentation is disastrously inadequate. |
\item Real documentation (specs doc \& data sheet) missing. |
\end{enumerate} |
\end{framed} |
|
/src/tex/sw_samples.tex
20,8 → 20,7
|
Target 'demo' will build a synthesizable demo; it will compile the sample |
sources and place the resulting object code in file |
'/vhdl/demo/code\_rom\_pkg.vhdl' (note that the 'sim' target has to do this |
too).\\ |
'/vhdl/SoC/bootstrap\_code\_pkg.vhdl'.\\ |
|
The build process will produce two or more binary files ('*.code' and |
'*.data', or '*.bin') that can be run on the software simulator, plus a |
31,6 → 30,7
software simulator with the proper parameters. As an example, these are the |
contents of the 'swsim.bat' file for the 'hello' demo: |
|
\needspace{8\baselineskip} |
\begin{verbatim} |
@rem Run software simulator in hands-off mode |
..\..\tools\slite\slite\bin\Debug\slite.exe ^ |
51,6 → 51,7
templates. |
Assuming you have Python 2.5 or later in your machine, call the script with: |
|
\needspace{1\baselineskip} |
\begin{verbatim} |
python bin2hdl.py --help |
\end{verbatim}\\ |
57,3 → 58,165
|
to get a short description (see section~\ref{python_script}). |
|
\section{Support Code} |
\label{support_code} |
|
\subsection{Bootstrap Code} |
\label{asm_bootstrap_code} |
|
File \emph{'/src/common/bootstrap.s'} contains the reset code and the stub |
interrupt handler. |
|
This code is meant to be placed at the reset vector address; in the present |
revision of the project, the bootstrap code would be placed in the |
bootstrap BRAM of the SoC module (see section ~\ref{bootstrap_code}). It |
can be placed in external memory if the SoC memory map and the makefiles are |
altered accordingly. |
|
The reset code initializes both I-Cache and D-Cache and jumps to symbol |
\texttt{'entry'} with the CPU in kernel mode and interrupts disabled. |
|
|
The interrupt code is somewhat more elaborate but nevertheless is still |
a stub. The interrupt code will find out the cause of the interrupt and will |
proceed. |
|
Table ~\ref{tab_irq_handling} shows interrupt sources recognized in the |
current version, and how the interrupt code deals with them. |
|
\begin{table}[h] |
\caption{Interrupt handling\label{tab_irq_handling}} |
\begin{tabularx}{\textwidth}{ l|X } |
\toprule |
Cause & Processing \\ |
\midrule |
SYSCALL instruction & Return immediately (STUB). \\ |
BREAK instruction & Return immediately (STUB). \\ |
Invalid opcode & Return immediately (STUB). \\ |
Unimplemented MIPS-32 opcode & Emulate opcode. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
Please note that the interrupt code does not even \emph{know} there is |
such a thing as a hardware interrupt -- hardware interrupts are not |
implemented yet in the CPU. |
|
As can be seen, most of the interrupt processing is missing, replaced by |
stubs. Eventually, those stubs will be replaced with calls to C functions |
that will be defined in the supporting libraries. I have yet to work out |
exactly how to do it without reinventing the wheel. |
|
The only interrupt processing that is performed (even if only partially) is |
the emulation of \emph{some} MIPS-32 opcodes. |
|
\subsection{MIPS-32 Opcode Emulation} |
\label{mips32_opcode_emulation} |
|
I have found out that most MIPS toolchains target the MIPS-32 architecture |
and can only be made to work with the MIPS-I architecture with some |
difficulty. This applies specially to the C support libraries, where |
MIPS-32 opcodes are used occasionally. |
Other reasons, such as the availability of MIPS-32 ports of |
uClinux, make at least partial compatibility to MIPS-32 very desirable. |
|
On the other hand, extending the core to implement the full MIPS-32 specs (as |
opposed to the far simpler MIPS-I) might not be possible without running |
into a patent minefield -- I may be wrong in this. |
|
Therefore, I have chosen to just emulate whatever subset of the |
MIPS-32 ISA is enough to overcome the above obstacles. I have run some |
experiments with uClinux and have found out that only a few MIPS-32 opcodes |
need to be emulated -- see table ~\ref{tab_emulated_opcodes}. |
|
\begin{table}[h] |
\caption{Emulated MIPS-32 opcodes\label{tab_emulated_opcodes}} |
\begin{tabularx}{\textwidth}{ l|X } |
\toprule |
Opcode & Emulation \\ |
\midrule |
EXT & Fully emulated. \\ |
INS & Fully emulated. \\ |
CLO & Fully emulated. \\ |
CLZ & Fully emulated. \\ |
\midrule |
MUL (3-reg version) & To Be Done. \\ |
LWL & To Be Done. \\ |
LWR & To Be Done. \\ |
SWL & To Be Done. \\ |
SWR & To Be Done. \\ |
\bottomrule |
\end{tabularx} |
\end{table} |
|
Opcode emulation is implemented in file \emph{/src/common/opcode\_emu.s}. |
|
The emulation code has been tested (code sample \emph{'opcodes'}) but is |
still unfinished: not only are there opcodes to be emulated, as can |
be seen in table ~\ref{tab_emulated_opcodes}, but the emulation does NOT |
work when the MIPS-32 opcode is in a delay slot; or more precisely, the |
jump instruction of the delay slot is not emulated as it should -- this is |
just work to be done, nothing specially difficult. |
|
|
If the trapped MIPS-32 opcode is not one of the emulated opcodes, it is |
ignored exactly as if it was a NOP. A flag is raised in a special, reserved |
area of the BSS segment -- see the source for details. Eventually the trap |
handling will be complete and unimplemented MIPS-32 opcodes will be dealt |
with as undefined opcodes. |
|
Note that opcode emulation can be disabled in the makefiles, by defining |
symbol \texttt{NO\_EMU\_MIPS32} when assembling \emph{bootstrap.s}. |
|
|
\subsection{C Startup Code} |
\label{c_startup_code} |
|
File \emph{'/src/common/c\_startup.s'} contains the C startup code used |
in all C code samples. |
|
This startup code does what's usual in these cases, namely: |
|
\begin{enumerate} |
\item Initialize the stack at the end of the bss area. |
\item Clear the bss area. |
\item Move the data section from FLASH to RAM (if applicable). |
\item Call main(). |
\item Freeze in endless loop after main() returns, if it does. |
\end{enumerate} |
|
|
See the makefile of any of the C code samples for examples on how to |
assemble, configure and link the startup code. |
|
|
\subsection{C Support Library Replacement} |
\label{libsoc} |
|
The core will eventually be tested with one of the regular, free libc |
replacement libraries available. It isn't yet because building those |
libraries for a MIPS-I target (i.e. with no MIPS-32 stuff mixed in) has |
turned out to be much more difficult than I had anticipated. |
|
In order to be able to use the C toolchain and provide some code samples in |
C, I have built a minimalistic libc replacement called 'libsoc'. The |
source code can be found in \emph{/src/common/libsoc/src}. |
|
This mini-libc provides only those C support functions I have found |
necessary so far; it will be extended with new functionality as I add more |
code samples, until I can finally use some real C library. |
|
Apart from a number of utility functions that the C runtime needs, libsoc |
provides the following: |
|
\begin{itemize} |
\item Floating point support (SW, float and double). |
\item \texttt{putchar} and \texttt{getchar} using the SoC UART. |
\item Replacement for the built-in \texttt{printf} provided by gcc. |
\end{itemize} |
|
That's it. Yet, even this little is enough to run the Adventure demo... |
|
All the source code has been lifted from the original Plasma supporting code |
or has been scavenged from the net -- see credits in the file headers. |
|
/src/ionstyle.sty
59,3 → 59,6
\usepackage[utf8]{inputenc} |
\usepackage{booktabs} |
\usepackage{tabularx} % Flexible tables |
\hypersetup{% |
pdfborder = {0 0 0} |
} |
/src/ion_project.tex
25,6 → 25,7
\input{./tex/intro.tex} |
%\input{./tex/quickstart.tex} |
\input{./tex/usage.tex} |
\input{./tex/soc.tex} |
\input{./tex/cpu.tex} |
\input{./tex/cache.tex} |
\input{./tex/simulation.tex} |