Username:
Password:

Remember me

Browse

Projects
Forums
About
- Mission
- Logos
- Community
- Statistics
HowTo/FAQ
- FAQ
- Project
- SVN
- WISHBONE
- EDA Tools
Media
- News
- Articles
- Newsletter
Licensing
Commerce
- Shop
- Advertise
- Jobs
Partners
Maintainers
Contact us

Tools

URL https://opencores.org/ocsvn/ion/ion/trunk

Subversion Repositories ion

Compare Revisions

This comparison shows the changes necessary to convert path
```
/ion/trunk/doc
```
from Rev 210 to Rev 221
↔ Reverse comparison
Compare Path: Rev

With Path: Rev

Rev 210 → Rev 221

/src/tex/tools.tex

2,6 → 2,16

 \chapter{Tools}
 \label{tools}
+    Directory '/tools' of the project includes a few tools -- small C or Python
+    programs purpose-built for this project.
+    What follows is a brief description of each of the tools. This document
+    won't go into the implementation or usage details. The tools themselves have
+    brief usage instructions and for any further details the user must read
+    the source code.
 \section{MIPS Software Simulator}
 \label{sw_simulator}

27,10 → 37,12

     \end{itemize}
     Each code sample includes a DOS batch file named 'swsim.bat' that runs the
-    simulator in batch mode.\\
+    simulator in batch mode. Note that the BAT file invokes a windows binary
+    which is included in the SVN repository and should be immediately useable
+    after checkout.\\
     The program includes usage help (a short description of the command line
-    parameters). The source code (very simple and straighforward) is includef in
+    parameters). The source code (very simple and straighforward) is included in
     the project. The BAT files provide an usage example. And anyone who is
     interested and finds trouble can always contact me.

43,9 → 55,56

     The hardcoded log file name is "sw\_sim\_log.txt" and it is generated in the
     same directory from which the simulator is run.\\
+\section{Configuration Package Builder Script build\_pkg.py}
+\label{python_script}
+    This tools is used to build a simulation and synthesis configuration
+    package.
+    The generated package contains configuration constants used by the
+    simulation test bench \emph{'mips\_tb.vhdl'} and by the hardware demo
+    \emph{'c2sb\_demo.vhdl'}.
+    It too includes memory initialization constants containing object code,
+    used to initialize simulated and inferred memories, both in simulation
+    and in synthesis.
+    In the code samples, this script is used to generate two separate packages
+    for simulation and synthesis. Please refer to the makefiles for detailed
+    usage examples.
 \section{Conversion Script bin2hdl.py}
 \label{python_script}
+    \begin{figure*}[ht]
+    \begin{center}
+    {\small
+    \framebox[7in]{
+    \begin{minipage}[t]{6.0in}
+    NOTE: This script was used in previous versions of the project -- it came
+    in handy to initialize byte-sliced memories when the caches were under
+    development.
+    It has been abandoned because it was far too complicated and no longer
+    necessary. The VHDL
+    templates it refers to and the script itself have been moved from the /src
+    directory to their own subdirectory in /tools.
+    It is being retained in case it becomes useful again but it is no longer
+    used.
+    \end{minipage}
+    }
+    }
+    \end{center}
+    \label{lb}
+    \end{figure*}
     This Python script reads one or more binary files and 'inserts' them in a
     vhdl template. It makes the
     conversion from binary to vhdl strings and slices the data in byte columns,

/src/tex/hw_demo.tex

10,7 → 10,8

     makefiles -- assuming you have a mips toolchain.\\
     'Pre-generated' in this context means that all the vhdl files necessary for
-    building the demo are already included with the project, and the only
+    building the demo are already included with the project, including the
+    configuration package that contains the program's object code, and the only
     tool needed is the synthesis tool.
     The pregenerated demo is included just for convenience, so that you can

42,6 → 43,7

\item 'Next' your way out of the new project wizard.

\item Add to the project all the vhdl files in /vhdl and /vhdl/demo,

except mips\_cache\_stub.vhdl and sdram\_controller.vhdl.

\item Add to the project all the vhdl files in /vhdl/SoC.

\item Select file c2sb\_demo.vhdl as top.

\item Import pin constraints file (assignments-\textgreater import assignments).

\item Create a clock constraint for signal clk (51 MHz or some other

51,7 → 53,7

         \item Double-click on nCEO value column and select "use as regular I/O".
             IMPORTANT: otherwise the synthesis will fail; we need to use a FPGA
             pin that happens to be dual-purpose (programming and regular).
-        \item Select 'balanced' optimization.
+        \item Select 'speed' optimization.
         \item Save the project and synthesize.
         \item Make sure the clock constraint is met (timing analyzer report).
             There is a random element to the synthesis process, as you know,

59,8 → 61,7

         \item Program the FPGA from Quartus-2
         \item If you have a terminal hooked to the serial port (19200/8/N/1) you
             should see a welcome message after depressing the reset button.
-            (by default this is pusbutton 2).
+            (by default this is pushbutton 2).
     \end{enumerate}
     In the present version, the synthesis will produce a lot of warnings. The

88,20 → 89,20

     this:
     \begin{itemize}
-    \item An FPGA capable enough (the demo uses internal memory for code)
-    \item At least 4KB of 16-bit wide external, asynchronous, old-fashioned SRAM
-    \item A reset pin (possibly a pushbutton)
-    \item A clock input (uart modules assume 50MHz, see below)
-    \item RXD and TXD UART pins, plus a connector, header or whatever
+    \item An FPGA capable enough (the demo uses internal memory for code).
+    \item At least 4KB of 16-bit wide external, asynchronous, old-fashioned SRAM.
+    \item A reset pin (possibly a pushbutton).
+    \item A clock input (uart modules assume 50MHz, see below).
+    \item RXD and TXD UART pins, plus a connector, header or whatever.
     \end{itemize}
-    The only modules that care at all about clock rate are the UART
-    modules. They are hardwired to 19200 bauds when clocked at 50MHz, so if you
+    The only module that care at all about clock rate is the UART embedded into
+    the SoC module. It's hardwired to 19200 bauds when clocked at 50MHz, so if you
     use a different frequency you must edit the generics in the demo entity
-    accordingly.\\
-    Be aware that these uart modules have been used a lot in other projects but
-    have not been tested with a wide range of clock rates; they should work but
-    you have been warned.\\
+    accordingly -- the demo generics are passed all the way down to whatever
+    module needs them.\\
+    The UART has hardly been tested at clock rates other than 50MHz and has not
+    passed any independent test bench; try the core first at 50 MHz.\\
     Though there is no reset control logic, the reset input is synchronized
     internally, so you can use a raw pushbutton -- you may trigger multiple

110,7 → 111,8

     Assuming you take care of all of the above, the easiest way I see to port
     the demo is just editing the top module ports ('/vhdl/demo/c2sb\_demo.vhdl')
-    to match your board setup.\\
+    to match your board setup. The only tricky part is the interface to FLASH
+    and SDRAM.\\
     All the code in this project is vendor agnostic (or should be, I have only
     tried it on Quartus and ISE). Specifically, it does not instantiate memory

148,4 → 150,4

great as a confidence builder.\\

Besides, running Adventure on a computer built by myself is something

I just wanted to do :)\\

I've always wanted to do :)\\

/src/tex/cache.tex

19,7 → 19,7

     alternative, simplified scheme.\\
     The standard R3000 cache control flags in the SR are not used, either. Instead,
-    two flags from the SR have been commandeered for cache control.\\
+    two flags from the SR have been repurposed for cache control.\\
 \subsection{Cache control flags}
 \label{cache_control_flags}

67,8 → 67,8

 \needspace{10\baselineskip}
 \begin{verbatim}
-             ___________ <-- These address bits are NOT in the tag
-            /           \
+                _________ <-- These address bits are NOT in the tag
+               /         \
 ..   27| 26 .. 21  |20 ..          12|11  ..        4|3:2|
     +---------+-----------+-----------------+---------------+---+---+
     | 5       |           | 9               | 8             | 2 |   |

81,10 → 81,11

     Since bits 26 downto 21 are not included in the tag, there will be a
     'mirror' effect in the cache. We have effectively split the memory space
     into 32 separate blocks of 1MB which is obviously not enough but will do
-    for the initial tests.
+    for the initial versions of the core.
     In subsequent versions of the cache, the tag size needs to be enlarged AND
     some of the top bits might be omitted when they're not needed to implement
-    the default memory map (namely bit 30 which is always '0').
+    the default MIPS memory map (namely bit 30 which is always '0').
 \section{Memory Controller}

122,10 → 123,10

     For each address, the memory map logic will supply the following information:
 \begin{enumerate}
-    \item What kind of memory it is
-    \item How many wait states to use
-    \item Whether it is writeable or not (ignored in current version)
-    \item Whether it is cacheable or not (ignored in current version)
+    \item What kind of memory it is.
+    \item How many wait states to use.
+    \item Whether it is writeable or not (ignored in current version).
+    \item Whether it is cacheable or not (ignored in current version).
 \end{enumerate}
     In the present implementation the memory map can't be modified at run time.\\

209,7 → 210,7

 cache/ps    ?| (1)             | (2)             | ... | (2)             |??
-refill_ctr  ?| 0                                 | ... <  3              |??
+refill_ctr  ?| 0                                 | ... |  3              |??
 chip_addr   ?|  210h           |  211h           | ... |  217h           |--

229,7 → 230,6

in this chronogram it takes the following values:

\begin{enumerate}

\item idle

\item data\_refill\_sram\_0

\item data\_refill\_sram\_1

\end{enumerate}

244,10 → 244,15

 \subsubsection{SRAM interface read cycle timing -- 8-bit interface}
 \label{sram_read_cycle_8b}
-TODO: 8-bit refill procedure to be done.
+The refill from an 8-bit static memory is essentially the same as depicted
+above, except we need to read 4 bytes (over the LSB lines of the static memory
+data bus) instead of 2 16-bit halfwords. The operation takes correspondingly
+longer to perform and uses an extra address line but is otherwise identical.
+TODO: 8-bit refill chronogram to be done.
-\subsubsection{SRAM interface write cycle timing}
+\subsubsection{16-bit SRAM interface write cycle timing}
 \label{sram_write_cycle}
 The path of the state machine that deals with SRAM writethroughs is linear so

261,7 → 266,7

 A general memory write will be 32-bit wide and thus it will take two 16-bit
 memory accesses to complete. Unaligned, halfword or byte wide CPU writes might
 in some cases be optimized to take only a single 16-bit memory access. This
-module does no such optimization.
+module does no such optimization yet.
 For simplicity, all writethroughs take two 16-bit access cycles, even if one
 of them has both we\_n signals deasserted.\\

274,7 → 279,7

 In this example, the SRAM is being accessed with 1 WS: WE\_N is asserted for
 two cycles.
-Note how a lot of cycles are lost in order to guarantee compliance with the
+Note how a lot of cycles are used in order to guarantee compliance with the
 setup and hold times of the SRAM against the we, address and data lines.
 \needspace{15\baselineskip}

302,7 → 307,6

in this chronogram it takes the following values:

\begin{enumerate}

\item idle

\item data\_writethrough\_sram\_0a

\item data\_writethrough\_sram\_0b

\item data\_writethrough\_sram\_0c

316,6 → 320,9

 \section{Known Problems}
 \label{cache_problems}
+    The cache implementation is still provisional and has a number of
+    acknowledged problems:
 \begin{enumerate}
 \item All parameters hardcoded -- generics are almost ignored.
 \item SRAM read state machine does not guarantee internal FPGA $T_{hold}$.

324,5 → 331,6

in the parent module) are far smaller than the SRAM response times, but

it would be better to insert an extra cycle after the wait states in

the sram read state machine.

\item Cache logic mixed with memory controller logic.

\end{enumerate}

/src/tex/simulation.tex

11,7 → 11,7

     the cpu state to a text log file.\\
     This log file can then be compared to a log file generated by a software
-    simulator for the same code sample (see section 5.1). The software
+    simulator for the same code sample (see section \ref{sw_simulator}). The software
     simulator is the 'golden model' against which the cpu is tested, so any
     difference between both log files means trouble.\\

22,38 → 22,31

     In addition to the main log file, there is a console log file to which all
     data written to the UART is logged (see section~\ref{uart_logging}).\\
-    There are a few simulation test bench templates in the /src directory, which
-    are used by all the code samples.\\
-    The only ones actually used are '/src/code\_rom\_template.vhdl' and
-    '/src/sim\_params\_template.vhdl'. The others
-    are remnants of previous versions that will be removed ASAP.\\
-    The template in file '/src/code\_rom\_template.vhdl' is filled with object
-    code meant to be run from internal FPGA BRAM. This is how we load bootstrap
-    code into our FPGA. The resulting file is '/vhdl/demo/code\_rom\_pkg.vhdl'
-    and is used by both the simulation test bench and the synthesizable MCU.\\
+    The simulation test bench can be found in file '/vhdl/tb/mips\_tb.vhdl'.
+    This test bench is meant to be used with all the code samples.
-    The template in file '/src/sim\_params\_template.vhdl' is filled with
-    simulation parameters (such as the simulation length, etc.) and the resulting
-    file is written as '/vhdl/tb/sim\_params\_pkg.vhdl'. This file is only used
-    by the simulation test bench.
+    Each of the code samples configures the simulation test bench with certain
+    parameters (such as simulation length or memory sizes) and of course each
+    sample has a different object code to be run. The way to pass these
+    parameters to the simulation is through a simulation package, in file
+    '/vhdl/tb/sim\_params\_pkg.vhdl'.
-    All of this template filling is done by a python script (/src/bin2hdl.py)
-    which is invoked from the makefiles and explained in section xxx.\\
+    This file is generated from a template whenever you 'make' each code sample
+    (see section~\ref{samples}). The package is built using oe of the
+    provided tools, 'build\_pkg', explained in section ~\ref{build_pkg}.
     Note that all code samples share the same vhdl files: you need to run the
     makefile target 'sim' for the sample you want to simulate; that will
-    overwrite the two files mentioned above. So there's no vhdl file that is
+    overwrite the package file mentioned above. So there's no vhdl file that is
     specific to a particular code sample.\\
-    The actual test bench entity is at '/vhdl/tb/mips\_tb.vhdl' and is shared
-    by all the code samples.\\
     While the test benches and sample code are good enough to catch MOST errors
     in the full system (i.e. cache included) they don't help with diagnostic;
     once you know there's an error, and the approximate address where it's
     triggered (approximate because of the cache) you have to dig into the
-    simulation waveforms to find it. It's easier than it seems.\\
+    simulation waveforms to find it.\\
 \section{Running the Simulation}
 \label{running_the_simulation}

66,11 → 59,11

     The test bench files mentioned in the previous section are automatically
     generated for each of the sample programs. This is automatically done by the
     sample code makefile,
-    assuming you have a MIPS cross-toolchain in your computer (see section~\ref{code_samples}).\\
+    assuming you have a MIPS cross-toolchain in your computer (see section~\ref{samples}).\\
-    For convenience, a pre-generated mips\_tb.vhdl is included so you can launch
-    a simulation without having to install toolchains, etc. The code is that
-    of the 'hello world' sample.\\
+    For convenience, a pre-generated file 'sim\_params\_pkg.vhdl' is included
+    so you can launch a simulation without having to install toolchains, etc.
+    The code is that of the 'hello world' sample.\\
     I guess that if you are interested in this sort of stuff then you probably
     know more about Modelsim than I do. Yet, here's a step-by-step guide to

79,7 → 72,7

 \begin{enumerate}
     \item Run 'make hello\_sim' from directory '/src/hello'.
         This will compile the program sources, build the necessary binary object
-        files and then create the two package files mentioned above.\\
+        files and then create the package file mentioned above.\\
         Read the makefile and comments in the python script '/src/bin2hdl.py'
         for details.\\

138,8 → 131,6

     Events are logged with the address of the instruction that triggered
     the change. This holds true even for load instructions.\\
-    Early versions of the project logged the address of the
-    preceding instruction -- it was confusing and I have fixed it.\\
     The simulation log file is stored by default in modelsim's working directory
     (see above). I don't provide any automated script to do the comparison, you

/src/tex/usage.tex

9,135 → 9,102

 \begin{enumerate}
     \item The CPU (mips\_cpu.vhdl).
     \item The cache+memory controller (mips\_cache.vhdl).
-    \item An 'MCU' entity which combines CPU+Cache (mips\_mpu.vhdl).
+    \item A 'SoC' entity which combines CPU+Cache (mips\_soc.vhdl).
 \end{enumerate}
-The entity you should use in your projects is the MCU module. The project
+The entity you should use in your projects is the SoC module. The project
 includes a 'hardware demo' built around this module (see section
-~\ref{pregenerated_demo}) which can be used as an usage example.\\
+~\ref{pregenerated_demo}) which is meant as an usage example.\\
-The main modules are briefly described in the following subsections.
+\section{Bootstrap Code}
+\label{bootstrap_code}
-\section{MCU Module}
-\label{mcu_module}
+Though the core is meant to run mostly from off-chip memory, the current version
+of the SoC module includes a small ROM implemented as FPGA BRAM and called
+'bootstrap BRAM'. In the current version of the core, this BRAM can be loaded
+with arbitrary code and its size can be configured by using generics, but it
+can't be removed from the SoC. Even though the memory map can be modified to
+boot from external FLASH and not use a BRAM at all, a BRAM will still be
+inferred within the SoC -- subsequent versions will fix this.
-The MCU module main purpose is to encapsulate the somewhat complex
-interconnection between the CPU and the Cache module.
+As can be seen in table~\ref{tab_soc_memory_map}, the internal BRAM is mirrored
+all over a wide area starting at \texttt{0xb8000000}. In practice, this means
+the BRAM will be mapped at the CPU reset address (\texttt{0xbfc00000}) and thus
+the bootstrap code should be placed there.
+Unless the bootstrap BRAM is very small, it will span over the interrupt vector
+address too (\texttt{0xbfc00180}).
-If some project demands that some piece of hardware be directly connected to the
-CPU, bypassing the cache, this is where it should be -- an MMU comes to mind.
+For example, the 'Adventure' demo included with the project uses bootstrap
+code included in file \texttt{/src/common/bootstrap.s}. This bootstrap code
+is fairly incomplete (interrupt response code is mostly a stub) yet it's enough
+to boot most applications.
+Note that the C startup code, which deals with things like initializing the
+static variables on the data segment, etc. is not part of this bootstrap code.
+It can be found in file \texttt{/src/common/c\_startup.s}
-Any peripherals deemed common enough that they will be present in all projects
-might be placed in the MCU module too -- after all, the MCU name has been chosen
-to imply that 'bundling together' of a CPU and a bunch of peripherals.
+So, in short, the code loaded onto the startup BRAM should include the most
+basic system initialization (cache initialization at least) and the entry point
+for the interrupt response code; plus a jump to the main program entry address.
-In the current version of the MCU module, there is only a peripheral included in
-it -- a hardwired UART module. There is no penalty for placing peripherals
-ouside the MCU module, so there is no incentive to place them inside, thus
-making the interface more complex. This is an implementation option of yours.\\
+Anyone trying to build some application on this core is advised to use the code
+samples as starting points, specially the makefiles.
-\subsection{MCU Ports}
-\label{mcu_ports}
-\begin{figure}[h]
-\makebox[\textwidth]{\framebox[9cm]{\rule{0pt}{9cm}
-\includegraphics[width=8cm]{img/mpu_symbol.png}}}
-\caption{MPU module interface\label{mpu_symbol}}
-\end{figure}
+\subsection{Loading Bootstrap Code on the SoC Module}
+\label{loading_bootstrap_code}
-\begin{table}[h]
-\caption{MCU module interface ports}
-\begin{tabularx}{\textwidth}{ lll|X }
-\toprule
-Name & Type & Width & Description \\
-\midrule
-clk                 & in    & 1  & Clock input, active rising edge. \\
-reset               & in    & 1  & Synchronous global reset. \\
-\midrule
-sram\_address       & out   & 16 & Memory word address (bit 0 absent). \\
-sram\_data\_wr      & out   & 16 & Memory write data. Only valid when one of the \\
-                    &       &    & memory byte write enable outputs is active.\\
-sram\_data\_rd      & in    & 16 & Memory read data. Latched when xxx. \\
-sram\_byte\_we\_n   & out   & 2  & Memory byte write enable, active low.  \\
-                    &       &    & (0) enables the low byte (7 downto 0) \\
-                    &       &    & (1) enables the high byte (15 downto 8). \\
-\midrule
-io\_rd\_addr        & out   & 30 & I/O port read address (bits 1..0 absent). \\
-                    &       &    & Only valid when io\_rd\_vma is high. \\
-io\_wr\_addr        & out   & 30 & I/O port write address (bits 1..0 absent). \\
-io\_wr\_data        & out   & 32 & I/O write data.  Only valid when one of the \\
-                    &       &    & i/o byte write enable outputs is active.\\
-io\_rd\_data        & in    & 32 & I/O read data. Latched when xxx. \\
-io\_byte\_we        & out   & 4  & I/O byte write enable, active high. \\
-                    &       &    & (0) enables the low byte (7 downto 0) \\
-                    &       &    & (3) enables the high byte (31 downto 24). \\
-io\_rd\_vma         & out   & 1  & Active high on i/o read cycles. \\
-\midrule
-uart\_rxd           & in    & 1  & RxD input to internal UART. \\
-uart\_txd           & out   & 1  & TxD output from internal UART. \\
-\midrule
-interrupt           & in    & 8  & Interrupt request inputs, active high. \\
-\bottomrule
-\end{tabularx}
-\end{table}
+Once the code that is to be loaded on the bootstrap BRAM has been built, you
+need to load it onto the bootstrap BRAM within the FPGA.
-As you can see in figure~\ref{mpu_symbol} (symbol generated by Xilinx ISE),
-the MCU has the following interfaces:
+As you probably already know, there are several possible ways to deal with this
+and most of them involve using \emph{'Memory Initialization Files'} of
+some sort. This project is different.
-\begin{enumerate}
-    \item Interface to external static asynchronous memory (SRAM, FLASH...).
-    \item Interface to on-chip peripherals.
-    \item Interrupt inputs.
-\end{enumerate}
+So far, this project does not include any support for using IMF
+files of any kind. Instead, the bootstrap BRAM is inferred and initialized
+using regular VHDL constructs and a constant passed to the SoC module as a
+generic.
-These interfaces will be explained in the following subsections. The top module
-for the demo supplied with the project (c2sb\_demo.vhdl) will be used for
-illustration.
+This scheme has a big drawback: every time the object code within the FPGA
+changes, the whole synthesis needs to be re-run. This drawback is manageable
+as long as the core is not used in any big project or if the bootstrap code
+does not change often.
-\emph{NOTE}: This section needs a lot of elaboration -- ideally this should be
-equivalent to
-a datasheet in thoroughness and detail. This work, like many other parts of this
-project, will have to wait.
+On the other hand, I see some big advantages in using regular BRAM inference in
+this stage of the project:
-\subsection{MCU interface to static memory}
-\label{mcu_if_sram}
+\begin{enumerate}
+\item The whole scheme is totally vendor agnostic.
+\item Object code embedded on VHDL constants can very easily be used in both simulation and synthesis.
+\end{enumerate}
-The interface to external memory in the MCU module is essentially that of the
-internal cache/memory controller. Its timing is described in section
-~\ref{cache_state_machine}.\\
+So, whatever object code is to be used to initialize the SoC bootstrap BRAM has
+to be passed to the SoC module instance as a generic constant (see section
+~\ref{soc_generics}). The constant must be of type \texttt{t\_obj\_code}, which
+is defined in package \emph{mips\_pkg}.
-The MCU inputs are meant to be connected straight to the FPGA i/o pins. The only
-trick is the bidirectional memory data bus: as you can see, the MCU data buses
-are unidirectional and thus you will need to provide an interconnection
-external to this module. This interconnection shall include the requisite
--state buffers:
-\begin{verbatim}
-sram_databus <= sram_data_wr when sram_byte_we_n/="11" else (others => 'Z');
-\end{verbatim}
+\subsection{Building the Bootstrap Initialization Constant}
+\label{boot_code_conversion}
-The top level module can be used as a fully tested example of how to use this
-interface to connect to a common SRAM chip (ISSI IS61LV25616).
-In reviewing the top module source, note that I had to adapt the dual
-byte-write-enable outputs to the SRAM
-configuration of a single write-enable plus dual byte-enable inputs.
-Note too that the static memory bus is used to access both the 16-bit wide SRAM
-and an 8-bit wide FLASH. These chips are connected to separate buses on the
-target board, so the top module needs to conflate both buses before connecting
-them to the MPU. This is why a multiplexor is used in the mpu\_sram\_data\_rd
-bus. A real-world board would probably have the SRAM and the FLASH connected
-to the same bus, simplifying the interface logic.
+    The project includes a python script (\texttt{/tools/build\_pkg/build\_pkg.py})
+    whose purpose is to build an VHDL \texttt{t\_obj\_code} constant out of a
+    \emph{binary} object code file.
+    This script will read one or more big-endian, binary object files and will
+    produce a  VHDL package file that will contain initialization constants for
+    the bootstrap BRAM and for some other memories that are only used in the
+    simulation test bench.
+    The package can optionally include too some simulation and synthesis
+    configuration constants -- such as the size of the bootstrap BRAM.
-\subsection{MCU interface to peripherals}
-\label{mcu_if_io}
-    TODO Documentation to be done
-\subsection{MCU interrupt inputs}
-\label{mcu_irqs}
-    TODO Documentation to be done
+    The makefiles included in the code samples invoke this script twice: once
+    to generate a package called \emph{sim\_params\_pkg} and used in the
+    simulation test bench; and once to build a package called
+    \emph{bootstrap\_code\_pkg} used for synthesis.
+    Please refer to the makefiles for usage examples, and read the script source
+    for more detailed usage instructions.

/src/tex/soc.tex

0,0 → 1,397

+\chapter{SoC Module}
+\label{soc_module}
+The main purpose of the SoC module is to encapsulate the somewhat complex
+interconnection between the CPU and the Cache/Memory Controller module.
+If some project demands that some piece of hardware be directly connected to the
+CPU, bypassing the cache, this is where it should be -- an MMU comes to mind.
+Any peripherals deemed common enough that they will be present in all projects
+might be placed in the SoC module too.
+In the current version of the SoC module, there is only one peripheral included
+in it -- a hardwired UART module. There is no penalty for placing peripherals
+ouside the SoC module, so there is no incentive to place them inside. This is
+an implementation option of yours.\\
+Bear in mind that, in its current state, the SoC module is little more than a
+vehicle for building demos around the ION CPU. It is not meant as a real-world
+SoC, though it might be deloped into one eventually.
+\section{SoC Generics}
+\label{soc_generics}
+    The SoC needs to be configured upon instantiation by setting the following
+    generics:
+\begin{table}[h]
+\caption{SoC module generics\label{tab_soc_generics}}
+\begin{tabularx}{\textwidth}{ lll|X }
+\toprule
+Name & Type & Default value & Description \\
+\midrule
+\texttt{BOOT\_BRAM\_SIZE}      & integer    & 1024  & Bootstrap BRAM size in 32-bit words. \\
+\texttt{OBJ\_CODE}     & t\_obj\_code & (void code) & Bootstrap BRAM contents. \\
+\midrule
+\texttt{CLOCK\_FREQ}   & integer    & 50e6  & Main clock rate. \\
+\texttt{BAUD\_RATE}    & integer    & 19200  & UART baud rate. \\
+\midrule
+\texttt{SRAM\_ADDR\_SIZE} & integer & 17 & Size of SRAM address bus. \\
+\bottomrule
+\end{tabularx}
+\end{table}
+The current version of the SoC is not very strict in the enforcement of limits
+for the generics. You are advised to use only 'reasonable' values. This will
+be fixed, eventually.
+Generic \texttt{CLOCK\_FREQ} is only needed in order to compute the default
+baud period for the internal UART (from the value of generic \texttt{BAUD\_RATE}).
+Generic \texttt{BOOT\_BRAM\_SIZE} will determine the size of the internal
+bootstrap BRAM. This generic \emph{can't be zero}; in the current version of
+the SoC, the BRAM can't be disabled or omitted.
+Note that if the size of the bootstrap BRAM is not enough to hold the whole
+bootstrap code provided in generic \texttt{OBJ\_CODE}, the code \emph{will
+be sineltly truncated!}. Usually this will result in an early crash.
+Generic \texttt{OBJ\_CODE} is used at synthesis time to initialize the bootstrap
+BRAM. This generic is meant to contain boostrap code, as seen in section
+~\ref{bootstrap_code}). It can be omitted, in which case the bootstrap BRAM
+will be initialized to all zeros.
+\section{SoC Ports}
+\label{soc_ports}
+\begin{figure}[h]
+\makebox[\textwidth]{\framebox[9cm]{\rule{0pt}{9cm}
+\includegraphics[width=8cm]{img/soc_symbol.png}}}
+\caption{SoC module interface\label{soc_symbol}}
+\end{figure}
+\begin{table}[h]
+\caption{SoC module interface ports}
+\begin{tabularx}{\textwidth}{ lll|X }
+\toprule
+Name & Type & Width & Description \\
+\midrule
+clk                 & in    & 1  & Clock input, active rising edge. \\
+reset               & in    & 1  & Synchronous global reset. \\
+\midrule
+sram\_address       & out   & 16 & Memory word address (bit 0 absent). \\
+sram\_data\_wr      & out   & 16 & Memory write data. Only valid when one of the \\
+                    &       &    & memory byte write enable outputs is active.\\
+sram\_data\_rd      & in    & 16 & Memory read data. Latched when xxx. \\
+sram\_byte\_we\_n   & out   & 2  & Memory byte write enable, active low.  \\
+                    &       &    & (0) enables the low byte (7 downto 0) \\
+                    &       &    & (1) enables the high byte (15 downto 8). \\
+\midrule
+io\_rd\_addr        & out   & 30 & I/O port read address (bits 1..0 absent). \\
+                    &       &    & Only valid when io\_rd\_vma is high. \\
+io\_wr\_addr        & out   & 30 & I/O port write address (bits 1..0 absent). \\
+io\_wr\_data        & out   & 32 & I/O write data.  Only valid when one of the \\
+                    &       &    & i/o byte write enable outputs is active.\\
+io\_rd\_data        & in    & 32 & I/O read data. Latched when xxx. \\
+io\_byte\_we        & out   & 4  & I/O byte write enable, active high. \\
+                    &       &    & (0) enables the low byte (7 downto 0) \\
+                    &       &    & (3) enables the high byte (31 downto 24). \\
+io\_rd\_vma         & out   & 1  & Active high on i/o read cycles. \\
+\midrule
+uart\_rxd           & in    & 1  & RxD input to internal UART. \\
+uart\_txd           & out   & 1  & TxD output from internal UART. \\
+\midrule
+interrupt           & in    & 8  & Interrupt request inputs, active high. \\
+\bottomrule
+\end{tabularx}
+\end{table}
+As you can see in figure~\ref{soc_symbol} (symbol generated by Xilinx ISE),
+the SoC has the following interfaces:
+\begin{enumerate}
+    \item Interface to external static asynchronous memory (SRAM, FLASH...).
+    \item Interface to on-chip peripherals.
+    \item Interrupt inputs.
+    \item Debug port.
+\end{enumerate}
+These interfaces will be explained in the following subsections. The top module
+for the demo supplied with the project (c2sb\_demo.vhdl) will be used for
+illustration.
+\emph{NOTE}: This section needs a lot of elaboration -- ideally this should be
+equivalent to a datasheet in thoroughness and detail. This work, like many
+other parts of this project, will have to wait.
+\subsection{SoC interface to static memory}
+\label{soc_if_sram}
+The interface to external memory in the SoC module is essentially that of the
+internal cache/memory controller. Its timing is described in section
+~\ref{cache_state_machine}.\\
+The SoC inputs are meant to be connected straight to the FPGA i/o pins. The only
+trick is the bidirectional memory data bus: as you can see, the SoC data buses
+are unidirectional and thus you will need to provide an interconnection
+external to this module. This interconnection shall include the requisite
+-state buffers:
+\begin{verbatim}
+sram_databus <= sram_data_wr when sram_byte_we_n/="11" else (others => 'Z');
+\end{verbatim}
+The top level \emph{c2sb\_demo} module can be used as a fully tested example of
+how to use this interface to connect to a common 16-bit-wide SRAM chip
+(ISSI IS61LV25616).
+In reviewing the top module source, note that I had to adapt the dual
+byte-write-enable outputs to the SRAM
+configuration of a single write-enable plus dual byte-enable inputs.
+Note too that the static memory bus of the SoC module is used to access both the 16-bit wide SRAM
+and an 8-bit wide FLASH. These chips are connected to separate buses on the
+target board, so the top c2sb\_demo module needs to conflate both buses before connecting
+them to the SoC. This is why a multiplexor is used in the \texttt{mpu\_sram\_data\_rd}
+bus. A real-world board would probably have the SRAM and the FLASH connected
+to the same bus, simplifying the interface logic.
+\subsection{SoC interface to peripherals}
+\label{soc_if_io}
+    Every CPU access to an area designated as I/O (see ~\ref{soc_memory_map}, memory map)
+    will trigger a read/write cycle on this interface.
+    I/O ports are synchronous, byte accesible registers meant to be implemented
+    within the FPGA. I/O ports do not support wait states.
+    The I/O interface has separate input and output buses.
+    In an output cycle, one or more lines of signal \texttt{io\_byte\_we} will be
+    asserted for one clock cycle. Signals \texttt{io\_wr\_addr} and \texttt{io\_wr\_addr} will
+    be valid as long as \texttt{io\_byte\_we} is asserted.
+    In an input cycle, \texttt{io\_rd\_vma} will be asserted for one cycle and the input
+    data should be present at \texttt{io\_rd\_data} at the end of the following clock
+    cycle. The full read operation extends over two clock cycles.
+\subsection{SoC interrupt inputs}
+\label{soc_irqs}
+    The present version of the CPU does not have support for hardware interrupts
+    and therefore these signals are not used yet and are unconnected.
+    Hardware interrupts will be implemented in some future version as
+    time permits.
+\subsection{SoC debug port}
+\label{soc_debug_port}
+    The debug port is a VHDL record (\texttt{t\_debug\_info}, defined in
+    package \emph{mips\_pkg}), which holds some internal CPU status flags that
+    can be useful while debugging the core. It is not meant to be useful for
+    a real application.
+    Currently the record holds only two flags:
+    \begin{itemize}
+    \item \texttt{cache\_enabled}, asserted when the cache is enabled.
+    \item \texttt{unmapped\_access}, asserted when some access to an unmapped
+    address is made.
+    \end{itemize}
+    The current version of the demo connects these signals to some on-board
+    LEDs.
+\section{SoC Memory Map}
+\label{soc_memory_map}
+    The \emph{memory map} determines the type of memory that is connected to
+    each of a number of predefined address rangess (see section
+    ~\ref{memory_map_definition}).
+    It is defined in package \emph{mips\_pkg} and it is implemented in the
+    \emph{mips\_cache} module.
+\begin{table}[h]
+\caption{SoC module memory map\label{tab_soc_memory_map}}
+\begin{tabularx}{\textwidth}{ lll|X }
+\toprule
+Address range & Type & Wait States & Intended usage \\
+\midrule
+\texttt{0xb8000000-0xbfffffff}   & BRAM    & 0  & SoC internal boot BRAM. \\
+\midrule
+\texttt{0x00000000-0x07ffffff}   & SRAM-16 & 2  & Off-chip SRAM. \\
+\texttt{0x80000000-0x87ffffff}   & SRAM-16 & 2  & Off-chip SRAM. \\
+\texttt{0x20000000-0x27ffffff}   & I/O     & 0  & On-chip I/O registers. \\
+\texttt{0xb0000000-0xb7ffffff}   & SRAM-8  & 7  & Off-chip SRAM or FLASH. \\
+\bottomrule
+\end{tabularx}
+\end{table}
+\section{SoC UART}
+\label{soc_uart}
+    The current revision of the SoC includes a single peripheral, a hardwired
+-bit UART (file \emph{uart.vhdl}).
+    This UART is an 8-bit module built for some other unrelated project of mine
+    and commandeered to serve on this SoC. Therefore, it has some features
+    (like its 8-bit interface) which are sub-optimal for this application and/or
+    are not used.
+    The UART is 'hardwired' because some of its operational parameters are
+    hardcoded and can't be changed even at synthesis time. Namely:
+    \begin{itemize}
+    \item Stop bits: 1.
+    \item Parity: None.
+    \item Bits per character: 8.
+    \end{itemize}
+    All other parameters can at least be configured at synthesis time, and
+    under some conditions can be configured at run time too. The interested
+    user must read the module source for a better explaination of these
+    features. This document will only deal with the UART module as it is
+    instantiated in the SoC.
+    These are the UART control registers:
+    \begin{table}[h]
+    \caption{UART control registers\label{uart_control_regs}}
+    \begin{tabularx}{\textwidth}{ ll|X }
+    \toprule
+    Byte Address & Word Address & Register \\
+    \midrule
+    \texttt{0x20000003}   & \texttt{0x20000000} & Tx/Rx Buffer \\
+    \texttt{0x20000007}   & \texttt{0x20000004} & Status. \\
+    \texttt{0x2000000b}   & \texttt{0x20000008} & Baud rate period, low byte. \\
+    \texttt{0x2000000f}   & \texttt{0x2000000c} & Baud rate period, high byte. \\
+    \bottomrule
+    \end{tabularx}
+    \end{table}
+    All of these registers are mapped to the byte address given in the table,
+    that is, they are mapped on the \emph{low} byte of the 32-bit word they
+    belong to -- you don't have to worry about this unless you use a word
+    pointer to access these registers.
+\subsection{UART Usage}
+\label{soc_uart_usage}
+    Until hardware interrupts are implemented, you have to rely on polling to
+    use the UART.
+    When you want to transmit, you wait until flag TxRdy is '1' and then write
+    to the Tx Buffer. That will clear TxRdy until the transmission is done.
+    Writing to the Tx Buffer will NOT clear flag TxIrq.
+    Writing to the Tx Buffer while TxRdy is '0' will have no effect.
+    When you want to read received data, you wait until RxRdy is '1' and then
+    read the Rx Buffer. Reading the Rx Buffer will clear flag RxRdy until a new
+    byte arrives.
+    Reading the RxBuffer while RxRdy is '0' will return undefined data.
+    Reading the Rx Buffer will NOT clear flag RxIrq.
+    Of course, once hardware interrupts are implemented, you will use them
+    instead of polling, but this is the basic mechanics. Same as any old UART,
+    really.
+\subsection{UART Status Register}
+\label{soc_status_reg}
+    These are the flags present in the status register:
+\needspace{7\baselineskip}
+\begin{verbatim}
+  UART Status Register
+6       5       4       3       2       1       0
+  +-------+-------+-------+-------+-------+-------+-------+-------+
+  |   0   |   0   | RxIrq | TxIrq |   0   |   0   | RxRdy | TxRdy |
+  +-------+-------+-------+-------+-------+-------+-------+-------+
+      h       h      W1C     W1C      h       h       r       r
+\end{verbatim}
+    Bits marked 'h' are hardwired and can't be modified.
+    Bits marked 'r' are read only; they are set and clear by the UART core.
+    Bits marked W1C ('Write 1 Clear') are set by the UART core when an interrupt
+    has been triggered and must be cleared by the software by writing a '1'.
+    \begin{itemize}
+    \item Status bit TxRdy is high when there isn't any transmission in progress.
+            It is cleared when data is written to the transmission buffer and is
+            raised at the same time the transmission interrupt is triggered.
+    \item Status bit RxRdy is raised at the same time the receive interrupt is
+            triggered and is cleared when the data register is read.
+    \item Status bit TxIrq is raised when the transmission interrupt is triggered
+            and is cleared when a 1 is written to it.
+    \item Status bit RxIrq is raised when the reception interrupt is triggered
+            and is cleared when a 1 is written to it.
+    \end{itemize}
+    When writing to the status/control registers, only flags TxIrq and RxIrq are
+    affected, and only when writing a '1' as explained above. All other flags
+    are read-only.
+\subsection{UART Baud Rate Registers}
+\label{soc_uart_baud_regs}
+    When the UART module generic 'HARDWIRED' is set to 'false', these registers
+    can be written to in order to configure the baud rate -- see the source for
+    details.
+    When the UART module generic 'HARDWIRED' is set to 'true', these registers
+    are frozen and can't be modified. This is how the module is instantiated in
+    the current version of the SoC.
+    In either case, these are write-only registers: reading them will return
+    the contents of the status register (simplified multiplexor).
+    The baud rate is configured by loading these registers with the baud period
+    in clock cycles.
+\subsection{UART Interrupt}
+\label{soc_uart_interrupts}
+    The UART can trigger an interrupt (i.e. assert its interrupt output for one
+    clock cycle) whenever a character is received or transmitted. The UART
+    source explains in detail exactly when these interrupts are triggered.
+    The interrupt status is kept in two flags on the status register that can
+    be used for interrupt polling. Note there's no way to tell what kind of
+    interrupt we got other than looking at those flags.
+    Since the current CPU revision does not support hardware interrupts, this
+    feature is still unused and the interrupt line is unconnected.
+    Again, details can be found in the UART module source.

/src/tex/cpu.tex

10,8 → 10,8

     \caption{CPU module interface\label{cpu_symbol}}
     \end{figure}
-    the CPU module is not meant to be used directly. Instead, the MCU module
-    described in section ~\ref{mcu_module} should be used.\\
+    the CPU module is not meant to be used directly. Instead, the SoC module
+    described in chapter ~\ref{soc_module} should be used.\\
     The following sections will describe the CPU structure and interfaces.\\

46,7 → 46,7

     Note that the basic cpu module (mips\_cpu) is meant to be connected to
     internal, synchronous BRAMs only (i.e. the cache BRAMs). Some of its
     outputs are not registered because they needn't be. The parent module
-    (called 'mips\_mcu') has its outputs registered to limit $t_{co}$ to
+    (called 'mips\_soc') has its outputs registered to limit $t_{co}$ to
     acceptable values.\\

171,9 → 171,9

     stages as long as it is active. It is meant to be used by the cache at cache
     refills.\\
-    In short, the cache/memory controller stops the cpu for all data/code
+    The cache/memory controller stops the cpu for all data/code
     misses for as long as it takes to do a line refill. The current cache
-    implementation does refills in order (i.e. not 'target address first').
+    implementation does refills in reverse order (i.e. not 'target address first').
     Note that external memory wait states are a separate issue. They too are
     handled in the cache/memory controller. See section~\ref{cache} on the memory

254,7 → 254,7

 \end{verbatim}\\
     In the source code, all registers and signals in stage
-    \textless i\textgreater are prefixed by
+    \textless i\textgreater  are prefixed by
     "p\textless i\textgreater\_", as in p0\_*, p1\_* and p2\_*.
     A stage includes a set of registers and
     all the logic that feeds from those registers (actually, all the logic

271,9 → 271,12

     ports belong logically to stage 1 and the write port to stage 2.\\
     IMPORTANT: though the register bank read port is synchronous, its data can
-    be used in stage 1 because it is read early (the read port is loaded at the
+    be used in stage 1 because it is read early (the read address port is loaded at the
     same time as the instruction opcode). That is, a small part of the
-    instruction decoding is done on stage FETCH-1. Bearing in mind that the code
+    instruction decoding is done on stage FETCH-1, by feeding the source
+    register index field straight from the code bus to the register bank BRAM.
+    Bearing in mind that the code cache
     ram is meant to be the exact same type of block as the register bank (or
     faster if the register bank is implemented with distributed RAM), and we
     will cram the whole ALU delay plus the reg bank delay in stage 1, it does

454,7 → 457,7

     saved to EPF is not the victim instruction's but the preceding jump
     instruction's as explained in \cite[p.~64]{see_mips_run}.\\
-    Plasma used to save in epc the address of the instruction after break or
+    Plasma CPU used to save in epc the address of the instruction after break or
     syscall, and used an unstandard vector address (0x03c). This core will go
     the standard R3000 way instead.\\

/src/tex/intro.tex

1,11 → 1,11

 \clearpage
 This file contains usage instructions and notes about the Ion CPU core project.
-The core structure is briefly explained in sections 1 to 4. The rest of this
+The core structure is briefly explained in sections 1 to 5. The rest of this
 doc describes other aspects of the project: code samples, utility scripts,
 etc.\\
-This document is not yet a full reference on the Ion core. Instead, it should be
-taken as a companion and commentary to the source code.\\
+This document is not yet a full reference on the Ion core or a data sheet.
+Instead, it should be taken as a companion and commentary to the source code.\\
 This document assumes you know in some depth the MIPS-I architecture. Terms and
 concepts from \cite['See MIPS Run']{see_mips_run} and

36,9 → 36,9

     \item All unimplemented opcodes trigger the proper traps.
     \item Includes minimalistic memory handler with interfaces for external
           SRAM (or FLASH) on 8- and 16-bit data bus.
-    \item Size and speed compares favorably to other free MIPS cores.
+    \item Size and speed comparable to other free MIPS cores.
     \item Fully sinchronous (rising clock edge only). No latches.
-    \item Source HDL is vendor independent (Though it has only been tested on
+    \item Source HDL is fully vendor independent (Only tested on
           Xilinx and Altera synthesis tools).
 \end{enumerate}
 \end{framed}

71,7 → 71,7

     \item Hardware interrupts not implemented.
     \item Memory handler does not support dynamic RAM.
     \item Caches are not configurable or parametrizable.
-    \item Documentation is disastrously inadequate.
+    \item Real documentation (specs doc \& data sheet) missing.
 \end{enumerate}
 \end{framed}

/src/tex/sw_samples.tex

20,8 → 20,7

     Target 'demo' will build a synthesizable demo; it will compile the sample
     sources and place the resulting object code in file
-    '/vhdl/demo/code\_rom\_pkg.vhdl' (note that the 'sim' target has to do this
-    too).\\
+    '/vhdl/SoC/bootstrap\_code\_pkg.vhdl'.\\
     The build process will produce two or more binary files ('*.code' and
     '*.data', or '*.bin') that can be run on the software simulator, plus a

31,6 → 30,7

software simulator with the proper parameters. As an example, these are the

contents of the 'swsim.bat' file for the 'hello' demo:

\needspace{8\baselineskip}

\begin{verbatim}

@rem Run software simulator in hands-off mode

..\..\tools\slite\slite\bin\Debug\slite.exe ^

51,6 → 51,7

templates.

Assuming you have Python 2.5 or later in your machine, call the script with:

\needspace{1\baselineskip}

\begin{verbatim}

python bin2hdl.py --help

\end{verbatim}\\

57,3 → 58,165

     to get a short description (see section~\ref{python_script}).
+\section{Support Code}
+\label{support_code}
+\subsection{Bootstrap Code}
+\label{asm_bootstrap_code}
+    File \emph{'/src/common/bootstrap.s'} contains the reset code and the stub
+    interrupt handler.
+    This code is meant to be placed at the reset vector address; in the present
+    revision of the project, the bootstrap code would be placed in the
+    bootstrap BRAM of the SoC module (see section ~\ref{bootstrap_code}). It
+    can be placed in external memory if the SoC memory map and the makefiles are
+    altered accordingly.
+    The reset code initializes both I-Cache and D-Cache and jumps to symbol
+    \texttt{'entry'} with the CPU in kernel mode and interrupts disabled.
+    The interrupt code is somewhat more elaborate but nevertheless is still
+    a stub. The interrupt code will find out the cause of the interrupt and will
+    proceed.
+    Table ~\ref{tab_irq_handling} shows interrupt sources recognized in the
+    current version, and how the interrupt code deals with them.
+    \begin{table}[h]
+    \caption{Interrupt handling\label{tab_irq_handling}}
+    \begin{tabularx}{\textwidth}{ l|X }
+    \toprule
+    Cause & Processing \\
+    \midrule
+    SYSCALL instruction             & Return immediately (STUB). \\
+    BREAK instruction               & Return immediately (STUB). \\
+    Invalid opcode                  & Return immediately (STUB). \\
+    Unimplemented MIPS-32 opcode    & Emulate opcode. \\
+    \bottomrule
+    \end{tabularx}
+    \end{table}
+    Please note that the interrupt code does not even \emph{know} there is
+    such a thing as a hardware interrupt -- hardware interrupts are not
+    implemented yet in the CPU.
+    As can be seen, most of the interrupt processing is missing, replaced by
+    stubs. Eventually, those stubs will be replaced with calls to C functions
+    that will be defined in the supporting libraries. I have yet to work out
+    exactly how to do it without reinventing the wheel.
+    The only interrupt processing that is performed (even if only partially) is
+    the emulation of \emph{some} MIPS-32 opcodes.
+\subsection{MIPS-32 Opcode Emulation}
+\label{mips32_opcode_emulation}
+    I have found out that most MIPS toolchains target the MIPS-32 architecture
+    and can only be made to work with the MIPS-I architecture with some
+    difficulty. This applies specially to the C support libraries, where
+    MIPS-32 opcodes are used occasionally.
+    Other reasons, such as the availability of MIPS-32 ports of
+    uClinux, make at least partial compatibility to MIPS-32 very desirable.
+    On the other hand, extending the core to implement the full MIPS-32 specs (as
+    opposed to the far simpler MIPS-I) might not be possible without running
+    into a patent minefield -- I may be wrong in this.
+    Therefore, I have chosen to just emulate whatever subset of the
+    MIPS-32 ISA is enough to overcome the above obstacles. I have run some
+    experiments with uClinux and have found out that only a few MIPS-32 opcodes
+    need to be emulated -- see table ~\ref{tab_emulated_opcodes}.
+    \begin{table}[h]
+    \caption{Emulated MIPS-32 opcodes\label{tab_emulated_opcodes}}
+    \begin{tabularx}{\textwidth}{ l|X }
+    \toprule
+    Opcode & Emulation \\
+    \midrule
+    EXT                 & Fully emulated. \\
+    INS                 & Fully emulated. \\
+    CLO                 & Fully emulated. \\
+    CLZ                 & Fully emulated. \\
+    \midrule
+    MUL (3-reg version) & To Be Done. \\
+    LWL                 & To Be Done. \\
+    LWR                 & To Be Done. \\
+    SWL                 & To Be Done. \\
+    SWR                 & To Be Done. \\
+    \bottomrule
+    \end{tabularx}
+    \end{table}
+    Opcode emulation is implemented in file \emph{/src/common/opcode\_emu.s}.
+    The emulation code has been tested (code sample \emph{'opcodes'}) but is
+    still unfinished: not only are there opcodes to be emulated, as can
+    be seen in table ~\ref{tab_emulated_opcodes}, but the emulation does NOT
+    work when the MIPS-32 opcode is in a delay slot; or more precisely, the
+    jump instruction of the delay slot is not emulated as it should -- this is
+    just work to be done, nothing specially difficult.
+    If the trapped MIPS-32 opcode is not one of the emulated opcodes, it is
+    ignored exactly as if it was a NOP. A flag is raised in a special, reserved
+    area of the BSS segment -- see the source for details. Eventually the trap
+    handling will be complete and unimplemented MIPS-32 opcodes will be dealt
+    with as undefined opcodes.
+    Note that opcode emulation can be disabled in the makefiles, by defining
+    symbol \texttt{NO\_EMU\_MIPS32} when assembling \emph{bootstrap.s}.
+\subsection{C Startup Code}
+\label{c_startup_code}
+    File \emph{'/src/common/c\_startup.s'} contains the C startup code used
+    in all C code samples.
+    This startup code does what's usual in these cases, namely:
+    \begin{enumerate}
+    \item Initialize the stack at the end of the bss area.
+    \item Clear the bss area.
+    \item Move the data section from FLASH to RAM (if applicable).
+    \item Call main().
+    \item Freeze in endless loop after main() returns, if it does.
+    \end{enumerate}
+    See the makefile of any of the C code samples for examples on how to
+    assemble, configure and link the startup code.
+\subsection{C Support Library Replacement}
+\label{libsoc}
+    The core will eventually be tested with one of the regular, free libc
+    replacement libraries available. It isn't yet because building those
+    libraries for a MIPS-I target (i.e. with no MIPS-32 stuff mixed in) has
+    turned out to be much more difficult than I had anticipated.
+    In order to be able to use the C toolchain and provide some code samples in
+    C, I have built a minimalistic libc replacement called 'libsoc'. The
+    source code can be found in \emph{/src/common/libsoc/src}.
+    This mini-libc provides only those C support functions I have found
+    necessary so far; it will be extended with new functionality as I add more
+    code samples, until I can finally use some real C library.
+    Apart from a number of utility functions that the C runtime needs, libsoc
+    provides the following:
+    \begin{itemize}
+    \item Floating point support (SW, float and double).
+    \item \texttt{putchar} and \texttt{getchar} using the SoC UART.
+    \item Replacement for the built-in \texttt{printf} provided by gcc.
+    \end{itemize}
+    That's it. Yet, even this little is enough to run the Adventure demo...
+    All the source code has been lifted from the original Plasma supporting code
+    or has been scavenged from the net -- see credits in the file headers.

/src/ionstyle.sty

59,3 → 59,6

\usepackage[utf8]{inputenc}

\usepackage{booktabs}

\usepackage{tabularx} % Flexible tables

\hypersetup{%

pdfborder = {0 0 0}

}

/src/ion_project.tex

25,6 → 25,7

\input{./tex/intro.tex}

%\input{./tex/quickstart.tex}

\input{./tex/usage.tex}

\input{./tex/soc.tex}

\input{./tex/cpu.tex}

\input{./tex/cache.tex}

\input{./tex/simulation.tex}