OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [userguide/] [content.adoc] - Diff between revs 62 and 63

Go to most recent revision | Show entire file | Details | Blame | View Log

Rev 62 Rev 63
Line 1... Line 1...
Let's Get It Started!
Let's Get It Started!
 
 
To make your NEORV32 project run, follow the guides from the upcoming sections. Follow these guides
This user guide uses the NEORV32 project _as is_ from the official `neorv32` repository.
step by step and in the presented order.
To make your first NEORV32 project run, follow the guides from the upcoming sections. It is recommended to
 
follow these guides step by step and eventually in the presented order.
 
 
 
[TIP]
 
This guide uses the minimalistic and platform/toolchain agnostic SoC test setups from
 
`rtl/test_setups` for illustration. You can use one of the provided test setups for
 
your first FPGA tests. Alternatively, have a look at the `setups` folder,
 
which provides more sophisticated example setups for various FPGAs/FPGA boards and toolchains.
 
 
 
 
:sectnums:
:sectnums:
== Software Toolchain Setup
== Software Toolchain Setup
 
 
To compile (and debug) executables for the NEORV32 a RISC-V toolchain is required.
To compile (and debug) executables for the NEORV32 a RISC-V toolchain is required.
There are two possibilities to get this:
There are two possibilities to get this:
 
 
1. Download and _build_ the official RISC-V GNU toolchain yourself
1. Download and _build_ the official RISC-V GNU toolchain yourself.
2. Download and install a prebuilt version of the toolchain; this might also done via the package manager / app store of your OS
2. Download and install a prebuilt version of the toolchain; this might also done via the package manager / app store of your OS
 
 
[TIP]
[NOTE]
The default toolchain prefix for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V
The default toolchain prefix (`RISCV_PREFIX` variable) for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change the _RISCV_PREFIX_ variable in the application
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change `RISCV_PREFIX`
makefile(s) according to your needs or define this variable when invoking the makefile.
according to your needs.
 
 
[IMPORTANT]
 
Keep in mind that – for instance – a rv32imc toolchain only provides library code compiled with
 
compressed (_C_) and `mul`/`div` instructions (_M_)! Hence, this code cannot be executed (without
 
emulation) on an architecture without these extensions!
 
 
 
 
 
:sectnums:
:sectnums:
=== Building the Toolchain from Scratch
=== Building the Toolchain from Scratch
 
 
Line 37... Line 40...
----
----
riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –-with-abi=ilp32
riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –-with-abi=ilp32
riscv-gnu-toolchain$ make
riscv-gnu-toolchain$ make
----
----
 
 
 
[IMPORTANT]
 
Keep in mind that – for instance – a toolchain build with `--with-arch=rv32imc` only provides library code compiled with
 
compressed (`C`) and `mul`/`div` instructions (`M`)! Hence, this code cannot be executed (without
 
emulation) on an architecture without these extensions!
 
 
 
 
:sectnums:
:sectnums:
=== Downloading and Installing a Prebuilt Toolchain
=== Downloading and Installing a Prebuilt Toolchain
 
 
Alternatively, you can download a prebuilt toolchain.
Alternatively, you can download a prebuilt toolchain.
Line 101... Line 109...
 
 
// ####################################################################################################################
// ####################################################################################################################
:sectnums:
:sectnums:
== General Hardware Setup
== General Hardware Setup
 
 
This guide will setup a NEORV32 project for FPGA implementation (or simulation only) _from scratch_
This guide shows the basics of setting up a NEORV32 project for FPGA implementation (or simulation only)
 
_from scratch_. It uses a _simplified_ test "SoC" setup of the processor to keeps things simple at the beginning.
 
This simple setup is intended for evaluation or as "hello world" project to check out the NEORV32
 
on _your_ FPGA board.
 
 
[TIP]
[TIP]
If you want to use a complete pre-defined setup to start with, check out the
If you want to use a more sophisticated pre-defined setup to start with, check out the
project's `setups` folder (https://github.com/stnolting/neorv32/tree/master/setups),
`setups` folder, which provides example setups for various FPGA, boards and toolchains.
which provides (script-based) demo setups for various FPGA boards and toolchains.
 
 
The NEORV32 project features two minimalistic pre-configured test setups in
 
https://github.com/stnolting/neorv32/blob/master/rtl/test_setups[`rtl/test_setups`].
 
Both test setups only implement very basic processor and CPU features.
 
The main difference between the two setups is the processor boot concept - so how to get a software executable
 
_into_ the processor:
 
 
 
* **`rtl/test_setups/neorv32_testsetup_approm.vhd`**: this setup does not require a connection via UART. The
 
software executable is "installed" into the bitstream to initialize a read-only memory. Use this setup
 
if your FPGA board does _not_ provide a UART interface.
 
* **`rtl/test_setups/neorv32_testsetup_bootloader.vhd`**: this setups uses the UART and the default NEORV32
 
bootloader to upload new software executables. Use this setup if your board _does_ provide a UART interface.
 
 
 
.NEORV32 "hello world" test setup (`rtl/test_setups/neorv32_testsetup_bootloader.vhd`)
 
image::neorv32_test_setup.png[align=center]
 
 
This tutorial uses a _simplified_ test setup of the processor
.External Clock Source
to keeps things simple at the beginning as this setup is intended as
[NOTE]
evaluation or "hello world" project to check out the NEORV32.
These test setups are intended to be directly used as **design top entity**. Of course you can also instantiate them
 
into another design unit. If your FPGA board only provides _very fast_ external clock sources (like on the FOMU board)
 
you might need to add clock management components (PLLs, DCMs, MMCMs, ...) to the test setup or to the according top entity
 
if you instantiate one of the test setups.
 
 
[start=1]
[start=1]
. Create a new project with your FPGA EDA tool of choice.
. Create a new project with your FPGA EDA tool of choice.
. Add all VHDL files from the project's `rtl/core` folder to your project. Make sure to _reference_ the
. Add all VHDL files from the project's `rtl/core` folder to your project.
files only – do not copy them.
 
. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not
. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not
provide a field to enter the library name, check out the "properties" menu of the added rtl files.
provide a field to enter the library name, check out the "properties" menu of the added rtl files.
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor. If you
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor, which can be
already have a design, instantiate this unit into your design and proceed.
instantiated into the "real" project. However, in this tutorial we will use one of the pre-defined
 
test setups from `rtl/test_setups` (see above).
 
 
[IMPORTANT]
[IMPORTANT]
Make sure to include the `neorv32` package into your design when instantiating the processor: add
Make sure to include the `neorv32` package into your design when instantiating the processor: add
`library neorv32;` and `use neorv32.neorv32_package.all;` to your design unit.
`library neorv32;` and `use neorv32.neorv32_package.all;` to your design unit.
 
 
[start=5]
[start=5]
. If you do not have a design yet and just want to check out the NEORV32 – no problem! This guide
. Add the pre-defined test setup of choice to the project, too, and select it as _top entity_.
uses a simplified top entity, that encapsulates the actual processor top entity: add the
. The entity of both test setups
`rtl/templates/processor/neorv32_ProcessorTop_Test.vhd` VHDL file to your project, too, and
provide a minimal set of configuration generics, that might have to be adapted to match your FPGA and board:
select it as _top entity_.
 
. This test setup provides a minimal test hardware setup:
 
 
 
.NEORV32 "hello world" test setup
 
image::neorv32_test_setup.png[align=center]
 
 
 
[start=7]
.Test setup entity - configuration generics
. It only implements some very basic processor and CPU features. Also, only the
 
minimum number of signals is propagated to the outer world.
 
. However, a minimal setup-specific configuration of the NEORV32 processor is required to make it run
 
on your FPGA board of choice. Only the absolutely required modifications will be made while
 
keeping the default configuration for the remaining configuration options:
 
 
 
.Cut-out of `neorv32_ProcessorTop_Test.vhd` showing the processor instance and its configuration
 
[source,vhdl]
[source,vhdl]
----
----
neorv32_top_inst: neorv32_top
  generic (
generic map (
    -- adapt these for your setup --
  -- General --
    CLOCK_FREQUENCY   : natural := 100000000; <1>
  CLOCK_FREQUENCY   => 100000000, -- in Hz # <1>
    MEM_INT_IMEM_SIZE : natural := 16*1024;   <2>
  INT_BOOTLOADER_EN => true,
    MEM_INT_DMEM_SIZE : natural := 8*1024     <3>
  ...
  );
  -- Internal instruction memory --
 
  MEM_INT_IMEM_EN   => true,
 
  MEM_INT_IMEM_SIZE => 16*1024, # <2>
 
  -- Internal data memory --
 
  MEM_INT_DMEM_EN   => true,
 
  MEM_INT_DMEM_SIZE => 8*1024, # <3>
 
  ...
 
----
----
<1> Clock frequency of `clk_i` signal in Hertz
<1> Clock frequency of `clk_i` signal in Hertz
<2> Default size of internal instruction memory: 16kB
<2> Default size of internal instruction memory: 16kB
<3> Default size of internal data memory: 8kB
<3> Default size of internal data memory: 8kB
 
 
[start=9]
[start=7]
. There is one generic that has to be set according to your FPGA board setup: the actual clock frequency
. If you feel like it – or if your FPGA does not provide sufficient resources – you can modify the
of the top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's
_memory sizes_ (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` – marked with notes "2" and "3"). But as mentioned
frequency in Hertz (Hz) (note "1").
above, let's keep things simple at first and use the standard configuration for now.
. If you feel like it – or if your FPGA does not provide many resources – you can modify the
. There is one generic that _has to be set according to your FPGA board_ setup: the actual clock frequency
**memory sizes** (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ – marked with notes "2" and "3") or even
of the top's clock input signal (`clk_i`). Use the `CLOCK_FREQUENCY` generic to specify your clock source's
exclude certain ISA extensions and peripheral modules from implementation - but as mentioned above, let's keep things
frequency in Hertz (Hz).
simple at first and use the standard configuration for now.
 
 
 
[NOTE]
[NOTE]
If you have changed the default memory configuration (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ generics)
If you have changed the default memory configuration (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` generics)
keep those new sizes in mind – these values are required for setting
keep those new sizes in mind – these values are required for setting
up the software framework in the next section <<_general_software_framework_setup>>.
up the software framework in the next section <<_general_software_framework_setup>>.
 
 
[start=11]
[start=9]
. Depending on your FPGA tool of choice, it is time to assign the signals of the test setup top entity to
. Depending on your FPGA tool of choice, it is time to assign the signals of the test setup top entity to
the according pins of your FPGA board. All the signals can be found in the entity declaration:
the according pins of your FPGA board. All the signals can be found in the entity declaration of the
 
corresponding test setup:
 
 
.Entity signals of `neorv32_test_setup.vhd`
.Entity signals of `neorv32_testsetup_approm.vhd`
[source,vhdl]
[source,vhdl]
----
----
entity neorv32_test_setup is
 
  port (
  port (
    -- Global control --
    -- Global control --
    clk_i       : in std_ulogic := '0'; -- global clock, rising edge
    clk_i       : in  std_ulogic; -- global clock, rising edge
    rstn_i      : in std_ulogic := '0'; -- global reset, low-active, async
    rstn_i      : in  std_ulogic; -- global reset, low-active, async
 
    -- GPIO --
 
    gpio_o      : out std_ulogic_vector(7 downto 0) -- parallel output
 
  );
 
----
 
 
 
.Entity signals of `neorv32_testsetup_bootloader.vhd`
 
[source,vhdl]
 
----
 
  port (
 
    -- Global control --
 
    clk_i       : in  std_ulogic; -- global clock, rising edge
 
    rstn_i      : in  std_ulogic; -- global reset, low-active, async
    -- GPIO --
    -- GPIO --
    gpio_o      : out std_ulogic_vector(7 downto 0); -- parallel output
    gpio_o      : out std_ulogic_vector(7 downto 0); -- parallel output
    -- UART0 --
    -- UART0 --
    uart0_txd_o : out std_ulogic; -- UART0 send data
    uart0_txd_o : out std_ulogic; -- UART0 send data
    uart0_rxd_i : in std_ulogic := '0' -- UART0 receive data
    uart0_rxd_i : in  std_ulogic  -- UART0 receive data
);
);
end neorv32_test_setup;
 
----
----
 
 
[start=12]
.Signal Polarity
 
[NOTE]
 
If your FPGA board has inverse polarity for certain input/output you can add `not` gates. Example: The reset signal
 
`rstn_i` is low-active by default; the LEDs connected to `gpio_o` high-active by default.
 
You can do this in your board top if you instantiate the test setup,
 
or _inside_ the test setup if this is your top entity (low-active LEDs example: `gpio_o <= NOT con_gpio_o(7 downto 0);`).
 
 
 
[start=10]
. Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of
. Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of
your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is
your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is
**low-active**, so maybe you need to invert the input signal.
**low-active**, so maybe you need to invert the input signal.
. If possible, connected at least bit `0` of the GPIO output port `gpio_o` to a high-active LED (invert
. If possible, connected _at least_ bit `0` of the GPIO output port `gpio_o` to a LED (see "Signal Polarity" note above).
the signal when your LEDs are low-active). This LED will be used as status LED for the setup.
. Finally, if your are using the UART-based test setup (`neorv32_testsetup_bootloader.vhd`)
. Finally, if your FPGA board provides a serial host interface (USB-to-serial converter) interface,
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i` to the host interface (e.g. USB-UART converter).
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i`.
 
. Perform the project HDL compilation (synthesis, mapping, bitstream generation).
. Perform the project HDL compilation (synthesis, mapping, bitstream generation).
. Program the generated bitstream into your FPGA and press the button connected to the reset signal.
. Program the generated bitstream into your FPGA and press the button connected to the reset signal.
. Done! The assigned status LED should be flashing now for some sections before permanently lighting up.
. Done! The LED at `gpio_o(0)` should be flashing now.
 
 
 
[TIP]
 
After the GCC toolchain for compiling RISC-V source code is ready (chapter <<_general_software_framework_setup>>),
 
you can advance to one of these chapters to learn how to get a software executable into your processor setup:
 
* If you are using the `neorv32_testsetup_approm.vhd` setup: See section <<_installing_an_executable_directly_into_memory>>.
 
* If you are using the `neorv32_testsetup_bootloader.vhd` setup: See section <<_uploading_and_starting_of_a_binary_executable_image_via_uart>>.
 
 
 
 
 
 
 
 
// ####################################################################################################################
// ####################################################################################################################
Line 600... Line 631...
 
 
 
 
 
 
// ####################################################################################################################
// ####################################################################################################################
:sectnums:
:sectnums:
 
== Application-Specific Processor Configuration
 
 
 
Due to the processor's configuration options, which are mainly defined via the top entity VHDL generics, the SoC
 
can be tailored to the application-specific requirements. Note that this chapter does not focus on optional
 
_SoC features_ like IO/peripheral modules. It rather gives ideas on how to optimize for _overall goals_
 
like performance and area.
 
 
 
[NOTE]
 
Please keep in mind that optimizing the design in one direction (like performance) will also effect other potential
 
optimization goals (like area and energy).
 
 
 
=== Optimize for Performance
 
 
 
The following points show some concepts to optimize the processor for performance regardless of the costs
 
(i.e. increasing area and energy requirements):
 
 
 
* Enable all performance-related RISC-V CPU extensions that implement dedicated hardware accelerators instead
 
of emulating operations entirely in software:  `M`, `C`, `Zfinx`
 
* Enable mapping of compleX CPU operations to dedicated hardware: `FAST_MUL_EN => true` to use DSP slices for
 
multiplications, `FAST_SHIFT_EN => true` use a fast barrel shifter for shift operations.
 
* Implement the instruction cache: `ICACHE_EN => true`
 
* Use as many _internal_ memory as possible to reduce memory access latency: `MEM_INT_IMEM_EN => true` and
 
`MEM_INT_DMEM_EN => true`, maximize `MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE`
 
* Increase the CPU's instruction prefetch buffer size: `CPU_IPB_ENTRIES`
 
* _To be continued..._
 
 
 
 
 
=== Optimize for Size
 
 
 
The NEORV32 is a size-optimized processor system that is intended to fit into tiny niches within large SoC
 
designs or to be used a customized microcontroller in really tiny / low-power FPGAs (like Lattice iCE40).
 
Here are some ideas how to make the processor even smaller while maintaining it's _general purpose system_
 
concept and maximum RISC-V compatibility.
 
 
 
**SoC**
 
 
 
* This is obvious, but exclude all unused optional IO/peripheral modules from synthesis via the processor
 
configuration generics.
 
* If an IO module provides an option to configure the number of "channels", constrain this number to the
 
actually required value (e.g. the PWM module `IO_PWM_NUM_CH` or the external interrupt controller `XIRQ_NUM_CH`).
 
* Reduce the FIFO sizes of implemented modules (e.g. `SLINK_TX_FIFO`).
 
* Disable the instruction cache (`ICACHE_EN => false`) if the design only uses processor-internal IMEM
 
and DMEM memories.
 
* _To be continued..._
 
 
 
**CPU**
 
 
 
* Use the _embedded_ RISC-V CPU architecture extension (`CPU_EXTENSION_RISCV_E`) to reduce block RAM utilization.
 
* The compressed instructions extension (`CPU_EXTENSION_RISCV_C`) requires additional logic for the decoder but
 
also reduces program code size by approximately 30%.
 
* If not explicitly used/required, constrain the CPU's counter sizes: `CPU_CNT_WIDTH` for `[m]instret[h]`
 
(number of instruction) and `[m]cycle[h]` (number of cycles) counters. You can even remove these counters
 
by setting `CPU_CNT_WIDTH => 0` if they are not used at all (note, this is not RISC-V compliant).
 
* Reduce the CPU's prefetch buffer size (`CPU_IPB_ENTRIES`).
 
* Map CPU shift operations to a small and iterative shifter unit (`FAST_SHIFT_EN => false`).
 
* If you have unused DSP block available, you can map multiplication operations to those slices instead of
 
using LUTs to implement the multiplier (`FAST_MUL_EN => true`).
 
* If there is no need to execute division in hardware, use the `Zmmul` extension instead of the full-scale
 
`M` extension.
 
* Disable CPU extension that are not explicitly used (`A`, `U`, `Zfinx`).
 
* _To be continued..._
 
 
 
=== Optimize for Clock Speed
 
 
 
The NEORV32 Processor and CPU are designed to provide minimal logic between register stages to keep the
 
critical path as short as possible. When enabling additional extension or modules the impact on the existing
 
logic is also kept at a minimum to prevent timing degrading. If there is a major impact on existing
 
logic (example: many physical memory protection address configuration registers) the VHDL code automatically
 
adds additional register stages to maintain critical path length. Obviously, this increases operation latency.
 
 
 
In order to optimize for a minimal critical path (= maximum clock speed) the following points should be considered:
 
 
 
* Complex CPU extensions (in terms of hardware requirements) should be avoided (examples: floating-point unit, physical memory protection).
 
* Large carry chains (>32-bit) should be avoided (constrain CPU counter sizes: e.g. `CPU_CNT_WIDTH => 32` and `HPM_NUM_CNTS => 32`).
 
* If the target FPGA provides sufficient DSP resources, CPU multiplication operations can be mapped to DSP slices (`FAST_MUL_EN => true`)
 
reducing LUT usage and critical path impact while also increasing overall performance.
 
* Use the synchronous (registered) RX path configuration of the external memory interface (`MEM_EXT_ASYNC_RX => false`).
 
* _To be continued..._
 
 
 
[NOTE]
 
The short and fixed-length critical path allows to integrate the core into existing clock domains.
 
So no clock domain-crossing and no sub-clock generation is required. However, for very high clock
 
frequencies (this is technology / platform dependent) clock domain crossing becomes crucial for chip-internal
 
connections.
 
 
 
 
 
=== Optimize for Energy
 
 
 
There are no _dedicated_ configuration options to optimize the processor for energy (minimal consumption;
 
energy/instruction ratio) yet. However, a reduced processor area (<<_optimize_for_size>>) will also reduce
 
static energy consumption.
 
 
 
To optimize your setup for low-power applications, you can make use of the CPU sleep mode (`wfi` instruction).
 
Put the CPU to sleep mode whenever possible. Disable all processor modules that are not actually used (exclude them
 
from synthesis if the will be _never_ used; disable the module via it's control register if the module is not
 
_currently_ used). When is sleep mode, you can keep a timer module running (MTIME or the watch dog) to wake up
 
the CPU again. Since the wake up is triggered by _any_ interrupt, the external interrupt controller can also
 
be used to wake up the CPU again. By this, all timers (and all other modules) can be deactivated as well.
 
 
 
.Processor-internal clock generator shutdown
 
[TIP]
 
If _no_ IO/peripheral module is currently enabled, the processor's internal clock generator circuit will be
 
shut down reducing switching activity and thus, dynamic energy consumption.
 
 
 
 
 
 
 
 
 
// ####################################################################################################################
 
:sectnums:
== Customizing the Internal Bootloader
== Customizing the Internal Bootloader
 
 
The NEORV32 bootloader provides several options to configure and customize it for a certain application setup.
The NEORV32 bootloader provides several options to configure and customize it for a certain application setup.
This configuration is done by passing _defines_ when compiling the bootloader. Of course you can also
This configuration is done by passing _defines_ when compiling the bootloader. Of course you can also
modify to bootloader source code to provide a setup that perfectly fits your needs.
modify to bootloader source code to provide a setup that perfectly fits your needs.
Line 630... Line 770...
4+^| Boot configuration
4+^| Boot configuration
| `AUTO_BOOT_SPI_EN`  | `0` | `0`, `1` | Set `1` to enable immediate boot from external SPI flash
| `AUTO_BOOT_SPI_EN`  | `0` | `0`, `1` | Set `1` to enable immediate boot from external SPI flash
| `AUTO_BOOT_OCD_EN`  | `0` | `0`, `1` | Set `1` to enable boot via on-chip debugger (OCD)
| `AUTO_BOOT_OCD_EN`  | `0` | `0`, `1` | Set `1` to enable boot via on-chip debugger (OCD)
| `AUTO_BOOT_TIMEOUT` | `8` | _any_ | Time in seconds after the auto-boot sequence starts (if there is no UART input by user); set to 0 to disabled auto-boot sequence
| `AUTO_BOOT_TIMEOUT` | `8` | _any_ | Time in seconds after the auto-boot sequence starts (if there is no UART input by user); set to 0 to disabled auto-boot sequence
4+^| SPI configuration
4+^| SPI configuration
 
| `SPI_EN`                | `1` | `0`, `1` | Set `1` to enable the usage of the SPI module (including load/store executables from/to SPI flash options)
| `SPI_FLASH_CS`          | `0` | `0` ... `7` | SPI chip select output (`spi_csn_o`) for selecting flash
| `SPI_FLASH_CS`          | `0` | `0` ... `7` | SPI chip select output (`spi_csn_o`) for selecting flash
| `SPI_FLASH_SECTOR_SIZE` | `65536` | _any_ | SPI flash sector size in bytes
| `SPI_FLASH_SECTOR_SIZE` | `65536` | _any_ | SPI flash sector size in bytes
| `SPI_FLASH_CLK_PRSC`    | `CLK_PRSC_8`  | `CLK_PRSC_2` `CLK_PRSC_4` `CLK_PRSC_8` `CLK_PRSC_64` `CLK_PRSC_128` `CLK_PRSC_1024` `CLK_PRSC_2024` `CLK_PRSC_4096` | SPI clock pre-scaler (dividing main processor clock)
| `SPI_FLASH_CLK_PRSC`    | `CLK_PRSC_8`  | `CLK_PRSC_2` `CLK_PRSC_4` `CLK_PRSC_8` `CLK_PRSC_64` `CLK_PRSC_128` `CLK_PRSC_1024` `CLK_PRSC_2024` `CLK_PRSC_4096` | SPI clock pre-scaler (dividing main processor clock)
| `SPI_BOOT_BASE_ADDR`    | `0x08000000` | _any_ 32-bit value | Defines the _base_ address of the executable in external flash
| `SPI_BOOT_BASE_ADDR`    | `0x08000000` | _any_ 32-bit value | Defines the _base_ address of the executable in external flash
|=======================
|=======================
Line 818... Line 959...
 
 
// ####################################################################################################################
// ####################################################################################################################
:sectnums:
:sectnums:
== Simulating the Processor
== Simulating the Processor
 
 
.WORK IN PROGRESS
 
[WARNING]
 
This Section Is Under Construction! +
 
 +
 
FIXME!
 
 
 
:sectnums:
:sectnums:
=== Testbench
=== Testbench
 
 
The NEORV32 project features a simple default testbench (`sim/neorv32_tb.simple.vhd`) that can be used to simulate
The NEORV32 project features a simple, plain-VHDL (no third-party libraries) default testbench (`sim/neorv32_tb.simple.vhd`)
and test the processor setup. This testbench features a 100MHz clock and enables all optional peripheral and
that can be used to simulate and test the processor setup. This testbench features a 100MHz clock and enables all optional
CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its
peripheral and CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its
combinatorial (looped) oscillator architecture).
combinatorial (looped) architecture).
 
 
The simulation setup is configured via the "User Configuration" section located right at the beginning of
The simulation setup is configured via the "User Configuration" section located right at the beginning of
the testbench's architecture. Each configuration constant provides comments to explain the functionality.
the testbench's architecture. Each configuration constant provides comments to explain the functionality.
 
 
Besides the actual NEORV32 Processor, the testbench also simulates "external" components that are connected
Besides the actual NEORV32 Processor, the testbench also simulates "external" components that are connected
Line 858... Line 993...
| `0x80000000` | `dmem_size_c` | `r/w/e,  a, 8/16/32` | external DMEM
| `0x80000000` | `dmem_size_c` | `r/w/e,  a, 8/16/32` | external DMEM
| `0xf0000000` |      64 bytes | `r/w/e, !a, 8/16/32` | external "IO" memory, atomic accesses will fail
| `0xf0000000` |      64 bytes | `r/w/e, !a, 8/16/32` | external "IO" memory, atomic accesses will fail
| `0xff000000` |       4 bytes | `-/w/-,  a,  -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts
| `0xff000000` |       4 bytes | `-/w/-,  a,  -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts
|=======================
|=======================
 
 
The simulated NEORV32 does not use the bootloader and directly boots the current application image (from
 
the `rtl/core/neorv32_application_image.vhd` image file). Make sure to use the `all` target of the
 
makefile to install your application as VHDL image after compilation:
 
 
 
[source, bash]
 
----
 
sw/example/blink_led$ make clean_all all
 
----
 
 
 
.Simulation-Optimized CPU/Processors Modules
 
[NOTE]
[NOTE]
The `sim/rtl_modules` folder provides simulation-optimized versions of certain CPU/processor modules.
The simulated NEORV32 does not use the bootloader and _directly boots_ the current application image (from
These alternatives can be used to replace the default CPU/processor HDL files to allow faster/easier/more
the `rtl/core/neorv32_application_image.vhd` image file).
efficient simulation. **These files are not intended for synthesis!**
 
 
 
**Simulation Console Output**
 
 
 
 
.UART output during simulation
 
[NOTE]
Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented
Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented
as part of the testbench. Received chars are send to the simulator console and are also stored to a log file
as part of the testbench. Received chars are send to the simulator console and are also stored to a log file
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulator home folder.
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulation's home folder.
 
**Please note that printing via the native UART receiver takes a lot of time.** For faster simulation console output
 
see section <<_faster_simulation_console_output>>.
 
 
 
 
:sectnums:
:sectnums:
=== Faster Simulation Console Output
=== Faster Simulation Console Output
 
 
Line 907... Line 1033...
[source, bash]
[source, bash]
----
----
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all all
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all all
----
----
 
 
The provided define will change the default UART0/UART1 setup function in order to set the simulation mode flag in the according UART's control register.
The provided define will change the default UART0/UART1 setup function in order to set the simulation
 
mode flag in the according UART's control register.
 
 
[NOTE]
[NOTE]
The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is
The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is
completed with a line feed (newline, ASCII `\n` = 10).
completed with a line feed (newline, ASCII `\n` = 10).
 
 
Line 927... Line 1054...
----
----
neorv32/sim$ sh ghdl_sim.sh --stop-time=20ms
neorv32/sim$ sh ghdl_sim.sh --stop-time=20ms
----
----
 
 
 
 
 
:sectnums:
 
=== In-Console Application Simulation
 
 
 
To directly compile and run a program in the console (using the default testbench and GHDL
 
as simulator) you can use the `sim` makefile target. Make sure to use the UART simulation mode
 
(`USER_FLAGS+=-DUART0_SIM_MODE` and/or `USER_FLAGS+=-DUART1_SIM_MODE`) to get
 
faster / direct-to-console UART output.
 
 
 
[source, bash]
 
----
 
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all sim
 
[...]
 
Blinking LED demo program
 
----
 
 
 
 
 
:sectnums:
 
=== Hello World!
 
 
 
To do a quick test of the NEORV32 make sure to have [GHDL](https://github.com/ghdl/ghdl) and a
 
[RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed, navigate to the project's
 
`sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`:
 
 
 
[TIP]
 
The simulator will output some _sanity check_ notes (and warnings or even errors if something is ill-configured)
 
right at the beginning of the simulation to give a brief overview of the actual NEORV32 SoC and CPU configurations.
 
 
 
[source, bash]
 
----
 
stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim
 
../../../sw/lib/source/neorv32_uart.c: In function 'neorv32_uart0_setup':
 
../../../sw/lib/source/neorv32_uart.c:301:4: warning: #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! [-Wcpp]
 
  301 |   #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only!
 
      |    ^~~~~~~
 
Memory utilization:
 
   text    data     bss     dec     hex filename
 
   4612       0     120    4732    127c main.elf
 
Compiling ../../../sw/image_gen/image_gen
 
Installing application image to ../../../rtl/core/neorv32_application_image.vhd
 
Simulating neorv32_application_image.vhd...
 
Tip: Compile application with USER_FLAGS+=-DUART[0/1]_SIM_MODE to auto-enable UART[0/1]'s simulation mode (redirect UART output to simulator console).
 
Using simulation runtime args: --stop-time=10ms
 
../rtl/core/neorv32_top.vhd:347:3:@0ms:(assertion note): NEORV32 PROCESSOR IO Configuration: GPIO MTIME UART0 UART1 SPI TWI PWM WDT CFS SLINK NEOLED XIRQ
 
../rtl/core/neorv32_top.vhd:370:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Boot configuration: Direct boot from memory (processor-internal IMEM).
 
../rtl/core/neorv32_top.vhd:394:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing on-chip debugger (OCD).
 
../rtl/core/neorv32_cpu.vhd:169:3:@0ms:(assertion note): NEORV32 CPU ISA Configuration (MARCH): RV32IMACU_Zbb_Zicsr_Zifencei_Zfinx_Debug
 
../rtl/core/neorv32_cpu.vhd:189:3:@0ms:(assertion note): NEORV32 CPU CONFIG NOTE: Implementing NO dedicated hardware reset for uncritical registers (default, might reduce area). Set package constant  = TRUE to configure a DEFINED reset value for all CPU registers.
 
../rtl/core/neorv32_imem.vhd:107:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal IMEM as ROM (16384 bytes), pre-initialized with application (4612 bytes).
 
../rtl/core/neorv32_dmem.vhd:89:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal DMEM (RAM, 8192 bytes).
 
../rtl/core/neorv32_wishbone.vhd:136:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing STANDARD Wishbone protocol.
 
../rtl/core/neorv32_wishbone.vhd:140:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing auto-timeout (255 cycles).
 
../rtl/core/neorv32_wishbone.vhd:144:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing LITTLE-endian byte order.
 
../rtl/core/neorv32_wishbone.vhd:148:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing registered RX path.
 
../rtl/core/neorv32_slink.vhd:161:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing 8 RX and 8 TX stream links.
 
 
 
                                                                                       ##
 
                                                                                       ##         ##   ##   ##
 
 ##     ##   #########   ########    ########   ##      ##   ########    ########      ##       ################
 
####    ##  ##          ##      ##  ##      ##  ##      ##  ##      ##  ##      ##     ##     ####            ####
 
## ##   ##  ##          ##      ##  ##      ##  ##      ##          ##         ##      ##       ##   ######   ##
 
##  ##  ##  #########   ##      ##  #########   ##      ##      #####        ##        ##     ####   ######   ####
 
##   ## ##  ##          ##      ##  ##    ##     ##    ##           ##     ##          ##       ##   ######   ##
 
##    ####  ##          ##      ##  ##     ##     ##  ##    ##      ##   ##            ##     ####            ####
 
##     ##    #########   ########   ##      ##      ##       ########   ##########     ##       ################
 
                                                                                       ##         ##   ##   ##
 
                                                                                       ##
 
Hello world! :)
 
----
 
 
 
 
 
:sectnums:
 
=== Advanced Simulation using VUNIT
 
 
 
.WORK IN PROGRESS
 
[WARNING]
 
This Section Is Under Construction! +
 
 +
 
FIXME!
 
 
 
The NEORV32 provides a more sophisticated simulation setup using https://vunit.github.io/[VUNIT].
 
The according VUNIT-based testbench is `sim/neorv32_tb.vhd`.
 
 
 
**WORK-IN-PROGRESS**
 
 
 
 
 
 
 
 
// ####################################################################################################################
// ####################################################################################################################
:sectnums:
:sectnums:
== Building the Documentation
== Building the Documentation

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.