URL
https://opencores.org/ocsvn/neorv32/neorv32/trunk
Subversion Repositories neorv32
Compare Revisions
- This comparison shows the changes necessary to convert path
/neorv32/trunk/docs/userguide
- from Rev 63 to Rev 64
- ↔ Reverse comparison
Rev 63 → Rev 64
/content.adoc
145,8 → 145,29
[start=1] |
. Create a new project with your FPGA EDA tool of choice. |
. Add all VHDL files from the project's `rtl/core` folder to your project. |
|
.Internal Memories |
[IMPORTANT] |
For a _general_ first setup (technology-independent) use the _default_ memory architectures for the internal memories |
(IMEM and DMEM). These are located in `rtl/core/mem`, so **make sure to add the files from `rtl/core/mem` to your project, too**. + |
+ |
If synthesis cannot efficiently map those default memory descriptions to the available memory resources, you can later replace the |
default memory architectures by optimized platform-specific memory architectures. **Example:** The `setups/radiant/UPduino_v3` |
example setup uses optimized memory primitives. Hence, it does not include the default memory architectures from |
`rtl/core/mem` as these are replaced by device-specific implementations. However, it still has to include the entity |
definitions from `rtl/core`. |
|
[start=3] |
. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not |
provide a field to enter the library name, check out the "properties" menu of the added rtl files. |
|
.Compile order |
[NOTE] |
Some tools (like Lattice Radiant) might require a _manual compile order_ of the VHDL source files to identify the dependencies. |
The package file `neorv32_package.vhd` should be analyzed first followed by the memory image files (`neorv32_application_imagevhd` |
and `neorv32_bootloader_image.vhd`) and the entity-only files (`neorv32_*mem.entity.vhd`). |
|
[start=4] |
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor, which can be |
instantiated into the "real" project. However, in this tutorial we will use one of the pre-defined |
test setups from `rtl/test_setups` (see above). |
742,6 → 763,71
<<< |
// #################################################################################################################### |
:sectnums: |
== Adding Custom Hardware Modules |
|
In resemblance to the RISC-V ISA, the NEORV32 processor was designed to ease customization and _extensibility_. |
The processor provides several predefined options to add application-specific custom hardware modules and accelerators. |
|
|
=== Standard (_External_) Interfaces |
|
The processor already provides a set of standard interfaces that are intended to connect _chip-external_ devices. |
However, these interfaces can also be used chip-internally. The most suitable interfaces are |
https://stnolting.github.io/neorv32/#_general_purpose_input_and_output_port_gpio[GPIO], |
https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_and_transmitter_uart0[UART], |
https://stnolting.github.io/neorv32/#_serial_peripheral_interface_controller_spi[SPI] and |
https://stnolting.github.io/neorv32/#_two_wire_serial_interface_controller_twi[TWI]. |
|
The SPI and (especially) the GPIO interfaces might be the most straightforward approaches since they |
have a minimal protocol overhead. Device-specific interrupt capabilities can be added using the |
https://stnolting.github.io/neorv32/#_external_interrupt_controller_xirq[External Interrupt Controller (XIRQ)]. |
Beyond simplicity, these interface only provide a very limited bandwidth and require more sophisticated |
software handling ("bit-banging" for the GPIO). |
|
|
=== External Bus Interface |
|
The https://stnolting.github.io/neorv32/#_processor_external_memory_interface_wishbone_axi4_lite[External Bus Interface] |
provides the classic approach to connect to custom IP. By default, the bus interface implements the widely adopted |
Wishbone interface standard. However, this project also includes wrappers to bridge to other protocol standards like ARM's |
AXI4-Lite or Intel's Avalon. By using a full-featured bus protocol, complex SoC structures can be implemented (including |
several modules and even multi-core architectures). Many FPGA EDA tools provide graphical editors to build and customize |
whole SoC architectures and even include pre-defined IP libraries. |
|
.Example AXI SoC using Xilinx Vivado |
image::neorv32_axi_soc.png[] |
|
The bus interface uses a memory-mapped approach. All data transfers are handled by simple load/store operations since the |
external bus interface is mapped into the processor's https://stnolting.github.io/neorv32/#_address_space[address space]. |
This allows a very simple still high-bandwidth communications. |
|
|
=== Stream Link Interface |
|
The NEORV32 https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface] provides |
point-to-point, unidirectional and parallel data channels that can be used to transfer streaming data. In |
contrast to the external bus interface, the streaming data does not provide any kind of "direction" control, |
so it can be seen as "constant address bursts". The stream link interface provides less protocol overhead |
and less latency than the bus interface. Furthermore, FIFOs can be be configured to each direction (RX/TX) to |
allow more CPU-independent operation. |
|
|
=== Custom Functions Subsystem |
|
The https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[NEORV32 Custom Functions Subsystem] |
is as "empty" template for a processor-internal module. It provides 32 32-bit memory-mapped interface |
registers that can be used to communicate with any arbitrary custom design logic. The intentions of this |
subsystem is to provide a simple base, where the user can concentrate on implementing the actual design logic |
rather than taking care of the communication between the CPU/software and the design logic. The interface |
registers are already allocated within the processor's address space and are supported by the software framework |
via low-level hardware access mechanisms. Additionally, the CFS provides a direct pre-defined interrupt channel to |
the CPU, which is also supported by the _NEORV32 runtime environment_. |
|
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
== Customizing the Internal Bootloader |
|
The NEORV32 bootloader provides several options to configure and customize it for a certain application setup. |
781,7 → 867,7
|
Each configuration parameter is implemented as C-language `define` that can be manually overridden (_redefined_) when |
invoking the bootloader's makefile. The according parameter and its new value has to be _appended_ |
(using `+=`) to the makefile's `USER_FLAGS` variable. Make sure to use the `-D` prefix here. |
(using `+=`) to the makefile `USER_FLAGS` variable. Make sure to use the `-D` prefix here. |
|
For example, to configure a UART Baud rate of 57600 and redirecting the status LED to output pin 20 |
use the following command (_in_ the bootloader's source folder `sw/bootloader`): |
935,7 → 1021,8
== Packaging the Processor as IP block for Xilinx Vivado Block Designer |
|
[start=1] |
. Import all the core files from `rtl/core` and assign them to a _new_ design library `neorv32`. |
. Import all the core files from `rtl/core` (including default internal memory architectures from `rtl/core/mem`) |
and assign them to a _new_ design library `neorv32`. |
. Instantiate the `rtl/wrappers/neorv32_top_axi4lite.vhd` module. |
. Then either directly use that module in a new block-design ("Create Block Design", right-click -> "Add Module", |
thats easier for a first try) or package it ("Tools", "Create and Package new IP") for the use in other projects. |
961,13 → 1048,22
:sectnums: |
== Simulating the Processor |
|
The NEORV32 project includes a core CPU, built-in peripherals in the Processor Subsystem, and additional peripherals in |
the templates and examples. |
Therefore, there is a wide range of possible testing and verification strategies. |
|
On the one hand, a simple smoke testbench allows ensuring that functionality is correct from a software point of view. |
That is used for running the RISC-V architecture tests, in order to guarantee compliance with the ISA specification(s). |
|
On the other hand, http://vunit.github.io/[VUnit] and http://vunit.github.io/verification_components/user_guide.html[Verification Components] are used for verifying the functionality of the various peripherals from a hardware point of view. |
|
:sectnums: |
=== Testbench |
|
The NEORV32 project features a simple, plain-VHDL (no third-party libraries) default testbench (`sim/neorv32_tb.simple.vhd`) |
that can be used to simulate and test the processor setup. This testbench features a 100MHz clock and enables all optional |
peripheral and CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its |
combinatorial (looped) architecture). |
A plain-VHDL (no third-party libraries) testbench (`sim/simple/neorv32_tb.simple.vhd`) can be used for simulating and |
testing the processor. |
This testbench features a 100MHz clock and enables all optional peripheral and CPU extensions except for the `E` |
extension and the TRNG IO module (that CANNOT be simulated due to its combinatorial (looped) architecture). |
|
The simulation setup is configured via the "User Configuration" section located right at the beginning of |
the testbench's architecture. Each configuration constant provides comments to explain the functionality. |
981,9 → 1077,21
* a memory-mapped registers to trigger the processor's interrupt signals |
|
The following table shows the base addresses of these four components and their default configuration and |
properties (attributes: `r` = read, `w` = write, `e` = execute, `a` = atomic accesses possible, `8` = byte-accessible, `16` = |
half-word-accessible, `32` = word-accessible). |
properties: |
|
[NOTE] |
==== |
Attributes: |
|
* `r` = read |
* `w` = write |
* `e` = execute |
* `a` = atomic accesses possible |
* `8` = byte-accessible |
* `16` = half-word-accessible |
* `32` = word-accessible |
==== |
|
.Testbench: processor-external memories |
[cols="^4,>3,^5,<11"] |
[options="header",grid="rows"] |
995,12 → 1103,12
| `0xff000000` | 4 bytes | `-/w/-, a, -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts |
|======================= |
|
[NOTE] |
[IMPORTANT] |
The simulated NEORV32 does not use the bootloader and _directly boots_ the current application image (from |
the `rtl/core/neorv32_application_image.vhd` image file). |
|
.UART output during simulation |
[NOTE] |
[IMPORTANT] |
Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented |
as part of the testbench. Received chars are send to the simulator console and are also stored to a log file |
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulation's home folder. |
1013,22 → 1121,18
|
When printing data via the UART the communication speed will always be based on the configured BAUD |
rate. For a simulation this might take some time. To have faster output you can enable the **simulation mode** |
or UART0/UART1 (see section https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_and_transmitter_uart0[Documentation: Primary Universal Asynchronous Receiver and Transmitter (UART0)]). |
for UART0/UART1 (see section https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_and_transmitter_uart0[Documentation: Primary Universal Asynchronous Receiver and Transmitter (UART0)]). |
|
ASCII data send to UART0 will be immediately printed to the simulator console. Additionally, the |
ASCII data is logged in a file (`neorv32.uart0.sim_mode.text.out`) in the simulator home folder. All |
written 32-bit data is also dumped as 8-char hexadecimal value into a file |
(`neorv32.uart0.sim_mode.data.out`) also in the simulator home folder. |
ASCII data sent to UART0|UART1 will be immediately printed to the simulator console and logged to files in the simulator |
execution directory: |
|
ASCII data send to UART1 will be immediately printed to the simulator console. Additionally, the |
ASCII data is logged in a file (`neorv32.uart1.sim_mode.text.out`) in the simulator home folder. All |
written 32-bit data is also dumped as 8-char hexadecimal value into a file |
(`neorv32.uart1.sim_mode.data.out`) also in the simulator home folder. |
* `neorv32.uart?.sim_mode.text.out`: ASCII data. |
* `neorv32.uart?.sim_mode.data.out`: all written 32-bit dumped as 8-char hexadecimal values. |
|
You can "automatically" enable the simulation mode of UART0/UART1 when compiling an application. In this case the |
"real" UART0/UART1 transmitter unit is permanently disabled. To enable the simulation mode just compile |
and install your application and add _UART0_SIM_MODE_ for UART0 and/or _UART1_SIM_MODE_ for UART1 to |
the compiler's _USER_FLAGS_ variable (do not forget the `-D` suffix flag): |
You can "automatically" enable the simulation mode of UART0/UART1 when compiling an application. |
In this case, the "real" UART0/UART1 transmitter unit is permanently disabled. |
To enable the simulation mode just compile and install your application and add _UART?_SIM_MODE_ to the compiler's |
_USER_FLAGS_ variable (do not forget the `-D` suffix flag): |
|
[source, bash] |
---- |
1044,20 → 1148,20
|
|
:sectnums: |
=== Simulation using GHDL |
=== Simulation using a shell script (with GHDL) |
|
To simulate the processor using _GHDL_ navigate to the `sim` folder and run the provided shell script. |
To simulate the processor using _GHDL_ navigate to the `sim/simple/` folder and run the provided shell script. |
Any arguments that are provided while executing this script are passed to GHDL. |
For example the simulation time can be set to 20ms using `--stop-time=20ms` as argument. |
|
[source, bash] |
---- |
neorv32/sim$ sh ghdl_sim.sh --stop-time=20ms |
neorv32/sim/simple$ sh ghdl_sim.sh --stop-time=20ms |
---- |
|
|
:sectnums: |
=== In-Console Application Simulation |
=== Simulation using Application Makefiles (In-Console with GHDL) |
|
To directly compile and run a program in the console (using the default testbench and GHDL |
as simulator) you can use the `sim` makefile target. Make sure to use the UART simulation mode |
1073,11 → 1177,11
|
|
:sectnums: |
=== Hello World! |
==== Hello World! |
|
To do a quick test of the NEORV32 make sure to have [GHDL](https://github.com/ghdl/ghdl) and a |
[RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed, navigate to the project's |
`sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`: |
To do a quick test of the NEORV32 make sure to have https://github.com/ghdl/ghdl[GHDL] and a |
[RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed. |
Navigate to the project's `sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`: |
|
[TIP] |
The simulator will output some _sanity check_ notes (and warnings or even errors if something is ill-configured) |
1088,17 → 1192,17
stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim |
../../../sw/lib/source/neorv32_uart.c: In function 'neorv32_uart0_setup': |
../../../sw/lib/source/neorv32_uart.c:301:4: warning: #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! [-Wcpp] |
301 | #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! |
301 | #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! <1> |
| ^~~~~~~ |
Memory utilization: |
text data bss dec hex filename |
4612 0 120 4732 127c main.elf |
4612 0 120 4732 127c main.elf <2> |
Compiling ../../../sw/image_gen/image_gen |
Installing application image to ../../../rtl/core/neorv32_application_image.vhd |
Installing application image to ../../../rtl/core/neorv32_application_image.vhd <3> |
Simulating neorv32_application_image.vhd... |
Tip: Compile application with USER_FLAGS+=-DUART[0/1]_SIM_MODE to auto-enable UART[0/1]'s simulation mode (redirect UART output to simulator console). |
Using simulation runtime args: --stop-time=10ms |
../rtl/core/neorv32_top.vhd:347:3:@0ms:(assertion note): NEORV32 PROCESSOR IO Configuration: GPIO MTIME UART0 UART1 SPI TWI PWM WDT CFS SLINK NEOLED XIRQ |
Tip: Compile application with USER_FLAGS+=-DUART[0/1]_SIM_MODE to auto-enable UART[0/1]'s simulation mode (redirect UART output to simulator console). <4> |
Using simulation runtime args: --stop-time=10ms <5> |
../rtl/core/neorv32_top.vhd:347:3:@0ms:(assertion note): NEORV32 PROCESSOR IO Configuration: GPIO MTIME UART0 UART1 SPI TWI PWM WDT CFS SLINK NEOLED XIRQ <6> |
../rtl/core/neorv32_top.vhd:370:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Boot configuration: Direct boot from memory (processor-internal IMEM). |
../rtl/core/neorv32_top.vhd:394:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing on-chip debugger (OCD). |
../rtl/core/neorv32_cpu.vhd:169:3:@0ms:(assertion note): NEORV32 CPU ISA Configuration (MARCH): RV32IMACU_Zbb_Zicsr_Zifencei_Zfinx_Debug |
1110,7 → 1214,7
../rtl/core/neorv32_wishbone.vhd:144:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing LITTLE-endian byte order. |
../rtl/core/neorv32_wishbone.vhd:148:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing registered RX path. |
../rtl/core/neorv32_slink.vhd:161:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing 8 RX and 8 TX stream links. |
|
<7> |
## |
## ## ## ## |
## ## ######### ######## ######## ## ## ######## ######## ## ################ |
1124,24 → 1228,51
## |
Hello world! :) |
---- |
<1> Notifier that "simulation mode" of UART0 is enabled (by the `USER_FLAGS+=-DUART0_SIM_MODE` makefile flag). All UART0 output is send to the simulator console. |
<2> Final executable size (`text`) and _static_ data memory requirements (`data`, `bss`). |
<3> The application code is _installed_ as pre-initialized IMEM. This is the default approach for simulation. |
<4> A note regarding UART "simulation mode", but we have already enabled that. |
<5> List of (default) arguments that were send to the simulator. Here: maximum simulation time (10ms). |
<6> "Sanity checks" from the core's VHDL files. These reports give some brief information about the SoC/CPU configuration (-> generics). If there are problems with the current configuration, an ERROR will appear. |
<7> Execution of the actual program starts. |
|
|
:sectnums: |
=== Advanced Simulation using VUNIT |
=== Advanced Simulation using VUnit |
|
.WORK IN PROGRESS |
[WARNING] |
This Section Is Under Construction! + |
+ |
FIXME! |
https://vunit.github.io/[VUnit] is an open source unit testing framework for VHDL/SystemVerilog. |
It allows continuous and automated testing of HDL code by complementing traditional testing methodologies. |
The motto of VUnit is _"testing early and often"_ through automation. |
|
The NEORV32 provides a more sophisticated simulation setup using https://vunit.github.io/[VUNIT]. |
The according VUNIT-based testbench is `sim/neorv32_tb.vhd`. |
VUnit is composed by a http://vunit.github.io/py/ui.html[Python interface] and multiple optional |
http://vunit.github.io/vhdl_libraries.html[VHDL libraries]. |
The Python interface allows declaring sources and simulation options, and it handles the compilation, execution and |
gathering of the results regardless of the simulator used. |
That allows having a single `run.py` script to be used with GHDL, ModelSim/QuestaSim, Riviera PRO, etc. |
On the other hand, the VUnit's VHDL libraries provide utilities for assertions, logging, having virtual queues, handling CSV files, etc. |
The http://vunit.github.io/verification_components/user_guide.html[Verification Component Library] uses those features |
for abstracting away bit-toggling when verifying standard interfaces such as Wishbone, AXI, Avalon, UARTs, etc. |
|
**WORK-IN-PROGRESS** |
Testbench sources in `sim` (such as `sim/neorv32_tb.vhd` and `sim/uart_rx*.vhd`) use VUnit's VHDL libraries for testing |
NEORV32 and peripherals. |
The entrypoint for executing the tests is `sim/run.py`. |
|
[source, bash] |
---- |
# ./sim/run.py -l |
neorv32.neorv32_tb.all |
Listed 1 tests |
|
# ./sim/run.py -v |
Compiling into neorv32: rtl/core/neorv32_uart.vhd passed |
Compiling into neorv32: rtl/core/neorv32_twi.vhd passed |
Compiling into neorv32: rtl/core/neorv32_trng.vhd passed |
... |
---- |
|
See http://vunit.github.io/user_guide.html[VUnit: User Guide] and http://vunit.github.io/cli.html[VUnit: Command Line Interface] for further info about VUnit's features. |
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
1220,7 → 1351,12
processor setup _with_ internal instruction memory (IMEM) make sure it is implemented as RAM |
(_INT_BOOTLOADER_EN_ generic = true). |
|
[IMPORTANT] |
The on-chip debugger is only implemented if the _ON_CHIP_DEBUGGER_EN_ generic is set _true_. Furthermore, it requires |
the `Zicsr` and `Zifencei` CPU extension to be implemented (top generics _CPU_EXTENSION_RISCV_Zicsr_ |
and _CPU_EXTENSION_RISCV_Zifencei_ = true). |
|
|
:sectnums: |
=== Hardware Requirements |
|
1300,9 → 1436,15
.Compile the test application |
[source, bash] |
-------------------------- |
.../neorv32/sw/example/blink_led$ make MARCH=-march=rv32i clean_all all |
.../neorv32/sw/example/blink_led$ make MARCH=-march=rv32i USER_FLAGS+=-g clean_all all |
-------------------------- |
|
.Adding debug symbols to the executable |
[NOTE] |
`USER_FLAGS+=-g` passes the `-g` flag to the compiler so it adds debug information/symbols |
to the generated ELF file. This is optional but will provide more sophisticated information for debugging |
(like source file line numbers). |
|
This will generate an ELF file `main.elf` that contains all the symbols required for debugging. |
Furthermore, an assembly listing file `main.asm` is generated that we will use to define breakpoints. |
|
1331,9 → 1473,9
(gdb) |
-------------------------- |
|
Now connect to OpenOCD using the default port 3333 on your local machine. |
Set the ELF file we want to debug to the recently generated `main.elf` from the `blink_led` example. |
Finally, upload the program to the processor. |
Now connect to OpenOCD using the default port 3333 on your machine. |
We will use the previously generated ELF file `main.elf` from the `blink_led` example. |
Finally, upload the program to the processor and start debugging. |
|
[NOTE] |
The executable that is uploaded to the processor is **not** the default NEORV32 executable (`neorv32_exe.bin`) that |
1343,7 → 1485,7
.Running GDB |
[source, bash] |
-------------------------- |
(gdb) target remote localhost:3333 <1> |
(gdb) target extended-remote localhost:3333 <1> |
Remote debugging using localhost:3333 |
warning: No executable has been specified and target does not support |
determining executable automatically. Try using the "file" command. |
1405,6 → 1547,13
Breakpoint 1 at 0x690 |
-------------------------- |
|
.How do breakpoints work? |
[TIP] |
The NEORV32 on-chip debugger does not provide any hardware breakpoints (RISC-V "trigger modules") that compare an address like the PC |
with a predefined value. Instead, gdb will modify the actual executable in IMEM: the actual instruction at the address |
of the specified breakpoint is replaced by a `break` / `c.break` instruction. Whenever execution reaches this instruction, debug mode is |
re-entered and the debugger restores the original instruction at this address to maintain original program behavior. |
|
Now execute `c` (= continue). The CPU will resume operation until it hits the break-point. |
By this we can "step" from increment to increment. |
|