URL
https://opencores.org/ocsvn/neorv32/neorv32/trunk
Subversion Repositories neorv32
Compare Revisions
- This comparison shows the changes necessary to convert path
/neorv32/trunk/docs
- from Rev 62 to Rev 63
- ↔ Reverse comparison
Rev 62 → Rev 63
/datasheet/cpu.adoc
13,6 → 13,7
** `E` - embedded CPU version (reduced register file size) |
** `M` - integer multiplication and division hardware |
** `U` - less-privileged _user_ mode |
** `Zbb` - basic bit-manipulation operations |
** `Zfinx` - single-precision floating-point unit |
** `Zicsr` - control and status register access (privileged architecture) |
** `Zifencei` - instruction stream synchronization |
342,10 → 343,15
Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder. |
|
[TIP] |
The CPU can discover available ISA extensions via the <<_misa>> and <<_mzext>> CSRs or by executing an instruction |
and checking for an _illegal instruction exception_. |
The CPU can discover available ISA extensions via the <<_misa>> CSR and the |
_SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register |
or by executing an instruction and checking for an _illegal instruction exception_. |
|
[NOTE] |
Executing an instruction from an extension that is not implemented or not enabled (for example via the according |
top entity generic) will raise an _illegal instruction_ exception. |
|
|
==== **`A`** - Atomic Memory Access |
|
Atomic memory access instructions (for implementing semaphores and mutexes) are available when the |
387,7 → 393,8
requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond |
`x15` will raise and _illegal instruction exception_. |
|
Due to the reduced register file an alternate ABI (**`ilp32e`**) is required for the toolchain. |
[IMPORTANT] |
Due to the reduced register file size an alternate toolchain ABI (**`ilp32e`**) is required. |
|
|
==== **`I`** - Base Integer ISA |
439,10 → 446,10
|
* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu` |
|
If `Zmmul` is enabled, executing any division instruction from the `M` ISA (`div`, `divu`, `rem`, `remu`) |
will raise an illegal instruction exception. |
If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`) |
will raise an _illegal instruction exception_. |
|
Note that `M` and `Zmmul` extensions _cannot_ be enabled in parallel. |
Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time. |
|
[TIP] |
If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated" |
452,7 → 459,7
|
==== **`U`** - Less-Privileged User Mode |
|
Adds the less-privileged _user mode_ when the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For |
Adds the less-privileged _user mode_ if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For |
instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like |
peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode. |
|
461,25 → 468,19
|
The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR. |
|
[NOTE] |
The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie` |
The most important points of the NEORV32-specific extensions are: |
* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie` |
and `mip` CSR. This extension is mapped to bits, that are available for custom use (according to the |
RISC-V specs). Also, custom trap codes for `mcause` are implemented. |
* The CPU provides a single _non-maskable_ interrupt (`NMI)` that also provides a custom trap code for `mcause`. |
* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>). |
|
[NOTE] |
The CPU provides a single _non-maskable_ interrupt (`NMI)` that also provides a custom trap code for `mcause`. |
|
[NOTE] |
A custom CSR `mzext` is available that can be used to check for implemented `Z*` CPU extensions |
(for example `Zifencei`). This CSR is mapped to the official "custom CSR address region". |
==== **`Zfinx`** Single-Precision Floating-Point Operations |
|
[NOTE] |
All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception |
(see <<_full_virtualization>>). |
[WARNING] |
The NEORV32 `Zfinx` extension is specification-compliant and operational but still _experimental_. |
|
|
==== **`Zfinx`** Single-Precision Floating-Point Operations |
|
The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the |
integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f` |
register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes. |
516,9 → 517,36
intrinsic library is provided to utilize the provided `Zfinx` floating-point extension from C-language |
code (see `sw/example/floating_point_test`). |
|
|
==== **`Zbb`** Basic Bit-Manipulation Operations |
|
[WARNING] |
The NEORV32 `Zbb` extension is specification-compliant and operational but still _experimental_. |
|
The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`. |
The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip |
|
The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration |
generic is _true_. In this case the following instructions are available: |
|
* `andn`, `orn`, `xnor` |
* `clz`, `ctz`, `cpop` |
* `max`, `maxu`, `min`, `minu` |
* `sext.b`, `sext.h`, `zext.h` |
* `rol`, `ror`, `rori` |
* `orc.b`, `rev8` |
|
[TIP] |
By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations |
like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the |
<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all |
shift-related `Zbb` instructions. |
|
[IMPORTANT] |
Note that any FPU instruction including all FPU-related CSR accesses will raise an illegal instruction exception |
if the FPU is not enabled via the <<_mstatus>> CSR (`FS` bits). |
The `Zbb` extension is frozen but not officially ratified yet. There is no |
software support for this extension in the upstream GCC RISC-V port yet. However, an |
intrinsic library is provided to utilize the provided `Zbb` extension from C-language |
code (see `sw/example/bitmanip_test`). |
|
|
==== **`Zicsr`** Control and Status Register Access / Privileged Architecture |
706,6 → 734,10
| Floating-point - misc | `Zfinx` | `fsgnj.s` `fsgnjn.s` `fsgnjx.s` `fclass.s` | 12 |
| Floating-point - conversion | `Zfinx` | `fcvt.w.s` `fcvt.wu.s` | 47 |
| Floating-point - conversion | `Zfinx` | `fcvt.s.w` `fcvt.s.wu` | 48 |
| Basic bit-manip - logic | `Zbb` | `andn` `orn` `xnor` | 3 |
| Basic bit-manip - shift | `Zbb` | `clz` `ctz` `cpop` `rol` `ror` `rori` | 4+SA, FAST_SHIFT: 4 |
| Basic bit-manip - arith | `Zbb` | `max` `maxu` `min` `minu` | 3 |
| Basic bit-manip - misc | `Zbb` | `sext.b` `sext.h` `zext.h` `orc.b` `rev8` | 3 |
|======================= |
|
[NOTE] |
/datasheet/cpu_csr.adoc
107,8 → 107,6
| 0xf13 | <<_mimpid>> | _CSR_MIMPID_ | r/- | Machine implementation ID / version | |
| 0xf14 | <<_mhartid>> | _CSR_MHARTID_ | r/- | Machine thread ID | |
| 0xf15 | <<_mconfigptr>> | _CSR_MCONFIGPTR_ | r/- | Machine configuration pointer register | |
6+^| **<<_neorv32_specific_custom_csrs>>** |
| 0xfc0 | <<_mzext>> | _CSR_MZEXT_ | r/- | Available `Z*` CPU extensions | |
|======================= |
|
|
188,9 → 186,6
| Bit | Name [C] | R/W | Function |
| 31 | _CSR_MSTATUS_SD_ | r/- | Read-only bit that is set if the FS field is not all-zero (state _OFF_) |
| 21 | _CSR_MSTATUS_TW_ | r/w | Timeout wait: raise illegal instruction exception if `WFI` instruction is executed outside of M-mode when set |
| 14:13 | _CSR_MSTATUS_FS_H_ : _CSR_MSTATUS_FS_L_ | r/w | Floating-point extension state; `00` = _OFF_, `11` = _DIRTY_; writing any other value will |
always set _DIRTY_; if `FS` is _off_ all FPU instructions and FPU CSR access will raise an illegal instruction exception; these status bits are hardwired |
to zero if no FPU is present (_CPU_MZEXT_ZFINX_ = false) |
| 12:11 | _CSR_MSTATUS_MPP_H_ : _CSR_MSTATUS_MPP_L_ | r/w | Previous machine privilege level, 11 = machine (M) level, 00 = user (U) level |
| 7 | _CSR_MSTATUS_MPIE_ | r/w | Previous machine global interrupt enable flag state |
| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag |
233,7 → 228,8
|======================= |
|
[TIP] |
Information regarding the implemented RISC-V `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found in the <<_mzext>> CSR. |
Information regarding the implemented RISC-V `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found |
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. |
|
|
:sectnums!: |
512,16 → 508,16
[IMPORTANT] |
If _CPU_CNT_WIDTH_ is less than 64 (the default value) and greater than or equal 32, the according |
MSBs of `[m]cycleh` and `[m]instreth` are read-only and always read as zero. This configuration |
will also set the _ZXSCNT_ flag in the <<_mzext>> CSR. + |
will also set the _SYSINFO_CPU_ZXSCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. + |
+ |
If _CPU_CNT_WIDTH_ is less than 32 and greater than 0, the `[m]cycleh` and `[m]instreth` do not |
exist and any access will raise an illegal instruction exception. Furthermore, the according MSBs of |
`[m]cycle` and `[m]instret` are read-only and always read as zero. This configuration will also |
set the _ZXSCNT_ flag in the <<_mzext>> CSR. + |
set the _SYSINFO_CPU_ZXSCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. + |
+ |
If _CPU_CNT_WIDTH_ is 0, <<_cycleh>> and <<_instreth>> / <<_mcycleh>> and <<_minstreth>> do not |
exist and any access will raise an illegal instruction exception. This configuration will also set the |
_ZXNOCNT_ flag in the <<_mzext>> CSR. |
_SYSINFO_CPU_ZXNOCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. |
|
|
:sectnums!: |
782,39 → 778,3
Software can traverse this data structure to discover information about the harts, the platform, and their configuration. |
**NOTE: Not assigned yet.** |
|====== |
|
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
==== NEORV32-Specific Custom CSRs |
|
|
:sectnums!: |
===== **`mzext`** |
|
[cols="4,27,>7"] |
[frame="topbot",grid="none"] |
|====== |
| 0xfc0 | **Available Z* extensions** | `mzext` |
3+| Reset value: _0x00000000_ |
3+| The `mzext` CSR is a custom read-only CSR that shows the implemented Z* extensions. The following bits |
are implemented (all remaining bits are always zero). The entire CSR is read-only. |
|====== |
|
.Machine counter-inhibit register |
[cols="^1,<3,^1,<5"] |
[options="header",grid="rows"] |
|======================= |
| Bit | Name [C] | R/W | Event |
| 0 | _CPU_MZEXT_ZICSR_ | r/- | `Zicsr` extensions available (enabled via <<_cpu_extension_riscv_zicsr>> generic) |
| 1 | _CPU_MZEXT_ZIFENCEI_ | r/- | `Zifencei` extensions available (enabled via <<_cpu_extension_riscv_zifencei>> generic) |
| 2 | _CPU_MZEXT_ZMMUL_ | r/- | `Zmmul` extensions available (enabled via <<_cpu_extension_riscv_zmmul>> generic) |
| 5 | _CPU_MZEXT_ZFINX_ | r/- | `Zfinx` extensions available (enabled via <<_cpu_extension_riscv_zfinx>> generic) |
| 6 | _CPU_MZEXT_ZXSCNT_ | r/- | custom extension: "Small CPU counters": `cycle[h]` & `instret[h]` CSRs have less than 64-bit when set (when <<_cpu_cnt_width>> generic is less than 64) |
| 7 | _CPU_MZEXT_ZXNOCNT_ | r/- | custom extension: "NO CPU counters": `cycle[h]` & `instret[h]` CSRs are not available at all when set (when <<_cpu_cnt_width>> generic is 0) |
| 8 | _CSR_MZEXT_PMP_ | r/- | PMP (physical memory protection) extension available (<<_pmp_num_regions>> generic > 0) |
| 9 | _CSR_MZEXT_HPM_ | r/- | HPM (hardware performance monitors) extension available (<<_hpm_num_cnts>> generic > 0) |
| 10 | _CSR_MZEXT_DEBUGMODE_ | r/- | RISC-V "CPU debug mode" extension available (enabled via <<_cpu_top_entity_generics,_CPU_EXTENSION_RISCV_DEBUG_>> generic) |
|======================= |
/datasheet/overview.adoc
144,41 → 144,41
=== Project Folder Structure |
|
................................... |
neorv32 - Project home folder |
neorv32 - Project home folder |
│ |
├docs - Project documentation |
│├datasheet - .adoc sources for NEORV32 data sheet |
│├doxygen_build - Software framework documentation (generated by doxygen) |
│├figures - Figures and logos |
│├icons - Misc. symbols |
│├references - Data sheets and RISC-V specs. |
│└src_adoc - AsciiDoc sources for this document |
├docs - Project documentation |
│├datasheet - .adoc sources for NEORV32 data sheet |
│├doxygen_build - Software framework documentation (generated by doxygen) |
│├figures - Figures and logos |
│├icons - Misc. symbols |
│├references - Data sheets and RISC-V specs. |
│└src_adoc - AsciiDoc sources for this document |
│ |
├rtl - VHDL sources |
│├core - Core sources of the CPU & SoC |
│└templates - Alternate/additional top entities & wrappers |
│ ├processor - Processor SoC wrappers |
│ └system - System wrappers for advanced connectivity |
├rtl - VHDL sources |
│├core - Core sources of the CPU & SoC |
│├processor_templates - Pre-configured SoC wrappers |
│├system_integration - System wrappers for advanced connectivity |
│└test_setups - Minimal test setup "SoCs" used in the User Guide |
│ |
├setups - Example setups for various FPGAs, boards and toolchains |
├setups - Example setups for various FPGAs, boards and toolchains |
│└... |
│ |
├sim - Simulation files (see User Guide) |
├sim - Simulation files (see User Guide) |
│ |
â””sw - Software framework |
├bootloader - Sources and scripts for the NEORV32 internal bootloader |
├common - Linker script and crt0.S start-up code |
├example - Various example programs |
â””sw - Software framework |
├bootloader - Sources and scripts for the NEORV32 internal bootloader |
├common - Linker script and crt0.S start-up code |
├example - Various example programs |
│└... |
├isa-test |
│├riscv-arch-test - RISC-V spec. compatibility test framework (submodule) |
│└port-neorv32 - Port files for the official RISC-V architecture tests |
├ocd_firmware - source code for on-chip debugger's "park loop" |
├openocd - OpenOCD on-chip debugger configuration files |
├image_gen - Helper program to generate NEORV32 executables |
â””lib - Processor core library |
├include - Header files (*.h) |
â””source - Source files (*.c) |
│├riscv-arch-test - RISC-V spec. compatibility test framework (submodule) |
│└port-neorv32 - Port files for the official RISC-V architecture tests |
├ocd_firmware - source code for on-chip debugger's "park loop" |
├openocd - OpenOCD on-chip debugger configuration files |
├image_gen - Helper program to generate NEORV32 executables |
â””lib - Processor core library |
├include - Header files (*.h) |
â””source - Source files (*.c) |
................................... |
|
|
203,6 → 203,7
│ |
├neorv32_cpu.vhd - NEORV32 CPU top entity |
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit |
││├neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.) |
││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.) |
││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension) |
││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor |
275,7 → 276,8
[TIP] |
The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internal |
counter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware). |
See section <<_processor_top_entity_generics>> for more information. |
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section |
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration]. |
|
|
:sectnums: |
335,6 → 337,10
This benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the whole |
system. The according sources can be found in the `sw/example/coremark` folder. |
|
.Dhrystone |
[TIP] |
A _simple_ port of the Dhrystone benchmark is also available in `sw/example/dhrystone`. |
|
The resulting CoreMark score is defined as CoreMark iterations per second. |
The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score is |
defined as CoreMark score divided by the CPU's clock frequency in MHz. |
/datasheet/soc.adoc
131,18 → 131,20
[TIP] |
The NEORV32 generics allow to configure the system according to your needs. The generics are |
used to control implementation of certain CPU extensions and peripheral modules and even allow to |
optimize the system for certain design goals like minimal area or maximum performance. |
optimize the system for certain design goals like minimal area or maximum performance. + |
**More information can be found in the user guides' section |
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration]**. |
|
[TIP] |
Privileged software can determine the actual CPU and processor configuration via the `misa` and |
`mzext` (see <<_machine_trap_setup>> and <<_neorv32_specific_custom_csrs>>) CSRs and via the memory-mapped _SYSINFO_ module (see <<_system_configuration_information_memory_sysinfo>>), |
respectively. |
Privileged software can determine the actual CPU and processor configuration via the `misa` and the |
i_SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. |
|
[TIP] |
If optional modules (like CPU extensions or peripheral devices) are *not enabled* the according circuitry **will not be synthesized at all**. |
Hence, the disabled modules do not increase area and power requirements and do not impact the timing. |
[NOTE] |
If optional modules (like CPU extensions or peripheral devices) are *not enabled* the according circuitry |
**will not be synthesized at all**. Hence, the disabled modules do not increase area and power requirements |
and do not impact the timing. |
|
[TIP] |
[NOTE] |
Not all configuration combinations are valid. The processor RTL code provides sanity checks to inform the user |
during synthesis/simulation if an invalid combination has been detected. |
|
172,7 → 174,8
[frame="all",grid="none"] |
|====== |
| **CLOCK_FREQUENCY** | _natural_ | _none_ |
3+| The clock frequency of the processor's `clk_i` input port in Hertz (Hz). |
3+| The clock frequency of the processor's `clk_i` input port in Hertz (Hz). This value can be retrieved by software |
from the <<_system_configuration_information_memory_sysinfo, SYSINFO>> module. |
|====== |
|
|
190,17 → 193,6
|
|
:sectnums!: |
===== _USER_CODE_ |
|
[cols="4,4,2"] |
[frame="all",grid="none"] |
|====== |
| **USER_CODE** | _std_ulogic_vector(31 downto 0)_ | x"00000000" |
3+| Custom user code that can be read by software via the _SYSINFO_ module. |
|====== |
|
|
:sectnums!: |
===== _HW_THREAD_ID_ |
|
[cols="4,4,2"] |
207,7 → 199,8
[frame="all",grid="none"] |
|====== |
| **HW_THREAD_ID** | _natural_ | 0 |
3+| The hart ID of the CPU. Can be read via the `mhartid` CSR. Hart IDs must be unique within a system. |
3+| The hart ID of the CPU. Software can retrieve this value from the `mhartid` CSR. |
Note that hart IDs must be unique within a system. |
|====== |
|
|
218,7 → 211,8
[frame="all",grid="none"] |
|====== |
| **ON_CHIP_DEBUGGER_EN** | _boolean_ | false |
3+| Implement on-chip debugger (OCD). See chapter <<_on_chip_debugger_ocd>>. |
3+| Implement the on-chip debugger (OCD) and the CPU debug mode. |
See chapter <<_on_chip_debugger_ocd>> for more information. |
|====== |
|
|
226,7 → 220,10
:sectnums: |
==== RISC-V CPU Extensions |
|
See section <<_instruction_sets_and_extensions>> for more information. |
[TIP] |
See section <<_instruction_sets_and_extensions>> for more information. The configuration of the RISC-V _main_ ISA extensions |
(like `M`) can be determined via the <<_misa>> CSR. The configuration of ISA _sub-extensions_ (like `Zicsr`) and _extension options_ |
can be determined via memory-mapped registers of the <<_system_configuration_information_memory_sysinfo>> module. |
|
|
:sectnums!: |
248,8 → 245,8
[frame="all",grid="none"] |
|====== |
| **CPU_EXTENSION_RISCV_C** | _boolean_ | false |
3+| Implement compressed instructions (16-bit) when _true_. |
See section <<_c_compressed_instructions>>. |
3+| Implement compressed instructions (16-bit) when _true_. Compressed instructions can reduce program code |
size by approx. 30%. See section <<_c_compressed_instructions>>. |
|====== |
|
|
260,8 → 257,9
[frame="all",grid="none"] |
|====== |
| **CPU_EXTENSION_RISCV_E** | _boolean_ | false |
3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_. |
See section <<_e_embedded_cpu>>. |
3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_. This reduces embedded memory |
requirements for the register file. See section <<_e_embedded_cpu>> for more information. Note that this RISC-V extensions |
requires a different application binary interface (ABI). |
|====== |
|
|
272,8 → 270,11
[frame="all",grid="none"] |
|====== |
| **CPU_EXTENSION_RISCV_M** | _boolean_ | false |
3+| Implement integer multiplication and division instructions when _true_. |
See section <<_m_integer_multiplication_and_division>>. |
3+| Implement hardware accelerators for integer multiplication and division instructions when _true_. |
If this extensions is not enabled, multiplication and division operations (_not_ instructions) will be computed entirely in software. |
If only a hardware multiplier is required use the <<_cpu_extension_riscv_zmmul>> extension. Multiplication can also be mapped |
to DSP slices via the <<_fast_mul_en>> generic. |
See section <<_m_integer_multiplication_and_division>> for more information. |
|====== |
|
|
285,11 → 286,23
|====== |
| **CPU_EXTENSION_RISCV_U** | _boolean_ | false |
3+| Implement less-privileged user mode when _true_. |
See section <<_u_less_privileged_user_mode>>. |
See section <<_u_less_privileged_user_mode>> for more information. |
|====== |
|
|
:sectnums!: |
===== _CPU_EXTENSION_RISCV_Zbb_ |
|
[cols="4,4,2"] |
[frame="all",grid="none"] |
|====== |
| **CPU_EXTENSION_RISCV_Zbb** | _boolean_ | false |
3+| Implement the `Zbb` _basic_ bit-manipulation sub-extension when _true_. |
See section <<_zbb_basic_bit_manipulation_operations>> for more information. |
|====== |
|
|
:sectnums!: |
===== _CPU_EXTENSION_RISCV_Zfinx_ |
|
[cols="4,4,2"] |
297,7 → 310,7
|====== |
| **CPU_EXTENSION_RISCV_Zfinx** | _boolean_ | false |
3+| Implement the 32-bit single-precision floating-point extension (using integer registers) when _true_. |
See section <<_zfinx_single_precision_floating_point_operations>>. |
See section <<_zfinx_single_precision_floating_point_operations>> for more information. |
|====== |
|
|
311,7 → 324,7
3+| Implement the control and status register (CSR) access instructions when true. Note: When this option is |
disabled, the complete privileged architecture / trap system will be excluded from synthesis. Hence, no interrupts, no exceptions and |
no machine information will be available. |
See section <<_zicsr_control_and_status_register_access_privileged_architecture>>. |
See section <<_zicsr_control_and_status_register_access_privileged_architecture>> for more information. |
|====== |
|
|
323,8 → 336,8
|====== |
| **CPU_EXTENSION_RISCV_Zifencei** | _boolean_ | false |
3+| Implement the instruction fetch synchronization instruction `fence.i`. For example, this option is required |
for self-modifying code (and/or for i-cache flushes). |
See section <<_zifencei_instruction_stream_synchronization>>. |
for self-modifying code (and/or for instruction cache and CPU prefetch buffer flushes). |
See section <<_zifencei_instruction_stream_synchronization>> for more information. |
|====== |
|
|
335,8 → 348,8
[frame="all",grid="none"] |
|====== |
| **CPU_EXTENSION_RISCV_Zmmul** | _boolean_ | false |
3+| Implement integer multiplication-only instructions when _true_. This is a sub-extensions of the `M` extension. |
See section <<_zmmul_integer_multiplication>>. |
3+| Implement integer multiplication-only instructions when _true_. This is a sub-extension of the `M` extension, which |
cannot be used together with the `M` extension. See section <<_zmmul_integer_multiplication>> for more information. |
|====== |
|
|
354,9 → 367,11
[frame="all",grid="none"] |
|====== |
| **FAST_MUL_EN** | _boolean_ | false |
3+| When this generic is enabled, the multiplier of the `M` extension is realized using DSPs blocks instead of an |
iterative bit-serial approach. This generic is only relevant when the multiplier and divider CPU extension is |
enabled (<<_cpu_extension_riscv_m>> is _true_). |
3+| When this generic is enabled, the multiplier of the `M` extension is implemented using DSPs blocks instead of an |
iterative bit-serial approach. Performance will be increased and LUT utilization will be reduced at the cost of DSP slice |
utilization. This generic is only relevant when a hardware multiplier CPU extension is |
enabled (<<_cpu_extension_riscv_m>> or <<_cpu_extension_riscv_zmmul>> is _true_). **Note that the multipliers of the |
<<_zfinx_single_precision_floating_point_operations>> extension are always mapped to DSP block (if available).** |
|====== |
|
|
367,9 → 382,11
[frame="all",grid="none"] |
|====== |
| **FAST_SHIFT_EN** | _boolean_ | false |
3+| When this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring |
more hardware resources). If it is set _false_ the CPU uses a serial shifter that only performs a single bit shift per cycle |
(small but slow). |
3+| If this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring |
more hardware resources but completing within two clock cycles). If it is set _false_, the CPU uses a serial shifter |
that only performs a single bit shift per cycle (requiring less hardware resources, but requires up to 32 clock |
cycles to complete - depending on shift amount). **Note that this option also implements barrel shifters for _all_ |
shift-related operations of the <<_zbb_basic_bit_manipulation_operations>> extension.** |
|====== |
|
|
381,9 → 398,8
|====== |
| **CPU_CNT_WIDTH** | _natural_ | 64 |
3+| This generic configures the total size of the CPU's `cycle` and `instret` CSRs (low word + high word). |
The maximum value is 64, the minimum value is 0. See |
section <<_machine_counters_and_timers>> for more information. Note: configurations with <<_cpu_cnt_width>> |
less than 64 bits do not comply to the RISC-V specs. |
The maximum value is 64, the minimum value is 0. See section <<_machine_counters_and_timers>> for more information. |
Note: configurations with <<_cpu_cnt_width>> less than 64 bits do not comply to the RISC-V specs. |
|====== |
|
|
396,8 → 412,7
| **CPU_IPB_ENTRIES** | _natural_ | 2 |
3+| This generic configures the number of entries in the CPU's instruction prefetch buffer (a FIFO). |
The value has to be a power of two and has to be greater than zero. |
Long linear sequences of code can benefit from an increased IPB size. For setups that use the instruction |
cache (<<_icache_en>>) this generic should be set to 1. |
Long linear sequences of code can benefit from an increased IPB size. |
|====== |
|
|
416,8 → 431,8
|====== |
| **PMP_NUM_REGIONS** | _natural_ | 0 |
3+| Total number of implemented protections regions (0..64). If this generics is zero no physical memory |
protection logic will be implemented at all. Setting <<_pmp_num_regions>>_ > 0 will set the _CSR_MZEXT_PMP_ flag |
in the <<_mzext>> CSR. |
protection logic will be implemented at all. Setting <<_pmp_num_regions>>_ > 0 will set the _SYSINFO_CPU_PMP_ flag |
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. |
|====== |
|
|
446,9 → 461,9
[frame="all",grid="none"] |
|====== |
| **HPM_NUM_CNTS** | _natural_ | 0 |
3+| Total number of implemented hardware performance monitor counters (0..29). If this generics is zero no |
hardware performance monitor logic will be implemented at all. Setting <<_hpm_num_cnts>> > 0 will set the _CSR_MZEXT_HPM_ flag |
in the <<_mzext>> CSR. |
3+| Total number of implemented hardware performance monitor counters (0..29). If this generics is zero, no |
hardware performance monitor logic will be implemented at all. Setting <<_hpm_num_cnts>> > 0 will set the _SYSINFO_CPU_HPM_ flag |
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. |
|====== |
|
|
459,8 → 474,8
[frame="all",grid="none"] |
|====== |
| **HPM_CNT_WIDTH** | _natural_ | 40 |
3+| This generic defines the total LSB-aligned size of each HPM counter (size(`[m]hpmcounter*h`) + |
size(`[m]hpmcounter*`)). The maximum value is 64, the minimal is 0. If the size is less than 64-bit, the |
3+| This generic defines the total LSB-aligned size of each HPM counter (`size([m]hpmcounter*h)` + |
`size([m]hpmcounter*)`). The maximum value is 64, the minimal is 0. If the size is less than 64-bit, the |
unused MSB-aligned counter bits are hardwired to zero. |
|====== |
|
490,7 → 505,7
[frame="all",grid="none"] |
|====== |
| **MEM_INT_IMEM_SIZE** | _natural_ | 16*1024 |
3+| Size in bytes of the processor internal instruction memory (IMEM). Has no effect when _MEM_INT_IMEM_EN_ is _false_. |
3+| Size in bytes of the processor internal instruction memory (IMEM). Has no effect when <<_mem_int_imem_en>> is _false_. |
|====== |
|
|
519,7 → 534,7
[frame="all",grid="none"] |
|====== |
| **MEM_INT_DMEM_SIZE** | _natural_ | 8*1024 |
3+| Size in bytes of the processor-internal data memory (DMEM). Has no effect when _MEM_INT_DMEM_EN_ is _false_. |
3+| Size in bytes of the processor-internal data memory (DMEM). Has no effect when <<_mem_int_dmem_en>> is _false_. |
|====== |
|
|
537,7 → 552,8
[frame="all",grid="none"] |
|====== |
| **ICACHE_EN** | _boolean_ | false |
3+| Implement processor internal instruction cache when _true_. |
3+| Implement processor internal instruction cache when _true_. Note: if the setup only uses processor-internal data |
and instruction memories there is not point of implementing the i-cache. |
|====== |
|
|
549,7 → 565,7
|====== |
| **ICACHE_NUM_BLOCKS** | _natural_ | 4 |
3+| Number of blocks (cache "pages" or "lines") in the instruction cache. Has to be a power of two. Has no |
effect when _ICACHE_DMEM_EN_ is false. |
effect when <<_icache_dmem_en>> is false. |
|====== |
|
|
561,7 → 577,7
|====== |
| **ICACHE_BLOCK_SIZE** | _natural_ | 64 |
3+| Size in bytes of each block in the instruction cache. Has to be a power of two. Has no effect when |
_ICACHE_EN_ is _false_. |
<<_icache_dmem_en>> is _false_. |
|====== |
|
|
573,7 → 589,7
|====== |
| **ICACHE_ASSOCIATIVITY** | _natural_ | 1 |
3+| Associativity (= number of sets) of the instruction cache. Has to be a power of two. Allowed configurations: |
`1` = 1 set, direct mapped; `2` = 2-way set-associative. Has no effect when _ICACHE_EN_ is _false_. |
`1` = 1 set, direct mapped; `2` = 2-way set-associative. Has no effect when <<_icache_dmem_en>> is _false_. |
|====== |
|
|
602,7 → 618,8
[frame="all",grid="none"] |
|====== |
| **MEM_EXT_TIMEOUT** | _natural_ | 255 |
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception. Set to 0 to disable auto-timeout. |
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception. |
If set to zero, there will be no auto-timeout and no bus fault exception (might permanently stall system!). |
|====== |
|
|
613,7 → 630,8
[frame="all",grid="none"] |
|====== |
| **MEM_EXT_PIPE_MODE** | _boolean_ | false |
3+| Use _standard_ ("classic") Wishbone protocol for external bus when _false_; use _pipelined_ Wishbone protocol when _true_. |
3+| Use _standard_ ("classic") Wishbone protocol for external bus when _false_. |
Use _pipelined_ Wishbone protocol when _true_. |
|====== |
|
|
624,7 → 642,7
[frame="all",grid="none"] |
|====== |
| **MEM_EXT_BIG_ENDIAN** | _boolean_ | false |
3+| Use BIG endian interface for external bus when _true_; use little endian interface when _false_. |
3+| Use BIG endian interface for external bus when _true_. Use little endian interface when _false_. |
|====== |
|
|
637,7 → 655,7
| **MEM_EXT_ASYNC_RX** | _boolen_ | false |
3+| By default, _MEM_EXT_ASYNC_RX_ = _false_ implements a registered read-back path (RX) for incoming data in the bus interface |
in order to shorten the critical path. By setting _MEM_EXT_ASYNC_RX_ = _true_ an _asynchronous_ ("direct") read-back path is |
implemented reducing access latency by one cycle. |
implemented reducing access latency by one cycle but eventually increasing the critical path. |
|====== |
|
|
718,7 → 736,7
|====== |
| **XIRQ_TRIGGER_TYPE** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF |
3+| Interrupt trigger type configuration (one bit for each IRQ channel): `0` = level-triggered, '1' = edge triggered. |
_XIRQ_TRIGGER_POLARITY_ generic is used to specify the actual level (high/low) or edge (falling/rising). |
<<_xirq_trigger_polarity>> generic is used to specify the actual level (high/low) or edge (falling/rising). |
|====== |
|
|
730,7 → 748,7
|====== |
| **XIRQ_TRIGGER_POLARITY** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF |
3+| Interrupt trigger polarity configuration (one bit for each IRQ channel): `0` = low-level/falling-edge, |
'1' = high-level/rising-edge. _XIRQ_TRIGGER_TYPE_ generic is used to specify the actual type (level or edge). |
'1' = high-level/rising-edge. <<_xirq_trigger_type>> generic is used to specify the actual type (level or edge). |
|====== |
|
|
/datasheet/soc_sysinfo.adoc
26,41 → 26,82
[options="header",grid="all"] |
|======================= |
| Address | Name [C] | Function |
| `0xffffffe0` | _SYSINFO_CLK_ | clock speed in Hz (via top's _CLOCK_FREQUENCY_ generic) |
| `0xffffffe4` | _SYSINFO_USER_CODE_ | custom user code, assigned via top's _USER_CODE_ generic |
| `0xffffffe8` | _SYSINFO_FEATURES_ | specific hardware configuration (see next table) |
| `0xffffffec` | _SYSINFO_CACHE_ | cache configuration information (see next table) |
| `0xfffffff0` | _SYSINFO_ISPACE_BASE_ | instruction address space base (defined via `ispace_base_c` constant in the `neorv32_package.vhd` file) |
| `0xfffffff4` | _SYSINFO_IMEM_SIZE_ | internal IMEM size in bytes (defined via top's _MEM_INT_IMEM_SIZE_ generic) |
| `0xfffffff8` | _SYSINFO_DSPACE_BASE_ | data address space base (defined via `sdspace_base_c` constant in the `neorv32_package.vhd` file) |
| `0xfffffffc` | _SYSINFO_DMEM_SIZE_ | internal DMEM size in bytes (defined via top's _MEM_INT_DMEM_SIZE_ generic) |
| `0xffffffe0` | _SYSINFO_CLK_ | clock speed in Hz (via top's <<_clock_frequency>> generic) |
| `0xffffffe4` | _SYSINFO_CPU_ | specific CPU configuration (see <<_sysinfo_cpu_configuration>>) |
| `0xffffffe8` | _SYSINFO_FEATURES_ | specific SoC configuration (see <<_sysinfo_soc_configuration>>) |
| `0xffffffec` | _SYSINFO_CACHE_ | cache configuration information (see <<_sysinfo_cache_configuration>>) |
| `0xfffffff0` | _SYSINFO_ISPACE_BASE_ | instruction address space base (via package's `ispace_base_c` constant) |
| `0xfffffff4` | _SYSINFO_IMEM_SIZE_ | internal IMEM size in bytes (via top's <<_mem_int_imem_size>> generic) |
| `0xfffffff8` | _SYSINFO_DSPACE_BASE_ | data address space base (via package's `sdspace_base_c` constant) |
| `0xfffffffc` | _SYSINFO_DMEM_SIZE_ | internal DMEM size in bytes (via top's <<_mem_int_dmem_size>> generic) |
|======================= |
|
|
===== SYSINFO - CPU Configuration |
|
._SYSINFO_CPU_ bits |
[cols="^1,<10,<11"] |
[options="header",grid="all"] |
|======================= |
| Bit | Name [C] | Function |
| `0` | _SYSINFO_CPU_ZICSR_ | `Zicsr` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zicsr>> generic) |
| `1` | _SYSINFO_CPU_ZIFENCEI_ | `Zifencei` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zifencei>> generic) |
| `2` | _SYSINFO_CPU_ZMMUL_ | `Zmmul` extension (`M` sub-extension) available when set (via top's <<_cpu_extension_riscv_zmmul>> generic) |
| `3` | _SYSINFO_CPU_ZBB_ | `Zbb` extension (`B` sub-extension) available when set (via top's <<_cpu_extension_riscv_zbb>> generic) |
| `5` | _SYSINFO_CPU_ZFINX_ | `Zfinx` extension (`F` sub-/alternative-extension) available when set (via top's <<_cpu_extension_riscv_zfinx>> generic) |
| `6` | _SYSINFO_CPU_ZXSCNT_ | Custom extension - _Small_ CPU counters: `[m]cycle` & `[m]instret` CSRs have less than 64-bit when set (via top's <<_cpu_cnt_width>> generic) |
| `7` | _SYSINFO_CPU_ZXNOCNT_ | Custom extension - _NO_ CPU counters: `[m]cycle` & `[m]instret` CSRs are NOT available at all when set (via top's <<_cpu_cnt_width>> generic) |
| `8` | _SYSINFO_CPU_PMP_ | `PMP` (physical memory protection) extension available when set (via top's <<_>> generic) |
| `9` | _SYSINFO_CPU_HPM_ | `HPM` (hardware performance monitors) extension available when set (via top's <<_>> generic) |
| `10` | _SYSINFO_CPU_DEBUGMODE_ | RISC-V CPU `debug_mode` available when set (via top's <<_>> generic) |
| `30 | _SYSINFO_CPU_FASTMUL_ | fast multiplication available when set (via top's <<_fast_mul_en>> generic) |
| `31` | _SYSINFO_CPU_FASTSHIFT_ | fast shifts available when set (via top's <<_fast_shift_en>> generic) |
|======================= |
|
|
===== SYSINFO - SoC Configuration |
|
._SYSINFO_FEATURES_ bits |
[cols="^1,<10,<11"] |
[options="header",grid="all"] |
|======================= |
| Bit | Name [C] | Function |
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's _INT_BOOTLOADER_EN_ generic) |
| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's _MEM_EXT_EN_ generic) |
| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's _MEM_INT_DMEM_EN_ generic) |
| `3` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's _MEM_INT_IMEM_EN_ generic) |
| `4` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via top's _MEM_EXT_BIG_ENDIAN_ generic) |
| `5` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via _ICACHE_EN_ generic) |
| `14` | _SYSINFO_FEATURES_HW_RESET_ | set if on-chip debugger implemented (via _ON_CHIP_DEBUGGER_EN_ generic) |
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant) |
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant) |
| `16` | _SYSINFO_FEATURES_IO_GPIO_ | set if the GPIO is implemented (via top's _IO_GPIO_EN_ generic) |
| `17` | _SYSINFO_FEATURES_IO_MTIME_ | set if the MTIME is implemented (via top's _IO_MTIME_EN_ generic) |
| `18` | _SYSINFO_FEATURES_IO_UART0_ | set if the primary UART0 is implemented (via top's _IO_UART0_EN_ generic) |
| `19` | _SYSINFO_FEATURES_IO_SPI_ | set if the SPI is implemented (via top's _IO_SPI_EN_ generic) |
| `20` | _SYSINFO_FEATURES_IO_TWI_ | set if the TWI is implemented (via top's _IO_TWI_EN_ generic) |
| `21` | _SYSINFO_FEATURES_IO_PWM_ | set if the PWM is implemented (via top's _IO_PWM_EN_ generic) |
| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's _IO_WDT_EN_ generic) |
| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's _IO_CFS_EN_ generic) |
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's <<_int_bootloader_en>> generic) |
| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's <<_mem_ext_en>> generic) |
| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's <<_mem_int_dmem_en>> generic) |
| `3` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's <<_mem_int_imem_en>> generic) |
| `4` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via top's <<_mem_ext_big_endian>> generic) |
| `5` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via top's <<_icache_en>> generic) |
| `14` | _SYSINFO_FEATURES_HW_RESET_ | set if on-chip debugger implemented (via top's <<_on_chip_debugger_en>> generic) |
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's `dedicated_reset_c` constant) |
| `16` | _SYSINFO_FEATURES_IO_GPIO_ | set if the GPIO is implemented (via top's <<_io_gpio_en>> generic) |
| `17` | _SYSINFO_FEATURES_IO_MTIME_ | set if the MTIME is implemented (via top's <<_io_mtime_en>> generic) |
| `18` | _SYSINFO_FEATURES_IO_UART0_ | set if the primary UART0 is implemented (via top's <<_io_uart0_en>> generic) |
| `19` | _SYSINFO_FEATURES_IO_SPI_ | set if the SPI is implemented (via top's <<_io_spi_en>> generic) |
| `20` | _SYSINFO_FEATURES_IO_TWI_ | set if the TWI is implemented (via top's <<_io_twi_en>> generic) |
| `21` | _SYSINFO_FEATURES_IO_PWM_ | set if the PWM is implemented (via top's <<_io_pwm_en>> generic) |
| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's <<_io_wdt_en>> generic) |
| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's <<_io_cfs_en>> generic) |
| `24` | _SYSINFO_FEATURES_IO_TRNG_ | set if the TRNG is implemented (via top's _IO_TRNG_EN_ generic) |
| `25` | _SYSINFO_FEATURES_IO_SLINK_ | set if the SLINK is implemented (via top's _SLINK_NUM_TX_ / _SLINK_NUM_RX_ generics) |
| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's _IO_UART1_EN_ generic) |
| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's _IO_NEOLED_EN_ generic) |
| `25` | _SYSINFO_FEATURES_IO_SLINK_ | set if the SLINK is implemented (via top's <<_slink_num_tx>> and/or <<_slink_num_rx>> generics) |
| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's <<_io_uart1_en>> generic) |
| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's <<_io_neoled_en>> generic) |
|======================= |
|
|
===== SYSINFO - Cache Configuration |
|
[NOTE] |
Bit fields in this register are set to all-zero if the according cache is not implemented. |
|
._SYSINFO_CACHE_ bits |
[cols="^1,<10,<11"] |
[options="header",grid="all"] |
|======================= |
| Bit | Name [C] | Function |
| `3:0` | _SYSINFO_CACHE_IC_BLOCK_SIZE_3_ : _SYSINFO_CACHE_IC_BLOCK_SIZE_0_ | _log2_(i-cache block size in bytes), via top's <<_icache_block_size>> generic |
| `7:4` | _SYSINFO_CACHE_IC_NUM_BLOCKS_3_ : _SYSINFO_CACHE_IC_NUM_BLOCKS_0_ | _log2_(i-cache number of cache blocks), via top's <<_icache_num_blocks>> generic |
| `11:9` | _SYSINFO_CACHE_IC_ASSOCIATIVITY_3_ : _SYSINFO_CACHE_IC_ASSOCIATIVITY_0_ | _log2_(i-cache associativity), via top's <<_icache_associativity>> generic |
| `15:12` | _SYSINFO_CACHE_IC_REPLACEMENT_3_ : _SYSINFO_CACHE_IC_REPLACEMENT_0_ | i-cache replacement policy (`0001` = LRU if associativity > 0) |
| `32:16` | - | zero, reserved for d-cache |
|======================= |
/datasheet/soc_wishbone.adoc
133,7 → 133,7
|
**AXI4-Lite Connectivity** |
|
The AXI4-Lite wrapper (`rtl/templates/system/neorv32_SystemTop_axi4lite.vhd`) provides a Wishbone-to- |
The AXI4-Lite wrapper (`rtl/system_integration/neorv32_SystemTop_axi4lite.vhd`) provides a Wishbone-to- |
AXI4-Lite bridge, compatible with Xilinx Vivado (IP packager and block design editor). All entity signals of |
this wrapper are of type _std_logic_ or _std_logic_vector_, respectively. |
|
145,4 → 145,4
|
[WARNING] |
Using the auto-termination timeout feature (_MEM_EXT_TIMEOUT_ greater than zero) is **not AXI4 compliant** as the AXI protocol does not support canceling of |
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/templates/system/neorv32_SystemTop_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default. |
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/system_integration/neorv32_SystemTop_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default. |
/datasheet/software.adoc
124,6 → 124,7
exe - compile and generate <neorv32_exe.bin> executable for upload via bootloader |
hex - compile and generate <neorv32_exe.hex> executable raw file |
install - compile, generate and install VHDL IMEM boot image (for application) |
sim - in-console simulation using the default testbench and GHDL |
all - exe + hex + install |
elf_info - show ELF layout info |
clean - clean up project |
457,8 → 458,8
| `HWV` | Processor hardware version (from the `mimpid` CSR) in BCD format (example: `0x01040606` = v1.4.6.6). |
| `CLK` | Processor clock speed in Hz (via the SYSINFO module, from the _CLOCK_FREQUENCY_ generic). |
| `MISA` | CPU extensions (from the `misa` CSR). |
| `ZEXT` | CPU sub-extensions (from the `mzext` CSR) |
| `PROC` | Processor configuration (via the SYSINFO module, from the IO_* and MEM_* configuration generics). |
| `ZEXT` | CPU sub-extensions (via the _SYSINFO_CPU_ register in the SYSINFO module) |
| `PROC` | Processor configuration (via the _SYSINFO_FEATURES_ register in the SYSINFO module / from the IO_* and MEM_* configuration generics). |
| `IMEM` | IMEM memory base address and size in byte (from the _MEM_INT_IMEM_SIZE_ generic). |
| `DMEM` | DMEM memory base address and size in byte (from the _MEM_INT_DMEM_SIZE_ generic). |
|======================= |
/figures/neorv32_cpu.png
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
/figures/neorv32_processor.png
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
/references/bitmanip-draft.pdf
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
/references/riscv-privileged.pdf
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
/references/riscv-spec.pdf
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
/userguide/content.adoc
1,8 → 1,16
Let's Get It Started! |
|
To make your NEORV32 project run, follow the guides from the upcoming sections. Follow these guides |
step by step and in the presented order. |
This user guide uses the NEORV32 project _as is_ from the official `neorv32` repository. |
To make your first NEORV32 project run, follow the guides from the upcoming sections. It is recommended to |
follow these guides step by step and eventually in the presented order. |
|
[TIP] |
This guide uses the minimalistic and platform/toolchain agnostic SoC test setups from |
`rtl/test_setups` for illustration. You can use one of the provided test setups for |
your first FPGA tests. Alternatively, have a look at the `setups` folder, |
which provides more sophisticated example setups for various FPGAs/FPGA boards and toolchains. |
|
|
:sectnums: |
== Software Toolchain Setup |
|
9,20 → 17,15
To compile (and debug) executables for the NEORV32 a RISC-V toolchain is required. |
There are two possibilities to get this: |
|
1. Download and _build_ the official RISC-V GNU toolchain yourself |
1. Download and _build_ the official RISC-V GNU toolchain yourself. |
2. Download and install a prebuilt version of the toolchain; this might also done via the package manager / app store of your OS |
|
[TIP] |
The default toolchain prefix for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V |
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change the _RISCV_PREFIX_ variable in the application |
makefile(s) according to your needs or define this variable when invoking the makefile. |
[NOTE] |
The default toolchain prefix (`RISCV_PREFIX` variable) for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V |
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change `RISCV_PREFIX` |
according to your needs. |
|
[IMPORTANT] |
Keep in mind that – for instance – a rv32imc toolchain only provides library code compiled with |
compressed (_C_) and `mul`/`div` instructions (_M_)! Hence, this code cannot be executed (without |
emulation) on an architecture without these extensions! |
|
|
:sectnums: |
=== Building the Toolchain from Scratch |
|
39,7 → 42,12
riscv-gnu-toolchain$ make |
---- |
|
[IMPORTANT] |
Keep in mind that – for instance – a toolchain build with `--with-arch=rv32imc` only provides library code compiled with |
compressed (`C`) and `mul`/`div` instructions (`M`)! Hence, this code cannot be executed (without |
emulation) on an architecture without these extensions! |
|
|
:sectnums: |
=== Downloading and Installing a Prebuilt Toolchain |
|
103,25 → 111,45
:sectnums: |
== General Hardware Setup |
|
This guide will setup a NEORV32 project for FPGA implementation (or simulation only) _from scratch_ |
This guide shows the basics of setting up a NEORV32 project for FPGA implementation (or simulation only) |
_from scratch_. It uses a _simplified_ test "SoC" setup of the processor to keeps things simple at the beginning. |
This simple setup is intended for evaluation or as "hello world" project to check out the NEORV32 |
on _your_ FPGA board. |
|
[TIP] |
If you want to use a complete pre-defined setup to start with, check out the |
project's `setups` folder (https://github.com/stnolting/neorv32/tree/master/setups), |
which provides (script-based) demo setups for various FPGA boards and toolchains. |
If you want to use a more sophisticated pre-defined setup to start with, check out the |
`setups` folder, which provides example setups for various FPGA, boards and toolchains. |
|
This tutorial uses a _simplified_ test setup of the processor |
to keeps things simple at the beginning as this setup is intended as |
evaluation or "hello world" project to check out the NEORV32. |
The NEORV32 project features two minimalistic pre-configured test setups in |
https://github.com/stnolting/neorv32/blob/master/rtl/test_setups[`rtl/test_setups`]. |
Both test setups only implement very basic processor and CPU features. |
The main difference between the two setups is the processor boot concept - so how to get a software executable |
_into_ the processor: |
|
* **`rtl/test_setups/neorv32_testsetup_approm.vhd`**: this setup does not require a connection via UART. The |
software executable is "installed" into the bitstream to initialize a read-only memory. Use this setup |
if your FPGA board does _not_ provide a UART interface. |
* **`rtl/test_setups/neorv32_testsetup_bootloader.vhd`**: this setups uses the UART and the default NEORV32 |
bootloader to upload new software executables. Use this setup if your board _does_ provide a UART interface. |
|
.NEORV32 "hello world" test setup (`rtl/test_setups/neorv32_testsetup_bootloader.vhd`) |
image::neorv32_test_setup.png[align=center] |
|
.External Clock Source |
[NOTE] |
These test setups are intended to be directly used as **design top entity**. Of course you can also instantiate them |
into another design unit. If your FPGA board only provides _very fast_ external clock sources (like on the FOMU board) |
you might need to add clock management components (PLLs, DCMs, MMCMs, ...) to the test setup or to the according top entity |
if you instantiate one of the test setups. |
|
[start=1] |
. Create a new project with your FPGA EDA tool of choice. |
. Add all VHDL files from the project's `rtl/core` folder to your project. Make sure to _reference_ the |
files only – do not copy them. |
. Add all VHDL files from the project's `rtl/core` folder to your project. |
. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not |
provide a field to enter the library name, check out the "properties" menu of the added rtl files. |
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor. If you |
already have a design, instantiate this unit into your design and proceed. |
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor, which can be |
instantiated into the "real" project. However, in this tutorial we will use one of the pre-defined |
test setups from `rtl/test_setups` (see above). |
|
[IMPORTANT] |
Make sure to include the `neorv32` package into your design when instantiating the processor: add |
128,92 → 156,95
`library neorv32;` and `use neorv32.neorv32_package.all;` to your design unit. |
|
[start=5] |
. If you do not have a design yet and just want to check out the NEORV32 – no problem! This guide |
uses a simplified top entity, that encapsulates the actual processor top entity: add the |
`rtl/templates/processor/neorv32_ProcessorTop_Test.vhd` VHDL file to your project, too, and |
select it as _top entity_. |
. This test setup provides a minimal test hardware setup: |
. Add the pre-defined test setup of choice to the project, too, and select it as _top entity_. |
. The entity of both test setups |
provide a minimal set of configuration generics, that might have to be adapted to match your FPGA and board: |
|
.NEORV32 "hello world" test setup |
image::neorv32_test_setup.png[align=center] |
|
[start=7] |
. It only implements some very basic processor and CPU features. Also, only the |
minimum number of signals is propagated to the outer world. |
. However, a minimal setup-specific configuration of the NEORV32 processor is required to make it run |
on your FPGA board of choice. Only the absolutely required modifications will be made while |
keeping the default configuration for the remaining configuration options: |
|
.Cut-out of `neorv32_ProcessorTop_Test.vhd` showing the processor instance and its configuration |
.Test setup entity - configuration generics |
[source,vhdl] |
---- |
neorv32_top_inst: neorv32_top |
generic map ( |
-- General -- |
CLOCK_FREQUENCY => 100000000, -- in Hz # <1> |
INT_BOOTLOADER_EN => true, |
... |
-- Internal instruction memory -- |
MEM_INT_IMEM_EN => true, |
MEM_INT_IMEM_SIZE => 16*1024, # <2> |
-- Internal data memory -- |
MEM_INT_DMEM_EN => true, |
MEM_INT_DMEM_SIZE => 8*1024, # <3> |
... |
generic ( |
-- adapt these for your setup -- |
CLOCK_FREQUENCY : natural := 100000000; <1> |
MEM_INT_IMEM_SIZE : natural := 16*1024; <2> |
MEM_INT_DMEM_SIZE : natural := 8*1024 <3> |
); |
---- |
<1> Clock frequency of `clk_i` signal in Hertz |
<2> Default size of internal instruction memory: 16kB |
<3> Default size of internal data memory: 8kB |
|
[start=9] |
. There is one generic that has to be set according to your FPGA board setup: the actual clock frequency |
of the top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's |
frequency in Hertz (Hz) (note "1"). |
. If you feel like it – or if your FPGA does not provide many resources – you can modify the |
**memory sizes** (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ – marked with notes "2" and "3") or even |
exclude certain ISA extensions and peripheral modules from implementation - but as mentioned above, let's keep things |
simple at first and use the standard configuration for now. |
[start=7] |
. If you feel like it – or if your FPGA does not provide sufficient resources – you can modify the |
_memory sizes_ (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` – marked with notes "2" and "3"). But as mentioned |
above, let's keep things simple at first and use the standard configuration for now. |
. There is one generic that _has to be set according to your FPGA board_ setup: the actual clock frequency |
of the top's clock input signal (`clk_i`). Use the `CLOCK_FREQUENCY` generic to specify your clock source's |
frequency in Hertz (Hz). |
|
[NOTE] |
If you have changed the default memory configuration (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ generics) |
If you have changed the default memory configuration (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` generics) |
keep those new sizes in mind – these values are required for setting |
up the software framework in the next section <<_general_software_framework_setup>>. |
|
[start=11] |
[start=9] |
. Depending on your FPGA tool of choice, it is time to assign the signals of the test setup top entity to |
the according pins of your FPGA board. All the signals can be found in the entity declaration: |
the according pins of your FPGA board. All the signals can be found in the entity declaration of the |
corresponding test setup: |
|
.Entity signals of `neorv32_test_setup.vhd` |
.Entity signals of `neorv32_testsetup_approm.vhd` |
[source,vhdl] |
---- |
entity neorv32_test_setup is |
port ( |
-- Global control -- |
clk_i : in std_ulogic := '0'; -- global clock, rising edge |
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async |
clk_i : in std_ulogic; -- global clock, rising edge |
rstn_i : in std_ulogic; -- global reset, low-active, async |
-- GPIO -- |
gpio_o : out std_ulogic_vector(7 downto 0) -- parallel output |
); |
---- |
|
.Entity signals of `neorv32_testsetup_bootloader.vhd` |
[source,vhdl] |
---- |
port ( |
-- Global control -- |
clk_i : in std_ulogic; -- global clock, rising edge |
rstn_i : in std_ulogic; -- global reset, low-active, async |
-- GPIO -- |
gpio_o : out std_ulogic_vector(7 downto 0); -- parallel output |
-- UART0 -- |
uart0_txd_o : out std_ulogic; -- UART0 send data |
uart0_rxd_i : in std_ulogic := '0' -- UART0 receive data |
); |
end neorv32_test_setup; |
uart0_rxd_i : in std_ulogic -- UART0 receive data |
); |
---- |
|
[start=12] |
.Signal Polarity |
[NOTE] |
If your FPGA board has inverse polarity for certain input/output you can add `not` gates. Example: The reset signal |
`rstn_i` is low-active by default; the LEDs connected to `gpio_o` high-active by default. |
You can do this in your board top if you instantiate the test setup, |
or _inside_ the test setup if this is your top entity (low-active LEDs example: `gpio_o <= NOT con_gpio_o(7 downto 0);`). |
|
[start=10] |
. Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of |
your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is |
**low-active**, so maybe you need to invert the input signal. |
. If possible, connected at least bit `0` of the GPIO output port `gpio_o` to a high-active LED (invert |
the signal when your LEDs are low-active). This LED will be used as status LED for the setup. |
. Finally, if your FPGA board provides a serial host interface (USB-to-serial converter) interface, |
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i`. |
. If possible, connected _at least_ bit `0` of the GPIO output port `gpio_o` to a LED (see "Signal Polarity" note above). |
. Finally, if your are using the UART-based test setup (`neorv32_testsetup_bootloader.vhd`) |
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i` to the host interface (e.g. USB-UART converter). |
. Perform the project HDL compilation (synthesis, mapping, bitstream generation). |
. Program the generated bitstream into your FPGA and press the button connected to the reset signal. |
. Done! The assigned status LED should be flashing now for some sections before permanently lighting up. |
. Done! The LED at `gpio_o(0)` should be flashing now. |
|
[TIP] |
After the GCC toolchain for compiling RISC-V source code is ready (chapter <<_general_software_framework_setup>>), |
you can advance to one of these chapters to learn how to get a software executable into your processor setup: |
* If you are using the `neorv32_testsetup_approm.vhd` setup: See section <<_installing_an_executable_directly_into_memory>>. |
* If you are using the `neorv32_testsetup_bootloader.vhd` setup: See section <<_uploading_and_starting_of_a_binary_executable_image_via_uart>>. |
|
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
602,6 → 633,115
<<< |
// #################################################################################################################### |
:sectnums: |
== Application-Specific Processor Configuration |
|
Due to the processor's configuration options, which are mainly defined via the top entity VHDL generics, the SoC |
can be tailored to the application-specific requirements. Note that this chapter does not focus on optional |
_SoC features_ like IO/peripheral modules. It rather gives ideas on how to optimize for _overall goals_ |
like performance and area. |
|
[NOTE] |
Please keep in mind that optimizing the design in one direction (like performance) will also effect other potential |
optimization goals (like area and energy). |
|
=== Optimize for Performance |
|
The following points show some concepts to optimize the processor for performance regardless of the costs |
(i.e. increasing area and energy requirements): |
|
* Enable all performance-related RISC-V CPU extensions that implement dedicated hardware accelerators instead |
of emulating operations entirely in software: `M`, `C`, `Zfinx` |
* Enable mapping of compleX CPU operations to dedicated hardware: `FAST_MUL_EN => true` to use DSP slices for |
multiplications, `FAST_SHIFT_EN => true` use a fast barrel shifter for shift operations. |
* Implement the instruction cache: `ICACHE_EN => true` |
* Use as many _internal_ memory as possible to reduce memory access latency: `MEM_INT_IMEM_EN => true` and |
`MEM_INT_DMEM_EN => true`, maximize `MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` |
* Increase the CPU's instruction prefetch buffer size: `CPU_IPB_ENTRIES` |
* _To be continued..._ |
|
|
=== Optimize for Size |
|
The NEORV32 is a size-optimized processor system that is intended to fit into tiny niches within large SoC |
designs or to be used a customized microcontroller in really tiny / low-power FPGAs (like Lattice iCE40). |
Here are some ideas how to make the processor even smaller while maintaining it's _general purpose system_ |
concept and maximum RISC-V compatibility. |
|
**SoC** |
|
* This is obvious, but exclude all unused optional IO/peripheral modules from synthesis via the processor |
configuration generics. |
* If an IO module provides an option to configure the number of "channels", constrain this number to the |
actually required value (e.g. the PWM module `IO_PWM_NUM_CH` or the external interrupt controller `XIRQ_NUM_CH`). |
* Reduce the FIFO sizes of implemented modules (e.g. `SLINK_TX_FIFO`). |
* Disable the instruction cache (`ICACHE_EN => false`) if the design only uses processor-internal IMEM |
and DMEM memories. |
* _To be continued..._ |
|
**CPU** |
|
* Use the _embedded_ RISC-V CPU architecture extension (`CPU_EXTENSION_RISCV_E`) to reduce block RAM utilization. |
* The compressed instructions extension (`CPU_EXTENSION_RISCV_C`) requires additional logic for the decoder but |
also reduces program code size by approximately 30%. |
* If not explicitly used/required, constrain the CPU's counter sizes: `CPU_CNT_WIDTH` for `[m]instret[h]` |
(number of instruction) and `[m]cycle[h]` (number of cycles) counters. You can even remove these counters |
by setting `CPU_CNT_WIDTH => 0` if they are not used at all (note, this is not RISC-V compliant). |
* Reduce the CPU's prefetch buffer size (`CPU_IPB_ENTRIES`). |
* Map CPU shift operations to a small and iterative shifter unit (`FAST_SHIFT_EN => false`). |
* If you have unused DSP block available, you can map multiplication operations to those slices instead of |
using LUTs to implement the multiplier (`FAST_MUL_EN => true`). |
* If there is no need to execute division in hardware, use the `Zmmul` extension instead of the full-scale |
`M` extension. |
* Disable CPU extension that are not explicitly used (`A`, `U`, `Zfinx`). |
* _To be continued..._ |
|
=== Optimize for Clock Speed |
|
The NEORV32 Processor and CPU are designed to provide minimal logic between register stages to keep the |
critical path as short as possible. When enabling additional extension or modules the impact on the existing |
logic is also kept at a minimum to prevent timing degrading. If there is a major impact on existing |
logic (example: many physical memory protection address configuration registers) the VHDL code automatically |
adds additional register stages to maintain critical path length. Obviously, this increases operation latency. |
|
In order to optimize for a minimal critical path (= maximum clock speed) the following points should be considered: |
|
* Complex CPU extensions (in terms of hardware requirements) should be avoided (examples: floating-point unit, physical memory protection). |
* Large carry chains (>32-bit) should be avoided (constrain CPU counter sizes: e.g. `CPU_CNT_WIDTH => 32` and `HPM_NUM_CNTS => 32`). |
* If the target FPGA provides sufficient DSP resources, CPU multiplication operations can be mapped to DSP slices (`FAST_MUL_EN => true`) |
reducing LUT usage and critical path impact while also increasing overall performance. |
* Use the synchronous (registered) RX path configuration of the external memory interface (`MEM_EXT_ASYNC_RX => false`). |
* _To be continued..._ |
|
[NOTE] |
The short and fixed-length critical path allows to integrate the core into existing clock domains. |
So no clock domain-crossing and no sub-clock generation is required. However, for very high clock |
frequencies (this is technology / platform dependent) clock domain crossing becomes crucial for chip-internal |
connections. |
|
|
=== Optimize for Energy |
|
There are no _dedicated_ configuration options to optimize the processor for energy (minimal consumption; |
energy/instruction ratio) yet. However, a reduced processor area (<<_optimize_for_size>>) will also reduce |
static energy consumption. |
|
To optimize your setup for low-power applications, you can make use of the CPU sleep mode (`wfi` instruction). |
Put the CPU to sleep mode whenever possible. Disable all processor modules that are not actually used (exclude them |
from synthesis if the will be _never_ used; disable the module via it's control register if the module is not |
_currently_ used). When is sleep mode, you can keep a timer module running (MTIME or the watch dog) to wake up |
the CPU again. Since the wake up is triggered by _any_ interrupt, the external interrupt controller can also |
be used to wake up the CPU again. By this, all timers (and all other modules) can be deactivated as well. |
|
.Processor-internal clock generator shutdown |
[TIP] |
If _no_ IO/peripheral module is currently enabled, the processor's internal clock generator circuit will be |
shut down reducing switching activity and thus, dynamic energy consumption. |
|
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
== Customizing the Internal Bootloader |
|
The NEORV32 bootloader provides several options to configure and customize it for a certain application setup. |
632,6 → 772,7
| `AUTO_BOOT_OCD_EN` | `0` | `0`, `1` | Set `1` to enable boot via on-chip debugger (OCD) |
| `AUTO_BOOT_TIMEOUT` | `8` | _any_ | Time in seconds after the auto-boot sequence starts (if there is no UART input by user); set to 0 to disabled auto-boot sequence |
4+^| SPI configuration |
| `SPI_EN` | `1` | `0`, `1` | Set `1` to enable the usage of the SPI module (including load/store executables from/to SPI flash options) |
| `SPI_FLASH_CS` | `0` | `0` ... `7` | SPI chip select output (`spi_csn_o`) for selecting flash |
| `SPI_FLASH_SECTOR_SIZE` | `65536` | _any_ | SPI flash sector size in bytes |
| `SPI_FLASH_CLK_PRSC` | `CLK_PRSC_8` | `CLK_PRSC_2` `CLK_PRSC_4` `CLK_PRSC_8` `CLK_PRSC_64` `CLK_PRSC_128` `CLK_PRSC_1024` `CLK_PRSC_2024` `CLK_PRSC_4096` | SPI clock pre-scaler (dividing main processor clock) |
820,19 → 961,13
:sectnums: |
== Simulating the Processor |
|
.WORK IN PROGRESS |
[WARNING] |
This Section Is Under Construction! + |
+ |
FIXME! |
|
:sectnums: |
=== Testbench |
|
The NEORV32 project features a simple default testbench (`sim/neorv32_tb.simple.vhd`) that can be used to simulate |
and test the processor setup. This testbench features a 100MHz clock and enables all optional peripheral and |
CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its |
combinatorial (looped) oscillator architecture). |
The NEORV32 project features a simple, plain-VHDL (no third-party libraries) default testbench (`sim/neorv32_tb.simple.vhd`) |
that can be used to simulate and test the processor setup. This testbench features a 100MHz clock and enables all optional |
peripheral and CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its |
combinatorial (looped) architecture). |
|
The simulation setup is configured via the "User Configuration" section located right at the beginning of |
the testbench's architecture. Each configuration constant provides comments to explain the functionality. |
860,26 → 995,17
| `0xff000000` | 4 bytes | `-/w/-, a, -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts |
|======================= |
|
The simulated NEORV32 does not use the bootloader and directly boots the current application image (from |
the `rtl/core/neorv32_application_image.vhd` image file). Make sure to use the `all` target of the |
makefile to install your application as VHDL image after compilation: |
[NOTE] |
The simulated NEORV32 does not use the bootloader and _directly boots_ the current application image (from |
the `rtl/core/neorv32_application_image.vhd` image file). |
|
[source, bash] |
---- |
sw/example/blink_led$ make clean_all all |
---- |
|
.Simulation-Optimized CPU/Processors Modules |
.UART output during simulation |
[NOTE] |
The `sim/rtl_modules` folder provides simulation-optimized versions of certain CPU/processor modules. |
These alternatives can be used to replace the default CPU/processor HDL files to allow faster/easier/more |
efficient simulation. **These files are not intended for synthesis!** |
|
**Simulation Console Output** |
|
Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented |
as part of the testbench. Received chars are send to the simulator console and are also stored to a log file |
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulator home folder. |
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulation's home folder. |
**Please note that printing via the native UART receiver takes a lot of time.** For faster simulation console output |
see section <<_faster_simulation_console_output>>. |
|
|
:sectnums: |
909,7 → 1035,8
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all all |
---- |
|
The provided define will change the default UART0/UART1 setup function in order to set the simulation mode flag in the according UART's control register. |
The provided define will change the default UART0/UART1 setup function in order to set the simulation |
mode flag in the according UART's control register. |
|
[NOTE] |
The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is |
929,7 → 1056,92
---- |
|
|
:sectnums: |
=== In-Console Application Simulation |
|
To directly compile and run a program in the console (using the default testbench and GHDL |
as simulator) you can use the `sim` makefile target. Make sure to use the UART simulation mode |
(`USER_FLAGS+=-DUART0_SIM_MODE` and/or `USER_FLAGS+=-DUART1_SIM_MODE`) to get |
faster / direct-to-console UART output. |
|
[source, bash] |
---- |
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all sim |
[...] |
Blinking LED demo program |
---- |
|
|
:sectnums: |
=== Hello World! |
|
To do a quick test of the NEORV32 make sure to have [GHDL](https://github.com/ghdl/ghdl) and a |
[RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed, navigate to the project's |
`sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`: |
|
[TIP] |
The simulator will output some _sanity check_ notes (and warnings or even errors if something is ill-configured) |
right at the beginning of the simulation to give a brief overview of the actual NEORV32 SoC and CPU configurations. |
|
[source, bash] |
---- |
stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim |
../../../sw/lib/source/neorv32_uart.c: In function 'neorv32_uart0_setup': |
../../../sw/lib/source/neorv32_uart.c:301:4: warning: #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! [-Wcpp] |
301 | #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! |
| ^~~~~~~ |
Memory utilization: |
text data bss dec hex filename |
4612 0 120 4732 127c main.elf |
Compiling ../../../sw/image_gen/image_gen |
Installing application image to ../../../rtl/core/neorv32_application_image.vhd |
Simulating neorv32_application_image.vhd... |
Tip: Compile application with USER_FLAGS+=-DUART[0/1]_SIM_MODE to auto-enable UART[0/1]'s simulation mode (redirect UART output to simulator console). |
Using simulation runtime args: --stop-time=10ms |
../rtl/core/neorv32_top.vhd:347:3:@0ms:(assertion note): NEORV32 PROCESSOR IO Configuration: GPIO MTIME UART0 UART1 SPI TWI PWM WDT CFS SLINK NEOLED XIRQ |
../rtl/core/neorv32_top.vhd:370:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Boot configuration: Direct boot from memory (processor-internal IMEM). |
../rtl/core/neorv32_top.vhd:394:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing on-chip debugger (OCD). |
../rtl/core/neorv32_cpu.vhd:169:3:@0ms:(assertion note): NEORV32 CPU ISA Configuration (MARCH): RV32IMACU_Zbb_Zicsr_Zifencei_Zfinx_Debug |
../rtl/core/neorv32_cpu.vhd:189:3:@0ms:(assertion note): NEORV32 CPU CONFIG NOTE: Implementing NO dedicated hardware reset for uncritical registers (default, might reduce area). Set package constant <dedicated_reset_c> = TRUE to configure a DEFINED reset value for all CPU registers. |
../rtl/core/neorv32_imem.vhd:107:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal IMEM as ROM (16384 bytes), pre-initialized with application (4612 bytes). |
../rtl/core/neorv32_dmem.vhd:89:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal DMEM (RAM, 8192 bytes). |
../rtl/core/neorv32_wishbone.vhd:136:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing STANDARD Wishbone protocol. |
../rtl/core/neorv32_wishbone.vhd:140:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing auto-timeout (255 cycles). |
../rtl/core/neorv32_wishbone.vhd:144:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing LITTLE-endian byte order. |
../rtl/core/neorv32_wishbone.vhd:148:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing registered RX path. |
../rtl/core/neorv32_slink.vhd:161:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing 8 RX and 8 TX stream links. |
|
## |
## ## ## ## |
## ## ######### ######## ######## ## ## ######## ######## ## ################ |
#### ## ## ## ## ## ## ## ## ## ## ## ## ## #### #### |
## ## ## ## ## ## ## ## ## ## ## ## ## ## ###### ## |
## ## ## ######### ## ## ######### ## ## ##### ## ## #### ###### #### |
## ## ## ## ## ## ## ## ## ## ## ## ## ## ###### ## |
## #### ## ## ## ## ## ## ## ## ## ## ## #### #### |
## ## ######### ######## ## ## ## ######## ########## ## ################ |
## ## ## ## |
## |
Hello world! :) |
---- |
|
|
:sectnums: |
=== Advanced Simulation using VUNIT |
|
.WORK IN PROGRESS |
[WARNING] |
This Section Is Under Construction! + |
+ |
FIXME! |
|
The NEORV32 provides a more sophisticated simulation setup using https://vunit.github.io/[VUNIT]. |
The according VUNIT-based testbench is `sim/neorv32_tb.vhd`. |
|
**WORK-IN-PROGRESS** |
|
|
|
<<< |
// #################################################################################################################### |
:sectnums: |
/attrs.adoc
1,7 → 1,7
:author: Dipl.-Ing. Stephan Nolting |
:email: stnolting@gmail.com |
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL. |
:revnumber: v1.5.9 |
:revnumber: v1.6.0 |
:doctype: book |
:sectnums: |
:stem: |