OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

Compare Revisions

  • This comparison shows the changes necessary to convert path
    /neorv32/trunk/docs
    from Rev 62 to Rev 63
    Reverse comparison

Rev 62 → Rev 63

/datasheet/cpu.adoc
13,6 → 13,7
** `E` - embedded CPU version (reduced register file size)
** `M` - integer multiplication and division hardware
** `U` - less-privileged _user_ mode
** `Zbb` - basic bit-manipulation operations
** `Zfinx` - single-precision floating-point unit
** `Zicsr` - control and status register access (privileged architecture)
** `Zifencei` - instruction stream synchronization
342,10 → 343,15
Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.
 
[TIP]
The CPU can discover available ISA extensions via the <<_misa>> and <<_mzext>> CSRs or by executing an instruction
and checking for an _illegal instruction exception_.
The CPU can discover available ISA extensions via the <<_misa>> CSR and the
_SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register
or by executing an instruction and checking for an _illegal instruction exception_.
 
[NOTE]
Executing an instruction from an extension that is not implemented or not enabled (for example via the according
top entity generic) will raise an _illegal instruction_ exception.
 
 
==== **`A`** - Atomic Memory Access
 
Atomic memory access instructions (for implementing semaphores and mutexes) are available when the
387,7 → 393,8
requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond
`x15` will raise and _illegal instruction exception_.
 
Due to the reduced register file an alternate ABI (**`ilp32e`**) is required for the toolchain.
[IMPORTANT]
Due to the reduced register file size an alternate toolchain ABI (**`ilp32e`**) is required.
 
 
==== **`I`** - Base Integer ISA
439,10 → 446,10
 
* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
 
If `Zmmul` is enabled, executing any division instruction from the `M` ISA (`div`, `divu`, `rem`, `remu`)
will raise an illegal instruction exception.
If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`)
will raise an _illegal instruction exception_.
 
Note that `M` and `Zmmul` extensions _cannot_ be enabled in parallel.
Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time.
 
[TIP]
If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"
452,7 → 459,7
 
==== **`U`** - Less-Privileged User Mode
 
Adds the less-privileged _user mode_ when the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For
Adds the less-privileged _user mode_ if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For
instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like
peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode.
 
461,25 → 468,19
 
The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR.
 
[NOTE]
The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie`
The most important points of the NEORV32-specific extensions are:
* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie`
and `mip` CSR. This extension is mapped to bits, that are available for custom use (according to the
RISC-V specs). Also, custom trap codes for `mcause` are implemented.
* The CPU provides a single _non-maskable_ interrupt (`NMI)` that also provides a custom trap code for `mcause`.
* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).
 
[NOTE]
The CPU provides a single _non-maskable_ interrupt (`NMI)` that also provides a custom trap code for `mcause`.
 
[NOTE]
A custom CSR `mzext` is available that can be used to check for implemented `Z*` CPU extensions
(for example `Zifencei`). This CSR is mapped to the official "custom CSR address region".
==== **`Zfinx`** Single-Precision Floating-Point Operations
 
[NOTE]
All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception
(see <<_full_virtualization>>).
[WARNING]
The NEORV32 `Zfinx` extension is specification-compliant and operational but still _experimental_.
 
 
==== **`Zfinx`** Single-Precision Floating-Point Operations
 
The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the
integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f`
register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes.
516,9 → 517,36
intrinsic library is provided to utilize the provided `Zfinx` floating-point extension from C-language
code (see `sw/example/floating_point_test`).
 
 
==== **`Zbb`** Basic Bit-Manipulation Operations
 
[WARNING]
The NEORV32 `Zbb` extension is specification-compliant and operational but still _experimental_.
 
The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.
The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip
 
The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration
generic is _true_. In this case the following instructions are available:
 
* `andn`, `orn`, `xnor`
* `clz`, `ctz`, `cpop`
* `max`, `maxu`, `min`, `minu`
* `sext.b`, `sext.h`, `zext.h`
* `rol`, `ror`, `rori`
* `orc.b`, `rev8`
 
[TIP]
By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations
like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the
<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all
shift-related `Zbb` instructions.
 
[IMPORTANT]
Note that any FPU instruction including all FPU-related CSR accesses will raise an illegal instruction exception
if the FPU is not enabled via the <<_mstatus>> CSR (`FS` bits).
The `Zbb` extension is frozen but not officially ratified yet. There is no
software support for this extension in the upstream GCC RISC-V port yet. However, an
intrinsic library is provided to utilize the provided `Zbb` extension from C-language
code (see `sw/example/bitmanip_test`).
 
 
==== **`Zicsr`** Control and Status Register Access / Privileged Architecture
706,6 → 734,10
| Floating-point - misc | `Zfinx` | `fsgnj.s` `fsgnjn.s` `fsgnjx.s` `fclass.s` | 12
| Floating-point - conversion | `Zfinx` | `fcvt.w.s` `fcvt.wu.s` | 47
| Floating-point - conversion | `Zfinx` | `fcvt.s.w` `fcvt.s.wu` | 48
| Basic bit-manip - logic | `Zbb` | `andn` `orn` `xnor` | 3
| Basic bit-manip - shift | `Zbb` | `clz` `ctz` `cpop` `rol` `ror` `rori` | 4+SA, FAST_SHIFT: 4
| Basic bit-manip - arith | `Zbb` | `max` `maxu` `min` `minu` | 3
| Basic bit-manip - misc | `Zbb` | `sext.b` `sext.h` `zext.h` `orc.b` `rev8` | 3
|=======================
 
[NOTE]
/datasheet/cpu_csr.adoc
107,8 → 107,6
| 0xf13 | <<_mimpid>> | _CSR_MIMPID_ | r/- | Machine implementation ID / version |
| 0xf14 | <<_mhartid>> | _CSR_MHARTID_ | r/- | Machine thread ID |
| 0xf15 | <<_mconfigptr>> | _CSR_MCONFIGPTR_ | r/- | Machine configuration pointer register |
6+^| **<<_neorv32_specific_custom_csrs>>**
| 0xfc0 | <<_mzext>> | _CSR_MZEXT_ | r/- | Available `Z*` CPU extensions |
|=======================
 
 
188,9 → 186,6
| Bit | Name [C] | R/W | Function
| 31 | _CSR_MSTATUS_SD_ | r/- | Read-only bit that is set if the FS field is not all-zero (state _OFF_)
| 21 | _CSR_MSTATUS_TW_ | r/w | Timeout wait: raise illegal instruction exception if `WFI` instruction is executed outside of M-mode when set
| 14:13 | _CSR_MSTATUS_FS_H_ : _CSR_MSTATUS_FS_L_ | r/w | Floating-point extension state; `00` = _OFF_, `11` = _DIRTY_; writing any other value will
always set _DIRTY_; if `FS` is _off_ all FPU instructions and FPU CSR access will raise an illegal instruction exception; these status bits are hardwired
to zero if no FPU is present (_CPU_MZEXT_ZFINX_ = false)
| 12:11 | _CSR_MSTATUS_MPP_H_ : _CSR_MSTATUS_MPP_L_ | r/w | Previous machine privilege level, 11 = machine (M) level, 00 = user (U) level
| 7 | _CSR_MSTATUS_MPIE_ | r/w | Previous machine global interrupt enable flag state
| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag
233,7 → 228,8
|=======================
 
[TIP]
Information regarding the implemented RISC-V `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found in the <<_mzext>> CSR.
Information regarding the implemented RISC-V `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register.
 
 
:sectnums!:
512,16 → 508,16
[IMPORTANT]
If _CPU_CNT_WIDTH_ is less than 64 (the default value) and greater than or equal 32, the according
MSBs of `[m]cycleh` and `[m]instreth` are read-only and always read as zero. This configuration
will also set the _ZXSCNT_ flag in the <<_mzext>> CSR. +
will also set the _SYSINFO_CPU_ZXSCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. +
+
If _CPU_CNT_WIDTH_ is less than 32 and greater than 0, the `[m]cycleh` and `[m]instreth` do not
exist and any access will raise an illegal instruction exception. Furthermore, the according MSBs of
`[m]cycle` and `[m]instret` are read-only and always read as zero. This configuration will also
set the _ZXSCNT_ flag in the <<_mzext>> CSR. +
set the _SYSINFO_CPU_ZXSCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register. +
+
If _CPU_CNT_WIDTH_ is 0, <<_cycleh>> and <<_instreth>> / <<_mcycleh>> and <<_minstreth>> do not
exist and any access will raise an illegal instruction exception. This configuration will also set the
_ZXNOCNT_ flag in the <<_mzext>> CSR.
_SYSINFO_CPU_ZXNOCNT_ flag in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register.
 
 
:sectnums!:
782,39 → 778,3
Software can traverse this data structure to discover information about the harts, the platform, and their configuration.
**NOTE: Not assigned yet.**
|======
 
 
 
<<<
// ####################################################################################################################
:sectnums:
==== NEORV32-Specific Custom CSRs
 
 
:sectnums!:
===== **`mzext`**
 
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0xfc0 | **Available Z* extensions** | `mzext`
3+| Reset value: _0x00000000_
3+| The `mzext` CSR is a custom read-only CSR that shows the implemented Z* extensions. The following bits
are implemented (all remaining bits are always zero). The entire CSR is read-only.
|======
 
.Machine counter-inhibit register
[cols="^1,<3,^1,<5"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event
| 0 | _CPU_MZEXT_ZICSR_ | r/- | `Zicsr` extensions available (enabled via <<_cpu_extension_riscv_zicsr>> generic)
| 1 | _CPU_MZEXT_ZIFENCEI_ | r/- | `Zifencei` extensions available (enabled via <<_cpu_extension_riscv_zifencei>> generic)
| 2 | _CPU_MZEXT_ZMMUL_ | r/- | `Zmmul` extensions available (enabled via <<_cpu_extension_riscv_zmmul>> generic)
| 5 | _CPU_MZEXT_ZFINX_ | r/- | `Zfinx` extensions available (enabled via <<_cpu_extension_riscv_zfinx>> generic)
| 6 | _CPU_MZEXT_ZXSCNT_ | r/- | custom extension: "Small CPU counters": `cycle[h]` & `instret[h]` CSRs have less than 64-bit when set (when <<_cpu_cnt_width>> generic is less than 64)
| 7 | _CPU_MZEXT_ZXNOCNT_ | r/- | custom extension: "NO CPU counters": `cycle[h]` & `instret[h]` CSRs are not available at all when set (when <<_cpu_cnt_width>> generic is 0)
| 8 | _CSR_MZEXT_PMP_ | r/- | PMP (physical memory protection) extension available (<<_pmp_num_regions>> generic > 0)
| 9 | _CSR_MZEXT_HPM_ | r/- | HPM (hardware performance monitors) extension available (<<_hpm_num_cnts>> generic > 0)
| 10 | _CSR_MZEXT_DEBUGMODE_ | r/- | RISC-V "CPU debug mode" extension available (enabled via <<_cpu_top_entity_generics,_CPU_EXTENSION_RISCV_DEBUG_>> generic)
|=======================
/datasheet/overview.adoc
144,41 → 144,41
=== Project Folder Structure
 
...................................
neorv32 - Project home folder
neorv32 - Project home folder
│
├docs - Project documentation
│├datasheet - .adoc sources for NEORV32 data sheet
│├doxygen_build - Software framework documentation (generated by doxygen)
│├figures - Figures and logos
│├icons - Misc. symbols
│├references - Data sheets and RISC-V specs.
│└src_adoc - AsciiDoc sources for this document
├docs - Project documentation
│├datasheet - .adoc sources for NEORV32 data sheet
│├doxygen_build - Software framework documentation (generated by doxygen)
│├figures - Figures and logos
│├icons - Misc. symbols
│├references - Data sheets and RISC-V specs.
│└src_adoc - AsciiDoc sources for this document
│
├rtl - VHDL sources
│├core - Core sources of the CPU & SoC
│└templates - Alternate/additional top entities & wrappers
│ ├processor - Processor SoC wrappers
│ └system - System wrappers for advanced connectivity
├rtl - VHDL sources
│├core - Core sources of the CPU & SoC
│├processor_templates - Pre-configured SoC wrappers
│├system_integration - System wrappers for advanced connectivity
│└test_setups - Minimal test setup "SoCs" used in the User Guide
│
├setups - Example setups for various FPGAs, boards and toolchains
├setups - Example setups for various FPGAs, boards and toolchains
│└...
│
├sim - Simulation files (see User Guide)
├sim - Simulation files (see User Guide)
│
â””sw - Software framework
├bootloader - Sources and scripts for the NEORV32 internal bootloader
├common - Linker script and crt0.S start-up code
├example - Various example programs
â””sw - Software framework
├bootloader - Sources and scripts for the NEORV32 internal bootloader
├common - Linker script and crt0.S start-up code
├example - Various example programs
│└...
├isa-test
│├riscv-arch-test - RISC-V spec. compatibility test framework (submodule)
│└port-neorv32 - Port files for the official RISC-V architecture tests
├ocd_firmware - source code for on-chip debugger's "park loop"
├openocd - OpenOCD on-chip debugger configuration files
├image_gen - Helper program to generate NEORV32 executables
â””lib - Processor core library
├include - Header files (*.h)
â””source - Source files (*.c)
│├riscv-arch-test - RISC-V spec. compatibility test framework (submodule)
│└port-neorv32 - Port files for the official RISC-V architecture tests
├ocd_firmware - source code for on-chip debugger's "park loop"
├openocd - OpenOCD on-chip debugger configuration files
├image_gen - Helper program to generate NEORV32 executables
â””lib - Processor core library
├include - Header files (*.h)
â””source - Source files (*.c)
...................................
 
 
203,6 → 203,7
│
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
││├neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.)
││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.)
││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor
275,7 → 276,8
[TIP]
The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internal
counter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware).
See section <<_processor_top_entity_generics>> for more information.
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].
 
 
:sectnums:
335,6 → 337,10
This benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the whole
system. The according sources can be found in the `sw/example/coremark` folder.
 
.Dhrystone
[TIP]
A _simple_ port of the Dhrystone benchmark is also available in `sw/example/dhrystone`.
 
The resulting CoreMark score is defined as CoreMark iterations per second.
The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score is
defined as CoreMark score divided by the CPU's clock frequency in MHz.
/datasheet/soc.adoc
131,18 → 131,20
[TIP]
The NEORV32 generics allow to configure the system according to your needs. The generics are
used to control implementation of certain CPU extensions and peripheral modules and even allow to
optimize the system for certain design goals like minimal area or maximum performance.
optimize the system for certain design goals like minimal area or maximum performance. +
**More information can be found in the user guides' section
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration]**.
 
[TIP]
Privileged software can determine the actual CPU and processor configuration via the `misa` and
`mzext` (see <<_machine_trap_setup>> and <<_neorv32_specific_custom_csrs>>) CSRs and via the memory-mapped _SYSINFO_ module (see <<_system_configuration_information_memory_sysinfo>>),
respectively.
Privileged software can determine the actual CPU and processor configuration via the `misa` and the
i_SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register.
 
[TIP]
If optional modules (like CPU extensions or peripheral devices) are *not enabled* the according circuitry **will not be synthesized at all**.
Hence, the disabled modules do not increase area and power requirements and do not impact the timing.
[NOTE]
If optional modules (like CPU extensions or peripheral devices) are *not enabled* the according circuitry
**will not be synthesized at all**. Hence, the disabled modules do not increase area and power requirements
and do not impact the timing.
 
[TIP]
[NOTE]
Not all configuration combinations are valid. The processor RTL code provides sanity checks to inform the user
during synthesis/simulation if an invalid combination has been detected.
 
172,7 → 174,8
[frame="all",grid="none"]
|======
| **CLOCK_FREQUENCY** | _natural_ | _none_
3+| The clock frequency of the processor's `clk_i` input port in Hertz (Hz).
3+| The clock frequency of the processor's `clk_i` input port in Hertz (Hz). This value can be retrieved by software
from the <<_system_configuration_information_memory_sysinfo, SYSINFO>> module.
|======
 
 
190,17 → 193,6
 
 
:sectnums!:
===== _USER_CODE_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **USER_CODE** | _std_ulogic_vector(31 downto 0)_ | x"00000000"
3+| Custom user code that can be read by software via the _SYSINFO_ module.
|======
 
 
:sectnums!:
===== _HW_THREAD_ID_
 
[cols="4,4,2"]
207,7 → 199,8
[frame="all",grid="none"]
|======
| **HW_THREAD_ID** | _natural_ | 0
3+| The hart ID of the CPU. Can be read via the `mhartid` CSR. Hart IDs must be unique within a system.
3+| The hart ID of the CPU. Software can retrieve this value from the `mhartid` CSR.
Note that hart IDs must be unique within a system.
|======
 
 
218,7 → 211,8
[frame="all",grid="none"]
|======
| **ON_CHIP_DEBUGGER_EN** | _boolean_ | false
3+| Implement on-chip debugger (OCD). See chapter <<_on_chip_debugger_ocd>>.
3+| Implement the on-chip debugger (OCD) and the CPU debug mode.
See chapter <<_on_chip_debugger_ocd>> for more information.
|======
 
 
226,7 → 220,10
:sectnums:
==== RISC-V CPU Extensions
 
See section <<_instruction_sets_and_extensions>> for more information.
[TIP]
See section <<_instruction_sets_and_extensions>> for more information. The configuration of the RISC-V _main_ ISA extensions
(like `M`) can be determined via the <<_misa>> CSR. The configuration of ISA _sub-extensions_ (like `Zicsr`) and _extension options_
can be determined via memory-mapped registers of the <<_system_configuration_information_memory_sysinfo>> module.
 
 
:sectnums!:
248,8 → 245,8
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_C** | _boolean_ | false
3+| Implement compressed instructions (16-bit) when _true_.
See section <<_c_compressed_instructions>>.
3+| Implement compressed instructions (16-bit) when _true_. Compressed instructions can reduce program code
size by approx. 30%. See section <<_c_compressed_instructions>>.
|======
 
 
260,8 → 257,9
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_E** | _boolean_ | false
3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_.
See section <<_e_embedded_cpu>>.
3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_. This reduces embedded memory
requirements for the register file. See section <<_e_embedded_cpu>> for more information. Note that this RISC-V extensions
requires a different application binary interface (ABI).
|======
 
 
272,8 → 270,11
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_M** | _boolean_ | false
3+| Implement integer multiplication and division instructions when _true_.
See section <<_m_integer_multiplication_and_division>>.
3+| Implement hardware accelerators for integer multiplication and division instructions when _true_.
If this extensions is not enabled, multiplication and division operations (_not_ instructions) will be computed entirely in software.
If only a hardware multiplier is required use the <<_cpu_extension_riscv_zmmul>> extension. Multiplication can also be mapped
to DSP slices via the <<_fast_mul_en>> generic.
See section <<_m_integer_multiplication_and_division>> for more information.
|======
 
 
285,11 → 286,23
|======
| **CPU_EXTENSION_RISCV_U** | _boolean_ | false
3+| Implement less-privileged user mode when _true_.
See section <<_u_less_privileged_user_mode>>.
See section <<_u_less_privileged_user_mode>> for more information.
|======
 
 
:sectnums!:
===== _CPU_EXTENSION_RISCV_Zbb_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zbb** | _boolean_ | false
3+| Implement the `Zbb` _basic_ bit-manipulation sub-extension when _true_.
See section <<_zbb_basic_bit_manipulation_operations>> for more information.
|======
 
 
:sectnums!:
===== _CPU_EXTENSION_RISCV_Zfinx_
 
[cols="4,4,2"]
297,7 → 310,7
|======
| **CPU_EXTENSION_RISCV_Zfinx** | _boolean_ | false
3+| Implement the 32-bit single-precision floating-point extension (using integer registers) when _true_.
See section <<_zfinx_single_precision_floating_point_operations>>.
See section <<_zfinx_single_precision_floating_point_operations>> for more information.
|======
 
 
311,7 → 324,7
3+| Implement the control and status register (CSR) access instructions when true. Note: When this option is
disabled, the complete privileged architecture / trap system will be excluded from synthesis. Hence, no interrupts, no exceptions and
no machine information will be available.
See section <<_zicsr_control_and_status_register_access_privileged_architecture>>.
See section <<_zicsr_control_and_status_register_access_privileged_architecture>> for more information.
|======
 
 
323,8 → 336,8
|======
| **CPU_EXTENSION_RISCV_Zifencei** | _boolean_ | false
3+| Implement the instruction fetch synchronization instruction `fence.i`. For example, this option is required
for self-modifying code (and/or for i-cache flushes).
See section <<_zifencei_instruction_stream_synchronization>>.
for self-modifying code (and/or for instruction cache and CPU prefetch buffer flushes).
See section <<_zifencei_instruction_stream_synchronization>> for more information.
|======
 
 
335,8 → 348,8
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zmmul** | _boolean_ | false
3+| Implement integer multiplication-only instructions when _true_. This is a sub-extensions of the `M` extension.
See section <<_zmmul_integer_multiplication>>.
3+| Implement integer multiplication-only instructions when _true_. This is a sub-extension of the `M` extension, which
cannot be used together with the `M` extension. See section <<_zmmul_integer_multiplication>> for more information.
|======
 
 
354,9 → 367,11
[frame="all",grid="none"]
|======
| **FAST_MUL_EN** | _boolean_ | false
3+| When this generic is enabled, the multiplier of the `M` extension is realized using DSPs blocks instead of an
iterative bit-serial approach. This generic is only relevant when the multiplier and divider CPU extension is
enabled (<<_cpu_extension_riscv_m>> is _true_).
3+| When this generic is enabled, the multiplier of the `M` extension is implemented using DSPs blocks instead of an
iterative bit-serial approach. Performance will be increased and LUT utilization will be reduced at the cost of DSP slice
utilization. This generic is only relevant when a hardware multiplier CPU extension is
enabled (<<_cpu_extension_riscv_m>> or <<_cpu_extension_riscv_zmmul>> is _true_). **Note that the multipliers of the
<<_zfinx_single_precision_floating_point_operations>> extension are always mapped to DSP block (if available).**
|======
 
 
367,9 → 382,11
[frame="all",grid="none"]
|======
| **FAST_SHIFT_EN** | _boolean_ | false
3+| When this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring
more hardware resources). If it is set _false_ the CPU uses a serial shifter that only performs a single bit shift per cycle
(small but slow).
3+| If this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring
more hardware resources but completing within two clock cycles). If it is set _false_, the CPU uses a serial shifter
that only performs a single bit shift per cycle (requiring less hardware resources, but requires up to 32 clock
cycles to complete - depending on shift amount). **Note that this option also implements barrel shifters for _all_
shift-related operations of the <<_zbb_basic_bit_manipulation_operations>> extension.**
|======
 
 
381,9 → 398,8
|======
| **CPU_CNT_WIDTH** | _natural_ | 64
3+| This generic configures the total size of the CPU's `cycle` and `instret` CSRs (low word + high word).
The maximum value is 64, the minimum value is 0. See
section <<_machine_counters_and_timers>> for more information. Note: configurations with <<_cpu_cnt_width>>
less than 64 bits do not comply to the RISC-V specs.
The maximum value is 64, the minimum value is 0. See section <<_machine_counters_and_timers>> for more information.
Note: configurations with <<_cpu_cnt_width>> less than 64 bits do not comply to the RISC-V specs.
|======
 
 
396,8 → 412,7
| **CPU_IPB_ENTRIES** | _natural_ | 2
3+| This generic configures the number of entries in the CPU's instruction prefetch buffer (a FIFO).
The value has to be a power of two and has to be greater than zero.
Long linear sequences of code can benefit from an increased IPB size. For setups that use the instruction
cache (<<_icache_en>>) this generic should be set to 1.
Long linear sequences of code can benefit from an increased IPB size.
|======
 
 
416,8 → 431,8
|======
| **PMP_NUM_REGIONS** | _natural_ | 0
3+| Total number of implemented protections regions (0..64). If this generics is zero no physical memory
protection logic will be implemented at all. Setting <<_pmp_num_regions>>_ > 0 will set the _CSR_MZEXT_PMP_ flag
in the <<_mzext>> CSR.
protection logic will be implemented at all. Setting <<_pmp_num_regions>>_ > 0 will set the _SYSINFO_CPU_PMP_ flag
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register.
|======
 
 
446,9 → 461,9
[frame="all",grid="none"]
|======
| **HPM_NUM_CNTS** | _natural_ | 0
3+| Total number of implemented hardware performance monitor counters (0..29). If this generics is zero no
hardware performance monitor logic will be implemented at all. Setting <<_hpm_num_cnts>> > 0 will set the _CSR_MZEXT_HPM_ flag
in the <<_mzext>> CSR.
3+| Total number of implemented hardware performance monitor counters (0..29). If this generics is zero, no
hardware performance monitor logic will be implemented at all. Setting <<_hpm_num_cnts>> > 0 will set the _SYSINFO_CPU_HPM_ flag
in the _SYSINFO_CPU_ <<_system_configuration_information_memory_sysinfo, SYSINFO>> register.
|======
 
 
459,8 → 474,8
[frame="all",grid="none"]
|======
| **HPM_CNT_WIDTH** | _natural_ | 40
3+| This generic defines the total LSB-aligned size of each HPM counter (size(`[m]hpmcounter*h`) +
size(`[m]hpmcounter*`)). The maximum value is 64, the minimal is 0. If the size is less than 64-bit, the
3+| This generic defines the total LSB-aligned size of each HPM counter (`size([m]hpmcounter*h)` +
`size([m]hpmcounter*)`). The maximum value is 64, the minimal is 0. If the size is less than 64-bit, the
unused MSB-aligned counter bits are hardwired to zero.
|======
 
490,7 → 505,7
[frame="all",grid="none"]
|======
| **MEM_INT_IMEM_SIZE** | _natural_ | 16*1024
3+| Size in bytes of the processor internal instruction memory (IMEM). Has no effect when _MEM_INT_IMEM_EN_ is _false_.
3+| Size in bytes of the processor internal instruction memory (IMEM). Has no effect when <<_mem_int_imem_en>> is _false_.
|======
 
 
519,7 → 534,7
[frame="all",grid="none"]
|======
| **MEM_INT_DMEM_SIZE** | _natural_ | 8*1024
3+| Size in bytes of the processor-internal data memory (DMEM). Has no effect when _MEM_INT_DMEM_EN_ is _false_.
3+| Size in bytes of the processor-internal data memory (DMEM). Has no effect when <<_mem_int_dmem_en>> is _false_.
|======
 
 
537,7 → 552,8
[frame="all",grid="none"]
|======
| **ICACHE_EN** | _boolean_ | false
3+| Implement processor internal instruction cache when _true_.
3+| Implement processor internal instruction cache when _true_. Note: if the setup only uses processor-internal data
and instruction memories there is not point of implementing the i-cache.
|======
 
 
549,7 → 565,7
|======
| **ICACHE_NUM_BLOCKS** | _natural_ | 4
3+| Number of blocks (cache "pages" or "lines") in the instruction cache. Has to be a power of two. Has no
effect when _ICACHE_DMEM_EN_ is false.
effect when <<_icache_dmem_en>> is false.
|======
 
 
561,7 → 577,7
|======
| **ICACHE_BLOCK_SIZE** | _natural_ | 64
3+| Size in bytes of each block in the instruction cache. Has to be a power of two. Has no effect when
_ICACHE_EN_ is _false_.
<<_icache_dmem_en>> is _false_.
|======
 
 
573,7 → 589,7
|======
| **ICACHE_ASSOCIATIVITY** | _natural_ | 1
3+| Associativity (= number of sets) of the instruction cache. Has to be a power of two. Allowed configurations:
`1` = 1 set, direct mapped; `2` = 2-way set-associative. Has no effect when _ICACHE_EN_ is _false_.
`1` = 1 set, direct mapped; `2` = 2-way set-associative. Has no effect when <<_icache_dmem_en>> is _false_.
|======
 
 
602,7 → 618,8
[frame="all",grid="none"]
|======
| **MEM_EXT_TIMEOUT** | _natural_ | 255
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception. Set to 0 to disable auto-timeout.
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception.
If set to zero, there will be no auto-timeout and no bus fault exception (might permanently stall system!).
|======
 
 
613,7 → 630,8
[frame="all",grid="none"]
|======
| **MEM_EXT_PIPE_MODE** | _boolean_ | false
3+| Use _standard_ ("classic") Wishbone protocol for external bus when _false_; use _pipelined_ Wishbone protocol when _true_.
3+| Use _standard_ ("classic") Wishbone protocol for external bus when _false_.
Use _pipelined_ Wishbone protocol when _true_.
|======
 
 
624,7 → 642,7
[frame="all",grid="none"]
|======
| **MEM_EXT_BIG_ENDIAN** | _boolean_ | false
3+| Use BIG endian interface for external bus when _true_; use little endian interface when _false_.
3+| Use BIG endian interface for external bus when _true_. Use little endian interface when _false_.
|======
 
 
637,7 → 655,7
| **MEM_EXT_ASYNC_RX** | _boolen_ | false
3+| By default, _MEM_EXT_ASYNC_RX_ = _false_ implements a registered read-back path (RX) for incoming data in the bus interface
in order to shorten the critical path. By setting _MEM_EXT_ASYNC_RX_ = _true_ an _asynchronous_ ("direct") read-back path is
implemented reducing access latency by one cycle.
implemented reducing access latency by one cycle but eventually increasing the critical path.
|======
 
 
718,7 → 736,7
|======
| **XIRQ_TRIGGER_TYPE** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF
3+| Interrupt trigger type configuration (one bit for each IRQ channel): `0` = level-triggered, '1' = edge triggered.
_XIRQ_TRIGGER_POLARITY_ generic is used to specify the actual level (high/low) or edge (falling/rising).
<<_xirq_trigger_polarity>> generic is used to specify the actual level (high/low) or edge (falling/rising).
|======
 
 
730,7 → 748,7
|======
| **XIRQ_TRIGGER_POLARITY** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF
3+| Interrupt trigger polarity configuration (one bit for each IRQ channel): `0` = low-level/falling-edge,
'1' = high-level/rising-edge. _XIRQ_TRIGGER_TYPE_ generic is used to specify the actual type (level or edge).
'1' = high-level/rising-edge. <<_xirq_trigger_type>> generic is used to specify the actual type (level or edge).
|======
 
 
/datasheet/soc_sysinfo.adoc
26,41 → 26,82
[options="header",grid="all"]
|=======================
| Address | Name [C] | Function
| `0xffffffe0` | _SYSINFO_CLK_ | clock speed in Hz (via top's _CLOCK_FREQUENCY_ generic)
| `0xffffffe4` | _SYSINFO_USER_CODE_ | custom user code, assigned via top's _USER_CODE_ generic
| `0xffffffe8` | _SYSINFO_FEATURES_ | specific hardware configuration (see next table)
| `0xffffffec` | _SYSINFO_CACHE_ | cache configuration information (see next table)
| `0xfffffff0` | _SYSINFO_ISPACE_BASE_ | instruction address space base (defined via `ispace_base_c` constant in the `neorv32_package.vhd` file)
| `0xfffffff4` | _SYSINFO_IMEM_SIZE_ | internal IMEM size in bytes (defined via top's _MEM_INT_IMEM_SIZE_ generic)
| `0xfffffff8` | _SYSINFO_DSPACE_BASE_ | data address space base (defined via `sdspace_base_c` constant in the `neorv32_package.vhd` file)
| `0xfffffffc` | _SYSINFO_DMEM_SIZE_ | internal DMEM size in bytes (defined via top's _MEM_INT_DMEM_SIZE_ generic)
| `0xffffffe0` | _SYSINFO_CLK_ | clock speed in Hz (via top's <<_clock_frequency>> generic)
| `0xffffffe4` | _SYSINFO_CPU_ | specific CPU configuration (see <<_sysinfo_cpu_configuration>>)
| `0xffffffe8` | _SYSINFO_FEATURES_ | specific SoC configuration (see <<_sysinfo_soc_configuration>>)
| `0xffffffec` | _SYSINFO_CACHE_ | cache configuration information (see <<_sysinfo_cache_configuration>>)
| `0xfffffff0` | _SYSINFO_ISPACE_BASE_ | instruction address space base (via package's `ispace_base_c` constant)
| `0xfffffff4` | _SYSINFO_IMEM_SIZE_ | internal IMEM size in bytes (via top's <<_mem_int_imem_size>> generic)
| `0xfffffff8` | _SYSINFO_DSPACE_BASE_ | data address space base (via package's `sdspace_base_c` constant)
| `0xfffffffc` | _SYSINFO_DMEM_SIZE_ | internal DMEM size in bytes (via top's <<_mem_int_dmem_size>> generic)
|=======================
 
 
===== SYSINFO - CPU Configuration
 
._SYSINFO_CPU_ bits
[cols="^1,<10,<11"]
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `0` | _SYSINFO_CPU_ZICSR_ | `Zicsr` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zicsr>> generic)
| `1` | _SYSINFO_CPU_ZIFENCEI_ | `Zifencei` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zifencei>> generic)
| `2` | _SYSINFO_CPU_ZMMUL_ | `Zmmul` extension (`M` sub-extension) available when set (via top's <<_cpu_extension_riscv_zmmul>> generic)
| `3` | _SYSINFO_CPU_ZBB_ | `Zbb` extension (`B` sub-extension) available when set (via top's <<_cpu_extension_riscv_zbb>> generic)
| `5` | _SYSINFO_CPU_ZFINX_ | `Zfinx` extension (`F` sub-/alternative-extension) available when set (via top's <<_cpu_extension_riscv_zfinx>> generic)
| `6` | _SYSINFO_CPU_ZXSCNT_ | Custom extension - _Small_ CPU counters: `[m]cycle` & `[m]instret` CSRs have less than 64-bit when set (via top's <<_cpu_cnt_width>> generic)
| `7` | _SYSINFO_CPU_ZXNOCNT_ | Custom extension - _NO_ CPU counters: `[m]cycle` & `[m]instret` CSRs are NOT available at all when set (via top's <<_cpu_cnt_width>> generic)
| `8` | _SYSINFO_CPU_PMP_ | `PMP` (physical memory protection) extension available when set (via top's <<_>> generic)
| `9` | _SYSINFO_CPU_HPM_ | `HPM` (hardware performance monitors) extension available when set (via top's <<_>> generic)
| `10` | _SYSINFO_CPU_DEBUGMODE_ | RISC-V CPU `debug_mode` available when set (via top's <<_>> generic)
| `30 | _SYSINFO_CPU_FASTMUL_ | fast multiplication available when set (via top's <<_fast_mul_en>> generic)
| `31` | _SYSINFO_CPU_FASTSHIFT_ | fast shifts available when set (via top's <<_fast_shift_en>> generic)
|=======================
 
 
===== SYSINFO - SoC Configuration
 
._SYSINFO_FEATURES_ bits
[cols="^1,<10,<11"]
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's _INT_BOOTLOADER_EN_ generic)
| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's _MEM_EXT_EN_ generic)
| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's _MEM_INT_DMEM_EN_ generic)
| `3` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's _MEM_INT_IMEM_EN_ generic)
| `4` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via top's _MEM_EXT_BIG_ENDIAN_ generic)
| `5` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via _ICACHE_EN_ generic)
| `14` | _SYSINFO_FEATURES_HW_RESET_ | set if on-chip debugger implemented (via _ON_CHIP_DEBUGGER_EN_ generic)
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant)
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant)
| `16` | _SYSINFO_FEATURES_IO_GPIO_ | set if the GPIO is implemented (via top's _IO_GPIO_EN_ generic)
| `17` | _SYSINFO_FEATURES_IO_MTIME_ | set if the MTIME is implemented (via top's _IO_MTIME_EN_ generic)
| `18` | _SYSINFO_FEATURES_IO_UART0_ | set if the primary UART0 is implemented (via top's _IO_UART0_EN_ generic)
| `19` | _SYSINFO_FEATURES_IO_SPI_ | set if the SPI is implemented (via top's _IO_SPI_EN_ generic)
| `20` | _SYSINFO_FEATURES_IO_TWI_ | set if the TWI is implemented (via top's _IO_TWI_EN_ generic)
| `21` | _SYSINFO_FEATURES_IO_PWM_ | set if the PWM is implemented (via top's _IO_PWM_EN_ generic)
| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's _IO_WDT_EN_ generic)
| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's _IO_CFS_EN_ generic)
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's <<_int_bootloader_en>> generic)
| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's <<_mem_ext_en>> generic)
| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's <<_mem_int_dmem_en>> generic)
| `3` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's <<_mem_int_imem_en>> generic)
| `4` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via top's <<_mem_ext_big_endian>> generic)
| `5` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via top's <<_icache_en>> generic)
| `14` | _SYSINFO_FEATURES_HW_RESET_ | set if on-chip debugger implemented (via top's <<_on_chip_debugger_en>> generic)
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's `dedicated_reset_c` constant)
| `16` | _SYSINFO_FEATURES_IO_GPIO_ | set if the GPIO is implemented (via top's <<_io_gpio_en>> generic)
| `17` | _SYSINFO_FEATURES_IO_MTIME_ | set if the MTIME is implemented (via top's <<_io_mtime_en>> generic)
| `18` | _SYSINFO_FEATURES_IO_UART0_ | set if the primary UART0 is implemented (via top's <<_io_uart0_en>> generic)
| `19` | _SYSINFO_FEATURES_IO_SPI_ | set if the SPI is implemented (via top's <<_io_spi_en>> generic)
| `20` | _SYSINFO_FEATURES_IO_TWI_ | set if the TWI is implemented (via top's <<_io_twi_en>> generic)
| `21` | _SYSINFO_FEATURES_IO_PWM_ | set if the PWM is implemented (via top's <<_io_pwm_en>> generic)
| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's <<_io_wdt_en>> generic)
| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's <<_io_cfs_en>> generic)
| `24` | _SYSINFO_FEATURES_IO_TRNG_ | set if the TRNG is implemented (via top's _IO_TRNG_EN_ generic)
| `25` | _SYSINFO_FEATURES_IO_SLINK_ | set if the SLINK is implemented (via top's _SLINK_NUM_TX_ / _SLINK_NUM_RX_ generics)
| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's _IO_UART1_EN_ generic)
| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's _IO_NEOLED_EN_ generic)
| `25` | _SYSINFO_FEATURES_IO_SLINK_ | set if the SLINK is implemented (via top's <<_slink_num_tx>> and/or <<_slink_num_rx>> generics)
| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's <<_io_uart1_en>> generic)
| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's <<_io_neoled_en>> generic)
|=======================
 
 
===== SYSINFO - Cache Configuration
 
[NOTE]
Bit fields in this register are set to all-zero if the according cache is not implemented.
 
._SYSINFO_CACHE_ bits
[cols="^1,<10,<11"]
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `3:0` | _SYSINFO_CACHE_IC_BLOCK_SIZE_3_ : _SYSINFO_CACHE_IC_BLOCK_SIZE_0_ | _log2_(i-cache block size in bytes), via top's <<_icache_block_size>> generic
| `7:4` | _SYSINFO_CACHE_IC_NUM_BLOCKS_3_ : _SYSINFO_CACHE_IC_NUM_BLOCKS_0_ | _log2_(i-cache number of cache blocks), via top's <<_icache_num_blocks>> generic
| `11:9` | _SYSINFO_CACHE_IC_ASSOCIATIVITY_3_ : _SYSINFO_CACHE_IC_ASSOCIATIVITY_0_ | _log2_(i-cache associativity), via top's <<_icache_associativity>> generic
| `15:12` | _SYSINFO_CACHE_IC_REPLACEMENT_3_ : _SYSINFO_CACHE_IC_REPLACEMENT_0_ | i-cache replacement policy (`0001` = LRU if associativity > 0)
| `32:16` | - | zero, reserved for d-cache
|=======================
/datasheet/soc_wishbone.adoc
133,7 → 133,7
 
**AXI4-Lite Connectivity**
 
The AXI4-Lite wrapper (`rtl/templates/system/neorv32_SystemTop_axi4lite.vhd`) provides a Wishbone-to-
The AXI4-Lite wrapper (`rtl/system_integration/neorv32_SystemTop_axi4lite.vhd`) provides a Wishbone-to-
AXI4-Lite bridge, compatible with Xilinx Vivado (IP packager and block design editor). All entity signals of
this wrapper are of type _std_logic_ or _std_logic_vector_, respectively.
 
145,4 → 145,4
 
[WARNING]
Using the auto-termination timeout feature (_MEM_EXT_TIMEOUT_ greater than zero) is **not AXI4 compliant** as the AXI protocol does not support canceling of
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/templates/system/neorv32_SystemTop_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default.
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/system_integration/neorv32_SystemTop_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default.
/datasheet/software.adoc
124,6 → 124,7
exe - compile and generate <neorv32_exe.bin> executable for upload via bootloader
hex - compile and generate <neorv32_exe.hex> executable raw file
install - compile, generate and install VHDL IMEM boot image (for application)
sim - in-console simulation using the default testbench and GHDL
all - exe + hex + install
elf_info - show ELF layout info
clean - clean up project
457,8 → 458,8
| `HWV` | Processor hardware version (from the `mimpid` CSR) in BCD format (example: `0x01040606` = v1.4.6.6).
| `CLK` | Processor clock speed in Hz (via the SYSINFO module, from the _CLOCK_FREQUENCY_ generic).
| `MISA` | CPU extensions (from the `misa` CSR).
| `ZEXT` | CPU sub-extensions (from the `mzext` CSR)
| `PROC` | Processor configuration (via the SYSINFO module, from the IO_* and MEM_* configuration generics).
| `ZEXT` | CPU sub-extensions (via the _SYSINFO_CPU_ register in the SYSINFO module)
| `PROC` | Processor configuration (via the _SYSINFO_FEATURES_ register in the SYSINFO module / from the IO_* and MEM_* configuration generics).
| `IMEM` | IMEM memory base address and size in byte (from the _MEM_INT_IMEM_SIZE_ generic).
| `DMEM` | DMEM memory base address and size in byte (from the _MEM_INT_DMEM_SIZE_ generic).
|=======================
/figures/neorv32_cpu.png Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
/figures/neorv32_processor.png Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
/references/bitmanip-draft.pdf Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
/references/riscv-privileged.pdf Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
/references/riscv-spec.pdf Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
/userguide/content.adoc
1,8 → 1,16
Let's Get It Started!
 
To make your NEORV32 project run, follow the guides from the upcoming sections. Follow these guides
step by step and in the presented order.
This user guide uses the NEORV32 project _as is_ from the official `neorv32` repository.
To make your first NEORV32 project run, follow the guides from the upcoming sections. It is recommended to
follow these guides step by step and eventually in the presented order.
 
[TIP]
This guide uses the minimalistic and platform/toolchain agnostic SoC test setups from
`rtl/test_setups` for illustration. You can use one of the provided test setups for
your first FPGA tests. Alternatively, have a look at the `setups` folder,
which provides more sophisticated example setups for various FPGAs/FPGA boards and toolchains.
 
 
:sectnums:
== Software Toolchain Setup
 
9,20 → 17,15
To compile (and debug) executables for the NEORV32 a RISC-V toolchain is required.
There are two possibilities to get this:
 
1. Download and _build_ the official RISC-V GNU toolchain yourself
1. Download and _build_ the official RISC-V GNU toolchain yourself.
2. Download and install a prebuilt version of the toolchain; this might also done via the package manager / app store of your OS
 
[TIP]
The default toolchain prefix for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change the _RISCV_PREFIX_ variable in the application
makefile(s) according to your needs or define this variable when invoking the makefile.
[NOTE]
The default toolchain prefix (`RISCV_PREFIX` variable) for this project is **`riscv32-unknown-elf-`**. Of course you can use any other RISC-V
toolchain (like `riscv64-unknown-elf-`) that is capable to emit code for a `rv32` architecture. Just change `RISCV_PREFIX`
according to your needs.
 
[IMPORTANT]
Keep in mind that – for instance – a rv32imc toolchain only provides library code compiled with
compressed (_C_) and `mul`/`div` instructions (_M_)! Hence, this code cannot be executed (without
emulation) on an architecture without these extensions!
 
 
:sectnums:
=== Building the Toolchain from Scratch
 
39,7 → 42,12
riscv-gnu-toolchain$ make
----
 
[IMPORTANT]
Keep in mind that – for instance – a toolchain build with `--with-arch=rv32imc` only provides library code compiled with
compressed (`C`) and `mul`/`div` instructions (`M`)! Hence, this code cannot be executed (without
emulation) on an architecture without these extensions!
 
 
:sectnums:
=== Downloading and Installing a Prebuilt Toolchain
 
103,25 → 111,45
:sectnums:
== General Hardware Setup
 
This guide will setup a NEORV32 project for FPGA implementation (or simulation only) _from scratch_
This guide shows the basics of setting up a NEORV32 project for FPGA implementation (or simulation only)
_from scratch_. It uses a _simplified_ test "SoC" setup of the processor to keeps things simple at the beginning.
This simple setup is intended for evaluation or as "hello world" project to check out the NEORV32
on _your_ FPGA board.
 
[TIP]
If you want to use a complete pre-defined setup to start with, check out the
project's `setups` folder (https://github.com/stnolting/neorv32/tree/master/setups),
which provides (script-based) demo setups for various FPGA boards and toolchains.
If you want to use a more sophisticated pre-defined setup to start with, check out the
`setups` folder, which provides example setups for various FPGA, boards and toolchains.
 
This tutorial uses a _simplified_ test setup of the processor
to keeps things simple at the beginning as this setup is intended as
evaluation or "hello world" project to check out the NEORV32.
The NEORV32 project features two minimalistic pre-configured test setups in
https://github.com/stnolting/neorv32/blob/master/rtl/test_setups[`rtl/test_setups`].
Both test setups only implement very basic processor and CPU features.
The main difference between the two setups is the processor boot concept - so how to get a software executable
_into_ the processor:
 
* **`rtl/test_setups/neorv32_testsetup_approm.vhd`**: this setup does not require a connection via UART. The
software executable is "installed" into the bitstream to initialize a read-only memory. Use this setup
if your FPGA board does _not_ provide a UART interface.
* **`rtl/test_setups/neorv32_testsetup_bootloader.vhd`**: this setups uses the UART and the default NEORV32
bootloader to upload new software executables. Use this setup if your board _does_ provide a UART interface.
 
.NEORV32 "hello world" test setup (`rtl/test_setups/neorv32_testsetup_bootloader.vhd`)
image::neorv32_test_setup.png[align=center]
 
.External Clock Source
[NOTE]
These test setups are intended to be directly used as **design top entity**. Of course you can also instantiate them
into another design unit. If your FPGA board only provides _very fast_ external clock sources (like on the FOMU board)
you might need to add clock management components (PLLs, DCMs, MMCMs, ...) to the test setup or to the according top entity
if you instantiate one of the test setups.
 
[start=1]
. Create a new project with your FPGA EDA tool of choice.
. Add all VHDL files from the project's `rtl/core` folder to your project. Make sure to _reference_ the
files only – do not copy them.
. Add all VHDL files from the project's `rtl/core` folder to your project.
. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not
provide a field to enter the library name, check out the "properties" menu of the added rtl files.
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor. If you
already have a design, instantiate this unit into your design and proceed.
. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor, which can be
instantiated into the "real" project. However, in this tutorial we will use one of the pre-defined
test setups from `rtl/test_setups` (see above).
 
[IMPORTANT]
Make sure to include the `neorv32` package into your design when instantiating the processor: add
128,92 → 156,95
`library neorv32;` and `use neorv32.neorv32_package.all;` to your design unit.
 
[start=5]
. If you do not have a design yet and just want to check out the NEORV32 – no problem! This guide
uses a simplified top entity, that encapsulates the actual processor top entity: add the
`rtl/templates/processor/neorv32_ProcessorTop_Test.vhd` VHDL file to your project, too, and
select it as _top entity_.
. This test setup provides a minimal test hardware setup:
. Add the pre-defined test setup of choice to the project, too, and select it as _top entity_.
. The entity of both test setups
provide a minimal set of configuration generics, that might have to be adapted to match your FPGA and board:
 
.NEORV32 "hello world" test setup
image::neorv32_test_setup.png[align=center]
 
[start=7]
. It only implements some very basic processor and CPU features. Also, only the
minimum number of signals is propagated to the outer world.
. However, a minimal setup-specific configuration of the NEORV32 processor is required to make it run
on your FPGA board of choice. Only the absolutely required modifications will be made while
keeping the default configuration for the remaining configuration options:
 
.Cut-out of `neorv32_ProcessorTop_Test.vhd` showing the processor instance and its configuration
.Test setup entity - configuration generics
[source,vhdl]
----
neorv32_top_inst: neorv32_top
generic map (
-- General --
CLOCK_FREQUENCY => 100000000, -- in Hz # <1>
INT_BOOTLOADER_EN => true,
...
-- Internal instruction memory --
MEM_INT_IMEM_EN => true,
MEM_INT_IMEM_SIZE => 16*1024, # <2>
-- Internal data memory --
MEM_INT_DMEM_EN => true,
MEM_INT_DMEM_SIZE => 8*1024, # <3>
...
generic (
-- adapt these for your setup --
CLOCK_FREQUENCY : natural := 100000000; <1>
MEM_INT_IMEM_SIZE : natural := 16*1024; <2>
MEM_INT_DMEM_SIZE : natural := 8*1024 <3>
);
----
<1> Clock frequency of `clk_i` signal in Hertz
<2> Default size of internal instruction memory: 16kB
<3> Default size of internal data memory: 8kB
 
[start=9]
. There is one generic that has to be set according to your FPGA board setup: the actual clock frequency
of the top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's
frequency in Hertz (Hz) (note "1").
. If you feel like it – or if your FPGA does not provide many resources – you can modify the
**memory sizes** (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ – marked with notes "2" and "3") or even
exclude certain ISA extensions and peripheral modules from implementation - but as mentioned above, let's keep things
simple at first and use the standard configuration for now.
[start=7]
. If you feel like it – or if your FPGA does not provide sufficient resources – you can modify the
_memory sizes_ (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` – marked with notes "2" and "3"). But as mentioned
above, let's keep things simple at first and use the standard configuration for now.
. There is one generic that _has to be set according to your FPGA board_ setup: the actual clock frequency
of the top's clock input signal (`clk_i`). Use the `CLOCK_FREQUENCY` generic to specify your clock source's
frequency in Hertz (Hz).
 
[NOTE]
If you have changed the default memory configuration (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ generics)
If you have changed the default memory configuration (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` generics)
keep those new sizes in mind – these values are required for setting
up the software framework in the next section <<_general_software_framework_setup>>.
 
[start=11]
[start=9]
. Depending on your FPGA tool of choice, it is time to assign the signals of the test setup top entity to
the according pins of your FPGA board. All the signals can be found in the entity declaration:
the according pins of your FPGA board. All the signals can be found in the entity declaration of the
corresponding test setup:
 
.Entity signals of `neorv32_test_setup.vhd`
.Entity signals of `neorv32_testsetup_approm.vhd`
[source,vhdl]
----
entity neorv32_test_setup is
port (
-- Global control --
clk_i : in std_ulogic := '0'; -- global clock, rising edge
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async
clk_i : in std_ulogic; -- global clock, rising edge
rstn_i : in std_ulogic; -- global reset, low-active, async
-- GPIO --
gpio_o : out std_ulogic_vector(7 downto 0) -- parallel output
);
----
 
.Entity signals of `neorv32_testsetup_bootloader.vhd`
[source,vhdl]
----
port (
-- Global control --
clk_i : in std_ulogic; -- global clock, rising edge
rstn_i : in std_ulogic; -- global reset, low-active, async
-- GPIO --
gpio_o : out std_ulogic_vector(7 downto 0); -- parallel output
-- UART0 --
uart0_txd_o : out std_ulogic; -- UART0 send data
uart0_rxd_i : in std_ulogic := '0' -- UART0 receive data
);
end neorv32_test_setup;
uart0_rxd_i : in std_ulogic -- UART0 receive data
);
----
 
[start=12]
.Signal Polarity
[NOTE]
If your FPGA board has inverse polarity for certain input/output you can add `not` gates. Example: The reset signal
`rstn_i` is low-active by default; the LEDs connected to `gpio_o` high-active by default.
You can do this in your board top if you instantiate the test setup,
or _inside_ the test setup if this is your top entity (low-active LEDs example: `gpio_o <= NOT con_gpio_o(7 downto 0);`).
 
[start=10]
. Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of
your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is
**low-active**, so maybe you need to invert the input signal.
. If possible, connected at least bit `0` of the GPIO output port `gpio_o` to a high-active LED (invert
the signal when your LEDs are low-active). This LED will be used as status LED for the setup.
. Finally, if your FPGA board provides a serial host interface (USB-to-serial converter) interface,
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i`.
. If possible, connected _at least_ bit `0` of the GPIO output port `gpio_o` to a LED (see "Signal Polarity" note above).
. Finally, if your are using the UART-based test setup (`neorv32_testsetup_bootloader.vhd`)
connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i` to the host interface (e.g. USB-UART converter).
. Perform the project HDL compilation (synthesis, mapping, bitstream generation).
. Program the generated bitstream into your FPGA and press the button connected to the reset signal.
. Done! The assigned status LED should be flashing now for some sections before permanently lighting up.
. Done! The LED at `gpio_o(0)` should be flashing now.
 
[TIP]
After the GCC toolchain for compiling RISC-V source code is ready (chapter <<_general_software_framework_setup>>),
you can advance to one of these chapters to learn how to get a software executable into your processor setup:
* If you are using the `neorv32_testsetup_approm.vhd` setup: See section <<_installing_an_executable_directly_into_memory>>.
* If you are using the `neorv32_testsetup_bootloader.vhd` setup: See section <<_uploading_and_starting_of_a_binary_executable_image_via_uart>>.
 
 
 
<<<
// ####################################################################################################################
:sectnums:
602,6 → 633,115
<<<
// ####################################################################################################################
:sectnums:
== Application-Specific Processor Configuration
 
Due to the processor's configuration options, which are mainly defined via the top entity VHDL generics, the SoC
can be tailored to the application-specific requirements. Note that this chapter does not focus on optional
_SoC features_ like IO/peripheral modules. It rather gives ideas on how to optimize for _overall goals_
like performance and area.
 
[NOTE]
Please keep in mind that optimizing the design in one direction (like performance) will also effect other potential
optimization goals (like area and energy).
 
=== Optimize for Performance
 
The following points show some concepts to optimize the processor for performance regardless of the costs
(i.e. increasing area and energy requirements):
 
* Enable all performance-related RISC-V CPU extensions that implement dedicated hardware accelerators instead
of emulating operations entirely in software: `M`, `C`, `Zfinx`
* Enable mapping of compleX CPU operations to dedicated hardware: `FAST_MUL_EN => true` to use DSP slices for
multiplications, `FAST_SHIFT_EN => true` use a fast barrel shifter for shift operations.
* Implement the instruction cache: `ICACHE_EN => true`
* Use as many _internal_ memory as possible to reduce memory access latency: `MEM_INT_IMEM_EN => true` and
`MEM_INT_DMEM_EN => true`, maximize `MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE`
* Increase the CPU's instruction prefetch buffer size: `CPU_IPB_ENTRIES`
* _To be continued..._
 
 
=== Optimize for Size
 
The NEORV32 is a size-optimized processor system that is intended to fit into tiny niches within large SoC
designs or to be used a customized microcontroller in really tiny / low-power FPGAs (like Lattice iCE40).
Here are some ideas how to make the processor even smaller while maintaining it's _general purpose system_
concept and maximum RISC-V compatibility.
 
**SoC**
 
* This is obvious, but exclude all unused optional IO/peripheral modules from synthesis via the processor
configuration generics.
* If an IO module provides an option to configure the number of "channels", constrain this number to the
actually required value (e.g. the PWM module `IO_PWM_NUM_CH` or the external interrupt controller `XIRQ_NUM_CH`).
* Reduce the FIFO sizes of implemented modules (e.g. `SLINK_TX_FIFO`).
* Disable the instruction cache (`ICACHE_EN => false`) if the design only uses processor-internal IMEM
and DMEM memories.
* _To be continued..._
 
**CPU**
 
* Use the _embedded_ RISC-V CPU architecture extension (`CPU_EXTENSION_RISCV_E`) to reduce block RAM utilization.
* The compressed instructions extension (`CPU_EXTENSION_RISCV_C`) requires additional logic for the decoder but
also reduces program code size by approximately 30%.
* If not explicitly used/required, constrain the CPU's counter sizes: `CPU_CNT_WIDTH` for `[m]instret[h]`
(number of instruction) and `[m]cycle[h]` (number of cycles) counters. You can even remove these counters
by setting `CPU_CNT_WIDTH => 0` if they are not used at all (note, this is not RISC-V compliant).
* Reduce the CPU's prefetch buffer size (`CPU_IPB_ENTRIES`).
* Map CPU shift operations to a small and iterative shifter unit (`FAST_SHIFT_EN => false`).
* If you have unused DSP block available, you can map multiplication operations to those slices instead of
using LUTs to implement the multiplier (`FAST_MUL_EN => true`).
* If there is no need to execute division in hardware, use the `Zmmul` extension instead of the full-scale
`M` extension.
* Disable CPU extension that are not explicitly used (`A`, `U`, `Zfinx`).
* _To be continued..._
 
=== Optimize for Clock Speed
 
The NEORV32 Processor and CPU are designed to provide minimal logic between register stages to keep the
critical path as short as possible. When enabling additional extension or modules the impact on the existing
logic is also kept at a minimum to prevent timing degrading. If there is a major impact on existing
logic (example: many physical memory protection address configuration registers) the VHDL code automatically
adds additional register stages to maintain critical path length. Obviously, this increases operation latency.
 
In order to optimize for a minimal critical path (= maximum clock speed) the following points should be considered:
 
* Complex CPU extensions (in terms of hardware requirements) should be avoided (examples: floating-point unit, physical memory protection).
* Large carry chains (>32-bit) should be avoided (constrain CPU counter sizes: e.g. `CPU_CNT_WIDTH => 32` and `HPM_NUM_CNTS => 32`).
* If the target FPGA provides sufficient DSP resources, CPU multiplication operations can be mapped to DSP slices (`FAST_MUL_EN => true`)
reducing LUT usage and critical path impact while also increasing overall performance.
* Use the synchronous (registered) RX path configuration of the external memory interface (`MEM_EXT_ASYNC_RX => false`).
* _To be continued..._
 
[NOTE]
The short and fixed-length critical path allows to integrate the core into existing clock domains.
So no clock domain-crossing and no sub-clock generation is required. However, for very high clock
frequencies (this is technology / platform dependent) clock domain crossing becomes crucial for chip-internal
connections.
 
 
=== Optimize for Energy
 
There are no _dedicated_ configuration options to optimize the processor for energy (minimal consumption;
energy/instruction ratio) yet. However, a reduced processor area (<<_optimize_for_size>>) will also reduce
static energy consumption.
 
To optimize your setup for low-power applications, you can make use of the CPU sleep mode (`wfi` instruction).
Put the CPU to sleep mode whenever possible. Disable all processor modules that are not actually used (exclude them
from synthesis if the will be _never_ used; disable the module via it's control register if the module is not
_currently_ used). When is sleep mode, you can keep a timer module running (MTIME or the watch dog) to wake up
the CPU again. Since the wake up is triggered by _any_ interrupt, the external interrupt controller can also
be used to wake up the CPU again. By this, all timers (and all other modules) can be deactivated as well.
 
.Processor-internal clock generator shutdown
[TIP]
If _no_ IO/peripheral module is currently enabled, the processor's internal clock generator circuit will be
shut down reducing switching activity and thus, dynamic energy consumption.
 
 
 
<<<
// ####################################################################################################################
:sectnums:
== Customizing the Internal Bootloader
 
The NEORV32 bootloader provides several options to configure and customize it for a certain application setup.
632,6 → 772,7
| `AUTO_BOOT_OCD_EN` | `0` | `0`, `1` | Set `1` to enable boot via on-chip debugger (OCD)
| `AUTO_BOOT_TIMEOUT` | `8` | _any_ | Time in seconds after the auto-boot sequence starts (if there is no UART input by user); set to 0 to disabled auto-boot sequence
4+^| SPI configuration
| `SPI_EN` | `1` | `0`, `1` | Set `1` to enable the usage of the SPI module (including load/store executables from/to SPI flash options)
| `SPI_FLASH_CS` | `0` | `0` ... `7` | SPI chip select output (`spi_csn_o`) for selecting flash
| `SPI_FLASH_SECTOR_SIZE` | `65536` | _any_ | SPI flash sector size in bytes
| `SPI_FLASH_CLK_PRSC` | `CLK_PRSC_8` | `CLK_PRSC_2` `CLK_PRSC_4` `CLK_PRSC_8` `CLK_PRSC_64` `CLK_PRSC_128` `CLK_PRSC_1024` `CLK_PRSC_2024` `CLK_PRSC_4096` | SPI clock pre-scaler (dividing main processor clock)
820,19 → 961,13
:sectnums:
== Simulating the Processor
 
.WORK IN PROGRESS
[WARNING]
This Section Is Under Construction! +
+
FIXME!
 
:sectnums:
=== Testbench
 
The NEORV32 project features a simple default testbench (`sim/neorv32_tb.simple.vhd`) that can be used to simulate
and test the processor setup. This testbench features a 100MHz clock and enables all optional peripheral and
CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its
combinatorial (looped) oscillator architecture).
The NEORV32 project features a simple, plain-VHDL (no third-party libraries) default testbench (`sim/neorv32_tb.simple.vhd`)
that can be used to simulate and test the processor setup. This testbench features a 100MHz clock and enables all optional
peripheral and CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its
combinatorial (looped) architecture).
 
The simulation setup is configured via the "User Configuration" section located right at the beginning of
the testbench's architecture. Each configuration constant provides comments to explain the functionality.
860,26 → 995,17
| `0xff000000` | 4 bytes | `-/w/-, a, -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts
|=======================
 
The simulated NEORV32 does not use the bootloader and directly boots the current application image (from
the `rtl/core/neorv32_application_image.vhd` image file). Make sure to use the `all` target of the
makefile to install your application as VHDL image after compilation:
[NOTE]
The simulated NEORV32 does not use the bootloader and _directly boots_ the current application image (from
the `rtl/core/neorv32_application_image.vhd` image file).
 
[source, bash]
----
sw/example/blink_led$ make clean_all all
----
 
.Simulation-Optimized CPU/Processors Modules
.UART output during simulation
[NOTE]
The `sim/rtl_modules` folder provides simulation-optimized versions of certain CPU/processor modules.
These alternatives can be used to replace the default CPU/processor HDL files to allow faster/easier/more
efficient simulation. **These files are not intended for synthesis!**
 
**Simulation Console Output**
 
Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented
as part of the testbench. Received chars are send to the simulator console and are also stored to a log file
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulator home folder.
(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulation's home folder.
**Please note that printing via the native UART receiver takes a lot of time.** For faster simulation console output
see section <<_faster_simulation_console_output>>.
 
 
:sectnums:
909,7 → 1035,8
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all all
----
 
The provided define will change the default UART0/UART1 setup function in order to set the simulation mode flag in the according UART's control register.
The provided define will change the default UART0/UART1 setup function in order to set the simulation
mode flag in the according UART's control register.
 
[NOTE]
The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is
929,7 → 1056,92
----
 
 
:sectnums:
=== In-Console Application Simulation
 
To directly compile and run a program in the console (using the default testbench and GHDL
as simulator) you can use the `sim` makefile target. Make sure to use the UART simulation mode
(`USER_FLAGS+=-DUART0_SIM_MODE` and/or `USER_FLAGS+=-DUART1_SIM_MODE`) to get
faster / direct-to-console UART output.
 
[source, bash]
----
sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all sim
[...]
Blinking LED demo program
----
 
 
:sectnums:
=== Hello World!
 
To do a quick test of the NEORV32 make sure to have [GHDL](https://github.com/ghdl/ghdl) and a
[RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed, navigate to the project's
`sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`:
 
[TIP]
The simulator will output some _sanity check_ notes (and warnings or even errors if something is ill-configured)
right at the beginning of the simulation to give a brief overview of the actual NEORV32 SoC and CPU configurations.
 
[source, bash]
----
stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim
../../../sw/lib/source/neorv32_uart.c: In function 'neorv32_uart0_setup':
../../../sw/lib/source/neorv32_uart.c:301:4: warning: #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! [-Wcpp]
301 | #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only!
| ^~~~~~~
Memory utilization:
text data bss dec hex filename
4612 0 120 4732 127c main.elf
Compiling ../../../sw/image_gen/image_gen
Installing application image to ../../../rtl/core/neorv32_application_image.vhd
Simulating neorv32_application_image.vhd...
Tip: Compile application with USER_FLAGS+=-DUART[0/1]_SIM_MODE to auto-enable UART[0/1]'s simulation mode (redirect UART output to simulator console).
Using simulation runtime args: --stop-time=10ms
../rtl/core/neorv32_top.vhd:347:3:@0ms:(assertion note): NEORV32 PROCESSOR IO Configuration: GPIO MTIME UART0 UART1 SPI TWI PWM WDT CFS SLINK NEOLED XIRQ
../rtl/core/neorv32_top.vhd:370:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Boot configuration: Direct boot from memory (processor-internal IMEM).
../rtl/core/neorv32_top.vhd:394:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing on-chip debugger (OCD).
../rtl/core/neorv32_cpu.vhd:169:3:@0ms:(assertion note): NEORV32 CPU ISA Configuration (MARCH): RV32IMACU_Zbb_Zicsr_Zifencei_Zfinx_Debug
../rtl/core/neorv32_cpu.vhd:189:3:@0ms:(assertion note): NEORV32 CPU CONFIG NOTE: Implementing NO dedicated hardware reset for uncritical registers (default, might reduce area). Set package constant <dedicated_reset_c> = TRUE to configure a DEFINED reset value for all CPU registers.
../rtl/core/neorv32_imem.vhd:107:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal IMEM as ROM (16384 bytes), pre-initialized with application (4612 bytes).
../rtl/core/neorv32_dmem.vhd:89:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal DMEM (RAM, 8192 bytes).
../rtl/core/neorv32_wishbone.vhd:136:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing STANDARD Wishbone protocol.
../rtl/core/neorv32_wishbone.vhd:140:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing auto-timeout (255 cycles).
../rtl/core/neorv32_wishbone.vhd:144:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing LITTLE-endian byte order.
../rtl/core/neorv32_wishbone.vhd:148:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: External Bus Interface - Implementing registered RX path.
../rtl/core/neorv32_slink.vhd:161:3:@0ms:(assertion note): NEORV32 PROCESSOR CONFIG NOTE: Implementing 8 RX and 8 TX stream links.
 
##
## ## ## ##
## ## ######### ######## ######## ## ## ######## ######## ## ################
#### ## ## ## ## ## ## ## ## ## ## ## ## ## #### ####
## ## ## ## ## ## ## ## ## ## ## ## ## ## ###### ##
## ## ## ######### ## ## ######### ## ## ##### ## ## #### ###### ####
## ## ## ## ## ## ## ## ## ## ## ## ## ## ###### ##
## #### ## ## ## ## ## ## ## ## ## ## ## #### ####
## ## ######### ######## ## ## ## ######## ########## ## ################
## ## ## ##
##
Hello world! :)
----
 
 
:sectnums:
=== Advanced Simulation using VUNIT
 
.WORK IN PROGRESS
[WARNING]
This Section Is Under Construction! +
+
FIXME!
 
The NEORV32 provides a more sophisticated simulation setup using https://vunit.github.io/[VUNIT].
The according VUNIT-based testbench is `sim/neorv32_tb.vhd`.
 
**WORK-IN-PROGRESS**
 
 
 
<<<
// ####################################################################################################################
:sectnums:
/attrs.adoc
1,7 → 1,7
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.9
:revnumber: v1.6.0
:doctype: book
:sectnums:
:stem:

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.