OpenCores

Rev 72	Rev 73
Line 1...	Line 1...
`:sectnums:`	`:sectnums:`
`== NEORV32 Central Processing Unit (CPU)`	`== NEORV32 Central Processing Unit (CPU)`

`image::neorv32_cpu_block.png[width=600,align=center]`	`image::neorv32_cpu_block.png[width=600,align=center]`

	`Section Structure`

	`* <<_architecture>>, <<_full_virtualization>> and <<_risc_v_compatibility>>`
	`* <<_cpu_top_entity_signals>> and <<_cpu_top_entity_generics>>`
	`* <<_instruction_sets_and_extensions>>, <<_custom_functions_unit_cfu>> and <<_instruction_timing>>`
	`* <<_control_and_status_registers_csrs>>`
	`* <<_traps_exceptions_and_interrupts>>`
	`* <<_bus_interface>>`


`Key Features`	`Key Features`

* 32-bit multi-cycle in-order `rv32` RISC-V CPU	* 32-bit little-endian, multi-cycle, in-order `rv32` RISC-V CPU
`* Optional RISC-V extensions:`	`* Compatible to the RISC-V. Privileged Architecture - Machine ISA Version 1.12 specifications`
	`* Available <<_instruction_sets_and_extensions>>:`
** `A` - atomic memory access operations	** `A` - atomic memory access operations
** `B` - bit-manipulation instructions	** `B` - bit-manipulation instructions
** `C` - 16-bit compressed instructions	** `C` - 16-bit compressed instructions
** `I` - integer base ISA (always enabled)	** `I` - integer base ISA (always enabled)
** `E` - embedded CPU version (reduced register file size)	** `E` - embedded CPU version (reduced register file size)
Line 20...	Line 31...
** `Zihpm` - hardware performance monitors	** `Zihpm` - hardware performance monitors
** `Zifencei` - instruction stream synchronization	** `Zifencei` - instruction stream synchronization
** `Zmmul` - integer multiplication hardware	** `Zmmul` - integer multiplication hardware
** `Zxcfu` - custom instructions extension	** `Zxcfu` - custom instructions extension
** `PMP` - physical memory protection	** `PMP` - physical memory protection
** `Debug` - debug mode (part of the on.chip debugger) including hardware trigger module	** `Debug` - <<_cpu_debug_mode>> (part of the on.chip debugger) including hardware <<_trigger_module>>
`* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)`	`* <<_risc_v_compatibility>>: Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)`
`* Official RISC-V open-source architecture ID`	`* Official RISC-V open-source architecture ID`
`* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts`	`* Supports _all_ of the machine-level <<_traps_exceptions_and_interrupts>> from the RISC-V specifications (including bus access exceptions and all unimplemented/illegal/malformed instructions)`
`* Supports _all_ of the machine-level traps from the RISC-V specifications (including bus access exceptions and all unimplemented/illegal/malformed instructions)`
`** This is a special aspect on _execution safety_ by <<_full_virtualization>>`	`** This is a special aspect on _execution safety_ by <<_full_virtualization>>`
	`** Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 custom _fast_ interrupts`
`* Optional physical memory configuration (PMP), compatible to the RISC-V specifications`	`* Optional physical memory configuration (PMP), compatible to the RISC-V specifications`
`* Optional hardware performance monitors (HPM) for application benchmarking`	`* Optional hardware performance monitors (HPM) for application benchmarking`
`* Separated interfaces for instruction fetch and data access (merged into a single processor bus))`	`* Separated <<_bus_interface>>s for instruction fetch and data access`
`* little-endian byte order`
`* Configurable hardware reset`
`* No hardware support of unaligned data/instruction accesses - they will trigger an exception.`

`[NOTE]`	`[NOTE]`
`It is recommended to use the NEORV32 Processor as default top instance even if you only want to use the actual`	`It is recommended to use the NEORV32 Processor as default top instance even if you only want to use the actual`
`CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU`	`CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU`
`wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This`	`wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This`
Line 249...	Line 257...


`:sectnums:`	`:sectnums:`
`==== RISC-V Incompatibility Issues and Limitations`	`==== RISC-V Incompatibility Issues and Limitations`

`This list shows the currently identified issues regarding full RISC-V-compatibility. More specific information`	`This list shows the currently identified issues regarding full RISC-V-compatibility.`
`can be found in section <<_instruction_sets_and_extensions>>.`

`.Read-Only "Read-Write" CSRs`	`.Read-Only "Read-Write" CSRs`
`[IMPORTANT]`	`[IMPORTANT]`
`The <<_misa>> and <<_mtval>> CSRs in the NEORV32 are _read-only_.`	`The <<_misa>> and <<_mtval>> CSRs in the NEORV32 are _read-only_.`
`Any machine-mode write access to them is ignored and will _not_ cause any exceptions or side-effects to maintain`	`Any machine-mode write access to them is ignored and will _not_ cause any exceptions or`
`RISC-V compatibility.`	`side-effects to maintain RISC-V compatibility.`

`.Physical Memory Protection`	`.Physical Memory Protection`
`[IMPORTANT]`	`[IMPORTANT]`
`The physical memory protection (see section <<_machine_physical_memory_protection_csrs>>)`	`The RISC-V-compatible NEORV32 <<_machine_physical_memory_protection_csrs>> only implements the TOR`
`only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.`	`(top of region) mode and only up to 16 PMP regions. Furthermore, the <<_pmpcfg>>'s _lock bits_ only lock`
	`the according PMP entry and not the entries below. All region rules are checked in parallel without`
	`prioritization so for identical memory regions the most restrictive PMP rule will be enforced.`

`.Atomic Memory Operations`	`.Atomic Memory Operations`
`[IMPORTANT]`	`[IMPORTANT]`
The `A` CPU extension only implements the `lr.w` and `sc.w` instructions yet.	The `A` CPU extension only implements the `lr.w` and `sc.w` instructions yet.
`However, these instructions are sufficient to emulate all further atomic memory operations.`	`However, these instructions are sufficient to emulate all further atomic memory operations.`

	`.No HW-Support of Misaligned Memory Accesses`
	`[WARNING]`
	`The CPU does not support the resolution of unaligned memory access by the hardware. This is not a`
	`RISC-V-compatibility issue but an important thing to know. Any kind of unaligned memory access`
	`will raise an exception to allow a software-based emulation.`



`// ####################################################################################################################`	`// ####################################################################################################################`
`:sectnums:`	`:sectnums:`
`=== CPU Top Entity - Signals`	`=== CPU Top Entity - Signals`
Line 282...	Line 297...

`.NEORV32 CPU top entity signals`	`.NEORV32 CPU top entity signals`
`[cols="<2,^1,^1,<6"]`	`[cols="<2,^1,^1,<6"]`
`[options="header", grid="rows"]`	`[options="header", grid="rows"]`
`\|=======================`	`\|=======================`
`\| Signal \| Width \| Dir. \| Function`	`\| Signal \| Width \| Dir. \| Description`
`4+^\| Global Signals`	`4+^\| Global Signals`
\| `clk_i` \| 1 \| in \| global clock line, all registers triggering on rising edge	\| `clk_i` \| 1 \| in \| global clock line, all registers triggering on rising edge
\| `rstn_i` \| 1 \| in \| global reset, low-active	\| `rstn_i` \| 1 \| in \| global reset, low-active
\| `sleep_o` \| 1 \| out \| CPU is in sleep mode when set	\| `sleep_o` \| 1 \| out \| CPU is in sleep mode when set
\| `debug_o` \| 1 \| out \| CPU is in debug mode when set	\| `debug_o` \| 1 \| out \| CPU is in debug mode when set
`4+^\| Instruction Bus Interface (<<_bus_interface>>)`	`4+^\| Instruction <<_bus_interface>>`
\| `i_bus_addr_o` \| 32 \| out \| destination address	\| `i_bus_addr_o` \| 32 \| out \| access address
\| `i_bus_rdata_i` \| 32 \| in \| read data	\| `i_bus_rdata_i` \| 32 \| in \| read data
\| `i_bus_wdata_o` \| 32 \| out \| write data (always zero)	\| `i_bus_wdata_o` \| 32 \| out \| write data (always zero)
\| `i_bus_ben_o` \| 4 \| out \| byte enable	\| `i_bus_ben_o` \| 4 \| out \| byte enable
\| `i_bus_we_o` \| 1 \| out \| write transaction (always zero)	\| `i_bus_we_o` \| 1 \| out \| write transaction (always zero)
\| `i_bus_re_o` \| 1 \| out \| read transaction	\| `i_bus_re_o` \| 1 \| out \| read transaction
\| `i_bus_lock_o` \| 1 \| out \| exclusive access request (always zero)	\| `i_bus_lock_o` \| 1 \| out \| exclusive access request (always zero)
\| `i_bus_ack_i` \| 1 \| in \| bus transfer acknowledge from accessed peripheral	\| `i_bus_ack_i` \| 1 \| in \| bus transfer acknowledge from accessed peripheral
\| `i_bus_err_i` \| 1 \| in \| bus transfer terminate from accessed peripheral	\| `i_bus_err_i` \| 1 \| in \| bus transfer terminate from accessed peripheral
\| `i_bus_fence_o` \| 1 \| out \| indicates an executed _fence.i_ instruction	\| `i_bus_fence_o` \| 1 \| out \| indicates an executed `fence.i` instruction
\| `i_bus_priv_o` \| 2 \| out \| current CPU privilege level	\| `i_bus_priv_o` \| 1 \| out \| current _effective_ CPU privilege level (`0` user, `1` machine or debug)
`4+^\| Data Bus Interface (<<_bus_interface>>)`	`4+^\| Data <<_bus_interface>>`
\| `d_bus_addr_o` \| 32 \| out \| destination address	\| `d_bus_addr_o` \| 32 \| out \| access address
\| `d_bus_rdata_i` \| 32 \| in \| read data	\| `d_bus_rdata_i` \| 32 \| in \| read data
\| `d_bus_wdata_o` \| 32 \| out \| write data	\| `d_bus_wdata_o` \| 32 \| out \| write data
\| `d_bus_ben_o` \| 4 \| out \| byte enable	\| `d_bus_ben_o` \| 4 \| out \| byte enable
\| `d_bus_we_o` \| 1 \| out \| write transaction	\| `d_bus_we_o` \| 1 \| out \| write transaction
\| `d_bus_re_o` \| 1 \| out \| read transaction	\| `d_bus_re_o` \| 1 \| out \| read transaction
\| `d_bus_lock_o` \| 1 \| out \| exclusive access request	\| `d_bus_lock_o` \| 1 \| out \| exclusive access request
\| `d_bus_ack_i` \| 1 \| in \| bus transfer acknowledge from accessed peripheral	\| `d_bus_ack_i` \| 1 \| in \| bus transfer acknowledge from accessed peripheral
\| `d_bus_err_i` \| 1 \| in \| bus transfer terminate from accessed peripheral	\| `d_bus_err_i` \| 1 \| in \| bus transfer terminate from accessed peripheral
\| `d_bus_fence_o` \| 1 \| out \| indicates an executed _fence_ instruction	\| `d_bus_fence_o` \| 1 \| out \| indicates an executed `fence` instruction
\| `d_bus_priv_o` \| 2 \| out \| current CPU privilege level	\| `d_bus_priv_o` \| 1 \| out \| current _effective_ CPU privilege level (`0` user, `1` machine or debug)
`4+^\| System Time (see <<_timeh>> CSR)`	`4+^\| System Time (for <<_timeh>> CSR)`
\| `time_i` \| 64 \| in \| system time input (from MTIME)	\| `time_i` \| 64 \| in \| system time input from <<_machine_system_timer_mtime>>
`4+^\| Interrupts, RISC-V-compatible (<<_traps_exceptions_and_interrupts>>)`	`4+^\| Interrupts, RISC-V-compatible (<<_traps_exceptions_and_interrupts>>)`
\| `msw_irq_i` \| 1 \| in \| RISC-V machine software interrupt	\| `msw_irq_i` \| 1 \| in \| RISC-V machine software interrupt
\| `mext_irq_i` \| 1 \| in \| RISC-V machine external interrupt	\| `mext_irq_i` \| 1 \| in \| RISC-V machine external interrupt
\| `mtime_irq_i` \| 1 \| in \| RISC-V machine timer interrupt	\| `mtime_irq_i` \| 1 \| in \| RISC-V machine timer interrupt
`4+^\| Fast Interrupts, NEORV32-specific (<<_traps_exceptions_and_interrupts>>)`	`4+^\| Interrupts, NEORV32-specific (<<_traps_exceptions_and_interrupts>>)`
\| `firq_i` \| 16 \| in \| fast interrupt request signals	\| `firq_i` \| 16 \| in \| fast interrupt request signals
`4+^\| Enter Debug Mode Request (<<_on_chip_debugger_ocd>>)`	`4+^\| Enter Debug Mode Request (<<_on_chip_debugger_ocd>>)`
\| `db_halt_req_i` \| 1 \| in \| request CPU to halt and enter debug mode	\| `db_halt_req_i` \| 1 \| in \| request CPU to halt and enter debug mode
`\|=======================`	`\|=======================`

Line 337...	Line 352...
`The _specific_ generics are listed below.`	`The _specific_ generics are listed below.`

`[cols="4,4,2"]`	`[cols="4,4,2"]`
`[frame="all",grid="none"]`	`[frame="all",grid="none"]`
`\|======`	`\|======`
`\| CPU_BOOT_ADDR \| _std_ulogic_vector(31 downto 0)_ \| -`	`\| CPU_BOOT_ADDR \| _std_ulogic_vector(31 downto 0)_ \| _no default value_`
`3+\| This address defines the reset address at which the CPU starts fetching instructions after reset. In terms of the NEORV32 processor, this`	`3+\| This address defines the reset address at which the CPU starts fetching instructions after reset. In terms of the NEORV32 processor, this`
`generic is configured with the base address of the bootloader ROM (default) or with the base address of the processor-internal instruction`	`generic is configured with the base address of the bootloader ROM (default) or with the base address of the processor-internal instruction`
`memory (IMEM) if the bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.`	`memory (IMEM) if the bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.`
`\|======`	`\|======`

`[cols="4,4,2"]`	`[cols="4,4,2"]`
`[frame="all",grid="none"]`	`[frame="all",grid="none"]`
`\|======`	`\|======`
`\| CPU_DEBUG_ADDR \| _std_ulogic_vector(31 downto 0)_ \| -`	`\| CPU_DEBUG_ADDR \| _std_ulogic_vector(31 downto 0)_ \| _no default value_`
`3+\| This address defines the entry address for the "execution based" on-chip debugger. By default, this generic is configured with the base address`	`3+\| This address defines the entry address for the "execution based" on-chip debugger. By default, this generic is configured with the base address`
`of the debugger memory. See section <<_on_chip_debugger_ocd>> for more information.`	`of the debugger memory. See section <<_on_chip_debugger_ocd>> for more information.`
`\|======`	`\|======`

`[cols="4,4,2"]`	`[cols="4,4,2"]`
`[frame="all",grid="none"]`	`[frame="all",grid="none"]`
`\|======`	`\|======`
`\| CPU_EXTENSION_RISCV_DEBUG \| _boolean_ \| -`	`\| CPU_EXTENSION_RISCV_DEBUG \| _boolean_ \| _no default value_`
`3+\| Implement RISC-V-compatible "debug" CPU operation mode. See section <<_cpu_debug_mode>> for more information.`	`3+\| Implement RISC-V-compatible "debug" CPU operation mode. See section <<_cpu_debug_mode>> for more information.`
`\|======`	`\|======`



Line 513...	Line 528...
* multiplication: `mul` `mulh` `mulhsu` `mulhu`	* multiplication: `mul` `mulh` `mulhsu` `mulhu`
* division: `div` `divu` `rem` `remu`	* division: `div` `divu` `rem` `remu`

`[NOTE]`	`[NOTE]`
`By default, multiplication and division operations are executed in a bit-serial approach.`	`By default, multiplication and division operations are executed in a bit-serial approach.`
Alternatively, the multiplier core can be implemented using DSP blocks if the `FAST_MUL_EN`	`Alternatively, the multiplier core can be implemented using DSP blocks if the <<_fast_mul_en>>`
`generic is _true_ allowing faster execution. Multiplications and divisions`	`generic is _true_ allowing faster execution. Multiplications and divisions`
`always require a fixed amount of cycles to complete - regardless of the input operands.`	`always require a fixed amount of cycles to complete - regardless of the input operands.`

	`[NOTE]`
	`Regardless of the setting of the <<_fast_mul_en>> generic`
	`multiplication and division instructions operate _independently_ of the input operands.`
	`Hence, there is no early completion of multiply by one/zero and divide by zero operations.`


==== `Zmmul` - Integer Multiplication	==== `Zmmul` - Integer Multiplication

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations	This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations
of the `M` extensions and is intended for size-constrained setups that require hardware-based	of the `M` extensions and is intended for size-constrained setups that require hardware-based
Line 547...	Line 567...
`operation mode. It is implemented if the <<_cpu_extension_riscv_u>> configuration generic is _true_.`	`operation mode. It is implemented if the <<_cpu_extension_riscv_u>> configuration generic is _true_.`
`Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like`	`Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like`
`peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).`	`peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).`
`Any kind of privilege rights violation will raise an exception to allow <<_full_virtualization>>.`	`Any kind of privilege rights violation will raise an exception to allow <<_full_virtualization>>.`

	`Additional CSRs:`

	`* <<_mcounteren>> - machine counter enable to constrain user-mode access to timer/counter CSRs`


==== `X` - NEORV32-Specific (Custom) Extensions	==== `X` - NEORV32-Specific (Custom) Extensions

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the <<_misa>> CSR.	The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the <<_misa>> CSR.

`The most important points of the NEORV32-specific extensions are:`	`The most important points of the NEORV32-specific extensions are:`
* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie`	* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ`), which are controlled via custom bits in the <<_mie>>
`and <<_mip>> CSR. This extension is mapped to CSR bits, that are available for custom use (according to the`	`and <<_mip>> CSRs. These extensions are mapped to CSR bits, that are available for custom use according to the`
`RISC-V specs). Also, custom trap codes for <<_mcause>> are implemented.`	`RISC-V specs. Also, custom trap codes for <<_mcause>> are implemented.`
`* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).`	`* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).`
`* There are <<_neorv32_specific_csrs>>.`	`* There are <<_neorv32_specific_csrs>>.`


==== `Zfinx` Single-Precision Floating-Point Operations	==== `Zfinx` Single-Precision Floating-Point Operations
Line 582...	Line 606...
* comparison: `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s`	* comparison: `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s`
* computational: `fadd.s` `fsub.s` `fmul.s`	* computational: `fadd.s` `fsub.s` `fmul.s`
* sign-injection: `fsgnj.s` `fsgnjn.s` `fsgnjx.s`	* sign-injection: `fsgnj.s` `fsgnjn.s` `fsgnjx.s`
* number classification: `fclass.s`	* number classification: `fclass.s`

`* additional CSRs: <<_fcsr>>, <<_frm>>, <<_fflags>>`	* compressed instructions: `c.flw` `c.flwsp` `c.fsw` `c.fswsp`

	`Additional CSRs:`

	`* <<_fcsr>> - FPU control register`
	`* <<_frm>> - rounding mode control`
	`* <<_fflags>> - FPU status flags`

`[WARNING]`	`[WARNING]`
Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!	Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!
Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!	Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

Line 621...	Line 651...
`[NOTE]`	`[NOTE]`
If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.	If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.
`However, access privileges are still enforced so these instruction variants _do_ cause side-effects`	`However, access privileges are still enforced so these instruction variants _do_ cause side-effects`
`(the RISC-V spec. state that these combinations "_shall_ not cause any side-effects").`	`(the RISC-V spec. state that these combinations "_shall_ not cause any side-effects").`

`[NOTE]`	`wfi` Instruction

The "wait for interrupt instruction" `wfi` acts like a sleep command. When executed, the CPU is	The "wait for interrupt instruction" `wfi` acts like a sleep command. When executed, the CPU is
`halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to`	`halted until a valid interrupt request occurs. To wake up again, at least one interrupt source has to`
`be enabled via the <<_mie>> CSR and the global interrupt enable flag in <<_mstatus>> has to be set.`	`be enabled via the <<_mie>> CSR and the global interrupt enable flag in <<_mstatus>> has to be set.`
The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit
`TW` (timeout wait) is _hardwired_ to zero.

	If the <<_mstatus>> `TW` bis is cleared the `wfi` instruction is also allowed to execute when in user-mode.
	This is always the case if user-mode is not implemented. If the `TW` bit is set the execution of `wfi` in
	`user-mode will raise an illegal instruction exception.`


==== `Zicntr` CPU Base Counters	==== `Zicntr` CPU Base Counters

The `Zicntr` ISA extension adds the basic cycle `[m]cycle[h]`), instruction-retired (`[m]instret[h]`) and time (`time[h]`)	The `Zicntr` ISA extension adds the basic cycle `[m]cycle[h]`), instruction-retired (`[m]instret[h]`) and time (`time[h]`)
`counters. This extensions is stated is _mandatory_ by the RISC-V spec. However, size-constrained setups may remove support for`	`counters. This extensions is stated is _mandatory_ by the RISC-V spec. However, size-constrained setups may remove support for`
these counters. Section <<_machine_counter_and_timer_csrs>> shows a list of all `Zicntr`-related CSRs.	these counters. Section <<_machine_counter_and_timer_csrs>> shows a list of all `Zicntr`-related CSRs.
These are available if the `Zicntr` ISA extensions is enabled via the <<_cpu_extension_riscv_zicntr>> generic.	These are available if the `Zicntr` ISA extensions is enabled via the <<_cpu_extension_riscv_zicntr>> generic.

	`Additional CSRs:`

	`* <<_cycleh>>, <<_mcycleh>> - cycle counter`
	`* <<_instreth>>, <<_minstreth>> - instructions-retired counter`
	`* <<_timeh>> - system _wall-clock_ time`

`[NOTE]`	`[NOTE]`
Disabling the `Zicntr` extension does not remove the `time[h]`-driving MTIME unit.	Disabling the `Zicntr` extension does not remove the `time[h]`-driving MTIME unit.

If `Zicntr` is disabled, all accesses to the according counter CSRs will raise an illegal instruction exception.	If `Zicntr` is disabled, all accesses to the according counter CSRs will raise an illegal instruction exception.



==== `Zihpm` Hardware Performance Monitors	==== `Zihpm` Hardware Performance Monitors

`In additions to the base cycle, instructions-retired and time counters the NEORV32 CPU provides`	`In additions to the base cycle, instructions-retired and time counters the NEORV32 CPU provides`
`up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an`	`up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an`
`N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's`	`N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's`
`<<_hpm_cnt_width>> generic (0..64-bit) and a corresponding event configuration CSR. The event configuration`	`<<_hpm_cnt_width>> generic (0..64-bit) and a corresponding event configuration CSR. The event configuration`
`CSR defines the architectural events that lead to an increment of the associated HPM counter.`	`CSR defines the architectural events that lead to an increment of the associated HPM counter.`

The HPM counters are available if the `Zihpm` ISA extensions is enabled via the <<_cpu_extension_riscv_zihpm>> generic.	The HPM counters are available if the `Zihpm` ISA extensions is enabled via the <<_cpu_extension_riscv_zihpm>> generic.
	`The actual number of implemented HPM counters is defined by the <<_hpm_num_cnts>> generic.`

`Depending on the configuration the following additional CSR are available:`	`Additional CSRs:`

* counters: `mhpmcounter*[h]` (3..31, depending on `HPM_NUM_CNTS`)	`* <<_mhpmevent>> 3..31 (depending on <<_hpm_num_cnts>>) - event configuration CSRs`
* event configuration: `mhpmevent*` (3..31, depending on `HPM_NUM_CNTS`)	`* <<_mhpmcounterh>> 3..31 (depending on <<_hpm_num_cnts>>) - counter CSRs`

`[IMPORTANT]`	`[IMPORTANT]`
`The HPM counter CSR can only be accessed in machine-mode. Hence, the according <<_mcounteren>> CSR bits`	`The HPM counter CSRs can only be accessed in machine-mode. Hence, the according <<_mcounteren>> CSR bits`
`are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction`	`are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction`
`exception.`	`exception.`

`[TIP]`	`[TIP]`
`Auto-increment of the HPMs can be individually deactivated via the <<_mcountinhibit>> CSR.`	`Auto-increment of the HPMs can be deactivated individually via the <<_mcountinhibit>> CSR.`

`[TIP]`
`For a list of all HPM-related CSRs and all provided event configurations`
`see section <<_hardware_performance_monitors_hpm>>.`


==== `Zifencei` Instruction Stream Synchronization	==== `Zifencei` Instruction Stream Synchronization

The `Zifencei` CPU extension is implemented if the <<_cpu_extension_riscv_zifencei>> configuration	The `Zifencei` CPU extension is implemented if the <<_cpu_extension_riscv_zifencei>> configuration
Line 711...	Line 745...
`the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.`	`the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.`


==== `PMP` Physical Memory Protection	==== `PMP` Physical Memory Protection

`The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used`	`The NEORV32 physical memory protection (PMP) provides an elementary memory protection mechanism that can be used`
`to constrain memory read/write/execute rights for each available privilege level.`	`to constrain read, write and execute rights of arbitrary memory regions. The PMP is compatible`
	`to the _RISC-V Privileged Architecture Specifications_. For detailed information see the according spec.'s sections.`

	`[IMPORTANT]`
	`The NEORV32 PMP only supports TOR (top of region) mode, which basically is a "base-and-bound" concept, and only`
	`up to 16 PMP regions.`

	`The physical memory protection logic is implemented if the <<_pmp_num_regions>> configuration generic is greater`
	`than zero. This generic also defines the total number of available configurable protection`
	`regions. The minimal granularity of a protected region is defined by the <<_pmp_min_granularity>> generic. Larger`
	`granularity will reduce hardware complexity but will also decrease granularity as the minimal region sizes increases.`
	`The default value is 4 bytes, which allows a minimal region size of 4 bytes.`

	`If implemented the PMP provides the following additional CSRs:`

	`* <<_pmpcfg>> 0..3 (depending on configuration) - PMP configuration registers, 4 entries per CSR`
	`* <<_pmpaddr>> 0..15 (depending on configuration) - PMP address registers`


`The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger`	`Operation Summary`
minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.
The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.
`In this case the following additional CSRs are available:`

* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers	`Any CPU access address (from the instruction fetch or data access interface) is tested if it matches _any_`
* `pmpaddr*` (0..63, depending on configuration): PMP address registers	`of the specified PMP regions. If there is a match, the configured access rights are enforced:`

	`* a write access (store) will fail if no write attribute is set`
	`* a read access (load) will fail if no read attribute is set`
	`* an instruction fetch access will fail if no execute attribute is set`

	`If an access to a protected region does not have the according access rights it will raise the according`
	`instruction/load/store _bus access fault_ exception.`

	`By default, all PMP checks are enforced for user-mode only. However, PMP rules can also be enforced for`
	`machine-mode when the according PMP region has the "LOCK" bit set. This will also prevent any write access`
	`to according region's PMP CSRs until the CPU is reset.`

	`.Rule Prioritization`
	`[IMPORTANT]`
	`All rules are checked in parallel without prioritization so for identical memory regions the most restrictive`
	`PMP rule will be enforced.`

	`.PMP Example Program`
`[TIP]`	`[TIP]`
`See section <<_machine_physical_memory_protection_csrs>> for more information regarding the PMP CSRs.`	A simple PMP example program can be found in `sw/example/demo_pmp`.

`The actual number of regions and the minimal region granularity are defined via the top entity`
`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available
granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the
number of available `pmpcfg` and `pmpaddr` CSRs.

`When implementing more PMP regions that a _certain critical limit_ *an additional register stage`
`is automatically inserted* into the CPU's memory interfaces to reduce critical path length. Unfortunately, this will also`
`increase the latency of instruction fetches and data access by +1 cycle.`

`The critical limit can be adapted for custom use by a constant from the main VHDL package file`	`Impact on Critical Path`
(`rtl/core/neorv32_package.vhd`). The default value is 8:
	`When implementing more PMP regions that a "_certain critical limit_" an additional register stage is automatically`
	`inserted into the CPU's memory interfaces to keep impact on the critical path as short as minimal as possible.`
	`Unfortunately, this will also increase the latency of instruction fetches and data access by one cycle.`
	`The _critical limit_ can be modified by a constant from the main VHDL package file`
	(`rtl/core/neorv32_package.vhd`, default value = 8):

`[source,vhdl]`	`[source,vhdl]`
`----`	`----`
`-- "critical" number of PMP regions --`	`-- "critical" number of PMP regions --`
`constant pmp_num_regions_critical_c : natural := 8;`	`constant pmp_num_regions_critical_c : natural := 8;`
`----`	`----`

`Operation`	`[TIP]`
	`Reducing the minimal PMP region size / granularity via the <<_pmp_min_granularity>> to entity generic`
`Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_`	`will also reduce hardware utilization and impact on critical path.`
of the specified PMP regions(configured via `pmpaddr` and enabled via `pmpcfg`). If an
address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:

`* a write access (store) will fail if no write attribute is set`
`* a read access (load) will fail if no read attribute is set`
`* an instruction fetch access will fail if no execute attribute is set`

`If an access to a protected region does not have the according access rights it will raise the according`
`instruction/load/store _access fault_ exception.`

`By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical`
`memory protection also for machine-level programs you need to set the _locked bit_ in the according`
`pmpcfg*` configuration CSR.

`[IMPORTANT]`
After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for
`internal (iterative) computations before the configuration becomes valid.`

`[NOTE]`
`For more information regarding RISC-V physical memory protection see the official _The RISC-V`	`// ####################################################################################################################`
`Instruction Set Manual - Volume II: Privileged Architecture_ specifications.`

	`include::cpu_cfu.adoc[]`



`// ####################################################################################################################`	`// ####################################################################################################################`
`:sectnums:`	`:sectnums:`
Line 800...	Line 846...
\| Memory access \| `I/E` \| `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` \| 4 + ML	\| Memory access \| `I/E` \| `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` \| 4 + ML
\| Memory access \| `C` \| `c.lw` `c.sw` `c.lwsp` `c.swsp` \| 4 + ML	\| Memory access \| `C` \| `c.lw` `c.sw` `c.lwsp` `c.swsp` \| 4 + ML
\| Memory access \| `A` \| `lr.w` `sc.w` \| 4 + ML	\| Memory access \| `A` \| `lr.w` `sc.w` \| 4 + ML
\| Multiplication \| `M` \| `mul` `mulh` `mulhsu` `mulhu` \| 2+32+2; FAST_MULfootnote:[DSP-based multiplication; enabled via `FAST_MUL_EN`.]: 4	\| Multiplication \| `M` \| `mul` `mulh` `mulhsu` `mulhu` \| 2+32+2; FAST_MULfootnote:[DSP-based multiplication; enabled via `FAST_MUL_EN`.]: 4
\| Division \| `M` \| `div` `divu` `rem` `remu` \| 2+32+2	\| Division \| `M` \| `div` `divu` `rem` `remu` \| 2+32+2
\| CSR access \| `Zicsr` \| `csrrw` `csrrs` `csrrc` `csrrwi` `csrrsi` `csrrci` \| 4	\| CSR access \| `Zicsr` \| `csrrw` `csrrs` `csrrc` `csrrwi` `csrrsi` `csrrci` \| 3
\| System \| `I/E`+`Zicsr` \| `ecall` `ebreak` \| 4
\| System \| `I/E` \| `fence` \| 3	\| System \| `I/E` \| `fence` \| 3
\| System \| `C`+`Zicsr` \| `c.break` \| 4	\| System \| `Zicsr` \| `ecall` `ebreak` \| 3
\| System \| `Zicsr` \| `mret` `wfi` \| 5	\| System \| `Zicsr`+`C` \| `c.break` \| 3
	\| System \| `Zicsr` \| `mret` `wfi` \| 6
\| System \| `Zifencei` \| `fence.i` \| 3 + ML	\| System \| `Zifencei` \| `fence.i` \| 3 + ML
\| Floating-point - artihmetic \| `Zfinx` \| `fadd.s` \| 110	\| Floating-point - artihmetic \| `Zfinx` \| `fadd.s` \| 110
\| Floating-point - artihmetic \| `Zfinx` \| `fsub.s` \| 112	\| Floating-point - artihmetic \| `Zfinx` \| `fsub.s` \| 112
\| Floating-point - artihmetic \| `Zfinx` \| `fmul.s` \| 22	\| Floating-point - artihmetic \| `Zfinx` \| `fmul.s` \| 22
\| Floating-point - compare \| `Zfinx` \| `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` \| 13	\| Floating-point - compare \| `Zfinx` \| `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` \| 13
Line 821...	Line 867...
\| Bit-manipulation - shifts \| `B(Zbb)` \| `cpop` \| 3 + 32	\| Bit-manipulation - shifts \| `B(Zbb)` \| `cpop` \| 3 + 32
\| Bit-manipulation - shifts \| `B(Zbb)` \| `rol` `ror` `rori` \| 3 + SA	\| Bit-manipulation - shifts \| `B(Zbb)` \| `rol` `ror` `rori` \| 3 + SA
\| Bit-manipulation - single-bit \| `B(Zbs)` \| `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` \| 3	\| Bit-manipulation - single-bit \| `B(Zbs)` \| `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` \| 3
\| Bit-manipulation - shifted-add \| `B(Zba)` \| `sh1add` `sh2add` `sh3add` \| 3	\| Bit-manipulation - shifted-add \| `B(Zba)` \| `sh1add` `sh2add` `sh3add` \| 3
\| Bit-manipulation - carry-less multiply \| `B(Zbc)` \| `clmul` `clmulh` `clmulr` \| 3 + 32	\| Bit-manipulation - carry-less multiply \| `B(Zbc)` \| `clmul` `clmulh` `clmulr` \| 3 + 32
\| CFU: custom instructions \| `Zxcfu` \| - \| min. 4	\| Custom instructions (CFU) \| `Zxcfu` \| - \| min. 4
	`\| \| \| \|`
	\| _Illegal instructions_ \| `Zicsr` \| - \| 2
`\|=======================`	`\|=======================`

`[NOTE]`	`[NOTE]`
`The presented values of the floating-point execution cycles are average values - obtained from`	`The presented values of the floating-point execution cycles are average values - obtained from`
`4096 instruction executions using pseudo-random input values. The execution time for emulating the`	`4096 instruction executions using pseudo-random input values. The execution time for emulating the`
Line 865...	Line 913...
`until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).`	`until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).`

`.Interrupt Signal Requirements - Fast Interrupt Requests`	`.Interrupt Signal Requirements - Fast Interrupt Requests`
`[IMPORTANT]`	`[IMPORTANT]`
`The NEORV32-specific FIRQ request lines are triggered by a one-shot high-level (i.e. rising edge). Each request is buffered in the CPU control`	`The NEORV32-specific FIRQ request lines are triggered by a one-shot high-level (i.e. rising edge). Each request is buffered in the CPU control`
`unit until the channel is either disabled (by clearing the according <<_mie>> CSR bit) or the request is explicitly cleared (by setting`	`unit until the channel is either disabled (by clearing the according <<_mie>> CSR bit) or the request is explicitly cleared (by writing`
`the according <<_mip>> CSR bit).`	`zero to the according <<_mip>> CSR bit).`

`.Instruction Atomicity`	`.Instruction Atomicity`
`[NOTE]`	`[NOTE]`
`All instructions execute as atomic operations - interrupts can only trigger _between_ two instructions.`	`All instructions execute as atomic operations - interrupts can only trigger _between_ two instructions.`
`So even if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before`	`So even if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before`
`another interrupt handler can start. This allows program progress even if there are permanent interrupt requests.`	`another interrupt handler can start. This allows program progress even if there are permanent interrupt requests.`


`:sectnums:`	`:sectnums:`
`==== Memory Access Exceptions`	`===== Memory Access Exceptions`

`If a load operation causes any exception, the instruction's destination register is`	`If a load operation causes any exception, the instruction's destination register is`
`_not written_ at all. Load exceptions caused by a misalignment or a physical memory protection fault do not`	`_not written_ at all. Load exceptions caused by a misalignment or a physical memory protection fault do not`
`trigger a bus/memory read-operation at all. Vice versa, exceptions caused by a store address misalignment or a store physical`	`trigger a bus/memory read-operation at all. Vice versa, exceptions caused by a store address misalignment or a store physical`
`memory protection fault do not trigger a bus/memory write-operation at all.`	`memory protection fault do not trigger a bus/memory write-operation at all.`


`:sectnums:`	`:sectnums:`
`==== Custom Fast Interrupt Request Lines`	`===== Custom Fast Interrupt Request Lines`

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top	As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top
`entity signals. These interrupts have custom configuration and status flags in the <<_mie>> and <<_mip>> CSRs and also`	`entity signals. These interrupts have custom configuration and status flags in the <<_mie>> and <<_mip>> CSRs and also`
`provide custom trap codes in <<_mcause>>. These FIRQs are reserved for NEORV32 processor-internal usage only.`	`provide custom trap codes in <<_mcause>>. These FIRQs are reserved for NEORV32 processor-internal usage only.`




`// ####################################################################################################################`
`:sectnums:`	`:sectnums:`
`==== NEORV32 Trap Listing`	`===== NEORV32 Trap Listing`

`The following table shows all traps that are currently supported by the NEORV32 CPU. It also shows the prioritization`	`The following table shows all traps that are currently supported by the NEORV32 CPU. It also shows the prioritization`
`and the CSR side-effects. A more detailed description of the actual trap triggering events is provided in a further table.`	`and the CSR side-effects. A more detailed description of the actual trap triggering events is provided in a further table.`

`[NOTE]`	`[NOTE]`
Line 909...	Line 954...

`Table Annotations`	`Table Annotations`

The "Prio." column shows the priority of each trap. The highest priority is 1. The "`mcause`" column shows the	The "Prio." column shows the priority of each trap. The highest priority is 1. The "`mcause`" column shows the
`cause ID of the according trap that is written to <<_mcause>> CSR. The "[RISC-V]" columns show the interrupt/exception code value from the`	`cause ID of the according trap that is written to <<_mcause>> CSR. The "[RISC-V]" columns show the interrupt/exception code value from the`
`official RISC-V privileged architecture manual. The "[C]" names are defined by the NEORV32 core library (the runtime environment _RTE_) and can`	`official RISC-V privileged architecture spec. The "ID [C]" names are defined by the NEORV32 core library (the runtime environment _RTE_) and can`
be used in plain C code. The "`mepc`" and "`mtval`" columns show the value written to	be used in plain C code. The "`mepc`" and "`mtval`" columns show the value written to <<_mepc>> and <<_mtval>> CSRs when a trap is triggered:
`<<_mepc>> and <<_mtval>> CSRs when a trap is triggered:`
	`* IPC - address of interrupted instruction (instruction has not been executed yet)`
`* _I-PC_ - address of interrupted instruction (instruction has not been execute/completed yet)`	`* PC - address of instruction that caused the trap`
`* _B-ADR_- bad memory access address that cause the trap`	`* ADR - bad memory access address that caused the trap`
`* _PC_ - address of instruction that caused the trap`	`* INST - the faulting instruction word itself`
`* _0_ - zero`	`* 0 - zero`
`* _Inst_ - the faulting instruction itself`

`.NEORV32 Trap Listing`	`.NEORV32 Trap Listing`
`[cols="3,6,5,14,11,4,4"]`	`[cols="3,6,5,14,11,4,4"]`
`[options="header",grid="rows"]`	`[options="header",grid="rows"]`
`\|=======================`	`\|=======================`

Line 1...

:sectnums:

:sectnums:

== NEORV32 Central Processing Unit (CPU)

== NEORV32 Central Processing Unit (CPU)

image::neorv32_cpu_block.png[width=600,align=center]

image::neorv32_cpu_block.png[width=600,align=center]

**Section Structure**

* <<_architecture>>, <<_full_virtualization>> and <<_risc_v_compatibility>>

* <<_cpu_top_entity_signals>> and <<_cpu_top_entity_generics>>

* <<_instruction_sets_and_extensions>>, <<_custom_functions_unit_cfu>> and <<_instruction_timing>>

* <<_control_and_status_registers_csrs>>

* <<_traps_exceptions_and_interrupts>>

* <<_bus_interface>>

**Key Features**

**Key Features**

* 32-bit multi-cycle in-order `rv32` RISC-V CPU

* 32-bit little-endian, multi-cycle, in-order `rv32` RISC-V CPU

* Optional RISC-V extensions:

* Compatible to the RISC-V. **Privileged Architecture - Machine ISA Version 1.12** specifications

* Available <<_instruction_sets_and_extensions>>:

** `A` - atomic memory access operations

** `A` - atomic memory access operations

** `B` - bit-manipulation instructions

** `B` - bit-manipulation instructions

** `C` - 16-bit compressed instructions

** `C` - 16-bit compressed instructions

** `I` - integer base ISA (always enabled)

** `I` - integer base ISA (always enabled)

** `E` - embedded CPU version (reduced register file size)

** `E` - embedded CPU version (reduced register file size)

Line 20...

Line 31...

** `Zihpm` - hardware performance monitors

** `Zihpm` - hardware performance monitors

** `Zifencei` - instruction stream synchronization

** `Zifencei` - instruction stream synchronization

** `Zmmul` - integer multiplication hardware

** `Zmmul` - integer multiplication hardware

** `Zxcfu` - custom instructions extension

** `Zxcfu` - custom instructions extension

** `PMP` - physical memory protection

** `PMP` - physical memory protection

** `Debug` - debug mode (part of the on.chip debugger) including hardware trigger module

** `Debug` - <<_cpu_debug_mode>> (part of the on.chip debugger) including hardware <<_trigger_module>>

* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)

* <<_risc_v_compatibility>>: Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)

* Official RISC-V open-source architecture ID

* Official RISC-V open-source architecture ID

* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts

* Supports _all_ of the machine-level <<_traps_exceptions_and_interrupts>> from the RISC-V specifications (including bus access exceptions and all unimplemented/illegal/malformed instructions)

* Supports _all_ of the machine-level traps from the RISC-V specifications (including bus access exceptions and all unimplemented/illegal/malformed instructions)

** This is a special aspect on _execution safety_ by <<_full_virtualization>>

** This is a special aspect on _execution safety_ by <<_full_virtualization>>

** Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 custom _fast_ interrupts

* Optional physical memory configuration (PMP), compatible to the RISC-V specifications

* Optional physical memory configuration (PMP), compatible to the RISC-V specifications

* Optional hardware performance monitors (HPM) for application benchmarking

* Optional hardware performance monitors (HPM) for application benchmarking

* Separated interfaces for instruction fetch and data access (merged into a single processor bus))

* Separated <<_bus_interface>>s for instruction fetch and data access

* little-endian byte order

* Configurable hardware reset

* No hardware support of unaligned data/instruction accesses - they will trigger an exception.

[NOTE]

[NOTE]

It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual

It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual

CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU

CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU

wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This

wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This

Line 249...

Line 257...

:sectnums:

:sectnums:

==== RISC-V Incompatibility Issues and Limitations

==== RISC-V Incompatibility Issues and Limitations

This list shows the currently identified issues regarding full RISC-V-compatibility. More specific information

This list shows the currently identified issues regarding full RISC-V-compatibility.

can be found in section <<_instruction_sets_and_extensions>>.

.Read-Only "Read-Write" CSRs

.Read-Only "Read-Write" CSRs

[IMPORTANT]

[IMPORTANT]

The <<_misa>> and <<_mtval>> CSRs in the NEORV32 are _read-only_.

The <<_misa>> and <<_mtval>> CSRs in the NEORV32 are _read-only_.

Any machine-mode write access to them is ignored and will _not_ cause any exceptions or side-effects to maintain

Any machine-mode write access to them is ignored and will _not_ cause any exceptions or

RISC-V compatibility.

side-effects to maintain RISC-V compatibility.

.Physical Memory Protection

.Physical Memory Protection

[IMPORTANT]

[IMPORTANT]

The physical memory protection (see section <<_machine_physical_memory_protection_csrs>>)

The RISC-V-compatible NEORV32 <<_machine_physical_memory_protection_csrs>> only implements the **TOR**

only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.

(top of region) mode and only up to 16 PMP regions. Furthermore, the <<_pmpcfg>>'s _lock bits_ only lock

the according PMP entry and not the entries below. All region rules are checked in parallel **without**

prioritization so for identical memory regions the most restrictive PMP rule will be enforced.

.Atomic Memory Operations

.Atomic Memory Operations

[IMPORTANT]

[IMPORTANT]

The `A` CPU extension only implements the `lr.w` and `sc.w` instructions yet.

The `A` CPU extension only implements the `lr.w` and `sc.w` instructions yet.

However, these instructions are sufficient to emulate all further atomic memory operations.

However, these instructions are sufficient to emulate all further atomic memory operations.

.No HW-Support of Misaligned Memory Accesses

[WARNING]

The CPU does not support the resolution of unaligned memory access by the hardware. This is not a

RISC-V-compatibility issue but an important thing to know. Any kind of unaligned memory access

will raise an exception to allow a software-based emulation.

// ####################################################################################################################

// ####################################################################################################################

:sectnums:

:sectnums:

=== CPU Top Entity - Signals

=== CPU Top Entity - Signals

Line 282...

Line 297...

.NEORV32 CPU top entity signals

.NEORV32 CPU top entity signals

[cols="<2,^1,^1,<6"]

[cols="<2,^1,^1,<6"]

[options="header", grid="rows"]

[options="header", grid="rows"]

|=======================

|=======================

| Signal           | Width | Dir.   | Function

| Signal           | Width | Dir. | Description

4+^| **Global Signals**

4+^| **Global Signals**

| `clk_i`          |     1 | in  | global clock line, all registers triggering on rising edge

| `clk_i`          |     1 | in  | global clock line, all registers triggering on rising edge

| `rstn_i`         |     1 | in  | global reset, low-active

| `rstn_i`         |     1 | in  | global reset, low-active

| `sleep_o`        |     1 | out | CPU is in sleep mode when set

| `sleep_o`        |     1 | out | CPU is in sleep mode when set

| `debug_o`        |     1 | out | CPU is in debug mode when set

| `debug_o`        |     1 | out | CPU is in debug mode when set

4+^| **Instruction Bus Interface (<<_bus_interface>>)**

4+^| **Instruction <<_bus_interface>>**

| `i_bus_addr_o`   |    32 | out | destination address

| `i_bus_addr_o`   |    32 | out | access address

| `i_bus_rdata_i`  |    32 | in  | read data

| `i_bus_rdata_i`  |    32 | in  | read data

| `i_bus_wdata_o`  |    32 | out | write data (always zero)

| `i_bus_wdata_o`  |    32 | out | write data (always zero)

| `i_bus_ben_o`    |     4 | out | byte enable

| `i_bus_ben_o`    |     4 | out | byte enable

| `i_bus_we_o`     |     1 | out | write transaction (always zero)

| `i_bus_we_o`     |     1 | out | write transaction (always zero)

| `i_bus_re_o`     |     1 | out | read transaction

| `i_bus_re_o`     |     1 | out | read transaction

| `i_bus_lock_o`   |     1 | out | exclusive access request (always zero)

| `i_bus_lock_o`   |     1 | out | exclusive access request (always zero)

| `i_bus_ack_i`    |     1 | in  | bus transfer acknowledge from accessed peripheral

| `i_bus_ack_i`    |     1 | in  | bus transfer acknowledge from accessed peripheral

| `i_bus_err_i`    |     1 | in  | bus transfer terminate from accessed peripheral

| `i_bus_err_i`    |     1 | in  | bus transfer terminate from accessed peripheral

| `i_bus_fence_o`  |     1 | out | indicates an executed _fence.i_ instruction

| `i_bus_fence_o`  |     1 | out | indicates an executed `fence.i` instruction

| `i_bus_priv_o`   |     2 | out | current CPU privilege level

| `i_bus_priv_o`   |     1 | out | current _effective_ CPU privilege level (`0` user, `1` machine or debug)

4+^| **Data Bus Interface (<<_bus_interface>>)**

4+^| **Data <<_bus_interface>>**

| `d_bus_addr_o`   |    32 | out | destination address

| `d_bus_addr_o`   |    32 | out | access address

| `d_bus_rdata_i`  |    32 | in  | read data

| `d_bus_rdata_i`  |    32 | in  | read data

| `d_bus_wdata_o`  |    32 | out | write data

| `d_bus_wdata_o`  |    32 | out | write data

| `d_bus_ben_o`    |     4 | out | byte enable

| `d_bus_ben_o`    |     4 | out | byte enable

| `d_bus_we_o`     |     1 | out | write transaction

| `d_bus_we_o`     |     1 | out | write transaction

| `d_bus_re_o`     |     1 | out | read transaction

| `d_bus_re_o`     |     1 | out | read transaction

| `d_bus_lock_o`   |     1 | out | exclusive access request

| `d_bus_lock_o`   |     1 | out | exclusive access request

| `d_bus_ack_i`    |     1 | in  | bus transfer acknowledge from accessed peripheral

| `d_bus_ack_i`    |     1 | in  | bus transfer acknowledge from accessed peripheral

| `d_bus_err_i`    |     1 | in  | bus transfer terminate from accessed peripheral

| `d_bus_err_i`    |     1 | in  | bus transfer terminate from accessed peripheral

| `d_bus_fence_o`  |     1 | out | indicates an executed _fence_ instruction

| `d_bus_fence_o`  |     1 | out | indicates an executed `fence` instruction

| `d_bus_priv_o`   |     2 | out | current CPU privilege level

| `d_bus_priv_o`   |     1 | out | current _effective_ CPU privilege level (`0` user, `1` machine or debug)

4+^| **System Time (see <<_timeh>> CSR)**

4+^| **System Time (for <<_timeh>> CSR)**

| `time_i`         |    64 | in  | system time input (from MTIME)

| `time_i`         |    64 | in  | system time input from <<_machine_system_timer_mtime>>

4+^| **Interrupts, RISC-V-compatible (<<_traps_exceptions_and_interrupts>>)**

4+^| **Interrupts, RISC-V-compatible (<<_traps_exceptions_and_interrupts>>)**

| `msw_irq_i`      |     1 | in  | RISC-V machine software interrupt

| `msw_irq_i`      |     1 | in  | RISC-V machine software interrupt

| `mext_irq_i`     |     1 | in  | RISC-V machine external interrupt

| `mext_irq_i`     |     1 | in  | RISC-V machine external interrupt

| `mtime_irq_i`    |     1 | in  | RISC-V machine timer interrupt

| `mtime_irq_i`    |     1 | in  | RISC-V machine timer interrupt

4+^| **Fast Interrupts, NEORV32-specific (<<_traps_exceptions_and_interrupts>>)**

4+^| **Interrupts, NEORV32-specific (<<_traps_exceptions_and_interrupts>>)**

| `firq_i`         |    16 | in  | fast interrupt request signals

| `firq_i`         |    16 | in  | fast interrupt request signals

4+^| **Enter Debug Mode Request (<<_on_chip_debugger_ocd>>)**

4+^| **Enter Debug Mode Request (<<_on_chip_debugger_ocd>>)**

| `db_halt_req_i`  |     1 | in  | request CPU to halt and enter debug mode

| `db_halt_req_i`  |     1 | in  | request CPU to halt and enter debug mode

|=======================

|=======================

Line 337...

Line 352...

The _specific_ generics are listed below.

The _specific_ generics are listed below.

[cols="4,4,2"]

[cols="4,4,2"]

[frame="all",grid="none"]

[frame="all",grid="none"]

|======

|======

| **CPU_BOOT_ADDR** | _std_ulogic_vector(31 downto 0)_ | -

| **CPU_BOOT_ADDR** | _std_ulogic_vector(31 downto 0)_ | _no default value_

3+| This address defines the reset address at which the CPU starts fetching instructions after reset. In terms of the NEORV32 processor, this

3+| This address defines the reset address at which the CPU starts fetching instructions after reset. In terms of the NEORV32 processor, this

generic is configured with the base address of the bootloader ROM (default) or with the base address of the processor-internal instruction

generic is configured with the base address of the bootloader ROM (default) or with the base address of the processor-internal instruction

memory (IMEM) if the bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.

memory (IMEM) if the bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.

|======

|======

[cols="4,4,2"]

[cols="4,4,2"]

[frame="all",grid="none"]

[frame="all",grid="none"]

|======

|======

| **CPU_DEBUG_ADDR** | _std_ulogic_vector(31 downto 0)_ | -

| **CPU_DEBUG_ADDR** | _std_ulogic_vector(31 downto 0)_ | _no default value_

3+| This address defines the entry address for the "execution based" on-chip debugger. By default, this generic is configured with the base address

3+| This address defines the entry address for the "execution based" on-chip debugger. By default, this generic is configured with the base address

of the debugger memory. See section <<_on_chip_debugger_ocd>> for more information.

of the debugger memory. See section <<_on_chip_debugger_ocd>> for more information.

|======

|======

[cols="4,4,2"]

[cols="4,4,2"]

[frame="all",grid="none"]

[frame="all",grid="none"]

|======

|======

| **CPU_EXTENSION_RISCV_DEBUG** | _boolean_ | -

| **CPU_EXTENSION_RISCV_DEBUG** | _boolean_ | _no default value_

3+| Implement RISC-V-compatible "debug" CPU operation mode. See section <<_cpu_debug_mode>> for more information.

3+| Implement RISC-V-compatible "debug" CPU operation mode. See section <<_cpu_debug_mode>> for more information.

|======

|======

Line 513...

Line 528...

* multiplication: `mul` `mulh` `mulhsu` `mulhu`

* multiplication: `mul` `mulh` `mulhsu` `mulhu`

* division: `div` `divu` `rem` `remu`

* division: `div` `divu` `rem` `remu`

[NOTE]

[NOTE]

By default, multiplication and division operations are executed in a bit-serial approach.

By default, multiplication and division operations are executed in a bit-serial approach.

Alternatively, the multiplier core can be implemented using DSP blocks if the `FAST_MUL_EN`

Alternatively, the multiplier core can be implemented using DSP blocks if the <<_fast_mul_en>>

generic is _true_ allowing faster execution. Multiplications and divisions

generic is _true_ allowing faster execution. Multiplications and divisions

always require a fixed amount of cycles to complete - regardless of the input operands.

always require a fixed amount of cycles to complete - regardless of the input operands.

[NOTE]

Regardless of the setting of the <<_fast_mul_en>> generic

multiplication and division instructions operate _independently_ of the input operands.

Hence, there is **no early completion** of multiply by one/zero and divide by zero operations.

==== **`Zmmul`** - Integer Multiplication

==== **`Zmmul`** - Integer Multiplication

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations

of the `M` extensions and is intended for size-constrained setups that require hardware-based

of the `M` extensions and is intended for size-constrained setups that require hardware-based

Line 547...

Line 567...

operation mode. It is implemented if the <<_cpu_extension_riscv_u>> configuration generic is _true_.

operation mode. It is implemented if the <<_cpu_extension_riscv_u>> configuration generic is _true_.

Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like

Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like

peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).

peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).

Any kind of privilege rights violation will raise an exception to allow <<_full_virtualization>>.

Any kind of privilege rights violation will raise an exception to allow <<_full_virtualization>>.

Additional CSRs:

* <<_mcounteren>> - machine counter enable to constrain user-mode access to timer/counter CSRs

==== **`X`** - NEORV32-Specific (Custom) Extensions

==== **`X`** - NEORV32-Specific (Custom) Extensions

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the <<_misa>> CSR.

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the <<_misa>> CSR.

The most important points of the NEORV32-specific extensions are:

The most important points of the NEORV32-specific extensions are:

* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ)`, which are controlled via custom bits in the `mie`

* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ`), which are controlled via custom bits in the <<_mie>>

and <<_mip>> CSR. This extension is mapped to CSR bits, that are available for custom use (according to the

and <<_mip>> CSRs. These extensions are mapped to CSR bits, that are available for custom use according to the

RISC-V specs). Also, custom trap codes for <<_mcause>> are implemented.

RISC-V specs. Also, custom trap codes for <<_mcause>> are implemented.

* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).

* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).

* There are <<_neorv32_specific_csrs>>.

* There are <<_neorv32_specific_csrs>>.

==== **`Zfinx`** Single-Precision Floating-Point Operations

==== **`Zfinx`** Single-Precision Floating-Point Operations

Line 582...

Line 606...

* comparison: `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s`

* comparison: `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s`

* computational: `fadd.s` `fsub.s` `fmul.s`

* computational: `fadd.s` `fsub.s` `fmul.s`

* sign-injection: `fsgnj.s` `fsgnjn.s` `fsgnjx.s`

* sign-injection: `fsgnj.s` `fsgnjn.s` `fsgnjx.s`

* number classification: `fclass.s`

* number classification: `fclass.s`

* additional CSRs: <<_fcsr>>, <<_frm>>, <<_fflags>>

* compressed instructions: `c.flw` `c.flwsp` `c.fsw` `c.fswsp`

Additional CSRs:

* <<_fcsr>> - FPU control register

* <<_frm>> - rounding mode control

* <<_fflags>> - FPU status flags

[WARNING]

[WARNING]

Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!

Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!

Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

Line 621...

Line 651...

[NOTE]

[NOTE]

If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.

If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.

However, access privileges are still enforced so these instruction variants _do_ cause side-effects

However, access privileges are still enforced so these instruction variants _do_ cause side-effects

(the RISC-V spec. state that these combinations "_shall_ not cause any side-effects").

(the RISC-V spec. state that these combinations "_shall_ not cause any side-effects").

[NOTE]

** `wfi` Instruction **

The "wait for interrupt instruction" `wfi` acts like a sleep command. When executed, the CPU is

The "wait for interrupt instruction" `wfi` acts like a sleep command. When executed, the CPU is

halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to

halted until a valid interrupt request occurs. To wake up again, at least one interrupt source has to

be enabled via the <<_mie>> CSR and the global interrupt enable flag in <<_mstatus>> has to be set.

be enabled via the <<_mie>> CSR and the global interrupt enable flag in <<_mstatus>> has to be set.

The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit

`TW` (timeout wait) is _hardwired_ to zero.

If the <<_mstatus>> `TW` bis is cleared the `wfi` instruction is also allowed to execute when in user-mode.

This is always the case if user-mode is not implemented. If the `TW` bit is set the execution of `wfi` in

user-mode will raise an illegal instruction exception.

==== **`Zicntr`** CPU Base Counters

==== **`Zicntr`** CPU Base Counters

The `Zicntr` ISA extension adds the basic cycle `[m]cycle[h]`), instruction-retired (`[m]instret[h]`) and time (`time[h]`)

The `Zicntr` ISA extension adds the basic cycle `[m]cycle[h]`), instruction-retired (`[m]instret[h]`) and time (`time[h]`)

counters. This extensions is stated is _mandatory_ by the RISC-V spec. However, size-constrained setups may remove support for

counters. This extensions is stated is _mandatory_ by the RISC-V spec. However, size-constrained setups may remove support for

these counters. Section <<_machine_counter_and_timer_csrs>> shows a list of all `Zicntr`-related CSRs.

these counters. Section <<_machine_counter_and_timer_csrs>> shows a list of all `Zicntr`-related CSRs.

These are available if the `Zicntr` ISA extensions is enabled via the <<_cpu_extension_riscv_zicntr>> generic.

These are available if the `Zicntr` ISA extensions is enabled via the <<_cpu_extension_riscv_zicntr>> generic.

Additional CSRs:

* <<_cycleh>>, <<_mcycleh>> - cycle counter

* <<_instreth>>, <<_minstreth>> - instructions-retired counter

* <<_timeh>> - system _wall-clock_ time

[NOTE]

[NOTE]

Disabling the `Zicntr` extension does not remove the `time[h]`-driving MTIME unit.

Disabling the `Zicntr` extension does not remove the `time[h]`-driving MTIME unit.

If `Zicntr` is disabled, all accesses to the according counter CSRs will raise an illegal instruction exception.

If `Zicntr` is disabled, all accesses to the according counter CSRs will raise an illegal instruction exception.

==== **`Zihpm`** Hardware Performance Monitors

==== **`Zihpm`** Hardware Performance Monitors

In additions to the base cycle, instructions-retired and time counters the NEORV32 CPU provides

In additions to the base cycle, instructions-retired and time counters the NEORV32 CPU provides

up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an

up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an

N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's

N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's

<<_hpm_cnt_width>> generic (0..64-bit) and a corresponding event configuration CSR. The event configuration

<<_hpm_cnt_width>> generic (0..64-bit) and a corresponding event configuration CSR. The event configuration

CSR defines the architectural events that lead to an increment of the associated HPM counter.

CSR defines the architectural events that lead to an increment of the associated HPM counter.

The HPM counters are available if the `Zihpm` ISA extensions is enabled via the <<_cpu_extension_riscv_zihpm>> generic.

The HPM counters are available if the `Zihpm` ISA extensions is enabled via the <<_cpu_extension_riscv_zihpm>> generic.

The actual number of implemented HPM counters is defined by the <<_hpm_num_cnts>> generic.

Depending on the configuration the following additional CSR are available:

Additional CSRs:

* counters: `mhpmcounter*[h]` (3..31, depending on `HPM_NUM_CNTS`)

* <<_mhpmevent>> 3..31 (depending on <<_hpm_num_cnts>>) - event configuration CSRs

* event configuration: `mhpmevent*` (3..31, depending on `HPM_NUM_CNTS`)

* <<_mhpmcounterh>> 3..31 (depending on <<_hpm_num_cnts>>) - counter CSRs

[IMPORTANT]

[IMPORTANT]

The HPM counter CSR can only be accessed in machine-mode. Hence, the according <<_mcounteren>> CSR bits

The HPM counter CSRs can only be accessed in machine-mode. Hence, the according <<_mcounteren>> CSR bits

are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction

are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction

exception.

exception.

[TIP]

[TIP]

Auto-increment of the HPMs can be individually deactivated via the <<_mcountinhibit>> CSR.

Auto-increment of the HPMs can be deactivated individually via the <<_mcountinhibit>> CSR.

[TIP]

For a list of all HPM-related CSRs and all provided event configurations

see section <<_hardware_performance_monitors_hpm>>.

==== **`Zifencei`** Instruction Stream Synchronization

==== **`Zifencei`** Instruction Stream Synchronization

The `Zifencei` CPU extension is implemented if the <<_cpu_extension_riscv_zifencei>> configuration

The `Zifencei` CPU extension is implemented if the <<_cpu_extension_riscv_zifencei>> configuration

Line 711...

Line 745...

the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.

the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.

==== **`PMP`** Physical Memory Protection

==== **`PMP`** Physical Memory Protection

The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used

The NEORV32 physical memory protection (PMP) provides an elementary memory protection mechanism that can be used

to constrain memory read/write/execute rights for each available privilege level.

to constrain read, write and execute rights of arbitrary memory regions. The PMP is compatible

to the _RISC-V Privileged Architecture Specifications_. For detailed information see the according spec.'s sections.

[IMPORTANT]

The NEORV32 PMP only supports **TOR** (top of region) mode, which basically is a "base-and-bound" concept, and only

up to 16 PMP regions.

The physical memory protection logic is implemented if the <<_pmp_num_regions>> configuration generic is greater

than zero. This generic also defines the total number of available configurable protection

regions. The minimal granularity of a protected region is defined by the <<_pmp_min_granularity>> generic. Larger

granularity will reduce hardware complexity but will also decrease granularity as the minimal region sizes increases.

The default value is 4 bytes, which allows a minimal region size of 4 bytes.

If implemented the PMP provides the following additional CSRs:

* <<_pmpcfg>> 0..3 (depending on configuration) - PMP configuration registers, 4 entries per CSR

* <<_pmpaddr>> 0..15 (depending on configuration) - PMP address registers

The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger

**Operation Summary**

minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.

The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.

In this case the following additional CSRs are available:

* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers

Any CPU access address (from the instruction fetch or data access interface) is tested if it matches _any_

* `pmpaddr*` (0..63, depending on configuration): PMP address registers

of the specified PMP regions. If there is a match, the configured access rights are enforced:

* a write access (store) will fail if no **write** attribute is set

* a read access (load) will fail if no **read** attribute is set

* an instruction fetch access will fail if no **execute** attribute is set

If an access to a protected region does not have the according access rights it will raise the according

instruction/load/store _bus access fault_ exception.

By default, all PMP checks are enforced for user-mode only. However, PMP rules can also be enforced for

machine-mode when the according PMP region has the "LOCK" bit set. This will also prevent any write access

to according region's PMP CSRs until the CPU is reset.

.Rule Prioritization

[IMPORTANT]

All rules are checked in parallel **without** prioritization so for identical memory regions the most restrictive

PMP rule will be enforced.

.PMP Example Program

[TIP]

[TIP]

See section <<_machine_physical_memory_protection_csrs>> for more information regarding the PMP CSRs.

A simple PMP example program can be found in `sw/example/demo_pmp`.

The actual number of regions and the minimal region granularity are defined via the top entity

`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available

granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the

number of available `pmpcfg*` and `pmpaddr*` CSRs.

When implementing more PMP regions that a _certain critical limit_ *an additional register stage

is automatically inserted* into the CPU's memory interfaces to reduce critical path length. Unfortunately, this will also

increase the latency of instruction fetches and data access by +1 cycle.

The critical limit can be adapted for custom use by a constant from the main VHDL package file

**Impact on Critical Path**

(`rtl/core/neorv32_package.vhd`). The default value is 8:

When implementing more PMP regions that a "_certain critical limit_" an **additional register stage** is automatically

inserted into the CPU's memory interfaces to keep impact on the critical path as short as minimal as possible.

Unfortunately, this will also increase the latency of instruction fetches and data access by one cycle.

The _critical limit_ can be modified by a constant from the main VHDL package file

(`rtl/core/neorv32_package.vhd`, default value = 8):

[source,vhdl]

[source,vhdl]

----

----

-- "critical" number of PMP regions --

-- "critical" number of PMP regions --

constant pmp_num_regions_critical_c : natural := 8;

constant pmp_num_regions_critical_c : natural := 8;

----

----

**Operation**

[TIP]

Reducing the minimal PMP region size / granularity via the <<_pmp_min_granularity>> to entity generic

Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_

will also reduce hardware utilization and impact on critical path.

of the specified  PMP regions(configured via `pmpaddr*` and enabled via `pmpcfg*`). If an

address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:

* a write access (store) will fail if no write attribute is set

* a read access (load) will fail if no read attribute is set

* an instruction fetch access will fail if no execute attribute is set

If an access to a protected region does not have the according access rights it will raise the according

instruction/load/store _access fault_ exception.

By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical

memory protection also for machine-level programs you need to set the _locked bit_ in the according

`pmpcfg*` configuration CSR.

[IMPORTANT]

After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for

internal (iterative) computations before the configuration becomes valid.

[NOTE]

For more information regarding RISC-V physical memory protection see the official _The RISC-V

// ####################################################################################################################

Instruction Set Manual - Volume II: Privileged Architecture_ specifications.

include::cpu_cfu.adoc[]

// ####################################################################################################################

// ####################################################################################################################

:sectnums:

:sectnums:

Line 800...

Line 846...

| Memory access | `I/E` | `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` | 4 + ML

| Memory access | `I/E` | `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` | 4 + ML

| Memory access | `C`   | `c.lw` `c.sw` `c.lwsp` `c.swsp`           | 4 + ML

| Memory access | `C`   | `c.lw` `c.sw` `c.lwsp` `c.swsp`           | 4 + ML

| Memory access | `A`   | `lr.w` `sc.w`                             | 4 + ML

| Memory access | `A`   | `lr.w` `sc.w`                             | 4 + ML

| Multiplication | `M`  | `mul` `mulh` `mulhsu` `mulhu` | 2+32+2; FAST_MULfootnote:[DSP-based multiplication; enabled via `FAST_MUL_EN`.]: 4

| Multiplication | `M`  | `mul` `mulh` `mulhsu` `mulhu` | 2+32+2; FAST_MULfootnote:[DSP-based multiplication; enabled via `FAST_MUL_EN`.]: 4

| Division       | `M`  | `div` `divu` `rem` `remu`     | 2+32+2

| Division       | `M`  | `div` `divu` `rem` `remu`     | 2+32+2

| CSR access | `Zicsr` | `csrrw` `csrrs` `csrrc` `csrrwi` `csrrsi` `csrrci` | 4

| CSR access     | `Zicsr`     | `csrrw` `csrrs` `csrrc` `csrrwi` `csrrsi` `csrrci` | 3

| System | `I/E`+`Zicsr` | `ecall` `ebreak` | 4

| System | `I/E` | `fence` | 3

| System | `I/E` | `fence` | 3

| System | `C`+`Zicsr` | `c.break` | 4

| System         | `Zicsr`     | `ecall` `ebreak` | 3

| System | `Zicsr` | `mret` `wfi` | 5

| System         | `Zicsr`+`C` | `c.break` | 3

| System         | `Zicsr`     | `mret` `wfi` | 6

| System | `Zifencei` | `fence.i` | 3 + ML

| System | `Zifencei` | `fence.i` | 3 + ML

| Floating-point - artihmetic | `Zfinx` | `fadd.s` | 110

| Floating-point - artihmetic | `Zfinx` | `fadd.s` | 110

| Floating-point - artihmetic | `Zfinx` | `fsub.s` | 112

| Floating-point - artihmetic | `Zfinx` | `fsub.s` | 112

| Floating-point - artihmetic | `Zfinx` | `fmul.s` | 22

| Floating-point - artihmetic | `Zfinx` | `fmul.s` | 22

| Floating-point - compare | `Zfinx` | `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` | 13

| Floating-point - compare | `Zfinx` | `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` | 13

Line 821...

Line 867...

| Bit-manipulation - shifts | `B(Zbb)` | `cpop` | 3 + 32

| Bit-manipulation - shifts | `B(Zbb)` | `cpop` | 3 + 32

| Bit-manipulation - shifts | `B(Zbb)` | `rol` `ror` `rori` | 3 + SA

| Bit-manipulation - shifts | `B(Zbb)` | `rol` `ror` `rori` | 3 + SA

| Bit-manipulation - single-bit  | `B(Zbs)` | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 3

| Bit-manipulation - single-bit  | `B(Zbs)` | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 3

| Bit-manipulation - shifted-add | `B(Zba)` | `sh1add` `sh2add` `sh3add` | 3

| Bit-manipulation - shifted-add | `B(Zba)` | `sh1add` `sh2add` `sh3add` | 3

| Bit-manipulation - carry-less multiply | `B(Zbc)` | `clmul` `clmulh` `clmulr` | 3 + 32

| Bit-manipulation - carry-less multiply | `B(Zbc)` | `clmul` `clmulh` `clmulr` | 3 + 32

| CFU: custom instructions | `Zxcfu` | - | min. 4

| Custom instructions (CFU) | `Zxcfu` | - | min. 4

| | | |

| _Illegal instructions_    | `Zicsr` | - | 2

|=======================

|=======================

[NOTE]

[NOTE]

The presented values of the *floating-point execution cycles* are average values - obtained from

The presented values of the *floating-point execution cycles* are average values - obtained from

4096 instruction executions using pseudo-random input values. The execution time for emulating the

4096 instruction executions using pseudo-random input values. The execution time for emulating the

Line 865...

Line 913...

until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).

until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).

.Interrupt Signal Requirements - Fast Interrupt Requests

.Interrupt Signal Requirements - Fast Interrupt Requests

[IMPORTANT]

[IMPORTANT]

The NEORV32-specific FIRQ request lines are triggered by a one-shot high-level (i.e. rising edge). Each request is buffered in the CPU control

The NEORV32-specific FIRQ request lines are triggered by a one-shot high-level (i.e. rising edge). Each request is buffered in the CPU control

unit until the channel is either disabled (by clearing the according <<_mie>> CSR bit) or the request is explicitly cleared (by setting

unit until the channel is either disabled (by clearing the according <<_mie>> CSR bit) or the request is explicitly cleared (by writing

the according <<_mip>> CSR bit).

zero to the according <<_mip>> CSR bit).

.Instruction Atomicity

.Instruction Atomicity

[NOTE]

[NOTE]

All instructions execute as atomic operations - interrupts can only trigger _between_ two instructions.

All instructions execute as atomic operations - interrupts can only trigger _between_ two instructions.

So even if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before

So even if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before

another interrupt handler can start. This allows program progress even if there are permanent interrupt requests.

another interrupt handler can start. This allows program progress even if there are permanent interrupt requests.

:sectnums:

:sectnums:

==== Memory Access Exceptions

===== Memory Access Exceptions

If a load operation causes any exception, the instruction's destination register is

If a load operation causes any exception, the instruction's destination register is

_not written_ at all. Load exceptions caused by a misalignment or a physical memory protection fault do not

_not written_ at all. Load exceptions caused by a misalignment or a physical memory protection fault do not

trigger a bus/memory read-operation at all. Vice versa, exceptions caused by a store address misalignment or a store physical

trigger a bus/memory read-operation at all. Vice versa, exceptions caused by a store address misalignment or a store physical

memory protection fault do not trigger a bus/memory write-operation at all.

memory protection fault do not trigger a bus/memory write-operation at all.

:sectnums:

:sectnums:

==== Custom Fast Interrupt Request Lines

===== Custom Fast Interrupt Request Lines

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top

entity signals. These interrupts have custom configuration and status flags in the <<_mie>> and <<_mip>> CSRs and also

entity signals. These interrupts have custom configuration and status flags in the <<_mie>> and <<_mip>> CSRs and also

provide custom trap codes in <<_mcause>>. These FIRQs are reserved for NEORV32 processor-internal usage only.

provide custom trap codes in <<_mcause>>. These FIRQs are reserved for NEORV32 processor-internal usage only.

// ####################################################################################################################

:sectnums:

:sectnums:

==== NEORV32 Trap Listing

===== NEORV32 Trap Listing

The following table shows all traps that are currently supported by the NEORV32 CPU. It also shows the prioritization

The following table shows all traps that are currently supported by the NEORV32 CPU. It also shows the prioritization

and the CSR side-effects. A more detailed description of the actual trap triggering events is provided in a further table.

and the CSR side-effects. A more detailed description of the actual trap triggering events is provided in a further table.

[NOTE]

[NOTE]

Line 909...

Line 954...

**Table Annotations**

**Table Annotations**

The "Prio." column shows the priority of each trap. The highest priority is 1. The "`mcause`" column shows the

The "Prio." column shows the priority of each trap. The highest priority is 1. The "`mcause`" column shows the

cause ID of the according trap that is written to <<_mcause>> CSR. The "[RISC-V]" columns show the interrupt/exception code value from the

cause ID of the according trap that is written to <<_mcause>> CSR. The "[RISC-V]" columns show the interrupt/exception code value from the

official RISC-V privileged architecture manual. The "[C]" names are defined by the NEORV32 core library (the runtime environment _RTE_) and can

official RISC-V privileged architecture spec. The "ID [C]" names are defined by the NEORV32 core library (the runtime environment _RTE_) and can

be used in plain C code. The "`mepc`" and "`mtval`" columns show the value written to

be used in plain C code. The "`mepc`" and "`mtval`" columns show the value written to <<_mepc>> and <<_mtval>> CSRs when a trap is triggered:

<<_mepc>> and <<_mtval>> CSRs when a trap is triggered:

* **IPC** - address of interrupted instruction (instruction has not been executed yet)

* _I-PC_ - address of interrupted instruction (instruction has not been execute/completed yet)

* **PC** - address of instruction that caused the trap

* _B-ADR_- bad memory access address that cause the trap

* **ADR** - bad memory access address that caused the trap

* _PC_ - address of instruction that caused the trap

* **INST** - the faulting instruction word itself

* _0_ - zero

* **0** - zero

* _Inst_ - the faulting instruction itself

.NEORV32 Trap Listing

.NEORV32 Trap Listing

[cols="3,6,5,14,11,4,4"]

[cols="3,6,5,14,11,4,4"]

[options="header",grid="rows"]

[options="header",grid="rows"]

|=======================

|=======================

| Prio. | `mcause` | [RISC-V] | ID [C] | Cause | `mepc` | `mtval`

| Prio. | `mcause` | [RISC-V] | ID [C] | Cause | `mepc` | `mtval`

| 1  | `0x00000000` | 0.0  | _TRAP_CODE_I_MISALIGNED_ | instruction address misaligned | _B-ADR_ | _PC_

7+^| **Synchronous Exceptions**

| 2  | `0x00000001` | 0.1  | _TRAP_CODE_I_ACCESS_     | instruction access fault | _B-ADR_ | _PC_

| 1     | `0x00000000` | 0.0      | _TRAP_CODE_I_MISALIGNED_ | instruction address misaligned    | **PC**  | **ADR**

| 3  | `0x00000002` | 0.2  | _TRAP_CODE_I_ILLEGAL_    | illegal instruction | _PC_ | _Inst_

| 2     | `0x00000001` | 0.1      | _TRAP_CODE_I_ACCESS_     | instruction access bus fault      | **PC**  | **ADR**

| 4  | `0x0000000B` | 0.11 | _TRAP_CODE_MENV_CALL_    | environment call from M-mode (`ecall` in machine-mode) | _PC_ | _PC_

| 3     | `0x00000002` | 0.2      | _TRAP_CODE_I_ILLEGAL_    | illegal instruction               | **PC**  | **INST**

| 5  | `0x00000008` | 0.8  | _TRAP_CODE_UENV_CALL_    | environment call from U-mode (`ecall` in user-mode) | _PC_ | _PC_

| 4     | `0x0000000B` | 0.11     | _TRAP_CODE_MENV_CALL_    | environment call from M-mode      | **PC**  | **0**

| 6  | `0x00000003` | 0.3  | _TRAP_CODE_BREAKPOINT_   | breakpoint (`ebreak`) | _PC_ | _PC_

| 5     | `0x00000008` | 0.8      | _TRAP_CODE_UENV_CALL_    | environment call from U-mode      | **PC**  | **0**

| 7  | `0x00000006` | 0.6  | _TRAP_CODE_S_MISALIGNED_ | store address misaligned | _B-ADR_ | _B-ADR_

| 6     | `0x00000003` | 0.3      | _TRAP_CODE_BREAKPOINT_   | breakpoint instruction            | **PC**  | **PC**

| 8  | `0x00000004` | 0.4  | _TRAP_CODE_L_MISALIGNED_ | load address misaligned | _B-ADR_ | _B-ADR_

| 7     | `0x00000006` | 0.6      | _TRAP_CODE_S_MISALIGNED_ | store address misaligned          | **PC**  | **ADR**

| 9  | `0x00000007` | 0.7  | _TRAP_CODE_S_ACCESS_     | store access fault | _B-ADR_ | _B-ADR_

| 8     | `0x00000004` | 0.4      | _TRAP_CODE_L_MISALIGNED_ | load address misaligned           | **PC**  | **ADR**

| 10 | `0x00000005` | 0.5  | _TRAP_CODE_L_ACCESS_     | load access fault | _B-ADR_ | _B-ADR_

| 9     | `0x00000007` | 0.7      | _TRAP_CODE_S_ACCESS_     | store access bus fault            | **PC**  | **ADR**

| 11 | `0x80000010` | 1.16 | _TRAP_CODE_FIRQ_0_       | fast interrupt request channel 0 | _I-PC_ | _0_

| 10    | `0x00000005` | 0.5      | _TRAP_CODE_L_ACCESS_     | load access bus fault             | **PC**  | **ADR**

| 12 | `0x80000011` | 1.17 | _TRAP_CODE_FIRQ_1_       | fast interrupt request channel 1 | _I-PC_ | _0_

7+^| **Asynchronous Exceptions (Interrupts)**

| 13 | `0x80000012` | 1.18 | _TRAP_CODE_FIRQ_2_       | fast interrupt request channel 2 | _I-PC_ | _0_

| 11    | `0x80000010` | 1.16     | _TRAP_CODE_FIRQ_0_       | fast interrupt request channel 0  | **IPC** | **0**

| 14 | `0x80000013` | 1.19 | _TRAP_CODE_FIRQ_3_       | fast interrupt request channel 3 | _I-PC_ | _0_

| 12    | `0x80000011` | 1.17     | _TRAP_CODE_FIRQ_1_       | fast interrupt request channel 1  | **IPC** | **0**

| 15 | `0x80000014` | 1.20 | _TRAP_CODE_FIRQ_4_       | fast interrupt request channel 4 | _I-PC_ | _0_

| 13    | `0x80000012` | 1.18     | _TRAP_CODE_FIRQ_2_       | fast interrupt request channel 2  | **IPC** | **0**

| 16 | `0x80000015` | 1.21 | _TRAP_CODE_FIRQ_5_       | fast interrupt request channel 5 | _I-PC_ | _0_

| 14    | `0x80000013` | 1.19     | _TRAP_CODE_FIRQ_3_       | fast interrupt request channel 3  | **IPC** | **0**

| 17 | `0x80000016` | 1.22 | _TRAP_CODE_FIRQ_6_       | fast interrupt request channel 6 | _I-PC_ | _0_

| 15    | `0x80000014` | 1.20     | _TRAP_CODE_FIRQ_4_       | fast interrupt request channel 4  | **IPC** | **0**

| 18 | `0x80000017` | 1.23 | _TRAP_CODE_FIRQ_7_       | fast interrupt request channel 7 | _I-PC_ | _0_

| 16    | `0x80000015` | 1.21     | _TRAP_CODE_FIRQ_5_       | fast interrupt request channel 5  | **IPC** | **0**

| 19 | `0x80000018` | 1.24 | _TRAP_CODE_FIRQ_8_       | fast interrupt request channel 8 | _I-PC_ | _0_

| 17    | `0x80000016` | 1.22     | _TRAP_CODE_FIRQ_6_       | fast interrupt request channel 6  | **IPC** | **0**

| 20 | `0x80000019` | 1.25 | _TRAP_CODE_FIRQ_9_       | fast interrupt request channel 9 | _I-PC_ | _0_

| 18    | `0x80000017` | 1.23     | _TRAP_CODE_FIRQ_7_       | fast interrupt request channel 7  | **IPC** | **0**

| 21 | `0x8000001a` | 1.26 | _TRAP_CODE_FIRQ_10_      | fast interrupt request channel 10 | _I-PC_ | _0_

| 19    | `0x80000018` | 1.24     | _TRAP_CODE_FIRQ_8_       | fast interrupt request channel 8  | **IPC** | **0**

| 22 | `0x8000001b` | 1.27 | _TRAP_CODE_FIRQ_11_      | fast interrupt request channel 11 | _I-PC_ | _0_

| 20    | `0x80000019` | 1.25     | _TRAP_CODE_FIRQ_9_       | fast interrupt request channel 9  | **IPC** | **0**

| 23 | `0x8000001c` | 1.28 | _TRAP_CODE_FIRQ_12_      | fast interrupt request channel 12 | _I-PC_ | _0_

| 21    | `0x8000001a` | 1.26     | _TRAP_CODE_FIRQ_10_      | fast interrupt request channel 10 | **IPC** | **0**

| 24 | `0x8000001d` | 1.29 | _TRAP_CODE_FIRQ_13_      | fast interrupt request channel 13 | _I-PC_ | _0_

| 22    | `0x8000001b` | 1.27     | _TRAP_CODE_FIRQ_11_      | fast interrupt request channel 11 | **IPC** | **0**

| 25 | `0x8000001e` | 1.30 | _TRAP_CODE_FIRQ_14_      | fast interrupt request channel 14 | _I-PC_ | _0_

| 23    | `0x8000001c` | 1.28     | _TRAP_CODE_FIRQ_12_      | fast interrupt request channel 12 | **IPC** | **0**

| 26 | `0x8000001f` | 1.31 | _TRAP_CODE_FIRQ_15_      | fast interrupt request channel 15 | _I-PC_ | _0_

| 24    | `0x8000001d` | 1.29     | _TRAP_CODE_FIRQ_13_      | fast interrupt request channel 13 | **IPC** | **0**

| 27 | `0x8000000B` | 1.11 | _TRAP_CODE_MEI_          | machine external interrupt | _I-PC_ | _0_

| 25    | `0x8000001e` | 1.30     | _TRAP_CODE_FIRQ_14_      | fast interrupt request channel 14 | **IPC** | **0**

| 28 | `0x80000003` | 1.3  | _TRAP_CODE_MSI_          | machine software interrupt | _I-PC_ | _0_

| 26    | `0x8000001f` | 1.31     | _TRAP_CODE_FIRQ_15_      | fast interrupt request channel 15 | **IPC** | **0**

| 29 | `0x80000007` | 1.7  | _TRAP_CODE_MTI_          | machine timer interrupt | _I-PC_ | _0_

| 27    | `0x8000000B` | 1.11     | _TRAP_CODE_MEI_          | machine external interrupt (MEI)  | **IPC** | **0**

| 28    | `0x80000003` | 1.3      | _TRAP_CODE_MSI_          | machine software interrupt (MSI)  | **IPC** | **0**

| 29    | `0x80000007` | 1.7      | _TRAP_CODE_MTI_          | machine timer interrupt (MTI)     | **IPC** | **0**

|=======================

|=======================

The following table provides a summarized description of the actual events for triggering a specific trap.

The following table provides a summarized description of the actual events for triggering a specific trap.

.NEORV32 Trap Description

.NEORV32 Trap Description

[cols="<3,<7"]

[cols="<3,<7"]

[options="header",grid="rows"]

[options="header",grid="rows"]

|=======================

|=======================

| Trap ID [C] | Triggered when ...

| Trap ID [C] | Triggered when ...

| _TRAP_CODE_I_MISALIGNED_ | fetching an 32-bit instruction word that is not 32-bit-aligned (_see note below!_)

| _TRAP_CODE_I_MISALIGNED_ | fetching a 32-bit instruction word that is not 32-bit-aligned (_see note below!_)

| _TRAP_CODE_I_ACCESS_     | bus timeout or bus error during instruction word fetch

| _TRAP_CODE_I_ACCESS_     | bus timeout or bus error during instruction word fetch

| _TRAP_CODE_I_ILLEGAL_    | trying to execute an invalid instruction word (malformed or not supported) or on a privilege violation

| _TRAP_CODE_I_ILLEGAL_    | trying to execute an invalid instruction word (malformed or not supported) or on a privilege violation

| _TRAP_CODE_MENV_CALL_    | executing `ecall` instruction in machine-mode

| _TRAP_CODE_MENV_CALL_    | executing `ecall` instruction in machine-mode

| _TRAP_CODE_UENV_CALL_    | executing `ecall` instruction in user-mode

| _TRAP_CODE_UENV_CALL_    | executing `ecall` instruction in user-mode

| _TRAP_CODE_BREAKPOINT_   | executing `ebreak` instruction (or triggered by on-chip debugger)

| _TRAP_CODE_BREAKPOINT_   | executing `ebreak` instruction

| _TRAP_CODE_S_MISALIGNED_ | storing data to an address that is not naturally aligned to the data size (byte, half, word) being stored

| _TRAP_CODE_S_MISALIGNED_ | storing data to an address that is not naturally aligned to the data size (byte, half, word) being stored

| _TRAP_CODE_L_MISALIGNED_ | loading data from an address that is not naturally aligned to the data size  (byte, half, word) being loaded

| _TRAP_CODE_L_MISALIGNED_ | loading data from an address that is not naturally aligned to the data size  (byte, half, word) being loaded

| _TRAP_CODE_S_ACCESS_     | bus timeout or bus error during load data operation

| _TRAP_CODE_S_ACCESS_     | bus timeout or bus error during load data operation

| _TRAP_CODE_L_ACCESS_     | bus timeout or bus error during store data operation

| _TRAP_CODE_L_ACCESS_     | bus timeout or bus error during store data operation

| _TRAP_CODE_FIRQ_0_ ... _TRAP_CODE_FIRQ_15_| caused by interrupt-condition of processor-internal modules, see <<_neorv32_specific_fast_interrupt_requests>>

| _TRAP_CODE_FIRQ_0_ ... _TRAP_CODE_FIRQ_15_| caused by interrupt-condition of processor-internal modules, see <<_neorv32_specific_fast_interrupt_requests>>

Line 1040...

Line 1086...

completed when the accessed peripheral/memory either sets the `*_bus_ack_i` signal (-> successful completion) or the

completed when the accessed peripheral/memory either sets the `*_bus_ack_i` signal (-> successful completion) or the

`*_bus_err_i` signal (-> failed completion). These bus response signal are also set only for one cycle active.

`*_bus_err_i` signal (-> failed completion). These bus response signal are also set only for one cycle active.

An error indicated by the `*_bus_err_i` signal will raise the according "instruction bus access fault" or

An error indicated by the `*_bus_err_i` signal will raise the according "instruction bus access fault" or

"load/store bus access fault" exception.

"load/store bus access fault" exception.

**Minimal Response Latency**

**Minimal Response Latency**

The transfer can be completed directly in the same cycle as it was initiated (via the `*_bus_re_o` or `*_bus_we_o`

The transfer can be completed directly in the same cycle as it was initiated (via the `*_bus_re_o` or `*_bus_we_o`

signal) if the peripheral sets `*_bus_ack_i` or `*_bus_err_i` high for one cycle. However, in order to shorten the

signal) if the peripheral sets `*_bus_ack_i` or `*_bus_err_i` high for one cycle. However, in order to shorten the

critical path such "asynchronous" completion should be avoided. The default NEORV32 processor-internal modules provide

critical path such "asynchronous" completion should be avoided. The default NEORV32 processor-internal modules provide

exactly **one cycle delay** between initiation and completion of transfers.

exactly **one cycle delay** between initiation and completion of transfers.

**Maximal Response Latency**

**Maximal Response Latency**

Processor-internal peripherals or memories do not have to respond within one cycle after a bus request has been initiated.

Processor-internal peripherals or memories do not have to respond within one cycle after a bus request has been initiated.

However, the bus transaction has to be completed (= acknowledged) within a certain **response time window**. This time window

However, the bus transaction has to be completed (= acknowledged) within a certain **response time window**. This time window

is defined by the global `max_proc_int_response_time_c` constant (default = 15 cycles; processor's VHDL package file `rtl/neorv32_package.vhd`).

is defined by the global `max_proc_int_response_time_c` constant (default = 15 cycles; processor's VHDL package file `rtl/neorv32_package.vhd`).

It defines the maximum number of cycles after which an _unacknowledged_ (`*_bus_ack_i` or `*_bus_err_i` both not set) processor-internal bus

It defines the maximum number of cycles after which an _unacknowledged_ (`*_bus_ack_i` or `*_bus_err_i` signal from the **processor-internal bus**

both not set) processor-internal bus

transfer will time out and raises a **bus fault exception**. The <<_internal_bus_monitor_buskeeper>> keeps track of all _internal_ bus

transfer will time out and raises a **bus fault exception**. The <<_internal_bus_monitor_buskeeper>> keeps track of all _internal_ bus

transactions to enforce this time window.

transactions to enforce this time window.

If any bus operations times out (for example when accessing "address space holes") the BUSKEEPER will issue a bus

If any bus operations times out (for example when accessing "address space holes") the BUSKEEPER will issue a bus

error to the CPU that will raise the according instruction fetch or data access bus exception.

error to the CPU that will raise the according instruction fetch or data access bus exception.

Note that **the bus keeper does not track external accesses via the external memory bus interface**. However,

Note that **the bus keeper does not track external accesses via the external memory bus interface**. However,

the external memory bus interface also provides an _optional_ bus timeout (see section <<_processor_external_memory_interface_wishbone_axi4_lite>>).

the external memory bus interface also provides an _optional_ bus timeout (see section <<_processor_external_memory_interface_wishbone_axi4_lite>>).

.Interface Response

[NOTE]

Please note that any CPU access via the data or instruction interface has to be terminated either by asserting the

CPU's *_bus_ack_i` or `*_bus_err_i` signal. Otherwise the CPU will be stalled permanently. The BUSKEEPER ensures that

any kind of access is always properly terminated.

**Exemplary Bus Accesses**

**Exemplary Bus Accesses**

.Example bus accesses: see read/write access description below

.Example bus accesses: see read/write access description below

[cols="^2,^2"]

[cols="^2,^2"]

[grid="none"]

[grid="none"]

Line 1072...

Line 1128...

a| image::cpu_interface_read_long.png[read,300,150]

a| image::cpu_interface_read_long.png[read,300,150]

a| image::cpu_interface_write_long.png[write,300,150]

a| image::cpu_interface_write_long.png[write,300,150]

| Read access | Write access

| Read access | Write access

|=======================

|=======================

**Write Access**

**Write Access**

For a write access, the access address (`bus_addr_o`), the data to be written (`bus_wdata_o`) and the byte

For a write access, the access address (`bus_addr_o`), the data to be written (`bus_wdata_o`) and the byte

enable signals (`bus_ben_o`) are set when bus_we_o goes high. These three signals are kept stable until the

enable signals (`bus_ben_o`) are set when bus_we_o goes high. These three signals are kept stable until the

transaction is completed. In the example the accessed peripheral cannot answer directly in the next

transaction is completed. In the example the accessed peripheral cannot answer directly in the next

cycle after issuing. Here, the transaction is successful and the peripheral sets the `bus_ack_i` signal several

cycle after issuing. Here, the transaction is successful and the peripheral sets the `bus_ack_i` signal several

cycles after issuing.

cycles after issuing.

**Read Access**

**Read Access**

For a read access, the accessed address (`bus_addr_o`) is set when `bus_re_o` goes high. The address is kept

For a read access, the accessed address (`bus_addr_o`) is set when `bus_re_o` goes high. The address is kept

stable until the transaction is completed. In the example the accessed peripheral cannot answer

stable until the transaction is completed. In the example the accessed peripheral cannot answer

directly in the next cycle after issuing. The peripheral hast to apply the read data right in the same cycle as

directly in the next cycle after issuing. The peripheral hast to apply the read data right in the same cycle as

the bus transaction is completed (here, the transaction is successful and the peripheral sets the `bus_ack_i`

the bus transaction is completed (here, the transaction is successful and the peripheral sets the `bus_ack_i`

signal).

signal).

**Access Boundaries**

**Access Boundaries**

The instruction interface will always access memory on word (= 32-bit) boundaries even if fetching

The instruction interface will always access memory on word (= 32-bit) boundaries even if fetching

compressed (16-bit) instructions. The data interface can access memory on byte (= 8-bit), half-word (= 16-

compressed (16-bit) instructions. The data interface can access memory on byte (= 8-bit), half-word (= 16-

bit) and word (= 32-bit) boundaries.

bit) and word (= 32-bit) boundaries.

**Exclusive (Atomic) Access**

**Exclusive (Atomic) Access**

The CPU can access memory in an exclusive manner by generating a load-reservate and store-conditional

The CPU can access memory in an exclusive manner by generating a load-reservate and store-conditional

combination. Normally, these combinations should target the same memory address.

combination. Normally, these combinations should target the same memory address.

Line 1119...

Line 1179...

[TIP]

[TIP]

For more information regarding the SoC-level behavior and requirements of atomic operations see

For more information regarding the SoC-level behavior and requirements of atomic operations see

section <<_processor_external_memory_interface_wishbone_axi4_lite>>.

section <<_processor_external_memory_interface_wishbone_axi4_lite>>.

**Memory Barriers**

**Memory Barriers**

Whenever the CPU executes a _fence_ instruction, the according interface signal is set high for one cycle

Whenever the CPU executes a _fence_ instruction, the according interface signal is set high for one cycle

(`d_bus_fence_o` for a `fence` instruction; `i_bus_fence_o` for a `fencei` instruction). It is the task of the

(`d_bus_fence_o` for a `fence` instruction; `i_bus_fence_o` for a `fencei` instruction). It is the task of the

memory system to perform the necessary operations (for example a cache flush and refill).

memory system to perform the necessary operations (for example a cache flush and refill).

Line 1137...

Line 1198...

In order to reduce routing constraints (and by this the actual hardware requirements), most uncritical

In order to reduce routing constraints (and by this the actual hardware requirements), most uncritical

registers of the NEORV32 CPU as well as most register of the whole NEORV32 Processor do not use **a

registers of the NEORV32 CPU as well as most register of the whole NEORV32 Processor do not use **a

dedicated hardware reset**. "Uncritical registers" in this context means that the initial value of these registers

dedicated hardware reset**. "Uncritical registers" in this context means that the initial value of these registers

after power-up is not relevant for a defined CPU boot process.

after power-up is not relevant for a defined CPU boot process.

**Rationale**

**Rationale**

A good example to illustrate the concept of uncritical registers is a pipelined processing engine. Each stage

A good example to illustrate the concept of uncritical registers is a pipelined processing engine. Each stage

of the engine features an N-bit _data register_ and a 1-bit _status register_. The status register is set when the

of the engine features an N-bit _data register_ and a 1-bit _status register_. The status register is set when the

data in the according data register is valid. At the end of the pipeline the status register might trigger a write-back

data in the according data register is valid. At the end of the pipeline the status register might trigger a write-back

Line 1148...

Line 1210...

irrelevant as long as the status registers are all reset to a defined value that indicates there is no valid data in

irrelevant as long as the status registers are all reset to a defined value that indicates there is no valid data in

the pipeline's data register. Therefore, the pipeline data register do no require a dedicated reset as they do not

the pipeline's data register. Therefore, the pipeline data register do no require a dedicated reset as they do not

control the actual operation (in contrast to the status register). This makes the pipeline data registers from

control the actual operation (in contrast to the status register). This makes the pipeline data registers from

this example "uncritical registers".

this example "uncritical registers".

**NEORV32 CPU Reset**

**NEORV32 CPU Reset**

In terms of the NEORV32 CPU, there are several pipeline registers, state machine registers and even status

In terms of the NEORV32 CPU, there are several pipeline registers, state machine registers and even status

and control registers (CSRs) that do not require a defined initial state to ensure a correct boot process. The

and control registers (CSRs) that do not require a defined initial state to ensure a correct boot process. The

pipeline register will get initialized by the CPU's internal state machines, which are initialized from the main

pipeline register will get initialized by the CPU's internal state machines, which are initialized from the main

Line 1162...

Line 1225...

the lack of dedicated hardware resets of certain CSRs. For example the machine interrupt-enable CSR <<_mie>>

the lack of dedicated hardware resets of certain CSRs. For example the machine interrupt-enable CSR <<_mie>>

does not provide a dedicated reset. The value after reset of this register is uncritical as interrupts cannot fire

does not provide a dedicated reset. The value after reset of this register is uncritical as interrupts cannot fire

because the global interrupt enabled flag in the status register (`mstatsus(mie)`) _do_ provide a dedicated

because the global interrupt enabled flag in the status register (`mstatsus(mie)`) _do_ provide a dedicated

hardware reset setting this bit to low (globally disabling interrupts).

hardware reset setting this bit to low (globally disabling interrupts).

**Reset Configuration**

**Reset Configuration**

Most CPU-internal register do provide an asynchronous reset in the VHDL code, but the "don't care" value

Most CPU-internal register do provide an asynchronous reset in the VHDL code, but the "don't care" value

(VHDL `'-'`) is used for initialization of all uncritical registers, effectively generating a flip-flop without a

(VHDL `'-'`) is used for initialization of all uncritical registers, effectively generating a flip-flop without a

reset. However, certain applications or situations (like advanced gate-level / timing simulations) might

reset. However, certain applications or situations (like advanced gate-level / timing simulations) might

Line 1176...

Line 1240...

----

----

-- use dedicated hardware reset value for UNCRITICAL registers --

-- use dedicated hardware reset value for UNCRITICAL registers --

-- FALSE=reset value is irrelevant (might simplify HW), default; TRUE=defined LOW reset value

-- FALSE=reset value is irrelevant (might simplify HW), default; TRUE=defined LOW reset value

constant dedicated_reset_c : boolean := false;

constant dedicated_reset_c : boolean := false;

----

----

// ####################################################################################################################

include::cpu_cfu.adoc[]

Browse

Tools

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [datasheet/] [cpu.adoc] - Diff between revs 72 and 73