OpenCores

Rev 64	Rev 65
Line 19...	Line 19...
** `Zifencei` - instruction stream synchronization	** `Zifencei` - instruction stream synchronization
** `Zmmul` - integer multiplication hardware	** `Zmmul` - integer multiplication hardware
** `PMP` - physical memory protection	** `PMP` - physical memory protection
** `HPM` - hardware performance monitors	** `HPM` - hardware performance monitors
** `DB` - debug mode	** `DB` - debug mode
`* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications – passes the official RISC-V Architecture Tests (v2+)`	`* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)`
`* Official RISC-V open-source architecture ID`	`* Official RISC-V open-source architecture ID`
`* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts and 1 non-maskable interrupt`	`* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts`
`* Supports most of the traps from the RISC-V specifications (including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions`	`* Supports most of the traps from the RISC-V specifications (including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions`
`* Optional physical memory configuration (PMP), compatible to the RISC-V specifications`	`* Optional physical memory configuration (PMP), compatible to the RISC-V specifications`
`* Optional hardware performance monitors (HPM) for application benchmarking`	`* Optional hardware performance monitors (HPM) for application benchmarking`
`* Separated interfaces for instruction fetch and data access (merged into single bus via a bus switch for`	`* Separated interfaces for instruction fetch and data access (merged into single bus via a bus switch for`
`the NEORV32 processor)`	`the NEORV32 processor)`
`* little-endian byte order`	`* little-endian byte order`
`* Configurable hardware reset`	`* Configurable hardware reset`
`* No hardware support of unaligned data/instruction accesses – they will trigger an exception.`	`* No hardware support of unaligned data/instruction accesses - they will trigger an exception.`

`[NOTE]`	`[NOTE]`
`It is recommended to use the NEORV32 Processor as default top instance even if you only want to use the actual`	`It is recommended to use the NEORV32 Processor as default top instance even if you only want to use the actual`
`CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU`	`CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU`
`wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This`	`wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This`
Line 51...	Line 51...
`The NEORV32 CPU was designed from scratch based only on the official ISA / privileged architecture`	`The NEORV32 CPU was designed from scratch based only on the official ISA / privileged architecture`
`specifications. The following figure shows the simplified architecture of the CPU.`	`specifications. The following figure shows the simplified architecture of the CPU.`

`image::neorv32_cpu.png[align=center]`	`image::neorv32_cpu.png[align=center]`

`The CPU uses a pipelined architecture with basically two main stages. The first stage (IF – instruction fetch)`	`The CPU uses a pipelined architecture with basically two main stages. The first stage (IF - instruction fetch)`
`is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is`	`is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is`
`stored to a FIFO – the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit`	`stored to a FIFO - the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit`
`instruction words for the next pipeline stage. Compressed instructions – if enabled – are also decompressed`	`instruction words for the next pipeline stage. Compressed instructions - if enabled - are also decompressed`
`in this stage. The second stage (EX – execution) is responsible for actually executing the fetched instructions`	`in this stage. The second stage (EX - execution) is responsible for actually executing the fetched instructions`
`via the execute engine.`	`via the execute engine.`

`These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a`	`These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a`
`certain operations can take several cycles. Since the IF and EX stages are decoupled via the instruction`	`certain operations can take several cycles. Since the IF and EX stages are decoupled via the instruction`
`prefetch buffer, both stages can operate in parallel and with overlapping operations. Hence, the optimal CPI`	`prefetch buffer, both stages can operate in parallel and with overlapping operations. Hence, the optimal CPI`
Line 221...	Line 221...

`.Hardwired R/W CSRs`	`.Hardwired R/W CSRs`
`[IMPORTANT]`	`[IMPORTANT]`
The `misa`, `mip` and `mtval` CSRs in the NEORV32 are _read-only_.	The `misa`, `mip` and `mtval` CSRs in the NEORV32 are _read-only_.
`Any write access to it (in machine mode) to them are ignored and will _not_ cause any exceptions or side-effects.`	`Any write access to it (in machine mode) to them are ignored and will _not_ cause any exceptions or side-effects.`
	`Pending interrupt can only be cleared by acknowledging the interrupt-causing device. However, pending interrupts`
	can still be ignored by clearing the according `mie` register bits.

`.Physical memory protection`	`.Physical memory protection`
`[IMPORTANT]`	`[IMPORTANT]`
`The physical memory protection (see section <<_machine_physical_memory_protection>>)`	`The physical memory protection (see section <<_machine_physical_memory_protection>>)`
`only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.`	`only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.`
Line 335...	Line 337...

`// ####################################################################################################################`	`// ####################################################################################################################`
`:sectnums:`	`:sectnums:`
`=== Instruction Sets and Extensions`	`=== Instruction Sets and Extensions`

The NEORV32 is an RISC-V `rv32i` architecture that provides several optional RISC-V CPU and ISA	The basic NEORV32 is a RISC-V `rv32i` architecture that provides several _optional_ RISC-V CPU and ISA
`(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please`	`(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please`
`see the The _RISC-V Instruction Set Manual – Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual`	`see the the _RISC-V Instruction Set Manual - Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual`
Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.	Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.

`[TIP]`	`[TIP]`
`The CPU can discover available ISA extensions via the <<_misa>> CSR and the`	`The CPU can discover available ISA extensions via the <<_misa>> CSR and the`
`CPU` <<_system_configuration_information_memory_sysinfo, SYSINFO>> register	`CPU` <<_system_configuration_information_memory_sysinfo, SYSINFO>> register
`or by executing an instruction and checking for an _illegal instruction exception_.`	`or by executing an instruction and checking for an _illegal instruction exception_.`

`[NOTE]`	`[NOTE]`
`Executing an instruction from an extension that is not implemented or not enabled (for example via the according`	`Executing an instruction from an extension that is not supported yet or that is currently not enabled`
`top entity generic) will raise an _illegal instruction_ exception.`	`(via the according top entity generic) will raise an _illegal instruction_ exception.`


==== `A` - Atomic Memory Access	==== `A` - Atomic Memory Access

`Atomic memory access instructions (for implementing semaphores and mutexes) are available when the`	`Atomic memory access instructions allow more sophisticated memory operations like implementing semaphores and mutexes.`
`CPU_EXTENSION_RISCV_A` configuration generic is _true_. In this case the following additional instructions	The RICS-C specs. defines a specific _atomic_ extension that provides instructions for atomic memory accesses. The `A`
`are available:`	ISA extension is enabled if the `CPU_EXTENSION_RISCV_A` configuration generic is _true_.
	`In this case the following additional instructions are available:`

* `lr.w`: load-reservate	* `lr.w`: load-reservate
* `sc.w`: store-conditional	* `sc.w`: store-conditional

`[NOTE]`	`[NOTE]`
Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations	Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations
`(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the`	`(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the`
instruction’s ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet	instruction's ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet
`implemented) AMO (atomic memory operation) will trigger an illegal instruction exception.`	`implemented) AMO (atomic memory operation) will raise an illegal instruction exception.`

	The load-reservate instruction behaves as a "normal" load-word instruction (`lw`) but will also set a CPU-internal
	_data memory access lock_. Executing a store-conditional behaves as "normal" store-word instruction (`sw`) that will
	`only conduct an actual memory write operations if the lock is still intact. Additionally, the store-conditional instruction`
	`will also return the lock state (returns zero if the lock is still intact or non-zero if the lock has been broken).`
	After the execution of the `sc` instruction, the lock is automatically removed.

	`The lock is broken if at least one of the following conditions occur:`
	. executing any data memory access instruction other than `lr.w`
	`. raising _any_ t (for example an interrupt or a memory access exception)`

`[NOTE]`	`[NOTE]`
`The atomic instructions have special requirements for memory system / bus interconnect. More`	`The atomic instructions have special requirements for memory system / bus interconnect. More`
`information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively.`	`information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively.`


==== `C` - Compressed Instructions	==== `C` - Compressed Instructions

Compressed 16-bit instructions are available when the `CPU_EXTENSION_RISCV_C` configuration generic is	`The _compressed_ ISA extension provides 16-bit encodings of commonly used instructions to reduce code space size.`
`_true_. In this case the following instructions are available:`	The `C` extension is available when the `CPU_EXTENSION_RISCV_C` configuration generic is _true_.
	`In this case the following instructions are available:`

* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`,	* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`,
`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp`	`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp`

`[NOTE]`	`[NOTE]`
`When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ address require`	`When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ instruction require`
`an additional instruction fetch to load the required second half-word of that instruction. The performance can be increased`	`an additional instruction fetch to load the according second half-word of that instruction. The performance can be increased`
again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`,	again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`,
`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile).	`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile).


==== `E` - Embedded CPU	==== `E` - Embedded CPU

`The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to reduce hardware`	`The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to`
requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond	decrease physical hardware requirements (for example block RAM). This extensions is enabled when the `CPU_EXTENSION_RISCV_E`
`x15` will raise and _illegal instruction exception_.	configuration generic is _true_. Accesses to registers beyond `x15` will raise and _illegal instruction exception_.
	`This extension does not add any additional instructions or features.`

`[IMPORTANT]`	`[IMPORTANT]`
Due to the reduced register file size an alternate toolchain ABI (`ilp32e`) is required.	Due to the reduced register file size an alternate toolchain ABI (`ilp32e`) is required.


==== `I` - Base Integer ISA	==== `I` - Base Integer ISA

The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled	The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled
`regardless of the setting of the remaining exceptions. The base instruction set includes the following`	`regardless of the setting of the remaining exceptions. The base instruction set includes the following`
`instructions:`	`instructions:`

* immediates: `lui`, `auipc`	* immediate: `lui`, `auipc`
* jumps: `jal`, `jalr`	* jumps: `jal`, `jalr`
* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu`	* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu`
* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw`	* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw`
* alu: `addi`, `slti`, `sltiu`, `xori`, `ori`, `andi`, `slli`, `srli`, `srai`, `add`, `sub`, `sll`, `slt`, `sltu`, `xor`, `srl`, `sra`, `or`, `and`	* alu: `addi`, `slti`, `sltiu`, `xori`, `ori`, `andi`, `slli`, `srli`, `srai`, `add`, `sub`, `sll`, `slt`, `sltu`, `xor`, `srl`, `sra`, `or`, `and`
* environment: `ecall`, `ebreak`, `fence`	* environment: `ecall`, `ebreak`, `fence`
Line 421...	Line 437...
executed. Any flags within the `fence` instruction word are ignore by the hardware.	executed. Any flags within the `fence` instruction word are ignore by the hardware.


==== `M` - Integer Multiplication and Division	==== `M` - Integer Multiplication and Division

`Hardware-accelerated integer multiplication and division instructions are available when the`	`Hardware-accelerated integer multiplication and division operations are available when the`
`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are	`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are
`available:`	`available:`

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`	* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
* division: `div`, `divu`, `rem`, `remu`	* division: `div`, `divu`, `rem`, `remu`
Line 438...	Line 454...


==== `Zmmul` - Integer Multiplication	==== `Zmmul` - Integer Multiplication

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations	This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations
of the `M` extensions and is intended for small scale applications, that require hardware-based	of the `M` extensions and is intended for size-constrained setups that require hardware-based
`integer multiplications but not hardware-based divisions, which will be computed entirely in software.`	`integer multiplications but not hardware-based divisions, which will be computed entirely in software.`
This extension requires only ~50% of the hardware utilization of the `M` extension.	This extension requires only ~50% of the hardware utilization of the "full" `M` extension.

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`	* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`

If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`)	If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`)
`will raise an _illegal instruction exception_.`	`will raise an _illegal instruction exception_.`
Line 452...	Line 468...
Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time.	Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time.

`[TIP]`	`[TIP]`
If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"	If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"
using a `rv32im` machine architecture and setting the `-mno-div` compiler flag	using a `rv32im` machine architecture and setting the `-mno-div` compiler flag
(example `$ make MARCH=-march=rv32im USER_FLAGS+=-mno-div clean_all exe`).	(example `$ make MARCH=rv32im USER_FLAGS+=-mno-div clean_all exe`).


==== `U` - Less-Privileged User Mode	==== `U` - Less-Privileged User Mode

Adds the less-privileged _user mode_ if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For	`In addition to the basic (and highest-privileged) machine-mode, the _user-mode_ ISA extensions adds a second less-privileged`
`instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like`	operation mode. It is implemented if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_.
`peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode.`	`Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like`
	`peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).`
	`Any kind of privilege rights violation will raise an exception to allow full virtualization.`


==== `X` - NEORV32-Specific (Custom) Extensions	==== `X` - NEORV32-Specific (Custom) Extensions

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR.	The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR.
Line 475...	Line 493...
`* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).`	`* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).`


==== `Zfinx` Single-Precision Floating-Point Operations	==== `Zfinx` Single-Precision Floating-Point Operations

`[WARNING]`	The `Zfinx` floating-point extension is an _alternative_ of the standard `F` floating-point ISA extension.
The NEORV32 `Zfinx` extension is specification-compliant and operational but still _experimental_.	The `Zfinx` extensions also uses the integer register file `x` to store and operate on floating-point data
	instead of a dedicated floating-point register file (hence, `F-in-x`). Thus, the `Zfinx` extension requires
The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the	less hardware resources and features faster context changes. This also implies that there are NO dedicated `f`
integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f`	`register file-related load/store or move instructions.`
register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes.	`The official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx`
This also implies that there are NO dedicated `f` register file related load/store or move instructions. The
`official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx`

	`[TIP]`
The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications.	The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications.

The `Zfinx` extensions only supports single-precision (`.s` suffix) yet (so it is a direct alternative to the `F`	The `Zfinx` extensions only supports single-precision (`.s` instruction suffix), so it is a direct alternative
extension). The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration	to the `F` extension. The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration
`generic is _true_. In this case the following instructions and CSRs are available:`	`generic is _true_. In this case the following instructions and CSRs are available:`

* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s`	* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s`
* comparison: `fmin.s`, `fmax.s`, `feq.s`, `flt.s`, `fle.s`	* comparison: `fmin.s`, `fmax.s`, `feq.s`, `flt.s`, `fle.s`
* computational: `fadd.s`, `fsub.s`, `fmul.s`	* computational: `fadd.s`, `fsub.s`, `fmul.s`
Line 503...	Line 520...
`[WARNING]`	`[WARNING]`
Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!	Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!
Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!	Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

`[WARNING]`	`[WARNING]`
`Subnormal numbers (also "de-normalized" numbers) are not supported by the NEORV32 FPU.`	`Subnormal numbers ("de-normalized" numbers) are not supported by the NEORV32 FPU.`
`Subnormal numbers (exponent = 0) are _flushed to zero_ (setting them to +/- 0) before entering the`	`Subnormal numbers (exponent = 0) are _flushed to zero_ setting them to +/- 0 before entering the`
FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the	FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the
`result is also flushed to zero during normalization.`	`result is also flushed to zero during normalization.`

`[WARNING]`	`[WARNING]`
The `Zfinx` extension is not yet officially ratified, but is expected to stay unchanged. There is no	The `Zfinx` extension is not yet officially ratified, but is expected to stay unchanged. There is no
Line 517...	Line 534...
code (see `sw/example/floating_point_test`).	code (see `sw/example/floating_point_test`).


==== `Zbb` Basic Bit-Manipulation Operations	==== `Zbb` Basic Bit-Manipulation Operations

`[WARNING]`
The NEORV32 `Zbb` extension is specification-compliant and operational but still _experimental_.

The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.	The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.
`The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip`	`The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip`

The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration	The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration
`generic is _true_. In this case the following instructions are available:`	`generic is _true_. In this case the following instructions are available:`
Line 539...	Line 553...
`By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations`	`By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations`
like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the	like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the
`<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all`	`<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all`
shift-related `Zbb` instructions.	shift-related `Zbb` instructions.

`[IMPORTANT]`	`[WARNING]`
The `Zbb` extension is frozen but not officially ratified yet. There is no	The `Zbb` extension is frozen but not officially ratified yet. There is no
`software support for this extension in the upstream GCC RISC-V port yet. However, an`	`software support for this extension in the upstream GCC RISC-V port yet. However, an`
intrinsic library is provided to utilize the provided `Zbb` extension from C-language	intrinsic library is provided to utilize the provided `Zbb` extension from C-language
code (see `sw/example/bitmanip_test`).	code (see `sw/example/bitmanip_test`).


==== `Zicsr` Control and Status Register Access / Privileged Architecture	==== `Zicsr` Control and Status Register Access / Privileged Architecture

`The CSR access instructions as well as the exception and interrupt system (= the privileged architecture) is implemented when the`	`The CSR access instructions as well as the exception and interrupt system (= the privileged architecture)`
`CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_. In this case the following instructions are	is implemented when the `CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_.
`available:`	`In this case the following instructions are available:`

* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`	* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`
* environment: `mret`, `wfi`	* environment: `mret`, `wfi`

`[WARNING]`	`[WARNING]`
If the `Zicsr` extension is disabled the CPU does not provide any kind of interrupt or exception	If the `Zicsr` extension is disabled the CPU does not provide any _privileged architecture_ features at all!
`support at all. In order to provide the full spectrum of functions and to allow a secure executions`	`In order to provide the full set of functions and to allow a secure execution`
environment, the `Zicsr` extension should always be enabled.	environment the `Zicsr` extension should always be enabled.

`[NOTE]`	`[NOTE]`
The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is	The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is
`halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to`	`halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to`
be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.	be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.

`[IMPORTANT]`	`[NOTE]`
The `wfi` instruction will raise an illegal instruction exception when executed outside of machine-mode	The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit
and <<_mstatus>> bit `TW` (timeout wait) is set.	`TW` (timeout wait) is hardwired to zero.


==== `Zifencei` Instruction Stream Synchronization	==== `Zifencei` Instruction Stream Synchronization

The `Zifencei` CPU extension is implemented if the `CPU_EXTENSION_RISCV_Zifencei` configuration	The `Zifencei` CPU extension is implemented if the `CPU_EXTENSION_RISCV_Zifencei` configuration
`generic is _true_. It allows manual synchronization of the instruction stream via the following instruction:`	`generic is _true_. It allows manual synchronization of the instruction stream via the following instruction:`

* `fence.i`	* `fence.i`

`[NOTE]`
The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer.	The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer.
This allows a clean re-fetch of modified instructions from memory. Also, the top's `i_bus_fencei_o` signal is set	This allows a clean re-fetch of modified instructions from memory. Also, the top's `i_bus_fencei_o` signal is set
`high for one cycle to inform the memory system (like the i-cache to perform a flush/reload.`	`high for one cycle to inform the memory system (like the i-cache to perform a flush/reload.`
Any additional flags within the `fence.i` instruction word are ignore by the hardware.	Any additional flags within the `fence.i` instruction word are ignore by the hardware.


==== `PMP` Physical Memory Protection	==== `PMP` Physical Memory Protection

`The NEORV32 physical memory protection (PMP) is compatible to the PMP specified by the RISC-V specs.`	`The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used`
`The CPU PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger minimal sizes can be configured`	`to constrain memory read/write/execute rights for each available privilege level.`
via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements. The physical memory protection system is implemented when the
`PMP_NUM_REGIONS` configuration generic is >0. In this case the following additional CSRs are available:	`The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger`
	minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.
	The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.
	`In this case the following additional CSRs are available:`

* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers	* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers
* `pmpaddr*` (0..63, depending on configuration): PMP address registers	* `pmpaddr*` (0..63, depending on configuration): PMP address registers

	`[TIP]`
`See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs.`	`See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs.`

`Configuration`

`The actual number of regions and the minimal region granularity are defined via the top entity`	`The actual number of regions and the minimal region granularity are defined via the top entity`
`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available	`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available
granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the	granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the
number of available `pmpcfg` and `pmpaddr` CSRs.	number of available `pmpcfg` and `pmpaddr` CSRs.

Line 618...	Line 633...
`constant pmp_num_regions_critical_c : natural := 8;`	`constant pmp_num_regions_critical_c : natural := 8;`
`----`	`----`

`Operation`	`Operation`

`Any memory access address (from the CPU's instruction fetch or data access interface) is tested if it is accessing any`	`Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_`
of the specified (configured via `pmpaddr` and enabled via `pmpcfg`) PMP regions. If an	of the specified PMP regions(configured via `pmpaddr` and enabled via `pmpcfg`). If an
address accesses one of these regions, the configured access rights (attributes in `pmpcfg*`) are checked:	address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:

`* a write access (store) will fail if no write attribute is set`	`* a write access (store) will fail if no write attribute is set`
`* a read access (load) will fail if no read attribute is set`	`* a read access (load) will fail if no read attribute is set`
`* an instruction fetch access will fail if no execute attribute is set`	`* an instruction fetch access will fail if no execute attribute is set`

`If an access to a protected region does not have the according access rights (attributes) it will raise the according`	`If an access to a protected region does not have the according access rights it will raise the according`
`_instruction/load/store access fault exception_.`	`instruction/load/store _access fault_ exception.`

`By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical`	`By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical`
`memory protection also for machine-level programs you need to active the _locked bit_ in the according`	`memory protection also for machine-level programs you need to set the _locked bit_ in the according`
`pmpcfg*` configuration.	`pmpcfg*` configuration CSR.

`[IMPORTANT]`	`[IMPORTANT]`
After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for	After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for
`internal (iterative) computations before the configuration becomes valid.`	`internal (iterative) computations before the configuration becomes valid.`

`[NOTE]`	`[NOTE]`
`For more information regarding RISC-V physical memory protection see the official _The RISC-V`	`For more information regarding RISC-V physical memory protection see the official _The RISC-V`
`Instruction Set Manual – Volume II: Privileged Architecture_ specifications.`	`Instruction Set Manual - Volume II: Privileged Architecture_ specifications.`


==== `HPM` Hardware Performance Monitors	==== `HPM` Hardware Performance Monitors

In additions to the mandatory cycles (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides	In additions to the mandatory cycle (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides
`up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an`	`up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an`
`N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's`	`N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's`
`HPM_CNT_WIDTH` generic (0..64-bit), and a corresponding event configuration CSR. The event configuration	`HPM_CNT_WIDTH` generic (0..64-bit) and a corresponding event configuration CSR. The event configuration
`CSR defines the architectural events that lead to an increment of the associated HPM counter.`	`CSR defines the architectural events that lead to an increment of the associated HPM counter.`

The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are	The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are
`mandatory performance monitors on every RISC-V platform and have fixed increment events. For example,`	`mandatory performance monitors on every RISC-V platform and have fixed increment events. For example,`
`the instructions-retired counter increments with each executed instructions. The actual hardware performance`	`the instructions-retired counter increments with each executed instructions. The actual hardware performance`
`monitors are optional and can be configured to increment on arbitrary hardware events. The number of`	`monitors are optional and can be configured to increment on arbitrary hardware events. The number of`
available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will exclude	available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will remove
`all HPM logic from the design.`	`all HPM logic from the design.`

`Depending on the configuration, the following additional CSR are available:`	If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the
	according `mcountinhibit` CSR bits are hardwired to zero.
	`However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).`
	`The according CSRs are read-only and will always return 0.`

	`Depending on the configuration the following additional CSR are available:`

* counters: `mhpmcounter*[h]` (3..31, depending on configuration)	* counters: `mhpmcounter*[h]` (3..31, depending on `HPM_NUM_CNTS`)
* event configuration: `mhpmevent*` (3..31, depending on configuration)	* event configuration: `mhpmevent*` (3..31, depending on `HPM_NUM_CNTS`)

`[IMPORTANT]`	`[IMPORTANT]`
The HPM counter CSR can only be accessed in machine-mode. Hence, the according `mcounteren` CSR bits	The HPM counter CSR can only be accessed in machine-mode. Hence, the according `mcounteren` CSR bits
`are always zero and read-only.`	`are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction`
	`exception.`

	`[TIP]`
Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR.	Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR.

If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the	`[TIP]`
according `mcountinhibit` CSR bits are hardwired to zero.	`For a list of all HPM-related CSRs and all provided event configurations`
`However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).`	`see section <<_hardware_performance_monitors_hpm>>.`
`The according CSRs are read-only and will always return 0.`

`[NOTE]`
`For a list of all allocated HPM-related CSRs and all provided event configurations see section <<_hardware_performance_monitors_hpm>>.`



`// ####################################################################################################################`	`// ####################################################################################################################`
`:sectnums:`	`:sectnums:`
Line 733...	Line 751...
\| Basic bit-manip - arith \| `Zbb` \| `max` `maxu` `min` `minu` \| 3	\| Basic bit-manip - arith \| `Zbb` \| `max` `maxu` `min` `minu` \| 3
\| Basic bit-manip - misc \| `Zbb` \| `sext.b` `sext.h` `zext.h` `orc.b` `rev8` \| 3	\| Basic bit-manip - misc \| `Zbb` \| `sext.b` `sext.h` `zext.h` `orc.b` `rev8` \| 3
`\|=======================`	`\|=======================`

`[NOTE]`	`[NOTE]`
`The presented values of the floating-point execution cycles are average values – obtained from`	`The presented values of the floating-point execution cycles are average values - obtained from`
`4096 instruction executions using pseudo-random input values. The execution time for emulating the`	`4096 instruction executions using pseudo-random input values. The execution time for emulating the`
`instructions (using pure-software libraries) is ~17..140 times higher.`	`instructions (using pure-software libraries) is ~17..140 times higher.`



Line 764...	Line 782...

`* Due to the acknowledged memory accesses the CPU is _always_ sync with the memory system`	`* Due to the acknowledged memory accesses the CPU is _always_ sync with the memory system`
`(i.e. there is no speculative execution / no out-of-order states).`	`(i.e. there is no speculative execution / no out-of-order states).`
`* The CPU supports _all_ RISC-V bus exceptions including access exceptions that are triggered if an`	`* The CPU supports _all_ RISC-V bus exceptions including access exceptions that are triggered if an`
`accessed address does not respond or encounters an internal error during access.`	`accessed address does not respond or encounters an internal error during access.`
`* The CPU raises an illegal instruction trap for _all_ unimplemented/malformed/illegal instructions.`	`* The RISC-V specs. state that executing an malformed instruction results in unpredictable behavior. As an additional security feature,`
	`the NEORV32 CPU ensures that _all_ unimplemented/malformed/illegal instructions _do raise an illegal instruction trap_ and`
	`_do not commit any operation_ (like writing registers or triggering memory operations).`
`* To be continued...`	`* To be continued...`



`// ####################################################################################################################`	`// ####################################################################################################################`
Line 789...	Line 809...
`The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered`	`The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered`
`while all remaining exceptions are ignored. If several _interrupts_ trigger at once, the one with highest priority`	`while all remaining exceptions are ignored. If several _interrupts_ trigger at once, the one with highest priority`
`is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with`	`is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with`
`the second highest priority will get serviced and so on until no further interrupt are pending.`	`the second highest priority will get serviced and so on until no further interrupt are pending.`

`.RISC-V interrupts`	`.Interrupt Signal Requirements`
`[IMPORTANT]`	`[IMPORTANT]`
`All RISC-V defined machine level interrupts request signals are high-active. A request has to stay at high-level until`	`All interrupts request signals (including FIRQs) are high-active. A request has to stay at high-level (=asserted)`
`it is acknowledged by the CPU (for example by writing to a specific memory-mapped register).`	`until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).`

`.Instruction Atomicity`	`.Instruction Atomicity`
`[NOTE]`	`[NOTE]`
`All instructions execute as atomic operations – interrupts can only trigger between two instructions.`	`All instructions execute as atomic operations - interrupts can only trigger between two instructions.`
`So if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before`	`So if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before`
`a new interrupt handler can start.`	`a new interrupt handler can start.`


`:sectnums:`	`:sectnums:`
Line 815...	Line 835...
`:sectnums:`	`:sectnums:`
`==== Custom Fast Interrupt Request Lines`	`==== Custom Fast Interrupt Request Lines`

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top	As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top
entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also	entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also
provide custom trap codes in `mcause`. These FIRQs are reserved for processor-internal usage only.	provide custom trap codes in `mcause`. These FIRQs are reserved for NEORV32 processor-internal usage only.

`[NOTE]`
`The fast interrupt request lines trigger on a rising-edge.`



`// ####################################################################################################################`	`// ####################################################################################################################`
`:sectnums!:`	`:sectnums!:`
Line 892...	Line 910...
`===== Address Space`	`===== Address Space`

`The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard`	`The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard`
`Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory`	`Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory`
`system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU`	`system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU`
`does not support unaligned memory accesses _in hardware_ – however, a software-based handling can be`	`does not support unaligned memory accesses _in hardware_ - however, a software-based handling can be`
`implemented as any unaligned memory access will trigger an according exception.`	`implemented as any unaligned memory access will trigger an according exception.`

`:sectnums:`	`:sectnums:`
`===== Interface Signals`	`===== Interface Signals`

Line 19...

** `Zifencei` - instruction stream synchronization

** `Zifencei` - instruction stream synchronization

** `Zmmul` - integer multiplication hardware

** `Zmmul` - integer multiplication hardware

** `PMP` - physical memory protection

** `PMP` - physical memory protection

** `HPM` - hardware performance monitors

** `HPM` - hardware performance monitors

** `DB` - debug mode

** `DB` - debug mode

* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications – passes the official RISC-V Architecture Tests (v2+)

* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)

* Official RISC-V open-source architecture ID

* Official RISC-V open-source architecture ID

* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts and 1 non-maskable interrupt

* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts

* Supports most of the traps from the RISC-V specifications (including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions

* Supports most of the traps from the RISC-V specifications (including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions

* Optional physical memory configuration (PMP), compatible to the RISC-V specifications

* Optional physical memory configuration (PMP), compatible to the RISC-V specifications

* Optional hardware performance monitors (HPM) for application benchmarking

* Optional hardware performance monitors (HPM) for application benchmarking

* Separated interfaces for instruction fetch and data access (merged into single bus via a bus switch for

* Separated interfaces for instruction fetch and data access (merged into single bus via a bus switch for

the NEORV32 processor)

the NEORV32 processor)

* little-endian byte order

* little-endian byte order

* Configurable hardware reset

* Configurable hardware reset

* No hardware support of unaligned data/instruction accesses – they will trigger an exception.

* No hardware support of unaligned data/instruction accesses - they will trigger an exception.

[NOTE]

[NOTE]

It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual

It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual

CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU

CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU

wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This

wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This

Line 51...

The NEORV32 CPU was designed from scratch based only on the official ISA / privileged architecture

The NEORV32 CPU was designed from scratch based only on the official ISA / privileged architecture

specifications. The following figure shows the simplified architecture of the CPU.

specifications. The following figure shows the simplified architecture of the CPU.

image::neorv32_cpu.png[align=center]

image::neorv32_cpu.png[align=center]

The CPU uses a pipelined architecture with basically two main stages. The first stage (IF – instruction fetch)

The CPU uses a pipelined architecture with basically two main stages. The first stage (IF - instruction fetch)

is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is

is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is

stored to a FIFO – the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit

stored to a FIFO - the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit

instruction words for the next pipeline stage. Compressed instructions – if enabled – are also decompressed

instruction words for the next pipeline stage. Compressed instructions - if enabled - are also decompressed

in this stage. The second stage (EX – execution) is responsible for actually executing the fetched instructions

in this stage. The second stage (EX - execution) is responsible for actually executing the fetched instructions

via the execute engine.

via the execute engine.

These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a

These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a

certain operations can take several cycles. Since the IF and EX stages are decoupled via the instruction

certain operations can take several cycles. Since the IF and EX stages are decoupled via the instruction

prefetch buffer, both stages can operate in parallel and with overlapping operations. Hence, the optimal CPI

prefetch buffer, both stages can operate in parallel and with overlapping operations. Hence, the optimal CPI

Line 221...

.Hardwired R/W CSRs

.Hardwired R/W CSRs

[IMPORTANT]

[IMPORTANT]

The `misa`, `mip` and `mtval` CSRs in the NEORV32 are _read-only_.

The `misa`, `mip` and `mtval` CSRs in the NEORV32 are _read-only_.

Any write access to it (in machine mode) to them are ignored and will _not_ cause any exceptions or side-effects.

Any write access to it (in machine mode) to them are ignored and will _not_ cause any exceptions or side-effects.

Pending interrupt can only be cleared by acknowledging the interrupt-causing device. However, pending interrupts

can still be ignored by clearing the according `mie` register bits.

.Physical memory protection

.Physical memory protection

[IMPORTANT]

[IMPORTANT]

The physical memory protection (see section <<_machine_physical_memory_protection>>)

The physical memory protection (see section <<_machine_physical_memory_protection>>)

only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.

only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.

Line 335...

Line 337...

// ####################################################################################################################

// ####################################################################################################################

:sectnums:

:sectnums:

=== Instruction Sets and Extensions

=== Instruction Sets and Extensions

The NEORV32 is an RISC-V `rv32i` architecture that provides several optional RISC-V CPU and ISA

The basic NEORV32 is a RISC-V `rv32i` architecture that provides several _optional_ RISC-V CPU and ISA

(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please

(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please

see the The _RISC-V Instruction Set Manual – Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual

see the the _RISC-V Instruction Set Manual - Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual

Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.

Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.

[TIP]

[TIP]

The CPU can discover available ISA extensions via the <<_misa>> CSR and the

The CPU can discover available ISA extensions via the <<_misa>> CSR and the

`CPU` <<_system_configuration_information_memory_sysinfo, SYSINFO>> register

`CPU` <<_system_configuration_information_memory_sysinfo, SYSINFO>> register

or by executing an instruction and checking for an _illegal instruction exception_.

or by executing an instruction and checking for an _illegal instruction exception_.

[NOTE]

[NOTE]

Executing an instruction from an extension that is not implemented or not enabled (for example via the according

Executing an instruction from an extension that is not supported yet or that is currently not enabled

top entity generic) will raise an _illegal instruction_ exception.

(via the according top entity generic) will raise an _illegal instruction_ exception.

==== **`A`** - Atomic Memory Access

==== **`A`** - Atomic Memory Access

Atomic memory access instructions (for implementing semaphores and mutexes) are available when the

Atomic memory access instructions allow more sophisticated memory operations like implementing semaphores and mutexes.

`CPU_EXTENSION_RISCV_A` configuration generic is _true_. In this case the following additional instructions

The RICS-C specs. defines a specific _atomic_ extension that provides instructions for atomic memory accesses. The `A`

are available:

ISA extension is enabled if the `CPU_EXTENSION_RISCV_A` configuration generic is _true_.

In this case the following additional instructions are available:

* `lr.w`: load-reservate

* `lr.w`: load-reservate

* `sc.w`: store-conditional

* `sc.w`: store-conditional

[NOTE]

[NOTE]

Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations

Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations

(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the

(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the

instruction’s ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet

instruction's ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet

implemented) AMO (atomic memory operation) will trigger an illegal instruction exception.

implemented) AMO (atomic memory operation) will raise an illegal instruction exception.

The *load-reservate* instruction behaves as a "normal" load-word instruction (`lw`) but will also set a CPU-internal

_data memory access lock_. Executing a *store-conditional* behaves as "normal" store-word instruction (`sw`) that will

only conduct an actual memory write operations if the lock is still intact. Additionally, the store-conditional instruction

will also return the lock state (returns zero if the lock is still intact or non-zero if the lock has been broken).

After the execution of the `sc` instruction, the lock is automatically removed.

The lock is broken if at least one of the following conditions occur:

. executing any data memory access instruction other than `lr.w`

. raising _any_ t (for example an interrupt or a memory access exception)

[NOTE]

[NOTE]

The atomic instructions have special requirements for memory system / bus interconnect. More

The atomic instructions have special requirements for memory system / bus interconnect. More

information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively.

information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively.

==== **`C`** - Compressed Instructions

==== **`C`** - Compressed Instructions

Compressed 16-bit instructions are available when the `CPU_EXTENSION_RISCV_C` configuration generic is

The _compressed_ ISA extension provides 16-bit encodings of commonly used instructions to reduce code space size.

_true_. In this case the following instructions are available:

The `C` extension is available when the `CPU_EXTENSION_RISCV_C` configuration generic is _true_.

In this case the following instructions are available:

* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`,

* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`,

`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp`

`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp`

[NOTE]

[NOTE]

When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ address require

When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ instruction require

an additional instruction fetch to load the required second half-word of that instruction. The performance can be increased

an additional instruction fetch to load the according second half-word of that instruction. The performance can be increased

again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`,

again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`,

`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile).

`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile).

==== **`E`** - Embedded CPU

==== **`E`** - Embedded CPU

The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to reduce hardware

The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to

requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond

decrease physical hardware requirements (for example block RAM). This extensions is enabled when the `CPU_EXTENSION_RISCV_E`

`x15` will raise and _illegal instruction exception_.

configuration generic is _true_. Accesses to registers beyond `x15` will raise and _illegal instruction exception_.

This extension does not add any additional instructions or features.

[IMPORTANT]

[IMPORTANT]

Due to the reduced register file size an alternate toolchain ABI (**`ilp32e`**) is required.

Due to the reduced register file size an alternate toolchain ABI (**`ilp32e`**) is required.

==== **`I`** - Base Integer ISA

==== **`I`** - Base Integer ISA

The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled

The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled

regardless of the setting of the remaining exceptions. The base instruction set includes the following

regardless of the setting of the remaining exceptions. The base instruction set includes the following

instructions:

instructions:

* immediates: `lui`, `auipc`

* immediate: `lui`, `auipc`

* jumps: `jal`, `jalr`

* jumps: `jal`, `jalr`

* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu`

* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu`

* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw`

* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw`

* alu: `addi`, `slti`, `sltiu`, `xori`, `ori`, `andi`, `slli`, `srli`, `srai`, `add`, `sub`, `sll`, `slt`, `sltu`, `xor`, `srl`, `sra`, `or`, `and`

* alu: `addi`, `slti`, `sltiu`, `xori`, `ori`, `andi`, `slli`, `srli`, `srai`, `add`, `sub`, `sll`, `slt`, `sltu`, `xor`, `srl`, `sra`, `or`, `and`

* environment: `ecall`, `ebreak`, `fence`

* environment: `ecall`, `ebreak`, `fence`

Line 421...

Line 437...

executed. Any flags within the `fence` instruction word are ignore by the hardware.

executed. Any flags within the `fence` instruction word are ignore by the hardware.

==== **`M`** - Integer Multiplication and Division

==== **`M`** - Integer Multiplication and Division

Hardware-accelerated integer multiplication and division instructions are available when the

Hardware-accelerated integer multiplication and division operations are available when the

`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are

`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are

available:

available:

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`

* division: `div`, `divu`, `rem`, `remu`

* division: `div`, `divu`, `rem`, `remu`

Line 438...

Line 454...

==== **`Zmmul`** - Integer Multiplication

==== **`Zmmul`** - Integer Multiplication

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations

This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations

of the `M` extensions and is intended for small scale applications, that require hardware-based

of the `M` extensions and is intended for size-constrained setups that require hardware-based

integer multiplications but not hardware-based divisions, which will be computed entirely in software.

integer multiplications but not hardware-based divisions, which will be computed entirely in software.

This extension requires only ~50% of the hardware utilization of the `M` extension.

This extension requires only ~50% of the hardware utilization of the "full" `M` extension.

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`

* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`

If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`)

If `Zmmul` is enabled, executing any division instruction from the `M` ISA extension (`div`, `divu`, `rem`, `remu`)

will raise an _illegal instruction exception_.

will raise an _illegal instruction exception_.

Line 452...

Line 468...

Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time.

Note that `M` and `Zmmul` extensions _cannot_ be enabled at the same time.

[TIP]

[TIP]

If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"

If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"

using a `rv32im` machine architecture and setting the `-mno-div` compiler flag

using a `rv32im` machine architecture and setting the `-mno-div` compiler flag

(example `$ make MARCH=-march=rv32im USER_FLAGS+=-mno-div clean_all exe`).

(example `$ make MARCH=rv32im USER_FLAGS+=-mno-div clean_all exe`).

==== **`U`** - Less-Privileged User Mode

==== **`U`** - Less-Privileged User Mode

Adds the less-privileged _user mode_ if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For

In addition to the basic (and highest-privileged) machine-mode, the _user-mode_ ISA extensions adds a second less-privileged

instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like

operation mode. It is implemented if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_.

peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode.

Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like

peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).

Any kind of privilege rights violation will raise an exception to allow full virtualization.

==== **`X`** - NEORV32-Specific (Custom) Extensions

==== **`X`** - NEORV32-Specific (Custom) Extensions

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR.

The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR.

Line 475...

Line 493...

* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).

* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).

==== **`Zfinx`** Single-Precision Floating-Point Operations

==== **`Zfinx`** Single-Precision Floating-Point Operations

[WARNING]

The `Zfinx` floating-point extension is an _alternative_ of the standard `F` floating-point ISA extension.

The NEORV32 `Zfinx` extension is specification-compliant and operational but still _experimental_.

The `Zfinx` extensions also uses the integer register file `x` to store and operate on floating-point data

instead of a dedicated floating-point register file (hence, `F-in-x`). Thus, the `Zfinx` extension requires

The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the

less hardware resources and features faster context changes. This also implies that there are NO dedicated `f`

integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f`

register file-related load/store or move instructions.

register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes.

The official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx

This also implies that there are NO dedicated `f` register file related load/store or move instructions. The

official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx

[TIP]

The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications.

The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications.

The `Zfinx` extensions only supports single-precision (`.s` suffix) yet (so it is a direct alternative to the `F`

The `Zfinx` extensions only supports single-precision (`.s` instruction suffix), so it is a direct alternative

extension). The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration

to the `F` extension. The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration

generic is _true_. In this case the following instructions and CSRs are available:

generic is _true_. In this case the following instructions and CSRs are available:

* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s`

* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s`

* comparison: `fmin.s`, `fmax.s`, `feq.s`, `flt.s`, `fle.s`

* comparison: `fmin.s`, `fmax.s`, `feq.s`, `flt.s`, `fle.s`

* computational: `fadd.s`, `fsub.s`, `fmul.s`

* computational: `fadd.s`, `fsub.s`, `fmul.s`

Line 503...

Line 520...

[WARNING]

[WARNING]

Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!

Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!

Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!

[WARNING]

[WARNING]

Subnormal numbers (also "de-normalized" numbers) are not supported by the NEORV32 FPU.

Subnormal numbers ("de-normalized" numbers) are not supported by the NEORV32 FPU.

Subnormal numbers (exponent = 0) are _flushed to zero_ (setting them to +/- 0) before entering the

Subnormal numbers (exponent = 0) are _flushed to zero_ setting them to +/- 0 before entering the

FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the

FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the

result is also flushed to zero during normalization.

result is also flushed to zero during normalization.

[WARNING]

[WARNING]

The `Zfinx` extension is not yet officially ratified, but is expected to stay unchanged. There is no

The `Zfinx` extension is not yet officially ratified, but is expected to stay unchanged. There is no

Line 517...

Line 534...

code (see `sw/example/floating_point_test`).

code (see `sw/example/floating_point_test`).

==== **`Zbb`** Basic Bit-Manipulation Operations

==== **`Zbb`** Basic Bit-Manipulation Operations

[WARNING]

The NEORV32 `Zbb` extension is specification-compliant and operational but still _experimental_.

The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.

The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.

The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip

The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip

The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration

The `Zbb` extension is implemented when the `CPU_EXTENSION_RISCV_Zbb` configuration

generic is _true_. In this case the following instructions are available:

generic is _true_. In this case the following instructions are available:

Line 539...

Line 553...

By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations

By default, the bit-manipulation unit uses an _iterative_ approach to compute shift-related operations

like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the

like `clz` and `rol`. To increase performance (at the cost of additional hardware resources) the

<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all

<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all

shift-related `Zbb` instructions.

shift-related `Zbb` instructions.

[IMPORTANT]

[WARNING]

The `Zbb` extension is frozen but not officially ratified yet. There is no

The `Zbb` extension is frozen but not officially ratified yet. There is no

software support for this extension in the upstream GCC RISC-V port yet. However, an

software support for this extension in the upstream GCC RISC-V port yet. However, an

intrinsic library is provided to utilize the provided `Zbb` extension from C-language

intrinsic library is provided to utilize the provided `Zbb` extension from C-language

code (see `sw/example/bitmanip_test`).

code (see `sw/example/bitmanip_test`).

==== **`Zicsr`** Control and Status Register Access / Privileged Architecture

==== **`Zicsr`** Control and Status Register Access / Privileged Architecture

The CSR access instructions as well as the exception and interrupt system (= the privileged architecture) is implemented when the

The CSR access instructions as well as the exception and interrupt system (= the privileged architecture)

`CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_. In this case the following instructions are

is implemented when the `CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_.

available:

In this case the following instructions are available:

* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`

* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`

* environment: `mret`, `wfi`

* environment: `mret`, `wfi`

[WARNING]

[WARNING]

If the `Zicsr` extension is disabled the CPU does not provide any kind of interrupt or exception

If the `Zicsr` extension is disabled the CPU does not provide any _privileged architecture_ features at all!

support at all. In order to provide the full spectrum of functions and to allow a secure executions

In order to provide the full set of functions and to allow a secure execution

environment, the `Zicsr` extension should always be enabled.

environment the `Zicsr` extension should always be enabled.

[NOTE]

[NOTE]

The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is

The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is

halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to

halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to

be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.

be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.

[IMPORTANT]

[NOTE]

The `wfi` instruction will raise an illegal instruction exception when executed outside of machine-mode

The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit

and <<_mstatus>> bit `TW` (timeout wait) is set.

`TW` (timeout wait) is hardwired to zero.

==== **`Zifencei`** Instruction Stream Synchronization

==== **`Zifencei`** Instruction Stream Synchronization

The `Zifencei` CPU extension is implemented if the `CPU_EXTENSION_RISCV_Zifencei` configuration

The `Zifencei` CPU extension is implemented if the `CPU_EXTENSION_RISCV_Zifencei` configuration

generic is _true_. It allows manual synchronization of the instruction stream via the following instruction:

generic is _true_. It allows manual synchronization of the instruction stream via the following instruction:

* `fence.i`

* `fence.i`

[NOTE]

The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer.

The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer.

This allows a clean re-fetch of modified instructions from memory. Also, the top's `i_bus_fencei_o` signal is set

This allows a clean re-fetch of modified instructions from memory. Also, the top's `i_bus_fencei_o` signal is set

high for one cycle to inform the memory system (like the i-cache to perform a flush/reload.

high for one cycle to inform the memory system (like the i-cache to perform a flush/reload.

Any additional flags within the `fence.i` instruction word are ignore by the hardware.

Any additional flags within the `fence.i` instruction word are ignore by the hardware.

==== **`PMP`** Physical Memory Protection

==== **`PMP`** Physical Memory Protection

The NEORV32 physical memory protection (PMP) is compatible to the PMP specified by the RISC-V specs.

The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used

The CPU PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger minimal sizes can be configured

to constrain memory read/write/execute rights for each available privilege level.

via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements. The physical memory protection system is implemented when the

`PMP_NUM_REGIONS` configuration generic is >0. In this case the following additional CSRs are available:

The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger

minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.

The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.

In this case the following additional CSRs are available:

* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers

* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers

* `pmpaddr*` (0..63, depending on configuration): PMP address registers

* `pmpaddr*` (0..63, depending on configuration): PMP address registers

[TIP]

See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs.

See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs.

**Configuration**

The actual number of regions and the minimal region granularity are defined via the top entity

The actual number of regions and the minimal region granularity are defined via the top entity

`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available

`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available

granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the

granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the

number of available `pmpcfg*` and `pmpaddr*` CSRs.

number of available `pmpcfg*` and `pmpaddr*` CSRs.

Line 618...

Line 633...

constant pmp_num_regions_critical_c : natural := 8;

constant pmp_num_regions_critical_c : natural := 8;

----

----

**Operation**

**Operation**

Any memory access address (from the CPU's instruction fetch or data access interface) is tested if it is accessing any

Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_

of the specified (configured via `pmpaddr*` and enabled via `pmpcfg*`) PMP regions. If an

of the specified  PMP regions(configured via `pmpaddr*` and enabled via `pmpcfg*`). If an

address accesses one of these regions, the configured access rights (attributes in `pmpcfg*`) are checked:

address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:

* a write access (store) will fail if no write attribute is set

* a write access (store) will fail if no write attribute is set

* a read access (load) will fail if no read attribute is set

* a read access (load) will fail if no read attribute is set

* an instruction fetch access will fail if no execute attribute is set

* an instruction fetch access will fail if no execute attribute is set

If an access to a protected region does not have the according access rights (attributes) it will raise the according

If an access to a protected region does not have the according access rights it will raise the according

_instruction/load/store access fault exception_.

instruction/load/store _access fault_ exception.

By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical

By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical

memory protection also for machine-level programs you need to active the _locked bit_ in the according

memory protection also for machine-level programs you need to set the _locked bit_ in the according

`pmpcfg*` configuration.

`pmpcfg*` configuration CSR.

[IMPORTANT]

[IMPORTANT]

After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for

After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for

internal (iterative) computations before the configuration becomes valid.

internal (iterative) computations before the configuration becomes valid.

[NOTE]

[NOTE]

For more information regarding RISC-V physical memory protection see the official _The RISC-V

For more information regarding RISC-V physical memory protection see the official _The RISC-V

Instruction Set Manual – Volume II: Privileged Architecture_ specifications.

Instruction Set Manual - Volume II: Privileged Architecture_ specifications.

==== **`HPM`** Hardware Performance Monitors

==== **`HPM`** Hardware Performance Monitors

In additions to the mandatory cycles (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides

In additions to the mandatory cycle (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides

up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an

up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an

N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's

N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's

`HPM_CNT_WIDTH` generic (0..64-bit), and a corresponding event configuration CSR. The event configuration

`HPM_CNT_WIDTH` generic (0..64-bit) and a corresponding event configuration CSR. The event configuration

CSR defines the architectural events that lead to an increment of the associated HPM counter.

CSR defines the architectural events that lead to an increment of the associated HPM counter.

The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are

The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are

mandatory performance monitors on every RISC-V platform and have fixed increment events. For example,

mandatory performance monitors on every RISC-V platform and have fixed increment events. For example,

the instructions-retired counter increments with each executed instructions. The actual hardware performance

the instructions-retired counter increments with each executed instructions. The actual hardware performance

monitors are optional and can be configured to increment on arbitrary hardware events. The number of

monitors are optional and can be configured to increment on arbitrary hardware events. The number of

available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will exclude

available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will remove

all HPM logic from the design.

all HPM logic from the design.

Depending on the configuration, the following additional CSR are available:

If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the

according `mcountinhibit` CSR bits are hardwired to zero.

However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).

The according CSRs are read-only and will always return 0.

Depending on the configuration the following additional CSR are available:

* counters: `mhpmcounter*[h]` (3..31, depending on configuration)

* counters: `mhpmcounter*[h]` (3..31, depending on `HPM_NUM_CNTS`)

* event configuration: `mhpmevent*` (3..31, depending on configuration)

* event configuration: `mhpmevent*` (3..31, depending on `HPM_NUM_CNTS`)

[IMPORTANT]

[IMPORTANT]

The HPM counter CSR can only be accessed in machine-mode. Hence, the according `mcounteren` CSR bits

The HPM counter CSR can only be accessed in machine-mode. Hence, the according `mcounteren` CSR bits

are always zero and read-only.

are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction

exception.

[TIP]

Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR.

Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR.

If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the

[TIP]

according `mcountinhibit` CSR bits are hardwired to zero.

For a list of all HPM-related CSRs and all provided event configurations

However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).

see section <<_hardware_performance_monitors_hpm>>.

The according CSRs are read-only and will always return 0.

[NOTE]

For a list of all allocated HPM-related CSRs and all provided event configurations see section <<_hardware_performance_monitors_hpm>>.

// ####################################################################################################################

// ####################################################################################################################

:sectnums:

:sectnums:

Line 733...

Line 751...

| Basic bit-manip - arith | `Zbb` | `max` `maxu` `min` `minu` | 3

| Basic bit-manip - arith | `Zbb` | `max` `maxu` `min` `minu` | 3

| Basic bit-manip - misc  | `Zbb` | `sext.b` `sext.h` `zext.h` `orc.b` `rev8` | 3

| Basic bit-manip - misc  | `Zbb` | `sext.b` `sext.h` `zext.h` `orc.b` `rev8` | 3

|=======================

|=======================

[NOTE]

[NOTE]

The presented values of the *floating-point execution cycles* are average values – obtained from

The presented values of the *floating-point execution cycles* are average values - obtained from

4096 instruction executions using pseudo-random input values. The execution time for emulating the

4096 instruction executions using pseudo-random input values. The execution time for emulating the

instructions (using pure-software libraries) is ~17..140 times higher.

instructions (using pure-software libraries) is ~17..140 times higher.

Line 764...

Line 782...

* Due to the acknowledged memory accesses the CPU is _always_ sync with the memory system

* Due to the acknowledged memory accesses the CPU is _always_ sync with the memory system

(i.e. there is no speculative execution / no out-of-order states).

(i.e. there is no speculative execution / no out-of-order states).

* The CPU supports _all_ RISC-V bus exceptions including access exceptions that are triggered if an

* The CPU supports _all_ RISC-V bus exceptions including access exceptions that are triggered if an

accessed address does not respond or encounters an internal error during access.

accessed address does not respond or encounters an internal error during access.

* The CPU raises an illegal instruction trap for _all_ unimplemented/malformed/illegal instructions.

* The RISC-V specs. state that executing an malformed instruction results in unpredictable behavior. As an additional security feature,

the NEORV32 CPU ensures that _all_ unimplemented/malformed/illegal instructions _do raise an illegal instruction trap_ and

_do not commit any operation_ (like writing registers or triggering memory operations).

* To be continued...

* To be continued...

// ####################################################################################################################

// ####################################################################################################################

Line 789...

Line 809...

The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered

The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered

while all remaining exceptions are ignored. If several _interrupts_ trigger at once, the one with highest priority

while all remaining exceptions are ignored. If several _interrupts_ trigger at once, the one with highest priority

is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with

is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with

the second highest priority will get serviced and so on until no further interrupt are pending.

the second highest priority will get serviced and so on until no further interrupt are pending.

.RISC-V interrupts

.Interrupt Signal Requirements

[IMPORTANT]

[IMPORTANT]

All RISC-V defined machine level interrupts request signals are high-active. A request has to stay at high-level until

All interrupts request signals (including FIRQs) are **high-active**. A request has to stay at high-level (=asserted)

it is acknowledged by the CPU (for example by writing to a specific memory-mapped register).

until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).

.Instruction Atomicity

.Instruction Atomicity

[NOTE]

[NOTE]

All instructions execute as atomic operations – interrupts can only trigger between two instructions.

All instructions execute as atomic operations - interrupts can only trigger between two instructions.

So if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before

So if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before

a new interrupt handler can start.

a new interrupt handler can start.

:sectnums:

:sectnums:

Line 815...

Line 835...

:sectnums:

:sectnums:

==== Custom Fast Interrupt Request Lines

==== Custom Fast Interrupt Request Lines

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top

As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top

entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also

entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also

provide custom trap codes in `mcause`. These FIRQs are reserved for processor-internal usage only.

provide custom trap codes in `mcause`. These FIRQs are reserved for NEORV32 processor-internal usage only.

[NOTE]

The fast interrupt request lines trigger on a **rising-edge**.

// ####################################################################################################################

// ####################################################################################################################

:sectnums!:

:sectnums!:

Line 892...

Line 910...

===== Address Space

===== Address Space

The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard

The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard

Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory

Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory

system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU

system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU

does not support unaligned memory accesses _in hardware_ – however, a software-based handling can be

does not support unaligned memory accesses _in hardware_ - however, a software-based handling can be

implemented as any unaligned memory access will trigger an according exception.

implemented as any unaligned memory access will trigger an according exception.

:sectnums:

:sectnums:

===== Interface Signals

===== Interface Signals

Browse

Tools

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [datasheet/] [cpu.adoc] - Diff between revs 64 and 65