OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

Compare Revisions

  • This comparison shows the changes necessary to convert path
    /neorv32/trunk/docs
    from Rev 64 to Rev 65
    Reverse comparison

Rev 64 → Rev 65

/datasheet/cpu.adoc
21,9 → 21,9
** `PMP` - physical memory protection
** `HPM` - hardware performance monitors
** `DB` - debug mode
* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications – passes the official RISC-V Architecture Tests (v2+)
* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)
* Official RISC-V open-source architecture ID
* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts and 1 non-maskable interrupt
* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts
* Supports most of the traps from the RISC-V specifications (including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions
* Optional physical memory configuration (PMP), compatible to the RISC-V specifications
* Optional hardware performance monitors (HPM) for application benchmarking
31,7 → 31,7
the NEORV32 processor)
* little-endian byte order
* Configurable hardware reset
* No hardware support of unaligned data/instruction accesses – they will trigger an exception.
* No hardware support of unaligned data/instruction accesses - they will trigger an exception.
 
[NOTE]
It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual
53,11 → 53,11
 
image::neorv32_cpu.png[align=center]
 
The CPU uses a pipelined architecture with basically two main stages. The first stage (IF – instruction fetch)
The CPU uses a pipelined architecture with basically two main stages. The first stage (IF - instruction fetch)
is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is
stored to a FIFO – the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit
instruction words for the next pipeline stage. Compressed instructions – if enabled – are also decompressed
in this stage. The second stage (EX – execution) is responsible for actually executing the fetched instructions
stored to a FIFO - the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit
instruction words for the next pipeline stage. Compressed instructions - if enabled - are also decompressed
in this stage. The second stage (EX - execution) is responsible for actually executing the fetched instructions
via the execute engine.
 
These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a
223,6 → 223,8
[IMPORTANT]
The `misa`, `mip` and `mtval` CSRs in the NEORV32 are _read-only_.
Any write access to it (in machine mode) to them are ignored and will _not_ cause any exceptions or side-effects.
Pending interrupt can only be cleared by acknowledging the interrupt-causing device. However, pending interrupts
can still be ignored by clearing the according `mie` register bits.
 
.Physical memory protection
[IMPORTANT]
337,9 → 339,9
:sectnums:
=== Instruction Sets and Extensions
 
The NEORV32 is an RISC-V `rv32i` architecture that provides several optional RISC-V CPU and ISA
The basic NEORV32 is a RISC-V `rv32i` architecture that provides several _optional_ RISC-V CPU and ISA
(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please
see the The _RISC-V Instruction Set Manual – Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual
see the the _RISC-V Instruction Set Manual - Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual
Volume II: Privileged Architecture_, which are available in the projects `docs/references` folder.
 
[TIP]
348,15 → 350,16
or by executing an instruction and checking for an _illegal instruction exception_.
 
[NOTE]
Executing an instruction from an extension that is not implemented or not enabled (for example via the according
top entity generic) will raise an _illegal instruction_ exception.
Executing an instruction from an extension that is not supported yet or that is currently not enabled
(via the according top entity generic) will raise an _illegal instruction_ exception.
 
 
==== **`A`** - Atomic Memory Access
 
Atomic memory access instructions (for implementing semaphores and mutexes) are available when the
`CPU_EXTENSION_RISCV_A` configuration generic is _true_. In this case the following additional instructions
are available:
Atomic memory access instructions allow more sophisticated memory operations like implementing semaphores and mutexes.
The RICS-C specs. defines a specific _atomic_ extension that provides instructions for atomic memory accesses. The `A`
ISA extension is enabled if the `CPU_EXTENSION_RISCV_A` configuration generic is _true_.
In this case the following additional instructions are available:
 
* `lr.w`: load-reservate
* `sc.w`: store-conditional
364,9 → 367,19
[NOTE]
Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations
(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the
instruction’s ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet
implemented) AMO (atomic memory operation) will trigger an illegal instruction exception.
instruction's ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet
implemented) AMO (atomic memory operation) will raise an illegal instruction exception.
 
The *load-reservate* instruction behaves as a "normal" load-word instruction (`lw`) but will also set a CPU-internal
_data memory access lock_. Executing a *store-conditional* behaves as "normal" store-word instruction (`sw`) that will
only conduct an actual memory write operations if the lock is still intact. Additionally, the store-conditional instruction
will also return the lock state (returns zero if the lock is still intact or non-zero if the lock has been broken).
After the execution of the `sc` instruction, the lock is automatically removed.
 
The lock is broken if at least one of the following conditions occur:
. executing any data memory access instruction other than `lr.w`
. raising _any_ t (for example an interrupt or a memory access exception)
 
[NOTE]
The atomic instructions have special requirements for memory system / bus interconnect. More
information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively.
374,15 → 387,16
 
==== **`C`** - Compressed Instructions
 
Compressed 16-bit instructions are available when the `CPU_EXTENSION_RISCV_C` configuration generic is
_true_. In this case the following instructions are available:
The _compressed_ ISA extension provides 16-bit encodings of commonly used instructions to reduce code space size.
The `C` extension is available when the `CPU_EXTENSION_RISCV_C` configuration generic is _true_.
In this case the following instructions are available:
 
* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`,
`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp`
 
[NOTE]
When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ address require
an additional instruction fetch to load the required second half-word of that instruction. The performance can be increased
When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ instruction require
an additional instruction fetch to load the according second half-word of that instruction. The performance can be increased
again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`,
`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile).
 
389,9 → 403,10
 
==== **`E`** - Embedded CPU
 
The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to reduce hardware
requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond
`x15` will raise and _illegal instruction exception_.
The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to
decrease physical hardware requirements (for example block RAM). This extensions is enabled when the `CPU_EXTENSION_RISCV_E`
configuration generic is _true_. Accesses to registers beyond `x15` will raise and _illegal instruction exception_.
This extension does not add any additional instructions or features.
 
[IMPORTANT]
Due to the reduced register file size an alternate toolchain ABI (**`ilp32e`**) is required.
398,11 → 413,12
 
 
==== **`I`** - Base Integer ISA
 
The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled
regardless of the setting of the remaining exceptions. The base instruction set includes the following
instructions:
 
* immediates: `lui`, `auipc`
* immediate: `lui`, `auipc`
* jumps: `jal`, `jalr`
* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu`
* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw`
423,7 → 439,7
 
==== **`M`** - Integer Multiplication and Division
 
Hardware-accelerated integer multiplication and division instructions are available when the
Hardware-accelerated integer multiplication and division operations are available when the
`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are
available:
 
440,9 → 456,9
==== **`Zmmul`** - Integer Multiplication
 
This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations
of the `M` extensions and is intended for small scale applications, that require hardware-based
of the `M` extensions and is intended for size-constrained setups that require hardware-based
integer multiplications but not hardware-based divisions, which will be computed entirely in software.
This extension requires only ~50% of the hardware utilization of the `M` extension.
This extension requires only ~50% of the hardware utilization of the "full" `M` extension.
 
* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
 
454,14 → 470,16
[TIP]
If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"
using a `rv32im` machine architecture and setting the `-mno-div` compiler flag
(example `$ make MARCH=-march=rv32im USER_FLAGS+=-mno-div clean_all exe`).
(example `$ make MARCH=rv32im USER_FLAGS+=-mno-div clean_all exe`).
 
 
==== **`U`** - Less-Privileged User Mode
 
Adds the less-privileged _user mode_ if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For
instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like
peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode.
In addition to the basic (and highest-privileged) machine-mode, the _user-mode_ ISA extensions adds a second less-privileged
operation mode. It is implemented if the `CPU_EXTENSION_RISCV_U` configuration generic is _true_.
Code executed in user-mode cannot access machine-mode CSRs. Furthermore, user-mode access to the address space (like
peripheral/IO devices) can be constrained via the physical memory protection (_PMP_).
Any kind of privilege rights violation will raise an exception to allow full virtualization.
 
 
==== **`X`** - NEORV32-Specific (Custom) Extensions
477,19 → 495,18
 
==== **`Zfinx`** Single-Precision Floating-Point Operations
 
[WARNING]
The NEORV32 `Zfinx` extension is specification-compliant and operational but still _experimental_.
The `Zfinx` floating-point extension is an _alternative_ of the standard `F` floating-point ISA extension.
The `Zfinx` extensions also uses the integer register file `x` to store and operate on floating-point data
instead of a dedicated floating-point register file (hence, `F-in-x`). Thus, the `Zfinx` extension requires
less hardware resources and features faster context changes. This also implies that there are NO dedicated `f`
register file-related load/store or move instructions.
The official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx
 
The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the
integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f`
register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes.
This also implies that there are NO dedicated `f` register file related load/store or move instructions. The
official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx
 
[TIP]
The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications.
 
The `Zfinx` extensions only supports single-precision (`.s` suffix) yet (so it is a direct alternative to the `F`
extension). The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration
The `Zfinx` extensions only supports single-precision (`.s` instruction suffix), so it is a direct alternative
to the `F` extension. The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration
generic is _true_. In this case the following instructions and CSRs are available:
 
* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s`
505,8 → 522,8
Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!
 
[WARNING]
Subnormal numbers (also "de-normalized" numbers) are not supported by the NEORV32 FPU.
Subnormal numbers (exponent = 0) are _flushed to zero_ (setting them to +/- 0) before entering the
Subnormal numbers ("de-normalized" numbers) are not supported by the NEORV32 FPU.
Subnormal numbers (exponent = 0) are _flushed to zero_ setting them to +/- 0 before entering the
FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the
result is also flushed to zero during normalization.
 
519,9 → 536,6
 
==== **`Zbb`** Basic Bit-Manipulation Operations
 
[WARNING]
The NEORV32 `Zbb` extension is specification-compliant and operational but still _experimental_.
 
The `Zbb` extension implements the _basic_ sub-set of the RISC-V bit-manipulation extensions `B`.
The official RISC-V specifications can be found here: https://github.com/riscv/riscv-bitmanip
 
541,7 → 555,7
<<_fast_shift_en>> generic can be enabled to implement full-parallel logic (like barrel shifters) for all
shift-related `Zbb` instructions.
 
[IMPORTANT]
[WARNING]
The `Zbb` extension is frozen but not officially ratified yet. There is no
software support for this extension in the upstream GCC RISC-V port yet. However, an
intrinsic library is provided to utilize the provided `Zbb` extension from C-language
550,17 → 564,17
 
==== **`Zicsr`** Control and Status Register Access / Privileged Architecture
 
The CSR access instructions as well as the exception and interrupt system (= the privileged architecture) is implemented when the
`CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_. In this case the following instructions are
available:
The CSR access instructions as well as the exception and interrupt system (= the privileged architecture)
is implemented when the `CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_.
In this case the following instructions are available:
 
* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`
* environment: `mret`, `wfi`
 
[WARNING]
If the `Zicsr` extension is disabled the CPU does not provide any kind of interrupt or exception
support at all. In order to provide the full spectrum of functions and to allow a secure executions
environment, the `Zicsr` extension should always be enabled.
If the `Zicsr` extension is disabled the CPU does not provide any _privileged architecture_ features at all!
In order to provide the full set of functions and to allow a secure execution
environment the `Zicsr` extension should always be enabled.
 
[NOTE]
The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is
567,9 → 581,9
halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to
be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.
 
[IMPORTANT]
The `wfi` instruction will raise an illegal instruction exception when executed outside of machine-mode
and <<_mstatus>> bit `TW` (timeout wait) is set.
[NOTE]
The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit
`TW` (timeout wait) is hardwired to zero.
 
 
==== **`Zifencei`** Instruction Stream Synchronization
579,7 → 593,6
 
* `fence.i`
 
[NOTE]
The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer.
This allows a clean re-fetch of modified instructions from memory. Also, the top's `i_bus_fencei_o` signal is set
high for one cycle to inform the memory system (like the i-cache to perform a flush/reload.
588,18 → 601,20
 
==== **`PMP`** Physical Memory Protection
 
The NEORV32 physical memory protection (PMP) is compatible to the PMP specified by the RISC-V specs.
The CPU PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger minimal sizes can be configured
via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements. The physical memory protection system is implemented when the
`PMP_NUM_REGIONS` configuration generic is >0. In this case the following additional CSRs are available:
The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used
to constrain memory read/write/execute rights for each available privilege level.
 
The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger
minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.
The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.
In this case the following additional CSRs are available:
 
* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers
* `pmpaddr*` (0..63, depending on configuration): PMP address registers
 
[TIP]
See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs.
 
**Configuration**
 
The actual number of regions and the minimal region granularity are defined via the top entity
`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available
granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the
620,20 → 635,20
 
**Operation**
 
Any memory access address (from the CPU's instruction fetch or data access interface) is tested if it is accessing any
of the specified (configured via `pmpaddr*` and enabled via `pmpcfg*`) PMP regions. If an
address accesses one of these regions, the configured access rights (attributes in `pmpcfg*`) are checked:
Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_
of the specified PMP regions(configured via `pmpaddr*` and enabled via `pmpcfg*`). If an
address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:
 
* a write access (store) will fail if no write attribute is set
* a read access (load) will fail if no read attribute is set
* an instruction fetch access will fail if no execute attribute is set
 
If an access to a protected region does not have the according access rights (attributes) it will raise the according
_instruction/load/store access fault exception_.
If an access to a protected region does not have the according access rights it will raise the according
instruction/load/store _access fault_ exception.
 
By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical
memory protection also for machine-level programs you need to active the _locked bit_ in the according
`pmpcfg*` configuration.
memory protection also for machine-level programs you need to set the _locked bit_ in the according
`pmpcfg*` configuration CSR.
 
[IMPORTANT]
After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for
641,15 → 656,15
 
[NOTE]
For more information regarding RISC-V physical memory protection see the official _The RISC-V
Instruction Set Manual – Volume II: Privileged Architecture_ specifications.
Instruction Set Manual - Volume II: Privileged Architecture_ specifications.
 
 
==== **`HPM`** Hardware Performance Monitors
 
In additions to the mandatory cycles (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides
In additions to the mandatory cycle (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides
up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an
N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's
`HPM_CNT_WIDTH` generic (0..64-bit), and a corresponding event configuration CSR. The event configuration
`HPM_CNT_WIDTH` generic (0..64-bit) and a corresponding event configuration CSR. The event configuration
CSR defines the architectural events that lead to an increment of the associated HPM counter.
 
The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are
656,29 → 671,32
mandatory performance monitors on every RISC-V platform and have fixed increment events. For example,
the instructions-retired counter increments with each executed instructions. The actual hardware performance
monitors are optional and can be configured to increment on arbitrary hardware events. The number of
available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will exclude
available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will remove
all HPM logic from the design.
 
Depending on the configuration, the following additional CSR are available:
If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the
according `mcountinhibit` CSR bits are hardwired to zero.
However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).
The according CSRs are read-only and will always return 0.
 
* counters: `mhpmcounter*[h]` (3..31, depending on configuration)
* event configuration: `mhpmevent*` (3..31, depending on configuration)
Depending on the configuration the following additional CSR are available:
 
* counters: `mhpmcounter*[h]` (3..31, depending on `HPM_NUM_CNTS`)
* event configuration: `mhpmevent*` (3..31, depending on `HPM_NUM_CNTS`)
 
[IMPORTANT]
The HPM counter CSR can only be accessed in machine-mode. Hence, the according `mcounteren` CSR bits
are always zero and read-only.
are always zero and read-only. Any access from less-privileged modes will raise an illegal instruction
exception.
 
[TIP]
Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR.
 
If `HPM_NUM_CNTS` is lower than the maximum value (=29) the remaining HPM CSRs are not implemented and the
according `mcountinhibit` CSR bits are hardwired to zero.
However, accessing their associated CSRs will not raise an illegal instruction exception (if in machine mode).
The according CSRs are read-only and will always return 0.
[TIP]
For a list of all HPM-related CSRs and all provided event configurations
see section <<_hardware_performance_monitors_hpm>>.
 
[NOTE]
For a list of all allocated HPM-related CSRs and all provided event configurations see section <<_hardware_performance_monitors_hpm>>.
 
 
<<<
// ####################################################################################################################
:sectnums:
735,7 → 753,7
|=======================
 
[NOTE]
The presented values of the *floating-point execution cycles* are average values – obtained from
The presented values of the *floating-point execution cycles* are average values - obtained from
4096 instruction executions using pseudo-random input values. The execution time for emulating the
instructions (using pure-software libraries) is ~17..140 times higher.
 
766,7 → 784,9
(i.e. there is no speculative execution / no out-of-order states).
* The CPU supports _all_ RISC-V bus exceptions including access exceptions that are triggered if an
accessed address does not respond or encounters an internal error during access.
* The CPU raises an illegal instruction trap for _all_ unimplemented/malformed/illegal instructions.
* The RISC-V specs. state that executing an malformed instruction results in unpredictable behavior. As an additional security feature,
the NEORV32 CPU ensures that _all_ unimplemented/malformed/illegal instructions _do raise an illegal instruction trap_ and
_do not commit any operation_ (like writing registers or triggering memory operations).
* To be continued...
 
 
791,14 → 811,14
is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with
the second highest priority will get serviced and so on until no further interrupt are pending.
 
.RISC-V interrupts
.Interrupt Signal Requirements
[IMPORTANT]
All RISC-V defined machine level interrupts request signals are high-active. A request has to stay at high-level until
it is acknowledged by the CPU (for example by writing to a specific memory-mapped register).
All interrupts request signals (including FIRQs) are **high-active**. A request has to stay at high-level (=asserted)
until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).
 
.Instruction Atomicity
[NOTE]
All instructions execute as atomic operations – interrupts can only trigger between two instructions.
All instructions execute as atomic operations - interrupts can only trigger between two instructions.
So if there is a permanent interrupt request, exactly one instruction from the interrupt program will be executed before
a new interrupt handler can start.
 
817,10 → 837,8
 
As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top
entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also
provide custom trap codes in `mcause`. These FIRQs are reserved for processor-internal usage only.
provide custom trap codes in `mcause`. These FIRQs are reserved for NEORV32 processor-internal usage only.
 
[NOTE]
The fast interrupt request lines trigger on a **rising-edge**.
 
 
<<<
894,7 → 912,7
The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard
Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory
system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU
does not support unaligned memory accesses _in hardware_ – however, a software-based handling can be
does not support unaligned memory accesses _in hardware_ - however, a software-based handling can be
implemented as any unaligned memory access will trigger an according exception.
 
:sectnums:
/datasheet/cpu_csr.adoc
218,7 → 218,6
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 21 | _CSR_MSTATUS_TW_ | r/w | Timeout wait: raise illegal instruction exception if `WFI` instruction is executed outside of M-mode when set
| 12:11 | _CSR_MSTATUS_MPP_H_ : _CSR_MSTATUS_MPP_L_ | r/w | Previous machine privilege level, 11 = machine (M) level, 00 = user (U) level
| 7 | _CSR_MSTATUS_MPIE_ | r/w | Previous machine global interrupt enable flag state
| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag
446,8 → 445,9
|======
| 0x344 | **Machine interrupt Pending** | `mip`
3+| Reset value: _0x00000000_
3+| The `mip` CSR is _partly_ compatible to the RISC-V specifications and also provides custom extensions. It shows currently pending interrupts. Since this register is
read-only, pending interrupts of processor-internal modules can only be cleared by disabling and re-enabling the according `mie` CSR bit.
3+| The `mip` CSR is compatible to the RISC-V specifications and also provides custom extensions. It shows currently _pending_ interrupts.
Since this register is read-only, pending interrupts of processor-internal modules can only be cleared by acknowledging the interrupt-causing
device. However, pending interrupts can be ignored by clearing the accordind <<_mie>> register bits.
The following CSR bits are implemented (all remaining bits are always zero and are read-only).
|======
 
502,7 → 502,7
[options="header",grid="rows"]
|=======================
| Bit | RISC-V name | R/W | Function
| 7 | _L_ | r/w | lock bit, can be set – but not be cleared again (only via CPU reset)
| 7 | _L_ | r/w | lock bit, can be set - but not be cleared again (only via CPU reset)
| 6:5 | - | r/- | reserved, read as zero
| 4:3 | _A_ | r/w | mode configuration; only OFF (`00`) and NAPOT (`11`) are supported
| 2 | _X_ | r/w | execute permission
/datasheet/on_chip_debugger.adoc
19,7 → 19,9
.OCD Security Note
[IMPORTANT]
Access via the OCD is _always authenticated_ (`dmstatus.authenticated` == `1`). Hence, the
_whole system_ can always be accessed via the on-chip debugger.
_whole system_ can always be accessed via the on-chip debugger. Currently, there is no option
to disable the OCD via software. The OCD can only be disabled by disabling implementation
(setting _ON_CHIP_DEBUGGER_EN_ generic to _false_).
 
[NOTE]
The OCD requires additional resources for implementation and _might_ also increase the critical path resulting in less
352,15 → 354,15
[cols="^1,^2,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | Description / required value
| 31:24 | `cmdtype` | `00000000` to indicate "access register" command
| 23 | _reserved_ | reserved, has to be `0` when writing
| 22:20 | `aarsize` | `010` to indicate 32-bit accesses
| 21 | `aarpostincrement` | `0`, postincrement is not supported
| 18 | `postexec` | if set the program buffer is executed _after_ the command
| 17 | `transfer` | if set the operation in `write` is conducted
| 16 | `write` | `1`: copy `data0` to `[regno]`; `0` copy `[regno]` to `data0`
| 15:0 | `regno` | GPR-access only; has to be `0x1000` - `0x101f`
| Bit | Name [RISC-V] | R/W | Description / required value
| 31:24 | `cmdtype` | -/w | `00000000` to indicate "access register" command
| 23 | _reserved_ | -/w | reserved, has to be `0` when writing
| 22:20 | `aarsize` | -/w | `010` to indicate 32-bit accesses
| 21 | `aarpostincrement` | -/w | `0`, postincrement is not supported
| 18 | `postexec` | -/w | if set the program buffer is executed _after_ the command
| 17 | `transfer` | -/w | if set the operation in `write` is conducted
| 16 | `write` | -/w | `1`: copy `data0` to `[regno]`; `0` copy `[regno]` to `data0`
| 15:0 | `regno` | -/w | GPR-access only; has to be `0x1000` - `0x101f`
|=======================
 
 
493,8 → 495,14
The CPU _debug mode_ requires the `Zicsr` and `Zifencei` CPU extension to be implemented (top generics _CPU_EXTENSION_RISCV_Zicsr_
and _CPU_EXTENSION_RISCV_Zifencei_ = true).
 
The CPU debug mode is entered when one of the following events appear:
.Hardware Watchpoints and Breakpoints
[NOTE]
The NEORV32 CPU _debug mode_ does not provide a hardware "trigger module" (which is optional in the RISC-V debug spec). However, gdb
provides a native _emulation_ for code (breakpoints using `break` instruction) and data (polling data watchpoints in automated
single-stepping) triggers.
 
The CPU debug-mode is entered when one of the following events appear:
 
[start=1]
. executing `ebreak` instruction (when `dcsr.ebreakm` is set and in machine mode OR when `dcsr.ebreaku` is set and in user mode)
. debug halt request from external DM (via CPU signal `db_halt_req_i`, high-active, triggering on rising-edge)
503,8 → 511,14
From a hardware point of view, these "entry conditions" are special synchronous (`ebreak` instruction) or asynchronous
(single-stepping "interrupt"; halt request "interrupt") traps, that are handled invisibly by the control logic.
 
Whenever the CPU **enters debug mode** it performs the following operations:
.WFI instruction
[WARNING]
The wait-for-interrupt instruction `wfi` puts the CPU into sleep mode. The CPU will resume normale operation
when at least one interrupt source becomes pending (= at least one bit in `mip` CSR is set).
However, the CPU will _also resume_ from sleep mode if there is a halt request from the debug module (DM).
 
Whenever the CPU **enters debug-mode** it performs the following operations:
 
* move `pc` to `dpcs`
* copy the hart's current privilege level to `dcsr.prv`
* set `dcrs.cause` according to the cause why debug mode is entered
511,16 → 525,17
* **no update** of `mtval`, `mcause`, `mtval` and `mstatus` CSRs
* load the address configured via the CPU _CPU_DEBUG_ADDR_ generic to the `pc` to jump to "debugger park loop" code in the debug module (DM)
 
When the CPU **is in debug mode** the following things are important:
When the CPU **is in debug-mode** the following things are important:
 
* while in debug mode, the CPU executes the parking loop and the program buffer provided by the DM if requested
* effective CPU privilege level is `machine` mode, PMP is not active
* if an exception occurs
* if the exception was caused by any debug-mode entry action the CPU jumps to the _normal entry point_
( = _CPU_DEBUG_ADDR_) of the park loop again (for example when executing `ebreak` in debug mode)
* for all other exception sources the CPU jumps to the _exception entry point_ ( = _CPU_DEBUG_ADDR_ + 4)
to signal an exception to the DM and restarts the park loop again afterwards
* interrupts are disabled; however, they will be remain pending and will get executed after the CPU has left debug mode
* effective CPU privilege level is `machine` mode, any PMP configuration is bypassed
* the `wfi` instruction acts as a `nop` (also during single-stepping)
* if an exception occurs:
** if the exception was caused by any debug-mode entry action the CPU jumps to the _normal entry point_
(= _CPU_DEBUG_ADDR_) of the park loop again (for example when executing `ebreak` _in_ debug-mode)
** for all other exception sources the CPU jumps to the _exception entry point_ ( = _CPU_DEBUG_ADDR_ + 4)
to signal an exception to the DM and restarts the park loop again afterwards
* interrupts are disabled; however, they will remain pending and will get executed after the CPU has left debug mode
* if the DM makes a resume request, the park loop exits and the CPU leaves debug mode (executing `dret`)
 
Debug mode is left either by executing the `dret` instruction footnote:[`dret` should only be executed _inside_ the debugger
/datasheet/overview.adoc
11,7 → 11,7
compatible on-chip debugger accessible via JTAG.
 
The software framework of the processor comes with application makefiles, software libraries for all CPU
and processor features, a bootloader, a runtime environment and several example programs – including a port
and processor features, a bootloader, a runtime environment and several example programs - including a port
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
 
26,11 → 26,11
 
**Structure**
 
[start=1]
[start=2]
. <<_neorv32_processor_soc>>
. <<_neorv32_central_processing_unit_cpu>>
. <<_software_framework>>
. <<_on_chip_debugger_ocd>>
. <<_software_framework>>
 
[TIP]
Links in this document are <<_overview,highlighted>>.
/datasheet/soc.adoc
13,7 → 13,7
* _optional_ processor-internal data and instruction memories (<<_data_memory_dmem,**DMEM**>>/<<_instruction_memory_imem,**IMEM**>>) + cache (<<_processor_internal_instruction_cache_icache,**iCACHE**>>)
* _optional_ internal bootloader (<<_bootloader_rom_bootrom,**BOOTROM**>>) with UART console & SPI flash boot option
* _optional_ machine system timer (<<_machine_system_timer_mtime,**MTIME**>>), RISC-V-compatible
* _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>, <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,**UART1**>>) with optional hardware flow control (RTS/CTS)
* _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>, <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,**UART1**>>) with optional hardware flow control (RTS/CTS) and optional RX/TX FIFOs
* _optional_ 8/16/24/32-bit serial peripheral interface controller (<<_serial_peripheral_interface_controller_spi,**SPI**>>) with 8 dedicated CS lines
* _optional_ two wire serial interface controller (<<_two_wire_serial_interface_controller_twi,**TWI**>>), compatible to the I²C standard
* _optional_ general purpose parallel IO port (<<_general_purpose_input_and_output_port_gpio,**GPIO**>>), 64xOut, 64xIn
796,6 → 796,32
 
 
:sectnums!:
===== _IO_UART0_RX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **IO_UART0_RX_FIFO** | _natural_ | 1
3+| UART0 receiver FIFO depth, has to be a power of two, minimum value is 1 (implementing simple double-buffering).
See section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> for
more information.
|======
 
 
:sectnums!:
===== _IO_UART0_TX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **IO_UART0_TX_FIFO** | _natural_ | 1
3+| UART0 transmitter FIFO depth, has to be a power of two, minimum value is 1 (implementing simple double-buffering).
See section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> for
more information.
|======
 
 
:sectnums!:
===== _IO_UART1_EN_
 
[cols="4,4,2"]
808,6 → 834,32
 
 
:sectnums!:
===== _IO_UART1_RX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **IO_UART1_RX_FIFO** | _natural_ | 1
3+| UART1 receiver FIFO depth, has to be a power of two, minimum value is 1 (implementing simple double-buffering).
See section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> for
more information.
|======
 
 
:sectnums!:
===== _IO_UART1_TX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **IO_UART1_TX_FIFO** | _natural_ | 1
3+| UART1 transmitter FIFO depth, has to be a power of two, minimum value is 1 (implementing simple double-buffering).
See section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> for
more information.
|======
 
 
:sectnums!:
===== _IO_SPI_EN_
 
[cols="4,4,2"]
964,8 → 1016,9
 
.Trigger type
[IMPORTANT]
These IRQs trigger on **high-level** and must _stay asserted_ until explicitly acknowledged by the CPU (for example
by writing to a specific memory-mapped register).
The fast interrupt request channel trigger on **high-level** and have to stay asserted until explicitly acknowledged
by the software (for example by writing to a specifc memory-mapped register). Hence, pending interrupts remain pending
as long as the interrupt-causing device's state fulfills it's interrupt condition(s).
 
 
:sectnums:
986,7 → 1039,8
[IMPORTANT]
The trigger for these interrupt can be defined via generics. See section
<<_external_interrupt_controller_xirq>> for more information. Depending on the trigger type, users can
implement custom acknowledge mechanisms.
implement custom acknowledge mechanisms. All _external interrupts_ are mapped to a single processor-internal
_fast interrupt request_ (see below).
 
 
:sectnums:
993,7 → 1047,7
==== NEORV32-Specific Fast Interrupt Requests
 
As part of the custom/NEORV32-specific CPU extensions, the CPU features 16 fast interrupt request signals
(`FIRQ0` – `FIRQ15`). These are used for _processor-internal_ modules only (for example for the communication
(`FIRQ0` - `FIRQ15`). These are reserved for _processor-internal_ modules only (for example for the communication
interfaces to signal "available incoming data" or "ready to send new data").
 
The mapping of the 16 FIRQ channels is shown in the following table (the channel number also corresponds to
1013,16 → 1067,17
| 6 | <<_serial_peripheral_interface_controller_spi,SPI>> | SPI transmission done interrupt
| 7 | <<_two_wire_serial_interface_controller_twi,TWI>> | TWI transmission done interrupt
| 8 | <<_external_interrupt_controller_xirq,XIRQ>> | External interrupt controller interrupt
| 9 | <<_smart_led_interface_neoled,NEOLED>> | NEOLED buffer TX empty / not full interrupt
| 10 | <<_stream_link_interface_slink,SLINK>> | RX data received
| 11 | <<_stream_link_interface_slink,SLINK>> | TX data send
| 9 | <<_smart_led_interface_neoled,NEOLED>> | NEOLED TX buffer interrupt
| 10 | <<_stream_link_interface_slink,SLINK>> | RX data buffer interrupt
| 11 | <<_stream_link_interface_slink,SLINK>> | TX data buffer interrupt
| 12:15 | - | _reserved_, will never fire
|=======================
 
.Trigger type
[IMPORTANT]
The fast interrupt request channel trigger on a single **rising-edge** and do not require any kind of explicit
acknowledgment at all.
The fast interrupt request channel trigger on **high-level** and have to stay asserted until explicitly acknowledged
by the software (for example by writing to a specifc memory-mapped register). Hence, pending interrupts remain pending
as long as the interrupt-causing device's state fulfills it's interrupt condition(s).
 
 
 
1031,21 → 1086,18
:sectnums:
=== Address Space
 
The NEORV32 Processor provides 32-bit physical addresses accessing up to 4GB of address space.
By default, this address space is divided into four main regions:
The NEORV32 Processor provides a 32-bit / 4GB (physical) address space
By default, this address space is divided into five main regions:
 
1. **Instruction address space** – for instructions (=code) and constants. A configurable section of this address space is used by
internal and/or external _instruction memory_ (IMEM).
2. **Data address space** – for application runtime data (heap, stack, etc.). A configurable section of this address space is used by
internal and/or external _data memory_ (DMEM).
3. **Bootloader address space**. A _fixed_ section of this address space is used by
1. **Instruction address space** - memory address space for instructions (=code) and constants.
A configurable section of this address space is used by the internal/external _instruction memory_ (<<_mem_int_imem_size>> for the internal IMEM).
2. **Data address space** - memory address space for application runtime data (heap, stack, etc.).
A configurable section of this address space is used by the internal/external _data memory_ (<<_mem_int_dmem_size>> for the internal DMEM).
3. **Bootloader address space**. A _fixed_ section of this address space is used by the
internal _bootloader memory_ (BOOTLDROM).
4. **IO/peripheral address space** – for the processor-internal IO/peripheral devices (e.g., UART).
4. **On-Chip Debugger address space**. This _fixed_ section is entirely used by the processor's <<_on_chip_debugger_ocd>>.
5. **IO/peripheral address space**. Also a _fixed_ section used for the processor-internal memory-mapped IO/peripheral devices (e.g., UART).
 
[TIP]
These four memory regions are handled by the linker when compiling a NEORV32 executable.
See section <<_executable_image_format>> for more information.
 
.NEORV32 processor - address space (default configuration)
image::address_space.png[900]
 
1102,7 → 1154,7
[NOTE]
The base address of the internal bootloader (at _0xFFFF0000_) and the internal IO region (at _0xFFFFFE00_) for
peripheral devices are also defined in the package and are fixed. These address regions cannot not be used for other
applications – even if the bootloader or all IO devices are not implemented - without modifying the core's
applications - even if the bootloader or all IO devices are not implemented - without modifying the core's
hardware sources.
 
 
1113,32 → 1165,36
Accessing a memory region in a way that violates any of these attributes will raise an according
access exception..
 
* `r` – read access (from CPU data access interface, e.g. via "load")
* `w` – write access (from CPU data access interface, e.g. via "store")
* `x` – execute access (from CPU instruction fetch interface)
* `a` – atomic access (from CPU data access interface)
* `8` – byte (8-bit)-accessible (when writing)
* `16` – half-word (16-bit)-accessible (when writing)
* `32` – word (32-bit)-accessible (when writing)
* `r` - read access (from CPU data access interface, "loads")
* `w` - write access (from CPU data access interface, "stores")
* `x` - execute access (from CPU instruction fetch interface)
* `a` - atomic access (from CPU data access interface)
* `8` - byte (8-bit)-accessible (when writing)
* `16` - half-word (16-bit)-accessible (when writing)
* `32` - word (32-bit)-accessible (when writing)
 
[NOTE]
Read accesses (i.e. loads) can always access data in word, half-word and byte quantities (requiring an accordingly aligned address).
Read accesses (loads and instruction fetches) can always access data in
word, half-word (for instruction fetch only if `C` extension is enabled)
and byte (not for instruction fetch) quantities (requiring an accordingly aligned address).
 
[TIP]
The following table shows the _default hardware-defined_ physical memory attributes of each main address space region.
Additional user-defined attributes (for example certain read/write/execute rights for specific address space regions) can be
provided using the RISC-V <<_machine_physical_memory_protection>>.
 
[cols="^1,^2,^2,^3,^2"]
[options="header",grid="rows"]
|=======================
| # | Region | Base address | Size | Attributes
| 4 | IO/peripheral devices | 0xfffffe00 | 512 bytes | `r/w/a/32`
| 3 | bootloader ROM | 0xffff0000 | up to 32kB| `r/x/a`
| 2 | DMEM | 0x80000000 | up to 2GB (-64kB) | `r/w/x/a/8/16/32`
| 1 | IMEM | 0x00000000 | up to 2GB | `r/w/x/a/8/16/32`
| # | Region | Base address | Size | Attributes
| 5 | IO/peripheral devices | 0xfffffe00 | 512 bytes | `r/w/a/32`
| 4 | On-chip debugger | 0xfffff800 | 512 bytes | `r/w/x/32`
| 3 | Bootloader ROM | 0xffff0000 | up to 32kB | `r/x/a`
| 2 | DMEM | 0x80000000 | up to "2GB" | `r/w/x/a/8/16/32`
| 1 | IMEM | 0x00000000 | up to 2GB | `r/w/x/a/8/16/32`
|=======================
 
[TIP]
The following table shows the provided physical memory attributes of each region. Additional attributes (for example
controlling certain right for specific address space regions) can be provided using the RISC-V <<_machine_physical_memory_protection>> extension.
 
 
:sectnums:
==== Memory Configuration
 
1263,7 → 1319,7
 
**Internal Reset Generator**
 
Most processor-internal modules – except for the CPU and the watchdog timer – do not have a dedicated
Most processor-internal modules - except for the CPU and the watchdog timer - do not have a dedicated
reset signal. However, all devices can be reset by software by clearing the corresponding unit's control
register. The automatically included application start-up code (`crt0.S`) will perform a software-reset of all
modules to ensure a clean system reset state.
/datasheet/soc_cfs.adoc
19,18 → 19,27
 
**Theory of Operation**
 
The custom functions subsystem can be used to implement application-specific user-defined co-processors
(like encryption or arithmetic accelerators) or peripheral/communication interfaces. In contrast to connecting
custom hardware accelerators via the external memory interface, the CFS provide a convenient and low-latency
extension and customization option.
The custom functions subsystem is meant for implementing application-specific user-defined co-processors
IP footnote:[Intellectual IP; proprietary circuit blocks.] blocks. The CFS provides up to 32x 32-bit memory-mapped
registers (`REG`, see register map table below) that can be accessed by the CPU via normal load/store operations.
The actual functionality of these register has to be defined by the hardware designer. Furthermore, the CFS
provides two IO conduits to implement custom module- or chip-external interfaces.
 
The CFS provides up to 32x 32-bit memory-mapped registers `REG` (see register map table below). The actual
functionality of these register has to be defined by the hardware designer.
In contrast to connecting custom hardware accelerators via external memory interfaces (like SPI or the processor's
external bus interface), the CFS provide a convenient, low-latency and tightly-coupled extension and
customization option.
 
Just like any other externally-connected IP, logic implemented within the custom functions subsystem can operate
_independently_ of the CPU providing true parallel processing capabilities. Potential use cases might include
dedicated hardware accelerators for en-/decryption (AES), signal processing (FFT) or AI applications
(CNNs) as well as custom IO systems like fast memory interfaces (DDR) and mass storage (SDIO), networking (CAN)
or real-time data transport (I2S).
 
[INFO]
Take a look at the template CFS VHDL source file (`rtl/core/neorv32_cfs.vhd`). The file is highly
commented to illustrate all aspects that are relevant for implementing custom CFS-based co-processor designs.
 
 
**CFS Software Access**
 
The CFS memory-mapped registers can be accessed by software using the provided C-language aliases (see
43,26 → 52,35
uint32_t temp = NEORV32_CFS.REG[20]; // read from CFS register 20
----
 
 
**CFS Interrupt**
 
The CFS provides a single one-shot interrupt request signal mapped to the CPU's fast interrupt channel 1.
See section <<_processor_interrupts>> for more information.
The CFS provides a single high-level-triggered interrupt request signal mapped to the CPU's fast interrupt channel 1.
Once set, the interrupt has to stay asserted until explicitly acknowledged by the software (for example by
writing to a specific CFS register). See section <<_processor_interrupts>> for more information.
 
 
**CFS Configuration Generic**
 
By default, the CFS provides a single 32-bit `std_(u)logic_vector` configuration generic _IO_CFS_CONFIG_
that is available in the processor's top entity. This generic can be used to pass custom configuration options
from the top entity down to the CFS entity.
from the top entity directly down to the CFS. The actual definition of the generics and it'S usage inside the
CFS is left to the hardware designer.
 
 
**CFS Custom IOs**
 
By default, the CFS also provides two unidirectional input and output conduits `cfs_in_i` and `cfs_out_o`.
These signals are propagated to the processor's top entity. The actual use of these signals has to be defined
by the hardware designer. The size of the input signal conduit `cfs_in_i` is defined via the (top's) _IO_CFS_IN_SIZE_ configuration
generic (default = 32-bit). The size of the output signal conduit `cfs_out_o` is defined via the (top's)
These signals are directly propagated to the processor's top entity. These conduits can be used to implement
application-specific interfaces like memory or network connections. The actual use case of these signals
has to be defined by the hardware designer.
 
The size of the input signal conduit `cfs_in_i` is defined via the top's _IO_CFS_IN_SIZE_ configuration
generic (default = 32-bit). The size of the output signal conduit `cfs_out_o` is defined via the top's
_IO_CFS_OUT_SIZE_ configuration generic (default = 32-bit). If the custom function subsystem is not implemented
(_IO_CFS_EN_ = false) the `cfs_out_o` signal is tied to all-zero.
 
 
.CFS register map (`struct NEORV32_CFS`)
[cols="^4,<5,^2,^3,<14"]
[options="header",grid="all"]
/datasheet/soc_imem.adoc
29,8 → 29,8
 
By default, the IMEM is implemented as RAM, so the content can be modified during run time. This is
required when using a bootloader that can update the content of the IMEM at any time. If you do not need
the bootloader anymore – since your application development has completed and you want the program to
permanently reside in the internal instruction memory – the IMEM is automatically implemented as _pre-intialized_
the bootloader anymore - since your application development has completed and you want the program to
permanently reside in the internal instruction memory - the IMEM is automatically implemented as _pre-intialized_
ROM when the processor-internal bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_).
 
When the IMEM is implemented as ROM, it will be initialized during synthesis with the actual application
/datasheet/soc_mtime.adoc
26,9 → 26,10
and the `MTI` machine timer CPU interrupt (`MTI`) is directly connected to the top's `mtime_irq_i` input.
 
The 64-bit system time can be accessed via the `TIME_LO` and `TIME_HI` memory-mapped registers (read/write) and also via
the CPU's `time[h]` CSRs (read-only). A 64-bit time compare register – accessible via memory-mapped `TIMECMP_LO` and `TIMECMP_HI`
registers – is used to configure an interrupt to the CPU. The interrupt is triggered
the CPU's `time[h]` CSRs (read-only). A 64-bit time compare register - accessible via memory-mapped `TIMECMP_LO` and `TIMECMP_HI`
registers - is used to configure an interrupt to the CPU. The interrupt is triggered
whenever `TIME` (high & low part) >= `TIMECMP` (high & low part) and is directly forwarded to the CPU's `MTI` interrupt.
The interrupt remain active (=pending) until `TIME` < `TIMECMP` (either by modifying `TIME` or `TIMECMP`).
 
.MTIME register map (`struct NEORV32_MTIME`)
[cols="<3,<3,^1,^1,<6"]
/datasheet/soc_neoled.adoc
30,7 → 30,7
NEOLED module provides 24-bit and 32-bit operating modes, a mixed setup with RGB LEDs (24-bit color)
and RGBW LEDs (32-bit color including a dedicated white LED chip) is possible.
 
**Theory of Operation – NEOLED Module**
**Theory of Operation - NEOLED Module**
 
The NEOLED modules provides two accessible interface registers: the control register `CTRL` and the
TX data register `DATA`. The NEOLED module is globally enabled via the control register's
55,7 → 55,7
an arbitrary setup of RGB and RGBW LEDs.
 
 
**Theory of Operation – Protocol**
**Theory of Operation - Protocol**
 
The interface of the WS2812 LEDs uses an 800kHz carrier signal. Data is transmitted in a serial manner
starting with LSB-first. The intensity for each R, G & B (& W) LED chip (= color code) is defined via an 8-bit
97,7 → 97,7
WS2811).
 
 
**Timing Configuration – Example (WS2812)**
**Timing Configuration - Example (WS2812)**
 
Generate the base clock f~TX~ for the NEOLED TX engine:
 
159,20 → 159,31
the actually written data is irrelevant) will trigger an idle phase (`neoled_o` = zero) of 127 periods (= _**T~carrier~**_).
This idle time will cause the LEDs to strobe the color data into the PWM driver registers.
 
Since the _NEOLED_CTRL_STROBE_ flag is also buffered in the TX buffer, the RESET command is treated as just another
Since the _NEOLED_CTRL_STROBE_ flag is also buffered in the TX buffer, the RESET command is treated just as another
data word being written to the TX buffer making busy wait concepts obsolete and allowing maximum refresh rates.
 
 
**Interrupt**
 
The NEOLED modules features a single interrupt that is triggered whenever the TX FIFO's fill level
falls below _half-full_ level. In this case software can write up to _IO_NEOLED_TX_FIFO_/2 new data
words to `DATA` without checking the FIFO status flags.
The NEOLED modules features a single interrupt that becomes pending based on the current TX buffer fill level.
The interrupt can only become pending if the NEOLED module is enabled. The specific interrupt condition
is configured via the _NEOLED_CTRL_IRQ_CONF_ in the control register `NEORV32_NEOLED.CTRL`.
 
This highly relaxes time constraints for sending a continuous data stream to the LEDs
(as an idle time beyond 50μs will trigger the LED's a RESET command).
If _NEOLED_CTRL_IRQ_CONF_ is cleared, an interrupt is generated whenever the TX FIFO is _less than half-full_.
In this case software can write up to _IO_NEOLED_TX_FIFO_/2 new data words to `DATA` without checking the FIFO
status flags. The interrupt request is cleared whenever the FIFO fill level is above _half-full_ level or if
the NEOLED module is disabled.
 
If _NEOLED_CTRL_IRQ_CONF_ is set, an interrupt is generated whenever the TX FIFO is _empty_. The interrupt
request is cleared again when the FIFO contains at least one data word.
 
[NOTE]
The _NEOLED_CTRL_IRQ_CONF_ is hardwired to one if _IO_NEOLED_TX_FIFO_ = 1 (-> IRQ if FIFO is empty).
 
If the FIFO is configured to contain only a single entry (_IO_NEOLED_TX_FIFO_ = 1) the interrupt
will become pending if the FIFO (which is just a single register providing simple _double-buffering_) is empty.
 
 
<<<
.NEOLED register map (`struct NEORV32_NEOLED`)
[cols="<4,<5,<9,^2,<9"]
179,30 → 190,35
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.25+<| `0xffffffd8` .25+<| `NEORV32_NEOLED.CTRL` <|`0` _NEOLED_CTRL_EN_ ^| r/w <| NEOLED enable
<|`1` _NEOLED_CTRL_MODE_ ^| r/w <| data transfer size; `0`=24-bit; `1`=32-bit
<|`2` _NEOLED_CTRL_STROBE_ ^| r/w <| `0`=send normal color data; `1`=send RESET command on data write access
<|`3` _NEOLED_CTRL_PRSC0_ ^| r/w <| 3-bit clock prescaler, bit 0
<|`4` _NEOLED_CTRL_PRSC1_ ^| r/w <| 3-bit clock prescaler, bit 1
<|`5` _NEOLED_CTRL_PRSC2_ ^| r/w <| 3-bit clock prescaler, bit 2
<|`6` _NEOLED_CTRL_BUFS0_ ^| r/- .4+<| 4-bit log2(_IO_NEOLED_TX_FIFO_)
<|`7` _NEOLED_CTRL_BUFS1_ ^| r/-
<|`8` _NEOLED_CTRL_BUFS2_ ^| r/-
<|`9` _NEOLED_CTRL_BUFS3_ ^| r/-
<|`10` _NEOLED_CTRL_T_TOT_0_ ^| r/w .5+<| 5-bit pulse clock ticks per total single-bit period (T~total~)
<|`11` _NEOLED_CTRL_T_TOT_1_ ^| r/w
<|`12` _NEOLED_CTRL_T_TOT_2_ ^| r/w
<|`13` _NEOLED_CTRL_T_TOT_3_ ^| r/w
<|`14` _NEOLED_CTRL_T_TOT_4_ ^| r/w
<|`20` _NEOLED_CTRL_ONE_H_0_ ^| r/w .5+<| 5-bit pulse clock ticks per high-time for sending a one-bit (T~H1~)
<|`21` _NEOLED_CTRL_ONE_H_1_ ^| r/w
<|`22` _NEOLED_CTRL_ONE_H_2_ ^| r/w
<|`23` _NEOLED_CTRL_ONE_H_3_ ^| r/w
<|`24` _NEOLED_CTRL_ONE_H_4_ ^| r/w
<|`30` _NEOLED_CTRL_TX_STATUS_ ^| r/- <| transmit engine busy when `1`
<|`31` _NEOLED_CTRL_TX_EMPTY_ ^| r/- <| TX FIFO is empty
<|`31` _NEOLED_CTRL_TX_HALF_ ^| r/- <| TX FIFO is _at least_ half full
<|`31` _NEOLED_CTRL_TX_FULL_ ^| r/- <| TX FIFO is full
<|`31` _NEOLED_CTRL_TX_BUSY_ ^| r/- <| TX serial engine is busy when set
.30+<| `0xffffffd8` .30+<| `NEORV32_NEOLED.CTRL` <|`0` _NEOLED_CTRL_EN_ ^| r/w <| NEOLED enable
<|`1` _NEOLED_CTRL_MODE_ ^| r/w <| data transfer size; `0`=24-bit; `1`=32-bit
<|`2` _NEOLED_CTRL_STROBE_ ^| r/w <| `0`=send normal color data; `1`=send RESET command on data write access
<|`3` _NEOLED_CTRL_PRSC0_ ^| r/w <| 3-bit clock prescaler, bit 0
<|`4` _NEOLED_CTRL_PRSC1_ ^| r/w <| 3-bit clock prescaler, bit 1
<|`5` _NEOLED_CTRL_PRSC2_ ^| r/w <| 3-bit clock prescaler, bit 2
<|`6` _NEOLED_CTRL_BUFS0_ ^| r/- .4+<| 4-bit log2(_IO_NEOLED_TX_FIFO_)
<|`7` _NEOLED_CTRL_BUFS1_ ^| r/-
<|`8` _NEOLED_CTRL_BUFS2_ ^| r/-
<|`9` _NEOLED_CTRL_BUFS3_ ^| r/-
<|`10` _NEOLED_CTRL_T_TOT_0_ ^| r/w .5+<| 5-bit pulse clock ticks per total single-bit period (T~total~)
<|`11` _NEOLED_CTRL_T_TOT_1_ ^| r/w
<|`12` _NEOLED_CTRL_T_TOT_2_ ^| r/w
<|`13` _NEOLED_CTRL_T_TOT_3_ ^| r/w
<|`14` _NEOLED_CTRL_T_TOT_4_ ^| r/w
<|`15` _NEOLED_CTRL_T_ZERO_H_0_ ^| r/w .5+<| 5-bit pulse clock ticks per high-time for sending a zero-bit (T~0H~)
<|`16` _NEOLED_CTRL_T_ZERO_H_1_ ^| r/w
<|`17` _NEOLED_CTRL_T_ZERO_H_2_ ^| r/w
<|`18` _NEOLED_CTRL_T_ZERO_H_3_ ^| r/w
<|`19` _NEOLED_CTRL_T_ZERO_H_4_ ^| r/w
<|`20` _NEOLED_CTRL_T_ONE_H_0_ ^| r/w .5+<| 5-bit pulse clock ticks per high-time for sending a one-bit (T~1H~)
<|`21` _NEOLED_CTRL_T_ONE_H_1_ ^| r/w
<|`22` _NEOLED_CTRL_T_ONE_H_2_ ^| r/w
<|`23` _NEOLED_CTRL_T_ONE_H_3_ ^| r/w
<|`24` _NEOLED_CTRL_T_ONE_H_4_ ^| r/w
<|`27` _NEOLED_CTRL_IRQ_CONF_ ^| r/w <| TX FIFO interrupt configuration: `0`=IRQ if FIFO is less than half-full, `1`=IRQ if FIFO is empty
<|`28` _NEOLED_CTRL_TX_EMPTY_ ^| r/- <| TX FIFO is empty
<|`29` _NEOLED_CTRL_TX_HALF_ ^| r/- <| TX FIFO is _at least_ half full
<|`30` _NEOLED_CTRL_TX_FULL_ ^| r/- <| TX FIFO is full
<|`31` _NEOLED_CTRL_TX_BUSY_ ^| r/- <| TX serial engine is busy when set
| `0xffffffdc` | `NEORV32_NEOLED.DATA` <|`31:0` / `23:0` ^| -/w <| TX data (32-/24-bit)
|=======================
/datasheet/soc_slink.adoc
44,6 → 44,7
via the external memory bus interface or use some of the processor's GPIO ports to implement custom data
tag signals.
 
 
**Theory of Operation**
 
The SLINK provides eight data registers (`DATA[i]`) to access the links (read accesses will access the RX links, write
82,6 → 83,7
from an _empty_ RX link. Hence, this concept should only be used when evaluating the half-full FIFO condition
(for example via the SLINK interrupts) before actual accessing links.
 
 
**Non-Blocking Link Access**
 
For a non-blocking link access concept, the FIFO status flags in `STATUS` need to be checked _before_
94,25 → 96,7
However, non-blocking accesses require additional instructions to check the according status flags prior
to the actual link access, which will reduce performance for high-bandwidth data streams.
 
**Interrupts**
 
The stream interface provides two interrupts that are _globally_ driven by the RX and TX link's
FIFO fill level status. The behavior of these interrupts differs if the FIFO depth is exactly 1 (minimal)
or if it is greater than 1.
 
When _SLINK_*X_FIFO_ is 1 a TX interrupt will fire if **any** TX link _was full_ and _becomes empty_ again.
Accordingly, if the FIFO of **any** RX link _was empty_ and a _new data word_ appears in it, the RX interrupt fires.
 
When _SLINK_*X_FIFO_ is greater than 1 the TX interrupt will fire if _any_ TX link's FIFO _falls below_ half-full fill level.
Accordingly, the RX interrupt will fire if _any_ RX link's FIFO _exceeds_ half-full fill level.
 
The interrupt service handler has to evaluate the SLINK status register is order to detect which link(s) has caused the
interrupt. No further interrupt can fire until the CPU acknowledges the last interrupt by _reading the SLINK status register_.
However, further IRQ conditions are buffered and will trigger another interrupt after the current one has been acknowledged.
 
Note that these interrupts can only fire if the SLINK module is actually enabled by setting the
_SLINK_CTRL_EN_ bit in the unit's control register.
 
**Stream Link Interface & Protocol**
 
The SLINK interface consists of three signals `dat`, `val` and `rdy` for each RX and TX link.
133,6 → 117,44
[TIP]
The SLINK handshake protocol is compatible with the https://developer.arm.com/documentation/ihi0051/a/Introduction/About-the-AXI4-Stream-protocol[AXI4-Stream] base protocol.
 
 
**Interrupts**
 
The stream interface provides two independent interrupts that are _globally_ driven by the RX and TX link's
FIFO fill level status. Each RX and TX link provides an individual interrupt enable flag and an individual
interrupt type flag that allows to configure interrupts only for certain (or all) links and for application-
specific interrupt conditions. The interrupt configuration is done using the `NEORV32_SLINK.IRQ` register.
Any interrupt can only become pending if the SLINK module is enabled at all.
 
The current FIFO fill-level of a specific **RX link** can only raise an interrupt request if it's interrupt enable flag
_SLINK_IRQ_RX_EN_ is set. Vice versa, the current FIFO fill-level of a specific **TX link** can only raise an interrupt
request if it's interrupt enable flag _SLINK_IRQ_TX_EN_ is set.
 
The **RX link's** _SLINK_IRQ_RX_MODE_ flags define the FIFO fill-level condition for raising an RX interrupt request:
* If a link's interrupt mode flag is `1` an IRQ is generated when the link's FIFO is _not empty_ ("RX data available").
* If a link's interrupt mode flag is `0` an IRQ is generated when the link's FIFO is _at least half-full_ ("time to get data from RX FIFO to prevent overflow").
 
The **TX link's** _SLINK_IRQ_TX_MODE_ flags define the FIFO fill-level condition for raising an TX interrupt request:
* If a link's interrupt mode flag is `1` an IRQ is generated when the link's FIFO is _not full_ ("space left in FIFO for new TX data").
* If a link's interrupt mode flag is `0` an IRQ is generated when the link's FIFO is _less than half-full_ ("SW can send _SLINK_TX_FIFO_/2 data words without checking any flags").
 
[NOTE]
If _SLINK_RX_FIFO_ is 1 the _SLINK_IRQ_RX_MODE_ bits are hardwired to one.
If _SLINK_TX_FIFO_ is 1 the _SLINK_IRQ_TX_MODE_ bits are hardwired to one.
 
[NOTE]
There is no RX FIFO overflow mechanism available yet.
 
If _any_ configured interrupt condition is fulfilled, the according global SLINK RX / SLINK TX CPU
interrupt becomes pending.
If the interrupt enable flags of several links are set, the interrupt service handler has to evaluate the SLINK
status register is order to detect which link(s) caused the interrupt.
 
[NOTE]
If the programmed interrupt condition is fulfilled, the corresponding IRQ will become _pending_ until
the causing interrupt conditions is resolved (for example by reading data from the according RX FIFO).
 
 
.SLINK register map (`struct NEORV32_SLINK`)
[cols="^4,<5,^2,^2,<14"]
[options="header",grid="all"]
144,12 → 166,17
<| `11:8` _SLINK_CTRL_RX_FIFO_S3_ : _SLINK_CTRL_RX_FIFO_S0_ ^| r/- <| RX links FIFO depth, log2 of_SLINK_RX_FIFO_ generic
<| `7:4` _SLINK_CTRL_TX_NUM3_ : _SLINK_CTRL_TX_NUM0_ ^| r/- <| Number of implemented TX links
<| `3:0` _SLINK_CTRL_RX_NUM3_ : _SLINK_CTRL_RX_NUM0_ ^| r/- <| Number of implemented RX links
| `0xfffffec4` : `0xfffffeec` | - |`31:0` | | _reserved
.4+<| `0xfffffed0` .4+<| `NEORV32_SLINK.STATUS` <| `31:24` _SLINK_STATUS_TX7_HALF_ : _SLINK_STATUS_TX0_HALF_ ^| r/- | TX link 7..0 FIFO fill level is > half-full
| `0xfffffec4` | - |`31:0` | r/- | _reserved_
.4+<| `0xfffffec8` .4+<| `NEORV32_SLINK.IRQ` <|`31:24` _SLINK_IRQ_RX_EN_MSB_ : _SLINK_IRQ_RX_EN_LSB_ ^| r/w <| RX interrupt enable for link 7..0
<|`23:16` _SLINK_IRQ_RX_MODE_MSB_ : _SLINK_IRQ_RX_MODE_LSB_ ^| r/w <| RX IRQ mode for link 7..0: `0` = FIFO at least half-full; `1` = FIFO not empty
<|`15:8` _SLINK_IRQ_TX_EN_MSB_ : _SLINK_IRQ_TX_EN_LSB_ ^| r/w <| TX interrupt enable for link 7..0
<|`7:0` _SLINK_IRQ_TX_MODE_MSB_ : _SLINK_IRQ_TX_MODE_LSB_ ^| r/w <| TX IRQ mode for link 7..0: `0` = FIFO less than half-full; `1` = FIFO not full
| `0xfffffeec` | - |`31:0` | r/- | _reserved_
.4+<| `0xfffffed0` .4+<| `NEORV32_SLINK.STATUS` <| `31:24` _SLINK_STATUS_TX7_HALF_ : _SLINK_STATUS_TX0_HALF_ ^| r/- <| TX link 7..0 FIFO fill level is >= half-full
<| `23:16` _SLINK_STATUS_RX7_HALF_ : _SLINK_STATUS_RX0_HALF_ ^| r/- <| RX link 7..0 FIFO fill level is >= half-full
<| `15:8` _SLINK_STATUS_TX7_FREE_ : _SLINK_STATUS_TX0_FREE_ ^| r/- <| At least one free TX FIFO entry available for link 7..0
<| `7:0` _SLINK_STATUS_RX7_AVAIL_ : _SLINK_STATUS_RX0_AVAIL_ ^| r/- <| At least one data word in RX FIFO available for link 7..0
| `0xfffffed4` : `0xfffffedc` | - |`31:0` | | _reserved_
| `0xfffffed4` : `0xfffffedc` | - |`31:0` | r/- | _reserved_
| `0xfffffee0` | `NEORV32_SLINK.DATA[0]` | `31:0` | r/w | Link 0 RX/TX data
| `0xfffffee4` | `NEORV32_SLINK.DATA[1]` | `31:0` | r/w | Link 1 RX/TX data
| `0xfffffee8` | `NEORV32_SLINK.DATA[2]` | `31:0` | r/w | Link 2 RX/TX data
/datasheet/soc_spi.adoc
9,31 → 9,77
| Software driver file(s): | neorv32_spi.c |
| | neorv32_spi.h |
| Top entity port: | `spi_sck_o` | 1-bit serial clock output
| | `spi_sdo_i` | 1-bit serial data output
| | `spi_sdi_o` | 1-bit serial data input
| | `spi_sdo_o` | 1-bit serial data output
| | `spi_sdi_i` | 1-bit serial data input
| | `spi_csn_i` | 8-bit dedicated chip select (low-active)
| Configuration generics: | _IO_SPI_EN_ | implement SPI controller when _true_
| CPU interrupts: | fast IRQ channel 6 | transmission done interrupt (see <<_processor_interrupts>>)
|=======================
 
 
**Theory of Operation**
 
SPI is a synchronous serial transmission interface. The NEORV32 SPI transceiver allows 8-, 16-, 24- and 32-
bit long transmissions. The unit provides 8 dedicated chip select signals via the top entity's `spi_csn_o`
signal.
SPI is a synchronous serial transmission interface for fast on-board communications.
The NEORV32 SPI transceiver supports 8-, 16-, 24- and 32-bit wide transmissions.
The unit provides 8 dedicated chip select signals via the top entity's `spi_csn_o` signal, which are
directly controlled by the SPI module (no additional GPIO required).
 
The SPI unit is enabled via the _SPI_CTRL_EN_ bit in the `CTRL` control register. The idle clock polarity is configured via the _SPI_CTRL_CPHA_
bit and can be low (`0`) or high (`1`) during idle. The data quantity to be transferred within a
single transmission is defined via the _SPI_CTRL_SIZEx bits_. The unit supports 8-bit (`00`), 16-bit (`01`), 24-
bit (`10`) and 32-bit (`11`) transfers. Whenever a transfer is completed, the "transmission done interrupt" is triggered.
A transmission is still in progress as long as the _SPI_CTRL_BUSY_ flag is set.
The SPI unit is enabled by setting the _SPI_CTRL_EN_ bit in the `CTRL` control register. No transfer can be initiated
and no interrupt request will be triggered if this bit is cleared. Furthermore, a transfer being in process
can be terminated at any time by clearing this bit.
 
The SPI controller features 8 dedicated chip-select lines. These lines are controlled via the control register's _SPI_CTRL_CSx_ bits. When
a specifc _SPI_CTRL_CSx_ bit is **set**, the according chip select line `spi_csn_o(x)` goes **low** (low-active chip select lines).
The data quantity to be transferred within a single transmission is defined via the _SPI_CTRL_SIZEx_ bits.
The SPI module supports 8-bit (`00`), 16-bit (`01`), 24-
bit (`10`) and 32-bit (`11`) transfers.
 
The SPI clock frequency is defined via the 3-bit _SPI_CTRL_PRSCx_ clock prescaler. The following prescalers
are available:
A transmission is started when writing data to the `DATA` register. The data must be LSB-aligned. So if
the SPI transceiver is configured for less than 32-bit transfers data quantity, the transmit data must be placed
into the lowest 8/16/24 bit of `DATA`. Vice versa, the received data is also always LSB-aligned. Application
software should only actually process the amount of bits that were configured using _SPI_CTRL_SIZEx_ when
reading `DATA`.
 
The SPI controller features 8 dedicated chip-select lines. These lines are controlled via the control register's
_SPI_CTRL_CSx_ bits. When a specific _SPI_CTRL_CSx_ bit is **set**, the according chip-select line `spi_csn_o(x)`
goes **low** (low-active chip-select lines).
 
[IMPORTANT]
Changes to the `CTRL` control register should be made only when the SPI module is idle as they directly effect
transmissions being in-progress.
 
[TIP]
The actual transmission length is left to the user: after asserting chip-select an arbitrary amount of
transmission with arbitrary data quantity (_SPI_CTRL_SIZEx_) can be made before de-asserting chip-select again.
 
[NOTE]
The NEORV32 SPI module only supports _host mode_. Transmission are initiated only by the processor's SPI module
(and not by an external SPI module).
 
[NOTE]
The NEORV32 SPI module only support MSB-first mode. Data can be reversed before writing `DATA` (for TX) / after
reading `DATA` (for RX) to provide LSB-first transmissions.
 
 
**SPI Clock Configuration**
 
The SPI module supports all _standard SPI clock modes_ (0, 1, 2, 3), which is via the two control register bits
_SPI_CTRL_CPHA_ and _SPI_CTRL_CPOL_. The _SPI_CTRL_CPHA_ bit defines the _clock phase_ and the _SPI_CTRL_CPOL_
bit defines the _clock polarity_.
 
.SPI clock modes; image from https://en.wikipedia.org/wiki/File:SPI_timing_diagram2.svg (license: (Wikimedia) https://en.wikipedia.org/wiki/Creative_Commons[Creative Commons] https://creativecommons.org/licenses/by-sa/3.0/deed.en[Attribution-Share Alike 3.0 Unported])
image::SPI_timing_diagram2.wikimedia.png[]
 
.SPI standard clock modes
[cols="<2,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| | Mode 0 | Mode 1 | Mode 2 | Mode 4
| _SPI_CTRL_CPOL_ | `0` | `0` | `1` | `1`
| _SPI_CTRL_CPHA_ | `0` | `1` | `0` | `1`
|=======================
 
The SPI clock frequency (`spi_sck_o`) is programmed by the 3-bit _SPI_CTRL_PRSCx_ clock prescaler.
The following prescalers are available:
 
.SPI prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
42,34 → 88,43
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
 
Based on the _SPI_CTRL_PRSCx_ configuration, the actual SPI clock frequency f~SPI~ is derived from the processor's main clock f~main~ and is determined by:
Based on the _SPI_CTRL_PRSCx_ configuration, the actual SPI clock frequency f~SPI~ is derived from the processor's
main clock f~main~ and is determined by:
 
_**f~SPI~**_ = _f~main~[Hz]_ / (2 * `clock_prescaler`)
 
A transmission is started when writing data to the `DATA` register. The data must be LSB-aligned. So if
the SPI transceiver is configured for less than 32-bit transfers data quantity, the transmit data must be placed
into the lowest 8/16/24 bit of `DATA`. Vice versa, the received data is also always LSB-aligned.
Hence, the maximum SPI clock is f~main~ / 4.
 
 
**SPI Interrupt**
 
The SPI module provides a single interrupt to signal "ready for new transmission" to the CPU. Whenever the SPI
module is currently idle (and enabled), the interrupt request is active. A pending interrupt request is cleared
by triggering a new SPI transmission or by disabling the SPI module.
 
 
.SPI register map (`struct NEORV32_SPI`)
[cols="<2,<2,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.16+<| `0xffffffa8` .16+<| `NEORV32_SPI.CTRL` <|`0` _SPI_CTRL_CS0_ ^| r/w .8+<| Direct chip-select 0..7; setting `spi_csn_o(x)` low when set
<|`1` _SPI_CTRL_CS1_ ^| r/w
<|`2` _SPI_CTRL_CS2_ ^| r/w
<|`3` _SPI_CTRL_CS3_ ^| r/w
<|`4` _SPI_CTRL_CS4_ ^| r/w
<|`5` _SPI_CTRL_CS5_ ^| r/w
<|`6` _SPI_CTRL_CS6_ ^| r/w
<|`7` _SPI_CTRL_CS7_ ^| r/w
<|`8` _SPI_CTRL_EN_ ^| r/w <| SPI enable
<|`9` _SPI_CTRL_CPHA_ ^| r/w <| polarity of `spi_sck_o` when idle
<|`10` _SPI_CTRL_PRSC0_ ^| r/w .3+| 3-bit clock prescaler select
<|`11` _SPI_CTRL_PRSC1_ ^| r/w
<|`12` _SPI_CTRL_PRSC2_ ^| r/w
<|`14` _SPI_CTRL_SIZE0_ ^| r/w .2+<| transfer size (`00`=8-bit, `01`=16-bit, `10`=24-bit, `11`=32-bit)
<|`15` _SPI_CTRL_SIZE1_ ^| r/w
<|`31` _SPI_CTRL_BUSY_ ^| r/- <| transmission in progress when set
.18+<| `0xffffffa8` .18+<| `NEORV32_SPI.CTRL` <|`0` _SPI_CTRL_CS0_ ^| r/w .8+<| Direct chip-select 0..7; setting `spi_csn_o(x)` low when set
<|`1` _SPI_CTRL_CS1_ ^| r/w
<|`2` _SPI_CTRL_CS2_ ^| r/w
<|`3` _SPI_CTRL_CS3_ ^| r/w
<|`4` _SPI_CTRL_CS4_ ^| r/w
<|`5` _SPI_CTRL_CS5_ ^| r/w
<|`6` _SPI_CTRL_CS6_ ^| r/w
<|`7` _SPI_CTRL_CS7_ ^| r/w
<|`8` _SPI_CTRL_EN_ ^| r/w <| SPI enable
<|`9` _SPI_CTRL_CPHA_ ^| r/w <| clock phase (`0`=sample RX on rising edge & update TX on falling edge; `1`=sample RX on falling edge & update TX on rising edge)
<|`10` _SPI_CTRL_PRSC0_ ^| r/w .3+| 3-bit clock prescaler select
<|`11` _SPI_CTRL_PRSC1_ ^| r/w
<|`12` _SPI_CTRL_PRSC2_ ^| r/w
<|`13` _SPI_CTRL_SIZE0_ ^| r/w .2+<| transfer size (`00`=8-bit, `01`=16-bit, `10`=24-bit, `11`=32-bit)
<|`14` _SPI_CTRL_SIZE1_ ^| r/w
<|`15` _SPI_CTRL_CPOL_ ^| r/w <| clock polarity
<|`16` .. `30` ^| r/- <| _reserved, read as zero
<|`31` _SPI_CTRL_BUSY_ ^| r/- <| transmission in progress when set
| `0xffffffac` | `NEORV32_SPI.DATA` |`31:0` | r/w | receive/transmit data, LSB-aligned
|=======================
/datasheet/soc_sysinfo.adoc
17,7 → 17,7
The SYSINFO allows the application software to determine the setting of most of the processor's top entity
generics that are related to processor/SoC configuration. All registers of this unit are read-only.
 
This device is always implemented – regardless of the actual hardware configuration. The bootloader as well
This device is always implemented - regardless of the actual hardware configuration. The bootloader as well
as the NEORV32 software runtime environment require information from this device (like memory layout
and default clock speed) for correct operation.
 
/datasheet/soc_twi.adoc
16,9 → 16,9
 
**Theory of Operation**
 
The two wire interface – also called "I²C" – is a quite famous interface for connecting several on-board
The two wire interface - also called "I²C" - is a quite famous interface for connecting several on-board
components. Since this interface only needs two signals (the serial data line `twi_sda_io` and the serial
clock line `twi_scl_io`) – despite of the number of connected devices – it allows easy interconnections of
clock line `twi_scl_io`) - despite of the number of connected devices - it allows easy interconnections of
several peripheral nodes.
 
The NEORV32 TWI implements a **TWI controller**. It features "clock stretching" (if enabled via the control
65,6 → 65,14
 
_**f~SCL~**_ = _f~main~[Hz]_ / (4 * `clock_prescaler`)
 
 
**Interrupt**
 
The TWI module provides a single interrupt to singal _idle state_ (= read for new transmission) to the CPU. Whenever TWI SPI module
is currently idle (and enabled), the interrupt request is active. A pending interrupt request is cleared
by triggering a new TWI transmission or by disabling the device.
 
 
.TWI register map (`struct NEORV32_TWI`)
[cols="<2,<2,<4,^1,<7"]
[options="header",grid="all"]
/datasheet/soc_uart.adoc
12,30 → 12,37
| | `uart0_rxd_i` | serial receiver input UART0
| | `uart0_rts_o` | flow control: RX ready to receive
| | `uart0_cts_i` | flow control: TX allowed to send
| Configuration generics: | _IO_UART0_EN_ | implement UART0 when _true_
| CPU interrupts: | fast IRQ channel 2 | RX done interrupt
| | fast IRQ channel 3 | TX done interrupt (see <<_processor_interrupts>>)
| Configuration generics: | _IO_UART0_EN_ | implement UART0 when _true_
| | _UART0_RX_FIFO_ | RX FIFO depth (power of 2, min 1)
| | _UART0_TX_FIFO_ | TX FIFO depth (power of 2, min 1)
| CPU interrupts: | fast IRQ channel 2 | RX interrupt
| | fast IRQ channel 3 | TX interrupt (see <<_processor_interrupts>>)
|=======================
 
[IMPORTANT]
The UART is a standard serial interface mainly used to establish a communication channel between a host computer
computer/user and an application running on the embedded processor.
 
The NEORV32 UARTs feature independent transmitter and receiver with a fixed frame configuration of 8 data bits,
an optional parity bit (even or odd) and a fixed stop bit. The actual transmission rate - the Baudrate - is
programmable via software. Optional FIFOs with custom sizes can be configured for the transmitter and receiver
independently.
 
The UART features two memory-mapped registers `CTRL` and `DATA`, which are used for configuration, status
check and data transfer.
 
[NOTE]
Please note that ALL default example programs and software libraries of the NEORV32 software
framework (including the bootloader and the runtime environment) use the primary UART
(_UART0_) as default user console interface. For compatibility, all C-language function calls to
`neorv32_uart_*` are mapped to the according primary UART (_UART0_) `neorv32_uart0_*`
functions.
(_UART0_) as default user console interface.
 
 
**Theory of Operation**
 
In most cases, the UART is a standard interface used to establish a communication channel between the
computer/user and an application running on the processor platform. The NEORV32 UARTs features a
standard configuration frame configuration: 8 data bits, an optional parity bit (even or odd) and 1 stop bit.
The parity and the actual Baudrate are configurable by software.
UART0 is enabled by setting the _UART_CTRL_EN_ bit in the UART0 control register `CTRL`. The Baudrate
is configured via a 12-bit _UART_CTRL_BAUDxx_ baud prescaler (`baud_prsc`) and a 3-bit _UART_CTRL_PRSCx_
clock prescaler (`clock_prescaler`) that scales the processor's primary clock (_f~main~_).
 
The UART0 is enabled by setting the _UART_CTRL_EN_ bit in the UART control register `CTRL`. The actual
transmission Baudrate (like 19200) is configured via the 12-bit _UART_CTRL_BAUDxx_ baud prescaler (`baud_rate`) and the
3-bit _UART_CTRL_PRSCx_ clock prescaler.
 
.UART prescaler configuration
.UART0 prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
43,109 → 50,129
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
 
_**Baudrate**_ = (_f~main~[Hz]_ / `clock_prescaler`) / (`baud_rate` + 1)
_**Baudrate**_ = (_f~main~[Hz]_ / `clock_prescaler`) / (`baud_prsc` + 1)
 
A new transmission is started by writing the data byte to be send to the lowest byte of the `DATA` register. The
transfer is completed when the _UART_CTRL_TX_BUSY_ control register flag returns to zero. A new received byte
is available when the _UART_DATA_AVAIL_ flag of the UART0_DATA register is set. A "frame error" in a received byte
(broken stop bit) is indicated via the _UART_DATA_FERR_ flag in the UART0_DATA register.
is available when the _UART_DATA_AVAIL_ flag of the `DATA` register is set. A "frame error" in a received byte
(invalid stop bit) is indicated via the _UART_DATA_FERR_ flag in the `DATA` register. The flag is cleared by
reading the `DATA` register.
 
**RX Double-Buffering**
 
The UART receive engine provides a simple data buffer with two entries. These two entries are transparent
for the user. The transmitting device can send up to 2 chars to the UART without risking data loss. If another
char is sent before at least one char has been read from the buffer data loss occurs. This situation can be
detected via the receiver overrun flag _UART_DATA_OVERR_ in the `DATA` register. The flag is
automatically cleared after reading `DATA`.
**RX and TX FIFOs**
 
**Parity Modes**
UART0 provides optional FIFO buffers for the transmitter and the receiver. The _UART0_RX_FIFO_ generic defines
the depth of the RX FIFO (for receiving data) while the _UART0_TX_FIFO_ defines the depth of the TX FIFO
(for sending data). Both generics have to be a power of two with a minimal allowed value of 1. This minimal
value will implement simple "double-buffering" instead of full-featured FIFOs.
Both FIFOs are cleared whenever UART0 is disabled (clearing _UART_CTRL_EN_ in `CTRL`).
 
The parity flag is added if the _UART_CTRL_PMODE1_ flag is set. When _UART_CTRL_PMODE0_ is zero the UART
operates in "even parity" mode. If this flag is set, the UART operates in "odd parity" mode. Parity errors in
received data are indicated via the _UART_DATA_PERR_ flag in the _UART_DATA_ registers. This flag is updated with each new
received character. A frame error in the received data (i.e. stop bit is not set) is indicated via the
_UART_DATA_FERR_ flag in the `DATA`. This flag is also updated with each new received character
The state of both FIFO (_empty_, _at lest half-full_, _full_) is available via the _UART_CTRL_?X_EMPTY_,
_UART_CTRL_?X_HALF_ and _UART_CTRL_*X_FULL_ flags in the `CTRL` register.
 
**Hardware Flow Control – RTS/CTS**
If the RX FIFO is already full and new data is received by the receiver unit, the _UART_DATA_OVERR_ flag
in the `DATA` register is set indicating an "overrun". This flag is cleared by reading the `DATA` register.
 
The UART supports hardware flow control using the standard CTS (clear to send) and/or RTS (ready to send
/ ready to receive "RTR") signals. Both hardware control flow mechanisms can be individually enabled.
 
If **RTS hardware flow control** is enabled by setting the _UART_CTRL_RTS_EN_ control register flag, the UART
will pull the `uart0_rts_o` signal low if the UART's receiver is idle and no received data is waiting to get read by
application software. As long as this signal is low the connected device can send new data. `uart0_rts_o` is always LOW if the UART is disabled.
 
The RTS line is de-asserted (going high) as soon as the start bit of a new incoming char has been
detected. The transmitting device continues sending the current char and can also send another char
(due to the RX double-buffering), which is done by most terminal programs. Any additional data send
when RTS is still asserted will override the RX input buffer causing data loss. This will set the _UART_DATA_OVERR_ flag in the
`DATA` register. Any read access to this register clears the flag again.
**Hardware Flow Control - RTS/CTS**
 
If **CTS hardware flow control** is enabled by setting the _UART_CTRL_CTS_EN_ control register flag, the UART's
transmitter will not start sending a new char until the `uart0_cts_i` signal goes low. If a new data to be
send is written to the UART data register while `uart0_cts_i` is not asserted (=low), the UART will wait for
`uart0_cts_i` to become asserted (=high) before sending starts. During this time, the UART busy flag
_UART_CTRL_TX_BUSY_ remains set.
UART0 supports optional hardware flow control using the standard CTS (clear to send) and/or RTS (ready to send
/ ready to receive "RTR") signals. Both hardware control flow mechanisms can be enabled individually.
 
If `uart0_cts_i` is asserted, no new data transmission will be started by the UART. The state of the `uart0_cts_i`
signals has no effect on a transmission being already in progress.
* If **RTS hardware flow control** is enabled by setting the _UART_CTRL_RTS_EN_ control register flag, the UART
will pull the `uart0_rts_o` signal low if the UART's receiver is ready to receive new data.
As long as this signal is low the connected device can send new data. `uart0_rts_o` is always LOW if the UART is disabled.
The RTS line is de-asserted (going high) as soon as the start bit of a new incoming char has been
detected.
 
Signal changes on `uart0_cts_i` during an active transmission are ignored. Application software can check
* If **CTS hardware flow control** is enabled by setting the _UART_CTRL_CTS_EN_ control register flag, the UART's
transmitter will not start sending a new data until the `uart0_cts_i` signal goes low. During this time, the UART busy flag
_UART_CTRL_TX_BUSY_ remains set. If `uart0_cts_i` is asserted, no new data transmission will be started by the UART.
The state of the `uart0_cts_i` signal has no effect on a transmission being already in progress. Application software can check
the current state of the `uart0_cts_o` input signal via the _UART_CTRL_CTS_ control register flag.
 
[TIP]
Please note that – just like the RXD and TXD signals – the RTS and CTS signals have to be **cross**-coupled
between devices.
 
**Parity Modes**
 
An optional parity bit can be added to the data stream if the _UART_CTRL_PMODE1_ flag is set.
When _UART_CTRL_PMODE0_ is zero, the UART operates in "even parity" mode. If this flag is set, the UART operates in "odd parity" mode.
Parity errors in received data are indicated via the _UART_DATA_PERR_ flag in the `DATA` register. This flag is updated with each new
received character and is cleared by reading the `DATA` register.
 
 
**Interrupts**
 
The UART features two interrupts: the "TX done interrupt" is triggered when a transmit operation (sending) has finished. The "RX
done interrupt" is triggered when a data byte has been received. If the UART0 is not implemented, the UART0 interrupts are permanently tied to zero.
UART0 features two independent interrupt for signaling certain RX and TX conditions. The behavior of these interrupts differ
based on the configured FIFO size. If the according FIFO size is greater than 1, the _UART_CTRL_RX_IRQ_ and _UART_CTRL_TX_IRQ_
`CTRL` flags allow a more fine-grained IRQ configuration.
 
[NOTE]
The UART's RX interrupt is always triggered when a new data word has arrived – regardless of the
state of the RX double-buffer.
* If _UART0_RX_FIFO_ is exactly 1, the RX interrupt becomes pending as soon as there is data available in the RX FIFO
(-> _UART_CTRL_RX_EMPTY_ clears). This flag is hardwired to `0` if _UART0_RX_FIFO_ = 1.
* If _UART0_TX_FIFO_ is exactly 1, the TX interrupt becomes pending as soon as there is a free entry left in the TX FIFO
(-> _UART_CTRL_TX_FULL_ clears). This flag is hardwired to `0` if _UART0_RX_FIFO_ = 1.
 
* If _UART0_RX_FIFO_ is greater than 1: If _UART_CTRL_RX_IRQ_ is `0` the RX interrupt becomes pending as soon as there is data
available in the RX FIFO (-> _UART_CTRL_RX_EMPTY_ clears). If _UART_CTRL_RX_IRQ_ is `1` the RX interrupt becomes pending as soon as
the RX FIFO is at least half-full (-> _UART_CTRL_RX_HALF_ sets).
* If _UART0_TX_FIFO_ is greater than 1: If _UART_CTRL_TX_IRQ_ is `0` the TX interrupt becomes pending as soon as there is a free
entry left in the TX FIFO (-> _UART_CTRL_TX_FULL_ clears). If _UART_CTRL_TX_IRQ_ is `1` the TX interrupt becomes pending as soon as
the RX FIFO is less than half-full (-> _UART_CTRL_TX_HALF_ clears).
 
An interrupt can only become pending if the according interrupt condition is fulfilled and the UART is enabled at all.
A pending interrupt is removed by resolving the interrupt-triggering conditions (for example by reading data from the
more-than-half-full RX FIFO).
 
 
**Simulation Mode**
 
The default UART0 operation will transmit any data written to the `DATA` register via the serial TX line at
the defined baud rate. Even though the default testbench provides a simulated UART0 receiver, which
outputs any received char to the simulator console, such a transmission takes a lot of time. To accelerate
UART0 output during simulation (and also to dump large amounts of data for further processing like
verification) the UART0 features a **simulation mode**.
the defined baud rate via the physical link. To accelerate UART0 output during simulation
(and also to dump large amounts of data) the UART0 features a _simulation mode_.
 
The simulation mode is enabled by setting the _UART_CTRL_SIM_MODE_ bit in the UART0's control register
`CTRL`. Any other UART0 configuration bits are irrelevant, but the UART0 has to be enabled via the
_UART_CTRL_EN_ bit. When the simulation mode is enabled, any written char to `DATA` (bits 7:0) is
directly output as ASCII char to the simulator console. Additionally, all text is also stored to a text file
`neorv32.uart0.sim_mode.text.out` in the simulation home folder. Furthermore, the whole 32-bit word
written to `DATA` is stored as plain 8-char hexadecimal value to a second text file
`neorv32.uart0.sim_mode.data.out` also located in the simulation home folder.
Simulation mode is enabled by setting the _UART_CTRL_SIM_MODE_ bit in the UART0's control register
`CTRL`. Any other UART0 configuration bits are irrelevant for this mode but UART0 has to be enabled via the
_UART_CTRL_EN_ bit. There will be no physical UART0 transmissions via `uart0_txd_o` at all when
simulation mode is enabled. Furthermore, no interrupts (RX & TX) will be triggered.
 
If the UART is configured for simulation mode there will be **NO physical UART0 transmissions via
`uart0_txd_o`** at all. Furthermore, no interrupts (RX done or TX done) will be triggered in any situation.
When the simulation mode is enabled any data written to `DATA[7:0]` is
directly output as ASCII char to the simulator console. Additionally, all chars are also stored to a text file
`neorv32.uart0.sim_mode.text.out` in the simulation home folder.
 
Furthermore, the whole 32-bit word written to `DATA[31:0]` is stored as plain 8-char hexadecimal value to a
second text file `neorv32.uart0.sim_mode.data.out` also located in the simulation home folder.
 
[TIP]
More information regarding the simulation-mode of the UART0 can be found in the Uer Guide
More information regarding the simulation-mode of the UART0 can be found in the User Guide
section https://stnolting.github.io/neorv32/ug/#_simulating_the_processor[Simulating the Processor].
 
 
.UART0 register map (`struct NEORV32_UART0`)
[cols="<6,<7,<10,^2,<18"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.12+<| `0xffffffa0` .12+<| `NEORV32_UART0.CTRL` <|`11:0` _UART_CT_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value
<|`12` _UART_CT_SIM_MODE_ ^| r/w <| enable **simulation mode**
<|`20` _UART_CT_RTS_EN_ ^| r/w <| enable RTS hardware flow control
<|`21` _UART_CT_CTS_EN_ ^| r/w <| enable CTS hardware flow control
<|`22` _UART_CT_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity)
<|`23` _UART_CT_PMODE1_ ^| r/w
<|`24` _UART_CT_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select
<|`25` _UART_CT_PRSC1_ ^| r/w
<|`26` _UART_CT_PRSC2_ ^| r/w
<|`27` _UART_CT_CTS_ ^| r/- <| current state of UART's CTS input signal
<|`28` _UART_CT_EN_ ^| r/w <| UART enable
<|`31` _UART_CT_TX_BUSY_ ^| r/- <| trasmitter busy flag
.21+<| `0xffffffa0` .21+<| `NEORV32_UART0.CTRL` <|`11:0` _UART_CTRL_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value
<|`12` _UART_CTRL_SIM_MODE_ ^| r/w <| enable **simulation mode**
<|`13` _UART_CTRL_RX_EMPTY_ ^| r/- <| RX FIFO is empty
<|`14` _UART_CTRL_RX_HALF_ ^| r/- <| RX FIFO is at least half-full
<|`15` _UART_CTRL_RX_FULL_ ^| r/- <| RX FIFO is full
<|`16` _UART_CTRL_TX_EMPTY_ ^| r/- <| TX FIFO is empty
<|`17` _UART_CTRL_TX_HALF_ ^| r/- <| TX FIFO is at least half-full
<|`18` _UART_CTRL_TX_FULL_ ^| r/- <| TX FIFO is full
<|`19` - ^| r/- <| _reserved_, read as zero
<|`20` _UART_CTRL_RTS_EN_ ^| r/w <| enable RTS hardware flow control
<|`21` _UART_CTRL_CTS_EN_ ^| r/w <| enable CTS hardware flow control
<|`22` _UART_CTRL_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity)
<|`23` _UART_CTRL_PMODE1_ ^| r/w
<|`24` _UART_CTRL_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select
<|`25` _UART_CTRL_PRSC1_ ^| r/w
<|`26` _UART_CTRL_PRSC2_ ^| r/w
<|`27` _UART_CTRL_CTS_ ^| r/- <| current state of UART's CTS input signal
<|`28` _UART_CTRL_EN_ ^| r/w <| UART enable
<|`29` _UART_CTRL_RX_IRQ_ ^| r/w <| RX IRQ mode: `1`=FIFO at least half-full; `0`=FIFO not empty
<|`30` _UART_CTRL_TX_IRQ_ ^| r/w <| TX IRQ mode: `1`=FIFO less than half-full; `0`=FIFO not full
<|`31` _UART_CTRL_TX_BUSY_ ^| r/- <| transmitter busy flag
.6+<| `0xffffffa4` .6+<| `NEORV32_UART0.DATA` <|`7:0` _UART_DATA_MSB_ : _UART_DATA_LSB_ ^| r/w <| receive/transmit data (8-bit)
<|`31:0` - ^| -/w <| **simulation data output**
<|`28` _UART_DATA_PERR_ ^| r/- <| RX parity error
171,43 → 198,56
| | `uart1_rxd_i` | serial receiver input UART1
| | `uart1_rts_o` | flow control: RX ready to receive
| | `uart1_cts_i` | flow control: TX allowed to send
| Configuration generics: | _IO_UART1_EN_ | implement UART1 when _true_
| CPU interrupts: | fast IRQ channel 4 | RX done interrupt
| | fast IRQ channel 5 | TX done interrupt (see <<_processor_interrupts>>)
| Configuration generics: | _IO_UART1_EN_ | implement UART1 when _true_
| | _UART1_RX_FIFO_ | RX FIFO depth (power of 2, min 1)
| | _UART1_TX_FIFO_ | TX FIFO depth (power of 2, min 1)
| CPU interrupts: | fast IRQ channel 4 | RX interrupt
| | fast IRQ channel 5 | TX interrupt (see <<_processor_interrupts>>)
|=======================
 
 
**Theory of Operation**
 
The secondary UART (UART1) is functional identical to the primary UART (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>).
Obviously, UART1 has different addresses for
the control register (`CTRL`) and the data register (`DATA`) – see the register map below. However, the
register bits/flags use the same bit positions and naming. Furthermore, the "RX done" and "TX done" interrupts are
mapped to different CPU fast interrupt channels.
Obviously, UART1 has different addresses for the control register (`CTRL`) and the data register (`DATA`) - see the register map below.
The register's bits/flags use the same bit positions and naming as for the primary UART. The RX and TX interrupts of UART1 are
mapped to different CPU fast interrupt (FIRQ) channels.
 
 
**Simulation Mode**
 
The secondary UART (UART1) provides the same simulation options as the primary UART. However,
output data is written to UART1-specific files: `neorv32.uart1.sim_mode.text.out` is used to store
plain ASCII text and `neorv32.uart1.sim_mode.data.out` is used to store full 32-bit hexadecimal
encoded data words.
data words.
 
 
.UART1 register map (`struct NEORV32_UART1`)
[cols="<6,<7,<10,^2,<18"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.12+<| `0xffffffd0` .12+<| `NEORV32_UART1.CTRL` <|`11:0` _UART_CT_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value
<|`12` _UART_CT_SIM_MODE_ ^| r/w <| enable **simulation mode**
<|`20` _UART_CT_RTS_EN_ ^| r/w <| enable RTS hardware flow control
<|`21` _UART_CT_CTS_EN_ ^| r/w <| enable CTS hardware flow control
<|`22` _UART_CT_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity)
<|`23` _UART_CT_PMODE1_ ^| r/w
<|`24` _UART_CT_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select
<|`25` _UART_CT_PRSC1_ ^| r/w
<|`26` _UART_CT_PRSC2_ ^| r/w
<|`27` _UART_CT_CTS_ ^| r/- <| current state of UART's CTS input signal
<|`28` _UART_CT_EN_ ^| r/w <| UART enable
<|`31` _UART_CT_TX_BUSY_ ^| r/- <| trasmitter busy flag
.21+<| `0xffffffd0` .21+<| `NEORV32_UART1.CTRL` <|`11:0` _UART_CTRL_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value
<|`12` _UART_CTRL_SIM_MODE_ ^| r/w <| enable **simulation mode**
<|`13` _UART_CTRL_RX_EMPTY_ ^| r/- <| RX FIFO is empty
<|`14` _UART_CTRL_RX_HALF_ ^| r/- <| RX FIFO is at least half-full
<|`15` _UART_CTRL_RX_FULL_ ^| r/- <| RX FIFO is full
<|`16` _UART_CTRL_TX_EMPTY_ ^| r/- <| TX FIFO is empty
<|`17` _UART_CTRL_TX_HALF_ ^| r/- <| TX FIFO is at least half-full
<|`18` _UART_CTRL_TX_FULL_ ^| r/- <| TX FIFO is full
<|`19` - ^| r/- <| _reserved_, read as zero
<|`20` _UART_CTRL_RTS_EN_ ^| r/w <| enable RTS hardware flow control
<|`21` _UART_CTRL_CTS_EN_ ^| r/w <| enable CTS hardware flow control
<|`22` _UART_CTRL_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity)
<|`23` _UART_CTRL_PMODE1_ ^| r/w
<|`24` _UART_CTRL_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select
<|`25` _UART_CTRL_PRSC1_ ^| r/w
<|`26` _UART_CTRL_PRSC2_ ^| r/w
<|`27` _UART_CTRL_CTS_ ^| r/- <| current state of UART's CTS input signal
<|`28` _UART_CTRL_EN_ ^| r/w <| UART enable
<|`29` _UART_CTRL_RX_IRQ_ ^| r/w <| RX IRQ mode: `1`=FIFO at least half-full; `0`=FIFO not empty; hardwired to zero if _UART0_RX_FIFO_ = 1
<|`30` _UART_CTRL_TX_IRQ_ ^| r/w <| TX IRQ mode: `1`=FIFO less than half-full; `0`=FIFO not full; hardwired to zero if _UART0_TX_FIFO_ = 1
<|`31` _UART_CTRL_TX_BUSY_ ^| r/- <| transmitter busy flag
.6+<| `0xffffffd4` .6+<| `NEORV32_UART1.DATA` <|`7:0` _UART_DATA_MSB_ : _UART_DATA_LSB_ ^| r/w <| receive/transmit data (8-bit)
<|`31:0` - ^| -/w <| **simulation data output**
<|`28` _UART_DATA_PERR_ ^| r/- <| RX parity error
/datasheet/soc_wdt.adoc
39,10 → 39,13
 
Whenever the internal timer overflows the watchdog executes one of two possible actions: Either a hard
processor reset is triggered or an interrupt is requested at CPU's fast interrupt channel #0. The
WDT_CTRL_MODE bit defines the action to be taken on an overflow: When cleared, the Watchdog will trigger an
IRQ, when set the WDT will cause a system reset. The configured actions can also be triggered manually at
WDT_CTRL_MODE bit definess the action to be taken on an overflow: When cleared, the Watchdog will assert an
IRQ, when set the WDT will cause a system reset. The configured action can also be triggered manually at
any time by setting the _WDT_CTRL_FORCE_ bit. The watchdog is reset by setting the _WDT_CTRL_RESET_ bit.
 
A watchdog interrupt can only occur if the watchdog is enabled and interrupt mode is enabled.
A pending interrupt is cleared by either disabling the watchdog or by resetting the watchdog.
 
The cause of the last action of the watchdog can be determined via the _WDT_CTRL_RCAUSE_ flag. If this flag is
zero, the processor has been reset via the external reset signal. If this flag is set the last system reset was
initiated by the watchdog.
/datasheet/soc_wishbone.adoc
64,7 → 64,7
 
[TOP]
A detailed description of the implemented Wishbone bus protocol and the according interface signals
can be found in the data sheet "Wishbone B4 – WISHBONE System-on-Chip (SoC) Interconnection
can be found in the data sheet "Wishbone B4 - WISHBONE System-on-Chip (SoC) Interconnection
Architecture for Portable IP Cores". A copy of this document can be found in the docs folder of this
project.
 
/datasheet/soc_xirq.adoc
34,7 → 34,7
[NOTE]
A disabled interrupt channel can still be pending if it has been triggered before clearing the according `IER` bit.
 
The CPU can determine firing interrupt request either by checking the bits in the `IPR` register, which show all
The CPU can determine active external interrupt request either by checking the bits in the `IPR` register, which show all
pending interrupt channels, or by reading the interrupt source register `SCR`.
This register provides a 5-bit wide ID (0..31) that shows the interrupt request with _highest priority_.
Interrupt channel `xirq_i(0)` has highest priority and `xirq_i(_XIRQ_NUM_CH_-1)` has lowest priority.
42,8 → 42,8
The CPU can use the ID from `SCR` to service IRQ according to their priority. To acknowledge the according
interrupt the CPU can write `1 << SCR` to `IPR`.
 
In order to acknowledge the interrupt from the external interrupt controller, the CPU has to write _any_
value to interrupt source register `SRC`.
In order to clear a pending FIRQ interrupt from the external interrupt controller, the CPU has to write _any_
value to the interrupt source register `SRC`.
 
[NOTE]
An interrupt handler should clear the interrupt pending bit that caused the interrupt first before
/datasheet/software.adoc
124,12 → 124,14
info - show makefile/toolchain configuration
exe - compile and generate <neorv32_exe.bin> executable for upload via bootloader
hex - compile and generate <neorv32_exe.hex> executable raw file
image - compile and generate VHDL IMEM boot image (for application) in local folder
install - compile, generate and install VHDL IMEM boot image (for application)
sim - in-console simulation using the default testbench and GHDL
sim - in-console simulation using default/simple testbench and GHDL
all - exe + hex + install
elf_info - show ELF layout info
clean - clean up project
clean_all - clean up project, core libraries and image generator
bl_image - compile and generate VHDL BOOTROM boot image (for bootloader only!) in local folder
bootloader - compile, generate and install VHDL BOOTROM boot image (for bootloader only!)
----
 
142,7 → 144,7
 
[TIP]
The makefile configuration variables can be (re-)defined directly when invoking the makefile. For
example via `$ make MARCH=-march=rv32ic clean_all exe`. You can also make project-specific definitions
example via `$ make MARCH=rv32ic clean_all exe`. You can also make project-specific definitions
of all variables inside the project's actual makefile (e.g., `sw/example/blink_led/makefile`).
 
[source,makefile]
162,8 → 164,8
# Compiler toolchain
RISCV_PREFIX ?= riscv32-unknown-elf-
# CPU architecture and ABI
MARCH ?= -march=rv32i
MABI ?= -mabi=ilp32
MARCH ?= rv32i
MABI ?= ilp32
# User flags for additional configuration (will be added to compiler flags)
USER_FLAGS ?=
# Relative or absolute path to the NEORV32 home folder
179,7 → 181,7
| _ASM_INC_ | Include file folders that are used only for the assembly source files (`*.S`/`*.s`).
| _EFFORT_ | Optimization level, optimize for size (`-Os`) is default; legal values: `-O0`, `-O1`, `-O2`, `-O3`, `-Os`
| _RISCV_PREFIX_ | The toolchain prefix to be used; follows the naming convention "architecture-vendor-output-"
| _MARCH_ | The targetd RISC-V architecture/ISA. Only `rv32` is supported by the NEORV32. Enable compiler support of optional CPU extension by adding the according extension letter (e.g. `rv32im` for _M_ CPU extension). See https://stnolting.github.io/neorv32/ug/#_enabling_risc_v_cpu_extensions[User Guide: Enabling RISC-V CPU Extensions] for more information.
| _MARCH_ | The targeted RISC-V architecture/ISA. Only `rv32` is supported by the NEORV32. Enable compiler support of optional CPU extension by adding the according extension letter (e.g. `rv32im` for _M_ CPU extension). See https://stnolting.github.io/neorv32/ug/#_enabling_risc_v_cpu_extensions[User Guide: Enabling RISC-V CPU Extensions] for more information.
| _MABI_ | The default 32-bit integer ABI.
| _USER_FLAGS_ | Additional flags that will be forwarded to the compiler tools
| _NEORV32_HOME_ | Relative or absolute path to the NEORV32 project home folder. Adapt this if the makefile/project is not in the project's `sw/example folder`.
374,7 → 376,7
----
int __neorv32_crt0_after_main(int32_t return_code) {
 
neorv32_uart_printf("Main returned with code: %i\n", return_code);
neorv32_uart0_printf("Main returned with code: %i\n", return_code);
return 0;
}
----
/figures/SPI_timing_diagram2.wikimedia.png Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
figures/SPI_timing_diagram2.wikimedia.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: figures/address_space.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/license.md =================================================================== --- figures/license.md (nonexistent) +++ figures/license.md (revision 65) @@ -0,0 +1,11 @@ +# :copyright: + +Figures are own work if not otherwise stated. License: https://github.com/stnolting/neorv32/blob/master/LICENSE + +## `SPI_timing_diagram2.wikimedia.png` + +source: https://en.wikipedia.org/wiki/File:SPI_timing_diagram2.svg + +License: +* Creative Commons: https://en.wikipedia.org/wiki/Creative_Commons +* Attribution-Share Alike 3.0 Unported: https://creativecommons.org/licenses/by-sa/3.0/deed.en \ No newline at end of file Index: userguide/content.adoc =================================================================== --- userguide/content.adoc (revision 64) +++ userguide/content.adoc (revision 65) @@ -29,21 +29,20 @@ :sectnums: === Building the Toolchain from Scratch -To build the toolchain by yourself you can follow the guide from the official https://github.com/riscv/riscv-gnu-toolchain GitHub page. +To build the toolchain by yourself you can follow the guide from the official https://github.com/riscv-collab/riscv-gnu-toolchain GitHub page. You need to make sure the generated toolchain fits the architecture of the NEORV32 core. To get a toolchain that even supports minimal ISA extension configurations, it is recommend to compile for `rv32i` only. Please note that this minimal ISA also provides further ISA -extensions like `m` or `c`. Of course you can use a `multilib` approach to generate -toolchains for several target ISAs. +extensions like `m` or `c`. Of course you can use a _multilib_ approach to generate toolchains for several target ISAs at once. .Configuring GCC build for `rv32i` (minimal ISA) [source,bash] ---- -riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –-with-abi=ilp32 +riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i --with-abi=ilp32 riscv-gnu-toolchain$ make ---- [IMPORTANT] -Keep in mind that – for instance – a toolchain build with `--with-arch=rv32imc` only provides library code compiled with +Keep in mind that - for instance - a toolchain build with `--with-arch=rv32imc` only provides library code compiled with compressed (`C`) and `mul`/`div` instructions (`M`)! Hence, this code cannot be executed (without emulation) on an architecture without these extensions! @@ -196,8 +195,8 @@ <3> Default size of internal data memory: 8kB [start=7] -. If you feel like it – or if your FPGA does not provide sufficient resources – you can modify the -_memory sizes_ (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` – marked with notes "2" and "3"). But as mentioned +. If you feel like it - or if your FPGA does not provide sufficient resources - you can modify the +_memory sizes_ (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` - marked with notes "2" and "3"). But as mentioned above, let's keep things simple at first and use the standard configuration for now. . There is one generic that _has to be set according to your FPGA board_ setup: the actual clock frequency of the top's clock input signal (`clk_i`). Use the `CLOCK_FREQUENCY` generic to specify your clock source's @@ -205,7 +204,7 @@ [NOTE] If you have changed the default memory configuration (`MEM_INT_IMEM_SIZE` and `MEM_INT_DMEM_SIZE` generics) -keep those new sizes in mind – these values are required for setting +keep those new sizes in mind - these values are required for setting up the software framework in the next section <<_general_software_framework_setup>>. [start=9] @@ -249,7 +248,7 @@ [start=10] . Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of -your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is +your FPGA board. Check whether it is low-active or high-active - the reset signal of the processor is **low-active**, so maybe you need to invert the input signal. . If possible, connected _at least_ bit `0` of the GPIO output port `gpio_o` to a LED (see "Signal Polarity" note above). . Finally, if your are using the UART-based test setup (`neorv32_testsetup_bootloader.vhd`) @@ -288,7 +287,7 @@ ---- MEMORY { - ram (rwx) : ORIGIN = 0x80000000, LENGTH = DEFINED(make_bootloader) ? 512 : 8*1024 # <1> + ram (rwx) : ORIGIN = 0x80000000, LENGTH = DEFINED(make_bootloader) ? 512 : 8*1024 <1> ... ---- <1> Size of the data memory address space (right-most value) (internal/external DMEM); here 8kB @@ -616,8 +615,8 @@ [source,makefile] ---- # CPU architecture and ABI -MARCH = -march=rv32i # <1> -MABI = -mabi=ilp32 # <2> +MARCH ?= rv32i <1> +MABI ?= ilp32 <2> ---- <1> MARCH = Machine architecture ("ISA string") <2> MABI = Machine binary interface @@ -642,7 +641,7 @@ [source,bash] ---- -$ make MARCH=-march=rv32ic clean_all all +$ make MARCH=rv32ic clean_all all ---- [NOTE] @@ -1037,9 +1036,13 @@ . Save everything, let VIVADO create a HDL-Wrapper for the block-design and choose this as your _Top Level Design_. . Define your constraints and generate your bitstream. +.TWI Tri-State Drivers +[IMPORTANT] +Set the synthesis option "global" when generating the block design to maintain the internal TWI tri-state drivers. + [NOTE] -Guide provided by GitHub user https://github.com/AWenzel83[`AWenzel83`] from -https://github.com/stnolting/neorv32/discussions/52#discussioncomment-819013 +Guide provided by GitHub user https://github.com/AWenzel83[`AWenzel83`] (see +https://github.com/stnolting/neorv32/discussions/52#discussioncomment-819013). ❤️ @@ -1181,7 +1184,7 @@ To do a quick test of the NEORV32 make sure to have https://github.com/ghdl/ghdl[GHDL] and a [RISC-V gcc toolchain](https://github.com/stnolting/riscv-gcc-prebuilt) installed. -Navigate to the project's `sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim`: +Navigate to the project's `sw/example/hello_world` folder and run `make USER_FLAGS+=-DUART0_SIM_MODE MARCH=rv32imac clean_all sim`: [TIP] The simulator will output some _sanity check_ notes (and warnings or even errors if something is ill-configured) @@ -1189,7 +1192,7 @@ [source, bash] ---- -stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=-march=rv32imac clean_all sim +stnolting@Einstein:/mnt/n/Projects/neorv32/sw/example/hello_world$ make USER_FLAGS+=-DUART0_SIM_MODE MARCH=rv32imac clean_all sim ../../../sw/lib/source/neorv32_uart.c: In function 'neorv32_uart0_setup': ../../../sw/lib/source/neorv32_uart.c:301:4: warning: #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! [-Wcpp] 301 | #warning UART0_SIM_MODE (primary UART) enabled! Sending all UART0.TX data to text.io simulation output instead of real UART0 transmitter. Use this for simulations only! <1> @@ -1310,6 +1313,23 @@ <<< // #################################################################################################################### :sectnums: +== Zephyr RTOS Support 🪁 + +The NEORV32 processor is supported by upstream Zephyr RTOS: https://docs.zephyrproject.org/latest/boards/riscv/neorv32/doc/index.html + +[IMPORTANT] +The absolute path to the NEORV32 executable image generator binary (`.../neorv32/sw/image_gen`) has to be added to the `PATH` variable +so the Zephyr build system can generate executables and memory-initialization images. + +[NOTE] +Zephyr OS port provided by GitHub user https://github.com/henrikbrixandersen[henrikbrixandersen] +(see https://github.com/stnolting/neorv32/discussions/172). ❤️ + + + +<<< +// #################################################################################################################### +:sectnums: == FreeRTOS Support A NEORV32-specific port and a simple demo for FreeRTOS (https://github.com/FreeRTOS/FreeRTOS) are @@ -1436,7 +1456,7 @@ .Compile the test application [source, bash] -------------------------- -.../neorv32/sw/example/blink_led$ make MARCH=-march=rv32i USER_FLAGS+=-g clean_all all +.../neorv32/sw/example/blink_led$ make MARCH=rv32i USER_FLAGS+=-g clean_all all -------------------------- .Adding debug symbols to the executable
/attrs.adoc
1,7 → 1,8
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:keywords: neorv32, risc-v, riscv, fpga, soft-core, vhdl, microcontroller, cpu, soc, processor, gcc, openocd, gdb
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.6.1
:revnumber: v1.6.2
:doctype: book
:sectnums:
:stem:
/legal.adoc
106,7 → 106,8
 
.DOI
[TIP]
This project also provides a _digital object identifier_ provided by https://zenodo.org[zenodo]: https://zenodo.org/record/5121427
This project also provides a _digital object identifier_ provided by https://zenodo.org[zenodo]:
https://doi.org/10.5281/zenodo.5018888[image:https://zenodo.org/badge/DOI/10.5281/zenodo.5018888.svg[title='zenodo']]
 
 
:sectnums!:

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.