OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

Compare Revisions

  • This comparison shows the changes necessary to convert path
    /neorv32/trunk/docs
    from Rev 60 to Rev 61
    Reverse comparison

Rev 60 → Rev 61

/datasheet/cpu.adoc
6,7 → 6,20
**Key Features**
 
* 32-bit pipelined/multi-cycle in-order `rv32` RISC-V CPU
* Optional RISC-V extensions: `rv32[i/e][m][a][c][u]` + `[Zfinx][Zicsr][Zifencei]` + `[debug_mode]` (for on-chip debugging)
* Optional RISC-V extensions:
** `A` - atomic memory access operations
** `C` - 16-bit compressed instructions
** `I` - integer base ISA (always enabled)
** `E` - embedded CPU version (reduced register file size)
** `M` - integer multiplication and division hardware
** `U` - less-privileged _user_ mode
** `Zfinx` - single-precision floating-point unit
** `Zicsr` - control and status register access (privileged architecture)
** `Zifencei` - instruction stream synchronization
** `Zmmul` - integer multiplication hardware
** `PMP` - physical memory protection
** `HPM` - hardware performance monitors
** `DB` - debug mode
* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications – passes the official RISC-V Architecture Tests (v2+)
* Official RISC-V open-source architecture ID
* Standard RISC-V interrupts (_external_, _timer_, _software_) plus 16 _fast_ interrupts and 1 non-maskable interrupt
296,7 → 309,7
| **CPU_BOOT_ADDR** | _std_ulogic_vector(31 downto 0)_ | 0x00000000
3+| This address defines the reset address at which the CPU starts fetching instructions after reset. In terms of the NEORV32 processor, this
generic is configured with the base address of the bootloader ROM (default) or with the base address of the processor-internal instruction
memory (IMEM) if the bootloader is disabled (_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.
memory (IMEM) if the bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_). See section <<_address_space>> for more information.
|======
 
[cols="4,4,2"]
387,12 → 400,10
* environment: `ecall`, `ebreak`, `fence`
 
[NOTE]
In order to keep the hardware footprint low, the CPU's shift unit uses a hybrid parallel/serial approach. Shift
operations are split in coarse shifts (multiples of 4) and a final fine shift (0 to 3). The total execution
time depends on the shift amount. Alternatively, the shift operations can be processed completely in parallels by a fast
(but large) barrel shifter when the `FAST_SHIFT_EN` generic is _true_. In that case, shift operations
complete within 2 cycles regardless of the shift amount. Shift operations can also be executed in a pure serial manner when
then `TINY_SHIFT_EN` generic is _true_. In that case, shift operations take up to 32 cycles depending on the shift amount.
In order to keep the hardware footprint low, the CPU's shift unit uses a bit-serial serial approach. Hence, shift operations
take up to 32 cycles (plus overhead) depending on the actual shift amount. Alternatively, the shift operations can be processed
completely in parallels by a fast (but large) barrel shifter when the `FAST_SHIFT_EN` generic is _true_. In that case, shift operations
complete within 2 cycles (plus overhead) regardless of the actual shift amount.
 
[NOTE]
Internally, the `fence` instruction does not perform any operation inside the CPU. It only sets the
406,8 → 417,8
`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are
available:
 
• multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
• division: `div`, `divu`, `rem`, `remu`
* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
* division: `div`, `divu`, `rem`, `remu`
 
[NOTE]
By default, multiplication and division operations are executed in a bit-serial approach.
416,6 → 427,26
always require a fixed amount of cycles to complete - regardless of the input operands.
 
 
==== **`Zmmul`** - Integer Multiplication
 
This is a _sub-extension_ of the `M` ISA extension. It implements the multiplication-only operations
of the `M` extensions and is intended for small scale applications, that require hardware-based
integer multiplications but not hardware-based divisions, which will be computed entirely in software.
This extension requires only ~50% of the hardware utilization of the `M` extension.
 
* multiplication: `mul`, `mulh`, `mulhsu`, `mulhu`
 
If `Zmmul` is enabled, executing any division instruction from the `M` ISA (`div`, `divu`, `rem`, `remu`)
will raise an illegal instruction exception.
 
Note that `M` and `Zmmul` extensions _cannot_ be enabled in parallel.
 
[TIP]
If your RISC-V GCC toolchain does not (yet) support the `_Zmmul` ISA extensions, it can be "emulated"
using a `rv32im` machine architecture and setting the `-mno-div` compiler flag
(example `$ make MARCH=-march=rv32im USER_FLAGS+=-mno-div clean_all exe`).
 
 
==== **`U`** - Less-Privileged User Mode
 
Adds the less-privileged _user mode_ when the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For
631,7 → 662,7
| ALU | `I/E` | `addi` `slti` `sltiu` `xori` `ori` `andi` `add` `sub` `slt` `sltu` `xor` `or` `and` `lui` `auipc` | 2
| ALU | `C` | `c.addi4spn` `c.nop` `c.addi` `c.li` `c.addi16sp` `c.lui` `c.andi` `c.sub` `c.xor` `c.or` `c.and` `c.add` `c.mv` | 2
| ALU | `I/E` | `slli` `srli` `srai` `sll` `srl` `sra` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINY_SHIFT_EN` is enabled.]: 2..32
| ALU | `C` | `c.srli` `c.srai` `c.slli` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINS_SHIFT_EN` is enabled.]: 2..32
| ALU | `C` | `c.srli` `c.srai` `c.slli` | 3 + SAfootnote:[Shift amount (0..31).]; FAST_SHIFTfootnote:[Barrel shifter when `FAST_SHIFT_EN` is enabled.]:
| Branches | `I/E` | `beq` `bne` `blt` `bge` `bltu` `bgeu` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3
| Branches | `C` | `c.beqz` `c.bnez` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3
| Jumps / Calls | `I/E` | `jal` `jalr` | 4 + ML
709,53 → 740,60
:sectnums:
==== Traps, Exceptions and Interrupts
 
In this document a (maybe) special nomenclature regarding traps is used:
In this document the following nomenclature regarding traps is used:
 
* _interrupt_ = asynchronous exceptions
* _exceptions_ = synchronous exceptions
* _traps_ = exceptions + interrupts (synchronous or asynchronous exceptions)
 
Whenever an exception or interrupt is triggered, the CPU transfers control to the address stored in the `mtvec`
CSR. The cause of the according interrupt or exception can be determined via the content of the `mcause`
CSR The address that reflected the current program counter when a trap was taken is stored to `mepc`.
Additional information regarding the cause of the trap can be retrieved from `mtval`.
Whenever an exception or interrupt is triggered, the CPU transfers control to the address stored in `mtvec`
CSR. The cause of the according interrupt or exception can be determined via the content of `mcause`
CSR. The address that reflects the current program counter when a trap was taken is stored to `mepc` CSR.
Additional information regarding the cause of the trap can be retrieved from `mtval` CSR.
 
The traps are prioritized. If several exceptions occur at once only the one with highest priority is triggered. If
several interrupts trigger at once, the one with highest priority is triggered while the remaining ones are
queued. After completing the interrupt handler the interrupt with the second highest priority will issues and
so on.
The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered
while all remaining exceptions are ignored. If several _interrupts_ trigger at once, the one with highest priority
is serviced first while the remaining ones are queued. After completing the interrupt handler the interrupt with
the second highest priority will get serviced and so on until no further interrupt are pending.
 
.Trigger Type
[IMPORTANT]
All CPU interrupt request signals are high-level triggered. So an interrupt request will be generated if the
according signal is _high_ for exactly one cycle (being high for several cycles might cause multiple
triggering of the same interrupt).
 
**Memory Access Exceptions**
.Instruction Atomicity
[NOTE]
All instructions execute as atomic operations – interrupts can only trigger between two instructions.
 
If a load operation causes any exception, the destination register is not written at all. Exceptions caused by a
misalignment or a physical memory protection fault do not trigger a bus read-operation at all.
Exceptions caused by a store address misalignment or a store physical memory protection fault do not trigger
a bus write-operation at all.
 
:sectnums:
==== Memory Access Exceptions**
 
**Instruction Atomicity**
If a load operation causes any exception, the instruction's destination register is
_not written_ at all. Load exceptions caused by a misalignment or a physical memory protection fault do not
trigger a bus read-operation at all. Exceptions caused by a store address misalignment or a store physical
memory protection fault do not trigger
a bus write-operation at all.
 
All instructions execute as atomic operations – interrupts can only trigger between two instructions.
 
:sectnums:
==== Custom Fast Interrupt Request Lines
 
**Custom Fast Interrupt Request Lines**
 
As a custom extension, the NEORV32 CPU features 16 fast interrupt request lines via the `firq_i` CPU (/Processor) top
As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top
entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also
provide custom trap codes in `mcause`.
provide custom trap codes in `mcause`. Thes FIRQs are reserved for processor-internal usage only.
 
 
**Non-Maskable Interrupt**
:sectnums:
==== Non-Maskable Interrupt
 
The NEORV32 CPU features a single non-maskable interrupt source via the `nm_irq_i` CPU (/Processor) top
entity signal that can be used to signal critical system conditions. This interrupt source _cannot_ be disabled at all (even not in interrupt service routines).
entity signal. This interrupt can be used to signal _critical_ system conditions that need immediate handling.
The non-maskable interrupt _cannot_ be masked/disabled at all (even not in interrupt service routines).
Hence, it does _not_ provide configuration/status flags in the `mie` and `mip` CSRs. The RISC-V-compatible
`mcause` value `0x80000000` is used to indicate the non-maskable interrupt.
 
[IMPORTANT]
All CPU/Processor interrupt request signals are triggered when the signal is _high_ for exactly one cycle (being high for several cycles might
cause multiple triggering of the interrupt).
 
 
<<<
/datasheet/cpu_csr.adoc
55,51 → 55,49
 
CSRs with the following notes ...
 
* `C` - have or are a custom CPU extension (that is allowed by the RISC-V specs)
* `R` - are read-only (in contrast to the originally specified r/w capability)
* `S` - have a constrained compatibility; for example not all specified bits are available
* `X`: _custom_ - have or are a custom CPU-specific extension (that is allowed by the RISC-V specs)
* `R`: _read-only_ - are read-only (in contrast to the originally specified r/w capability)
* `C`: _constrained_ - have a constrained compatibility, not all specified bits are implemented
 
.NEORV32 Control and Status Registers (CSRs)
[cols="<4,<6,<11,^3,<11,^3"]
[cols="<4,<7,<10,^3,<11,^3"]
[options="header"]
|=======================
| Address | Name [ASM] | Name [C] | R/W | Function | Note
| Address | Name [ASM] | Name [C] | R/W | Function | Note
6+^| **<<_floating_point_csrs>>**
| 0x001 | <<_fflags>> | _CSR_FFLAGS_ | r/w | Floating-point accrued exceptions |
| 0x002 | <<_frm>> | _CSR_FRM_ | r/w | Floating-point dynamic rounding mode |
| 0x003 | <<_fcsr>> | _CSR_FCSR_ | r/w | Floating-point control and status (`frm` + `fflags`) |
6+^| **<<_machine_trap_setup>>**
| 0x300 | <<_mstatus>> | _CSR_MSTATUS_ | r/w | Machine status register | `S`
| 0x300 | <<_mstatus>> | _CSR_MSTATUS_ | r/w | Machine status register | `C`
| 0x301 | <<_misa>> | _CSR_MISA_ | r/- | Machine CPU ISA and extensions | `R`
| 0x304 | <<_mie>> | _CSR_MIE_ | r/w | Machine interrupt enable register | `C`
| 0x304 | <<_mie>> | _CSR_MIE_ | r/w | Machine interrupt enable register | `X`
| 0x305 | <<_mtvec>> | _CSR_MTVEC_ | r/w | Machine trap-handler base address (for ALL traps) |
| 0x306 | <<_mcounteren>> | _CSR_MCOUNTEREN_ | r/w | Machine counter-enable register | `S`
| 0x306 | <<_mcounteren>> | _CSR_MCOUNTEREN_ | r/w | Machine counter-enable register | `C`
6+^| **<<_machine_trap_handling>>**
| 0x340 | <<_mscratch>> | _CSR_MSCRATCH_ | r/w | Machine scratch register |
| 0x341 | <<_mepc>> | _CSR_MEPC_ | r/w | Machine exception program counter |
| 0x342 | <<_mcause>> | _CSR_MCAUSE_ | r/w | Machine trap cause | `C`
| 0x342 | <<_mcause>> | _CSR_MCAUSE_ | r/w | Machine trap cause | `X`
| 0x343 | <<_mtval>> | _CSR_MTVAL_ | r/- | Machine bad address or instruction | `R`
| 0x344 | <<_mip>> | _CSR_MIP_ | r/- | Machine interrupt pending register | `CR`
| 0x344 | <<_mip>> | _CSR_MIP_ | r/- | Machine interrupt pending register | `XR`
6+^| **<<_machine_physical_memory_protection>>**
| 0x3a0 .. 0x3af | <<_pmpcfg, `pmpcfg0`>> .. <<_pmpcfg, , `pmpcfg15`>> | _CSR_PMPCFG0_ .. _CSR_PMPCFG15_ | r/w | Physical memory protection config. for region 0..63 | `S`
| 0x3a0 .. 0x3af | <<_pmpcfg, `pmpcfg0`>> .. <<_pmpcfg, `pmpcfg15`>> | _CSR_PMPCFG0_ .. _CSR_PMPCFG15_ | r/w | Physical memory protection config. for region 0..63 | `C`
| 0x3b0 .. 0x3ef | <<_pmpaddr, `pmpaddr0`>> .. <<_pmpaddr, `pmpaddr63`>> | _CSR_PMPADDR0_ .. _CSR_PMPADDR63_ | r/w | Physical memory protection addr. register region 0..63 |
6+^| **<<_machine_counters_and_timers>>**
| 0xb00 | <<_mcycleh, `mcycle`>> | _CSR_MCYCLE_ | r/w | Machine cycle counter low word |
| 0xb02 | <<_minstreth, `_minstret`>> | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter low word |
| 0xb80 | <<_mcycleh>> | _CSR_MCYCLE_ | r/w | Machine cycle counter high word |
| 0xb82 | <<_minstreth>> | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter high word |
| 0xc00 | <<_cycleh, `cycle`>> | _CSR_CYCLE_ | r/- | Cycle counter low word |
| 0xc01 | <<_timeh, `time`>> | _CSR_TIME_ | r/- | System time (from MTIME) low word |
| 0xb00 | <<_mcycleh, `mcycle`>> | _CSR_MCYCLE_ | r/w | Machine cycle counter low word |
| 0xb02 | <<_minstreth, `_minstret`>> | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter low word |
| 0xb80 | <<_mcycleh>> | _CSR_MCYCLE_ | r/w | Machine cycle counter high word |
| 0xb82 | <<_minstreth>> | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter high word |
| 0xc00 | <<_cycleh, `cycle`>> | _CSR_CYCLE_ | r/- | Cycle counter low word |
| 0xc01 | <<_timeh, `time`>> | _CSR_TIME_ | r/- | System time (from MTIME) low word |
| 0xc02 | <<_instreth, `instret`>> | _CSR_INSTRET_ | r/- | Instruction-retired counter low word |
| 0xc80 | <<_cycleh>> | _CSR_CYCLEH_ | r/- | Cycle counter high word |
| 0xc81 | <<_timeh>> | _CSR_TIMEH_ | r/- | System time (from MTIME) high word |
| 0xc82 | <<_instreth>> | _CSR_INSTRETH_ | r/- | Instruction-retired counter high word |
| 0xc80 | <<_cycleh>> | _CSR_CYCLEH_ | r/- | Cycle counter high word |
| 0xc81 | <<_timeh>> | _CSR_TIMEH_ | r/- | System time (from MTIME) high word |
| 0xc82 | <<_instreth>> | _CSR_INSTRETH_ | r/- | Instruction-retired counter high word |
6+^| **<<_hardware_performance_monitors_hpm>>**
| 0x323 .. 0x33f | <<_mhpmevent, `mhpmevent3`>> .. <<_mhpmevent, `mhpmevent31`>> | _CSR_MHPMEVENT3_ .. _CSR_MHPMEVENT31_ | r/w | Machine performance-monitoring event selector 3..31 | `C`
| 0x323 .. 0x33f | <<_mhpmevent, `mhpmevent3`>> .. <<_mhpmevent, `mhpmevent31`>> | _CSR_MHPMEVENT3_ .. _CSR_MHPMEVENT31_ | r/w | Machine performance-monitoring event selector 3..31 | `X`
| 0xb03 .. 0xb1f | <<_mhpmcounterh, `mhpmcounter3`>> .. <<_mhpmcounterh, `mhpmcounter31`>> | _CSR_MHPMCOUNTER3_ .. _CSR_MHPMCOUNTER31_ | r/w | Machine performance-monitoring counter 3..31 low word |
| 0xb83 .. 0xb9f | <<_mhpmcounterh, `mhpmcounter3h`>> .. <<_mhpmcounterh, `mhpmcounter31h`>> | _CSR_MHPMCOUNTER3H_ .. _CSR_MHPMCOUNTER31H_ | r/w | Machine performance-monitoring counter 3..31 high word |
| 0xc03 .. 0xc1f | <<_hpmcounterh, `hpmcounter3`>> .. <<_hpmcounterh, `hpmcounter31`>> | _CSR_HPMCOUNTER3_ .. _CSR_HPMCOUNTER31_ | r/- | Performance-monitoring counter 3..31 low word |
| 0xc83 .. 0xc9f | <<_hpmcounterh, `hpmcounter3h`>> .. <<_hpmcounter31h, `hpmcounter31h`>> | _CSR_HPMCOUNTER3H_ .. _CSR_HPMCOUNTER31H_ | r/- | Performance-monitoring counter 3..31 high word |
6+^| **<<_machine_counter_setup>>**
| 0x320 | <<_mcountinhibit>> | _CSR_MCOUNTINHIBIT_ | r/w | Machine counter-enable register |
6+^| **<<_machine_information_registers>>**
175,8 → 173,8
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x300 | **Machine status register - low word** | `mstatus`
3+| Reset value: _0x00000020.00000000_
| 0x300 | **Machine status register** | `mstatus`
3+| Reset value: _0x00000000_
3+| The `mstatus` CSR is compatible to the RISC-V specifications. It shows the CPU's current execution state.
The following bits are implemented (all remaining bits are always zero and are read-only).
|======
188,7 → 186,7
| Bit | Name [C] | R/W | Function
| 12:11 | _CSR_MSTATUS_MPP_H_ : _CSR_MSTATUS_MPP_L_ | r/w | Previous machine privilege level, 11 = machine (M) level, 00 = user (U) level
| 7 | _CSR_MSTATUS_MPIE_ | r/w | Previous machine global interrupt enable flag state
| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag
| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag
|=======================
 
When entering an exception/interrupt, the `MIE` flag is copied to `MPIE` and cleared afterwards. When leaving
228,7 → 226,7
|=======================
 
[TIP]
Information regarding the available RISC-V Z* _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found in the <<_mzext>> CSR.
Information regarding the implemented RISC-V `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found in the <<_mzext>> CSR.
 
 
:sectnums!:
290,9 → 288,8
3+| Reset value: _UNDEFINED_
3+| The `mcounteren` CSR is compatible to the RISC-V specifications. The bits of this CSR define which
counter/timer CSR can be accessed (read) from code running in a less-privileged modes. For example,
if user-level code tries to read from a counter/timer CSR without having access, the illegal instruction
exception is raised. The following table shows all implemented bits (all remaining bits are always zero and
are read-only). If user mode in not implemented (_CPU_EXTENSION_RISCV_U_ = _false_) all bits of the
if user-level code tries to read from a counter/timer CSR without enabled access, an illegal instruction
exception is raised. If user mode in not implemented (_CPU_EXTENSION_RISCV_U_ = _false_) all bits of the
`mcounteren` CSR are tied to zero.
|======
 
301,7 → 298,7
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 31:16 | _CSR_MCOUNTEREN_HPM31_ : _CSR_MCOUNTEREN_HPM3_ | r/w | User-level code is allowed to read `hpmcounter*[h]` CSRs when set
| 31:16 | - | r/- | User-level code is **not** allowed to read HPM counter
| 2 | _CSR_MCOUNTEREN_IR_ | r/w | User-level code is allowed to read `cycle[h]` CSRs when set
| 1 | _CSR_MCOUNTEREN_TM_ | r/w | User-level code is allowed to read `time[h]` CSRs when set
| 0 | _CSR_MCOUNTEREN_CY_ | r/w | User-level code is allowed to read `instret[h]` CSRs when set
481,25 → 478,24
==== (Machine) Counters and Timers
 
[IMPORTANT]
The _CPU_CNT_WIDTH_ generic defines the total size of the CPU's `[m]cycle` and `[m]instret`
The <<_cpu_cnt_width>> generic defines the total size of the CPU's <<_cycleh>> and <<_instreth>>
/ <<_mcycleh>> and <<_minstreth>>
counter CSRs (low and high words combined); the time CSRs are not affected by this generic. Any
configuration with _CPU_CNT_WIDTH_ less than 64 is not RISC-V compliant.
configuration with <<_cpu_cnt_width>> less than 64 is not RISC-V compliant.
 
[IMPORTANT]
If _CPU_CNT_WIDTH_ is less than 64 (the default value) and greater than or equal 32, the according
MSBs of `[m]cycleh` and `[m]instreth` are read-only and always read as zero. This configuration
will also set the _ZXSCNT_ flag in the `mzext` CSR.
 
[IMPORTANT]
will also set the _ZXSCNT_ flag in the <<_mzext>> CSR. +
+
If _CPU_CNT_WIDTH_ is less than 32 and greater than 0, the `[m]cycleh` and `[m]instreth` do not
exist and any access will raise an illegal instruction exception. Furthermore, the according MSBs of
`[m]cycle` and `[m]instret` are read-only and always read as zero. This configuration will also
set the _ZXSCNT_ flag in the `mzext` CSR.
 
[IMPORTANT]
If _CPU_CNT_WIDTH_ is 0, the `[m]cycleh`, `[m]cycle`, `[m]instreth` and `[m]instret` do not
set the _ZXSCNT_ flag in the <<_mzext>> CSR. +
+
If _CPU_CNT_WIDTH_ is 0, <<_cycleh>> and <<_instreth>> / <<_mcycleh>> and <<_minstreth>> do not
exist and any access will raise an illegal instruction exception. This configuration will also set the
_ZXNOCNT_ flag in the `mzext` CSR.
_ZXNOCNT_ flag in the <<_mzext>> CSR.
 
 
:sectnums!:
580,19 → 576,24
:sectnums:
==== Hardware Performance Monitors (HPM)
 
The available hardware performance logic is configured via the _HPM_NUM_CNTS_ top entity generic.
_HPM_NUM_CNTS_ defines the number of implemented performance monitors and thus, the availability of the
according `[m]hpmcounter*[h]` and `mhpmevent*` CSRs.
The available hardware performance logic is configured via the <<_hpm_num_cnts>> top entity generic,
which defines the number of implemented performance monitors and thus, the availability of the
according `mhpmcounter*[h]` and `mhpmevent*` CSRs.
 
The total size of the HPMs can be configured before synthesis via the _HPM_CNT_WIDTH_ generic (0..64-bit).
[IMPORTANT]
The HPM system only implements machine-mode access. Hence, `hpmcounter*[h]` CSR are not implemented
and any access (even) from machine mode will raise an exception. Furthermore, the according bits of <<_mcounteren>>
used to configure user-mode access to `hpmcounter*[h]` are hard-wired to zero.
 
The total counter size of the HPMs can be configured before synthesis via the <<_hpm_cnt_width>> generic (0..64-bit).
 
[TIP]
If trying to access an HPM-related CSR beyond _HPM_NUM_CNTS_ **no illegal instruction exception is
If trying to access an HPM-related CSR beyond <<_hpm_num_cnts>> **no illegal instruction exception is
triggered**. The according CSRs are read-only (writes are ignored) and always return zero.
 
[NOTE]
The total LSB-aligned HPM counter size (low word CSR + high word CSR) is defined via the
_HPM_CNT_WIDTH_ generic (0..64-bit). If _HPM_CNT_WIDTH_ is less than 64, all unused MSB-aligned
<<_hpm_num_cnts>> generic (0..64-bit). If <<_hpm_num_cnts>> is less than 64, all unused MSB-aligned
bits are hardwired to zero.
 
 
620,42 → 621,26
[cols="^1,<3,^1,<5"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event
| 0 | _HPMCNT_EVENT_CY_ | r/w | active clock cycle (not in sleep)
| 1 | - | r/- | _not implemented, always read as zero_
| 2 | _HPMCNT_EVENT_IR_ | r/w | retired instruction
| 3 | _HPMCNT_EVENT_CIR_ | r/w | retired cmpressed instruction
| Bit | Name [C] | R/W | Event
| 0 | _HPMCNT_EVENT_CY_ | r/w | active clock cycle (not in sleep)
| 1 | - | r/- | _not implemented, always read as zero_
| 2 | _HPMCNT_EVENT_IR_ | r/w | retired instruction
| 3 | _HPMCNT_EVENT_CIR_ | r/w | retired compressed instruction
| 4 | _HPMCNT_EVENT_WAIT_IF_ | r/w | instruction fetch memory wait cycle (if more than 1 cycle memory latency)
| 5 | _HPMCNT_EVENT_WAIT_II_ | r/w | instruction issue pipeline wait cycle (if more than 1 cycle latency), caused by pipelines flushes (like taken branches)
| 6 | _HPMCNT_EVENT_WAIT_MC_ | r/w | multi-cycle ALU operation wait cycle
| 7 | _HPMCNT_EVENT_LOAD_ | r/w | load operation
| 8 | _HPMCNT_EVENT_STORE_ | r/w | store operation
| 7 | _HPMCNT_EVENT_LOAD_ | r/w | load operation
| 8 | _HPMCNT_EVENT_STORE_ | r/w | store operation
| 9 | _HPMCNT_EVENT_WAIT_LS_ | r/w | load/store memory wait cycle (if more than 1 cycle memory latency)
| 10 | _HPMCNT_EVENT_JUMP_ | r/w | unconditional jump
| 11 | _HPMCNT_EVENT_BRANCH_ | r/w | conditional branch (taken or not taken)
| 10 | _HPMCNT_EVENT_JUMP_ | r/w | unconditional jump
| 11 | _HPMCNT_EVENT_BRANCH_ | r/w | conditional branch (taken or not taken)
| 12 | _HPMCNT_EVENT_TBRANCH_ | r/w | taken conditional branch
| 13 | _HPMCNT_EVENT_TRAP_ | r/w | entered trap
| 13 | _HPMCNT_EVENT_TRAP_ | r/w | entered trap
| 14 | _HPMCNT_EVENT_ILLEGAL_ | r/w | illegal instruction exception
|=======================
 
 
:sectnums!:
===== **`hpmcounter[h]`**
 
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0xc03 - 0xc1f | **Hardware performance monitor - counter low** | `hpmcounter3` - `hpmcounter31`
| 0xc83 - 0xc9f | **Hardware performance monitor - counter high** | `hpmcounter3h` - `hpmcounter31h`
3+| Reset value: _UNDEFINED_
3+| The `hpmcounter*[h]` CSRs are compatible to the RISC-V specifications. These CSRs provide the lower/upper 32-bit
of arbitrary event counters (64-bit). These CSRs are read-only and provide a showed copy of the according
`mhpmcounter*[h]` CSRs. The event(s) that trigger an increment of theses counters are selected via the according
`mhpmevent*` CSRs.
|======
 
 
:sectnums!:
===== **`mhpmcounter[h]`**
 
[cols="4,27,>7"]
665,9 → 650,8
| 0xb83 - 0xb9f | **Machine hardware performance monitor - counter high** | `mhpmcounter3h` - `mhpmcounter31h`
3+| Reset value: _UNDEFINED_
3+| The `mhpmcounter*[h]` CSRs are compatible to the RISC-V specifications. These CSRs provide the lower/upper 32-
bit of arbitrary event counters (64-bit). The `mhpmcounter*[h]` CSRs can also be written and are copied to the
`hpmcounter*[h]` CSRs. The event(s) that trigger an increment of theses counters are selected via the according
`mhpmevent*` CSRs.
bit of arbitrary event counters. The event(s) that trigger an increment of theses counters are selected via the according
`mhpmevent*` CSRs bits.
|======
 
 
782,12 → 766,13
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event
| 0 | _CPU_MZEXT_ZICSR_ | r/- | `Zicsr` extensions available (enabled via _CPU_EXTENSION_RISCV_Zicsr_ generic)
| 1 | _CPU_MZEXT_ZIFENCEI_ | r/- | `Zifencei` extensions available (enabled via _CPU_EXTENSION_RISCV_Zifencei_ generic)
| 5 | _CPU_MZEXT_ZFINX_ | r/- | `Zfinx` extensions available (enabled via _CPU_EXTENSION_RISCV_Zfinx_ generic)
| 6 | _CPU_MZEXT_ZXSCNT_ | r/- | custom extension: "Small CPU counters": `cycle[h]` & `instret[h]` CSRs have less than 64-bit when set (when _CPU_CNT_WIDTH_ generic is less than 64)
| 7 | _CPU_MZEXT_ZXNOCNT_ | r/- | custom extension: "NO CPU counters": `cycle[h]` & `instret[h]` CSRs are not available at all when set (when _CPU_CNT_WIDTH_ generic is 0)
| 8 | _CSR_MZEXT_PMP_ | r/- | PMP (physical memory protection) extension available (_PMP_NUM_REGIONS_ generic > 0)
| 9 | _CSR_MZEXT_HPM_ | r/- | HPM (hardware performance monitors) extension available (_HPM_NUM_CNTS_ generic > 0)
| 10 | _CSR_MZEXT_DEBUGMODE_ | r/- | RISC-V "CPU debug mode" extension available (enabled via _CPU_EXTENSION_RISCV_DEBUG_ generic)
| 0 | _CPU_MZEXT_ZICSR_ | r/- | `Zicsr` extensions available (enabled via <<_cpu_extension_riscv_zicsr>> generic)
| 1 | _CPU_MZEXT_ZIFENCEI_ | r/- | `Zifencei` extensions available (enabled via <<_cpu_extension_riscv_zifencei>> generic)
| 2 | _CPU_MZEXT_ZMMUL_ | r/- | `Zmmul` extensions available (enabled via <<_cpu_extension_riscv_zmmul>> generic)
| 5 | _CPU_MZEXT_ZFINX_ | r/- | `Zfinx` extensions available (enabled via <<_cpu_extension_riscv_zfinx>> generic)
| 6 | _CPU_MZEXT_ZXSCNT_ | r/- | custom extension: "Small CPU counters": `cycle[h]` & `instret[h]` CSRs have less than 64-bit when set (when <<_cpu_cnt_width>> generic is less than 64)
| 7 | _CPU_MZEXT_ZXNOCNT_ | r/- | custom extension: "NO CPU counters": `cycle[h]` & `instret[h]` CSRs are not available at all when set (when <<_cpu_cnt_width>> generic is 0)
| 8 | _CSR_MZEXT_PMP_ | r/- | PMP (physical memory protection) extension available (<<_pmp_num_regions>> generic > 0)
| 9 | _CSR_MZEXT_HPM_ | r/- | HPM (hardware performance monitors) extension available (<<_hpm_num_cnts>> generic > 0)
| 10 | _CSR_MZEXT_DEBUGMODE_ | r/- | RISC-V "CPU debug mode" extension available (enabled via <<_cpu_top_entity_generics,_CPU_EXTENSION_RISCV_DEBUG_>> generic)
|=======================
/datasheet/index.adoc
1,18 → 1,9
= The NEORV32 RISC-V Processor: Datasheet
include::../attrs.adoc[]
:title: [Datasheet] The NEORV32 RISC-V Processor
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.6.0
:doctype: book
:sectnums:
:icons: font
:imagesdir: img
:stem:
:reproducible:
:listing-caption: Listing
:toc: left
:toclevels: 4
:title-logo-image: neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
:favicon: img/icon.png
 
/datasheet/main.adoc
1,21 → 1,6
= The NEORV32 RISC-V Processor: Datasheet
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.6.0
:doctype: book
:sectnums:
:icons: image
:iconsdir: ../icons
:imagesdir: ../figures
:stem:
:reproducible:
:listing-caption: Listing
:toc: macro
:toclevels: 4
:title-logo-image: image:neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
// Uncomment next line to set page size (default is A4)
//:pdf-page-size: Letter
include::../attrs.adoc[]
include::../attrs.main.adoc[]
 
 
<<<
/datasheet/on_chip_debugger.adoc
16,6 → 16,11
* compatible to the https://github.com/riscv/riscv-openocd[RISC-V port of OpenOCD];
pre-built binaries can be obtained for example from https://www.sifive.com/software[SiFive]
 
.OCD Security Note
[IMPORTANT]
Access via the OCD is _always authenticated_ (`dmstatus.authenticated` == `1`). Hence, the
_whole system_ can always be accessed via the on-chip debugger.
 
[NOTE]
The OCD requires additional resources for implementation and _might_ also increase the critical path resulting in less
performance. If the OCD is not really required for the _final_ implementation, it can be disabled and thus,
65,7 → 70,7
The debug transport module (VHDL module: `rtl/core/neorv32_debug_dtm.vhd`) provides a JTAG test access port (TAP).
The DTM is the first entity in the debug system, which connects and external debugger via JTAG to the next debugging
entity: the debug module (DM).
External access is provided by the following top-level ports.
External JTAG access is provided by the following top-level ports.
 
.JTAG top level signals
[cols="^2,^2,^2,<8"]
/datasheet/overview.adoc
1,11 → 1,6
:sectnums:
== Overview
 
[quote]
____
RISC-V - Instruction Sets Want To Be Free!
____
 
The NEORV32footnote:[Pronounced "neo-R-V-thirty-two" or "neo-risc-five-thirty-two" in its long form.] is an open-source
RISC-V compatible processor system that is intended as *ready-to-go* auxiliary processor within a larger SoC
designs or as stand-alone custom / customizable microcontroller.
21,61 → 16,114
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
 
[TIP]
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
that provides hands-on tutorial to get you started.
 
[TIP]
The project's change log is available in https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md[CHANGELOG.md]
in the root directory of the NEORV32 repository. Please also check out the <<_legal>> section.
 
 
**Structure**
 
:sectnums!:
=== Structure
* <<_neorv32_processor_soc>>
* <<_neorv32_central_processing_unit_cpu>>
* <<_on_chip_debugger_ocd>>
* <<_software_framework>>
 
Chapter <<_neorv32_processor_soc>>
[TIP]
Links in this document are <<_overview,highlighted>>.
 
* top entity signals and configuration generics, address space layout, internal peripheral devices and interrupts, internal
memories and caches, internal bus architecture, external bus interface
 
Chapter <<_neorv32_central_processing_unit_cpu>>
 
* instruction set(s) and extensions, instruction timing, control ans status registers, traps, exceptions and interrupts,
hardware execution safety, native bus interface
<<<
// ####################################################################################################################
:sectnums:
=== Rationale
 
Chapter <<_on_chip_debugger_ocd>>
**Why did you make this?**
 
* on-chip debugging compatible to the "Minimal RISC-V Debug Specification Version 0.13.2".
I am fascinated by processor and CPU architecture design: it is the magic frontier where software meets hardware.
This project has started as something like a _journey_ into this magic realm to understand how things actually work
down on this very low level.
 
Chapter <<_software_framework>>
But there is more! When I started to dive into the emerging RISC-V ecosystem I felt overwhelmed by the complexity.
As a beginner it is hard to get an overview - especially when you want to setup a minimal platform to tinker with:
Which core to use? How to get the right toolchain? What features do I need? How does the booting work? How do I
create an actual executable? How to get that into the hardware? How to customize things? **_Where to start???_**
 
* core libraries, bootloader, makefiles, runtime environment
So this project aims to provides a _simple to understand_ and _easy to use_ yet _powerful_ and _flexible_ platform
that targets FPGA and RISC-V beginners as well as advanced users. Join me and us on this journey! 🙃
 
Chapter <<_lets_get_it_started>>
 
* toolchain installation and setup, hardware setup, software setup, application compilation, simulating the processor
debugging using the on-chip debugger
**Why a _soft_-core processor?**
 
[TIP]
Links in this document are <<_structure,highlighted>>.
As a matter of fact soft-core processors _cannot_ compete with discrete or FPGA hard-macro processors in terms
of performance, energy and size. But they do fill a niche in FPGA design space. For example, soft-core processors
allow to implement the _control flow part_ of certain applications (like communication protocol handling) using
software like plain C. This provides high flexibility as software can be easily changed, re-compiled and
re-uploaded again.
 
Furthermore, the concept of flexibility applies to all aspects of a soft-core processor. The user can add
_exactly_ the features that are required by the application: additional memories, custom interfaces, specialized
IP and even user-defined instructions.
 
 
<<<
**Why RISC-V?**
 
[quote, RISC-V International, https://riscv.org/about/]
____
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration.
____
 
I love the idea of open-source. **Knowledge can help best if it is freely available.**
While open-source has already become quite popular in _software_, hardware projects still need to catch up.
Admittedly, there has been quite a development, but mainly in terms of _platforms_ and _applications_ (so
schematics, PCBs, etc.). Although processors and CPUs are the heart of almost every digital system, having a true
open-source silicon is still a rarity. RISC-V aims to change that. Even it is _just one approach_, it helps paving
the road for future development.
 
Furthermore, I welcome the community aspect of RISC-V. The ISA and everything beyond is developed with direct
contact to the community: this includes businesses and professionals but also hobbyist, amateurs and people
that are just curious. Everyone can join discussions and contribute to RISC-V in their very own way.
 
Finally, I really like the RISC-V ISA itself. It aims to be a clean, orthogonal and "intuitive" ISA that
resembles with the basic concepts of _RISC_: simple yet effective.
 
 
**Yet another RISC-V core? What makes it special?**
 
The NEORV32 is not based on another RISC-V core. It was build entirely from ground up (just following the official
ISA specs) having a different design goal in mind. The project does not intend to replace certain RISC-V cores or
just beat existing ones like https://github.com/SpinalHDL/VexRiscv[VexRISC] in terms of performance or
https://github.com/olofk/serv[SERV] in terms of size.
 
The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance
vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability,
RISC-V compatibility, _customization_ and _ease of use_. See the <<_project_key_features>> below.
 
 
// ####################################################################################################################
:sectnums:
=== Project Key Features
 
* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU - passes the official RISC-V architecture tests
* official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]
* optional RISC-V CPU extensions:
** `A` - atomic memory access operations
** `B` - bit-manipulation instructions
** `C` - 16-bit compressed instructions
** `E` - embedded CPU version (reduced register file size)
** `M` - integer multiplication and division hardware
** `U` - less-privileged _user_ mode
** `Zfinx` - single-precision floating-point unit
** `Zicsr` - control and status register access (privileged architecture)
** `Zifencei` - instruction stream synchronization
** `PMP` - physical memory protection
** `HPM` - hardware performance monitors
* open-source and documented; including user guides to get started
* completely described in behavioral, platform-independent VHDL (yet platform-optimized modules are provided)
* fully synchronous design, no latches, no gated clocks
* small hardware footprint and high operating frequency for easy integration
* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU
** RISC-V compatibility: passes the official architecture tests
** base architecture + privileged architecture (optional) + ISA extensions (optional)
** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...)
** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]
* **NEORV32 Processor (SoC)**: highly-configurable full-scale microcontroller-like processor system
** based on the NEORV32 CPU
** optional serial interfaces (UARTs, TWI, SPI)
** optional timers and counters (WDT, MTIME)
** optional general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface
** optional embedded memories / caches for data, instructions and bootloader
** optional external memory interface (Wishbone / AXI4-Lite) and stream link interface (AXI4-Stream) for custom connectivity
** on-chip debugger compatible with OpenOCD and gdb
* **Software framework**
** GCC-based toolchain - prebuilt toolchains available; application compilation based on GNU makefiles
** internal bootloader with serial user interface
83,18 → 131,12
** runtime environment and several example programs
** doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/sw/files.html
** FreeRTOS port + demos available
* **NEORV32 Processor**: highly-configurable full-scale microcontroller-like processor system / SoC based on the NEORV32 CPU with optional standard peripherals:
** serial interfaces (UARTs, TWI, SPI)
** timers and counters (WDT, MTIME, NCO)
** general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface
** embedded memories / caches for data, instructions and bootloader
** external memory interface (Wishbone or AXI4-Lite)
* on-chip debugger compatible with OpenOCD and gdb
* fully synchronous design, no latches, no gated clocks
* completely described in behavioral, platform-independent VHDL
* small hardware footprint and high operating frequency
 
[TIP]
For more in-depth details regarding the feature provided by he hardware see the according sections:
<<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>.
 
 
<<<
// ####################################################################################################################
:sectnums:
101,23 → 143,24
=== Project Folder Structure
 
...................................
neorv32 - Project home folder
neorv32 - Project home folder
├.ci - Scripts for continuous integration
├boards - Example setups for various FPGA boards
├setups - Example setups for various FPGA boards and toolchains
│└...
├CHANGELOG.md - Project change log
├docs - Project documentation
│├doxygen_build - Software framework documentation (generated by doxygen)
│├src_adoc - AsciiDoc sources for this document
│├references - Data sheets and RISC-V specs.
│└figures - Figures and logos
│├doxygen_build - Software framework documentation (generated by doxygen)
│├src_adoc - AsciiDoc sources for this document
│├references - Data sheets and RISC-V specs.
│└figures - Figures and logos
├riscv-arch-test - Port files for the official RISC-V architecture tests
├rtl - VHDL sources
│├core - Sources of the CPU & SoC
│└top_templates - Alternate/additional top entities/wrappers
│├core - Sources of the CPU & SoC
│└templates - Alternate/additional top entities/wrappers
│ ├processor - Processor wrappers
│ └system - System wrappers for advanced connectivity
├sim - Simulation files
│├ghdl - Simulation scripts for GHDL
│├rtl_modules - Processor modules for simulation-only
│└vivado - Pre-configured Xilinx ISIM waveform
│└rtl_modules - Processor modules for simulation-only
└sw - Software framework
├bootloader - Sources and scripts for the NEORV32 internal bootloader
├common - Linker script and crt0.S start-up code
149,39 → 192,44
files, like alternative top entities, can be assigned to any library.
 
...................................
neorv32_top.vhd - NEORV32 Processor top entity
├neorv32_boot_rom.vhd - Bootloader ROM
│└neorv32_bootloader_image.vhd - Bootloader boot ROM memory image
├neorv32_busswitch.vhd - Processor bus switch for CPU buses (I&D)
├neorv32_bus_keeper.vhd - Processor-internal bus monitor
├neorv32_icache.vhd - Processor-internal instruction cache
├neorv32_cfs.vhd - Custom functions subsystem
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_package.vhd - Processor/CPU main VHDL package file
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
│├neorv32_cpu_bus.vhd - Bus interface unit + physical memory protection
│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs
││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder
│├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx extension)
│├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
│└neorv32_cpu_regfile.vhd - Data register file
├neorv32_debug_dm.vhd - on-chip debugger: debug module
├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module
├neorv32_dmem.vhd - Processor-internal data memory
├neorv32_gpio.vhd - General purpose input/output port unit
├neorv32_imem.vhd - Processor-internal instruction memory
│└neor32_application_image.vhd - IMEM application initialization image
├neorv32_mtime.vhd - Machine system timer
├neorv32_nco.vhd - Numerically-controlled oscillator
├neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface
├neorv32_pwm.vhd - Pulse-width modulation controller
├neorv32_spi.vhd - Serial peripheral interface controller
├neorv32_sysinfo.vhd - System configuration information memory
├neorv32_trng.vhd - True random number generator
├neorv32_twi.vhd - Two wire serial interface controller
├neorv32_uart.vhd - Universal async. receiver/transmitter
├neorv32_wdt.vhd - Watchdog timer
└neorv32_wb_interface.vhd - External (Wishbone) bus interface
neorv32_top.vhd - NEORV32 Processor top entity
├neorv32_fifo.vhd - General purpose FIFO component
├neorv32_package.vhd - Processor/CPU main VHDL package file
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.)
││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor
│├neorv32_cpu_bus.vhd - Bus interface + physical memory protection
│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs
││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder
│└neorv32_cpu_regfile.vhd - Data register file
├neorv32_boot_rom.vhd - Bootloader ROM
│└neorv32_bootloader_image.vhd - Bootloader boot ROM memory image
├neorv32_busswitch.vhd - Processor bus switch for CPU buses (I&D)
├neorv32_bus_keeper.vhd - Processor-internal bus monitor
├neorv32_icache.vhd - Processor-internal instruction cache
├neorv32_cfs.vhd - Custom functions subsystem
├neorv32_debug_dm.vhd - on-chip debugger: debug module
├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module
├neorv32_dmem.vhd - Processor-internal data memory
├neorv32_gpio.vhd - General purpose input/output port unit
├neorv32_imem.vhd - Processor-internal instruction memory
│└neor32_application_image.vhd - IMEM application initialization image
├neorv32_mtime.vhd - Machine system timer
├neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface
├neorv32_pwm.vhd - Pulse-width modulation controller
├neorv32_spi.vhd - Serial peripheral interface controller
├neorv32_sysinfo.vhd - System configuration information memory
├neorv32_trng.vhd - True random number generator
├neorv32_twi.vhd - Two wire serial interface controller
├neorv32_uart.vhd - Universal async. receiver/transmitter
├neorv32_wdt.vhd - Watchdog timer
├neorv32_wishbone.vhd - External (Wishbone) bus interface
└neorv32_xirq.vhd - External interrupt controller
...................................
 
 
208,15 → 256,15
[options="header",grid="rows"]
|=======================
| CPU | LEs | FFs | MEM bits | DSPs | _f~max~_
| `rv32i` | 980 | 409 | 1024 | 0 | 123 MHz
| `rv32i_Zicsr` | 1835 | 856 | 1024 | 0 | 124 MHz
| `rv32im_Zicsr` | 2443 | 1134 | 1024 | 0 | 124 MHz
| `rv32i` | 980 | 409 | 1024 | 0 | 125 MHz
| `rv32i_Zicsr` | 1835 | 856 | 1024 | 0 | 125 MHz
| `rv32im_Zicsr` | 2443 | 1134 | 1024 | 0 | 125 MHz
| `rv32imc_Zicsr` | 2669 | 1149 | 1024 | 0 | 125 MHz
| `rv32imac_Zicsr` | 2685 | 1156 | 1024 | 0 | 124 MHz
| `rv32imac_Zicsr` + `debug_mode` | 3058 | 1225 | 1024 | 0 | 120 MHz
| `rv32imac_Zicsr` + `u` | 2698 | 1162 | 1024 | 0 | 124 MHz
| `rv32imac_Zicsr_Zifencei` + `u` | 2715 | 1162 | 1024 | 0 | 122 MHz
| `rv32imac_Zicsr_Zifencei_Zfinx` + `u` | 4004 | 1812 | 1024 | 7 | 121 MHz
| `rv32imac_Zicsr` | 2685 | 1156 | 1024 | 0 | 125 MHz
| `rv32imac_Zicsr` + `debug_mode` | 3058 | 1225 | 1024 | 0 | 125 MHz
| `rv32imac_Zicsr` + `u` | 2698 | 1162 | 1024 | 0 | 125 MHz
| `rv32imac_Zicsr_Zifencei` + `u` | 2715 | 1162 | 1024 | 0 | 125 MHz
| `rv32imac_Zicsr_Zifencei_Zfinx` + `u` | 4004 | 1812 | 1024 | 7 | 118 MHz
|=======================
 
 
226,7 → 274,7
[cols="<2,<8"]
[grid="topbot"]
|=======================
| Hardware version: | `1.5.5.9`
| Hardware version: | `1.5.7.8`
| Top entity: | `rtl/core/neorv32_top.vhd`
|=======================
 
235,27 → 283,28
[options="header",grid="rows"]
|=======================
| Module | Description | LEs | FFs | MEM bits | DSPs
| Boot ROM | Bootloader ROM (4kB) | 3 | 1 | 32768 | 0
| **BUSKEEPER** | Processor-internal bus monitor | 11 | 6 | 0 | 0
| **BUSSWITCH** | Bus mux for CPU instr. and data interface | 49 | 8 | 0 | 0
| CFS | Custom functions subsystem | - | - | - | -
| DMEM | Processor-internal data memory (8kB) | 18 | 2 | 65536 | 0
| Boot ROM | Bootloader ROM (4kB) | 2 | 1 | 32768 | 0
| **BUSKEEPER** | Processor-internal bus monitor | 9 | 6 | 0 | 0
| **BUSSWITCH** | Bus mux for CPU instr. and data interface | 63 | 8 | 0 | 0
| CFS | Custom functions subsystemfootnote:[Resource utilization depends on actually implemented custom functionality.] | - | - | - | -
| DMEM | Processor-internal data memory (8kB) | 19 | 2 | 65536 | 0
| DM | On-chip debugger - debug module | 493 | 240 | 0 | 0
| DTM | On-chip debugger - debug transfer module (JTAG) | 254 | 218 | 0 | 0
| GPIO | General purpose input/output ports | 67 | 65 | 0 | 0
| iCACHE | Instruction cache (1x4 blocks, 256 bytes per block) | 220 | 154 | 8192 | 0
| IMEM | Processor-internal instruction memory (16kB) | 6 | 2 | 131072 | 0
| MTIME | Machine system timer | 289 | 200 | 0 | 0
| NCO | Numerically-controlled oscillator | 254 | 226 | 0 | 0
| NEOLED | Smart LED Interface (NeoPixel/WS28128) [4xFIFO] | 347 | 309 | 0 | 0
| GPIO | General purpose input/output ports | 134 | 161 | 0 | 0
| iCACHE | Instruction cache (1x4 blocks, 256 bytes per block) | 2 21| 156 | 8192 | 0
| IMEM | Processor-internal instruction memory (16kB) | 13 | 2 | 131072 | 0
| MTIME | Machine system timer | 319 | 167 | 0 | 0
| NEOLED | Smart LED Interface (NeoPixel/WS28128) [4xFIFO] | 342 | 307 | 0 | 0
| SLINK | Stream link interface (4 links, FIFO_depth=1) | 345 | 313 | 0 | 0
| PWM | Pulse_width modulation controller (4 channels) | 71 | 69 | 0 | 0
| SPI | Serial peripheral interface | 138 | 124 | 0 | 0
| **SYSINFO** | System configuration information memory | 10 | 10 | 0 | 0
| TRNG | True random number generator | 132 | 105 | 0 | 0
| TWI | Two-wire interface | 77 | 44 | 0 | 0
| UART0/1 | Universal asynchronous receiver/transmitter 0/1 | 176 | 132 | 0 | 0
| WDT | Watchdog timer | 60 | 45 | 0 | 0
| WISHBONE | External memory interface | 129 | 104 | 0 | 0
| SPI | Serial peripheral interface | 148 | 127 | 0 | 0
| **SYSINFO** | System configuration information memory | 14 | 11 | 0 | 0
| TRNG | True random number generator | 89 | 76 | 0 | 0
| TWI | Two-wire interface | 77 | 43 | 0 | 0
| UART0/1 | Universal asynchronous receiver/transmitter 0/1 | 183 | 132 | 0 | 0
| WDT | Watchdog timer | 53 | 43 | 0 | 0
| WISHBONE | External memory interface | 114 | 110 | 0 | 0
| XIRQ | External interrupt controller (32 channels) | 241 | 201 | 0 | 0
|=======================
 
 
264,38 → 313,10
==== Exemplary Setups
 
[TIP]
Exemplary setups for different technologies and various FPGA boards can be found in the `boards` folder
(https://github.com/stnolting/neorv32/tree/master/boards).
Check out the `setups` folder (@GitHub: https://github.com/stnolting/neorv32/tree/master/setups),
which provides several demo setups for various FPGA boards and toolchains.
 
The following table shows exemplary NEORV32 processor implementation results for different FPGA
platforms. Most setups use the default peripheral configuration (like no CFS, no caches and no
TRNG), no external memory interface and only internal instruction and data memories (IMEM uses 16kB
and DMEM uses 8kB memory space).
 
[cols="<2,<8"]
[grid="topbot"]
|=======================
| Hardware version: | `1.4.9.0`
|=======================
 
.Hardware utilization for exemplary NEORV32 setups
[cols="<4,<5,<4,<4,<3,<3,<3,<4,<4,<3"]
[options="header",grid="rows"]
|=======================
| Vendor | FPGA | Board | Toolchain | CPU | LUT | FF | DSP | Memory | _f_
| Intel | Cyclone IV `EP4CE22F17-C6N` | Terasic DE0-Nano | Quartus Prime Lite 20.1 | `rv32imcu_Zicsr_Zifencei` + `PMP` | 3813 (17%) | 1890 (8%) | 0 (0%) | Memory bits: 231424 (38%) | 119 MHz
| Lattice | iCE40 UltraPlus `iCE40UP5KSG48I` | Upduino v3.0 | Radiant 2.1 | `rv32icu_Zicsr_Zifencei` | 5123 (97%) | 1972 (37%) | 0 (0%) | EBR: 12 (40%) SPRAM: 4 (100%) | 24 MHz
| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | `rv32imcu_Zicsr_Zifencei` + `PMP` | 2465 (12%) | 1912 (5%) | 0 (0%) | BRAM: 8 (16%) | 100 MHz
|=======================
 
**Notes**
 
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DEMEM (each 64kB).
* The Upduino and the Arty board have on-board SPI flash memories for storing the FPGA configuration. These device can also be used by the default NEORV32 bootloader to store and automatically boot an application program after reset (both tested successfully).
* The setups with PMP implement 2 regions with a minimal granularity of 64kB.
* No HPM counters are used.
 
 
<<<
// ####################################################################################################################
:sectnums:
361,7 → 382,7
 
The average CPI is computed by dividing the total number of required clock cycles (only the timed core to
avoid distortion due to IO wait cycles) by the number of executed instructions (`[m]instret[h]` CSRs). The
executables were generated using optimization -O3.
executables were generated using optimization `-O3`.
 
[cols="<2,<8"]
[grid="topbot"]
/datasheet/soc.adoc
16,14 → 16,15
* _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>, <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,**UART1**>>) with optional hardware flow control (RTS/CTS)
* _optional_ 8/16/24/32-bit serial peripheral interface controller (<<_serial_peripheral_interface_controller_spi,**SPI**>>) with 8 dedicated CS lines
* _optional_ two wire serial interface controller (<<_two_wire_serial_interface_controller_twi,**TWI**>>), compatible to the I²C standard
* _optional_ general purpose parallel IO port (<<_general_purpose_input_and_output_port_gpio,**GPIO**>>), 32xOut, 32xIn
* _optional_ general purpose parallel IO port (<<_general_purpose_input_and_output_port_gpio,**GPIO**>>), 64xOut, 64xIn
* _optional_ 32-bit external bus interface, Wishbone b4 / AXI4-Lite compatible (<<_processor_external_memory_interface_wishbone_axi4_lite,**WISHBONE**>>)
* _optional_ 32-bit stream link interface with up to 8 independent links, AXI4-Stream compatible (<<_stream_link_interface_slink,**SLINK**>>)
* _optional_ watchdog timer (<<_watchdog_timer_wdt,**WDT**>>)
* _optional_ PWM controller with up to 60 channels & 8-bit duty cycle resolution (<<_pulse_width_modulation_controller_pwm,**PWM**>>)
* _optional_ ring-oscillator-based true random number generator (<<_true_random_number_generator_trng,**TRNG**>>)
* _optional_ custom functions subsystem for custom co-processor extensions (<<_custom_functions_subsystem_cfs,**CFS**>>)
* _optional_ numerically-controlled oscillator (<<_numerically_controlled_oscillator_nco,**NCO**>>) with 3 independent channels
* _optional_ NeoPixel(TM)/WS2812-compatible smart LED interface (<<_smart_led_interface_neoled,**NEOLED**>>)
* _optional_ external interrupt controller with up to 32 channels (<<_external_interrupt_controller_xirq,**XIRQ**>>)
* _optional_ on-chip debugger with JTAG TAP (<<_on_chip_debugger_ocd,**OCD**>>)
* system configuration information memory to check HW configuration via software (<<_system_configuration_information_memory_sysinfo,**SYSINFO**>>)
 
38,7 → 39,7
 
[TIP]
A wrapper for the NEORV32 Processor setup providing resolved port signals can be found in
`rtl/top_templates/neorv32_top_stdlogic.vhd`.
`rtl/templates/processor/neorv32_ProcessorTop_stdlogic.vhd`.
 
[cols="<3,^2,^2,<11"]
[options="header",grid="rows"]
49,10 → 50,10
| `rstn_i` | 1 | in | global reset, asynchronous, **low-active**
4+^| **JTAG Access Port for <<_on_chip_debugger_ocd>>**
| `jtag_trst_i` | 1 | in | TAP reset, low-active (optionalfootnote:[Pull high if not used.])
| `jtag_tck_i ` | 1 | in | serial clock
| `jtag_tdi_i ` | 1 | in | serial data input
| `jtag_tdo_o ` | 1 | out | serial data outputfootnote:[If the on-chip debugger is not implemented (_ON_CHIP_DEBUGGER_EN_ = false) `jtag_tdi_i` is directly forwarded to `jtag_tdo_o` to maintain the JTAG chain.]
| `jtag_tms_i ` | 1 | in | mode select
| `jtag_tck_i` | 1 | in | serial clock
| `jtag_tdi_i` | 1 | in | serial data input
| `jtag_tdo_o` | 1 | out | serial data outputfootnote:[If the on-chip debugger is not implemented (_ON_CHIP_DEBUGGER_EN_ = false) `jtag_tdi_i` is directly forwarded to `jtag_tdo_o` to maintain the JTAG chain.]
| `jtag_tms_i` | 1 | in | mode select
4+^| **External Bus Interface (<<_processor_external_memory_interface_wishbone_axi4_lite,WISHBONE>>)**
| `wb_tag_o` | 3 | out | tag (access type identifier)
| `wb_adr_o` | 32 | out | destination address
68,9 → 69,16
4+^| **Advanced Memory Control Signals**
| `fence_o` | 1 | out | indicates an executed _fence_ instruction
| `fencei_o` | 1 | out | indicates an executed _fencei_ instruction
4+^| **Stream Link Interface (<<_stream_link_interface_slink,SLINK>>)**
| `slink_tx_dat_o` | 8x32 | out | TX link _n_ data
| `slink_tx_val_o` | 8 | out | TX link _n_ data valid
| `slink_tx_rdy_i` | 8 | in | TX link _n_ allowed to send
| `slink_rx_dat_i` | 8x32 | in | RX link _n_ data
| `slink_rx_val_i` | 8 | in | RX link _n_ data valid
| `slink_rx_rdy_o` | 8 | out | RX link _n_ ready to receive
4+^| **General Purpose Inputs & Outputs (<<_general_purpose_input_and_output_port_gpio,GPIO>>)**
| `gpio_o` | 32 | out | general purpose parallel output
| `gpio_i` | 32 | in | general purpose parallel input
| `gpio_o` | 64 | out | general purpose parallel output
| `gpio_i` | 64 | in | general purpose parallel input
4+^| **Primary Universal Asynchronous Receiver/Transmitter (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>>)**
| `uart0_txd_o` | 1 | out | UART0 serial transmitter
| `uart0_rxd_i` | 1 | in | UART0 serial receiver
94,16 → 102,15
| `cfs_out_o` | 32 | out | custom CFS output signal conduit
4+^| **Pulse-Width Modulation Channels (<<_pulse_width_modulation_controller_pwm,PWM>>)**
| `pwm_o` | 4 | out | pulse-width modulated channels
4+^| **Numerically-Controller Oscillator (<<_numerically_controlled_oscillator_nco,NCO>>)**
| `nco_o` | 3 | out | NCO output channels
4+^| **Smart LED Interface - NeoPixel(TM) compatible (<<_smart_led_interface_neoled,NEOLED>>)**
| `neoled_o` | 1 | out | asynchronous serial data output
4+^| **System time (<<_machine_system_timer_mtime,MTIME>>)**
| `mtime_i` | 64 | in | machine timer time (to `time[h]` CSRs) from _external MTIME_ unit if the processor-internal _MTIME_ unit is NOT implemented
| `mtime_o` | 64 | out | machine timer time from _internal MTIME_ unit if processor-internal _MTIME_ unit IS implemented
4+^| **<<_processor_interrupts>>**
4+^| **<<_processor_interrupts, External Interrupts>>**
| `xirq_i` | 32 | in | external interrupt requests (up to 32 channels)
4+^| **<<_processor_interrupts, CPU Interrupts>>**
| `nm_irq_i` | 1 | in | non-maskable interrupt
| `soc_firq_i` | 6 | in | platform fast interrupt channels (custom)
| `mtime_irq_i` | 1 | in | machine timer interrupt13 (RISC-V)
| `msw_irq_i` | 1 | in | machine software interrupt (RISC-V)
| `mext_irq_i` | 1 | in | machine external interrupt (RISC-V)
133,15 → 140,19
If optional modules (like CPU extensions or peripheral devices) are *not enabled* the according circuitry **will not be synthesized at all**.
Hence, the disabled modules do not increase area and power requirements and do not impact the timing.
 
**CSR Description**
[TIP]
Not all configuration combinations are valid. The processor RTL code provides sanity checks to inform the user
during synthesis/simulation if an invalid combination has been detected.
 
The description of each CSR provides the following summary:
**Generic Description**
 
The description of each generic provides the following summary:
 
.Generic description
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| _Generic_ | _type_ | _default value_
| _Generic name_ | _type_ | _default value_
3+| _Description_
|======
 
164,15 → 175,15
 
 
:sectnums!:
===== _BOOTLOADER_EN_
===== _INT_BOOTLOADER_EN_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **BOOTLOADER_EN** | _boolean_ | true
3+| Implement the boot ROM, pre-initialized with the bootloader image when true. This will also change the
processor's boot address from the beginning of the instruction memory address space (default =
0x00000000) to the base address of the boot ROM. See section <<_bootloader>> for more information.
| **INT_BOOTLOADER_EN** | _boolean_ | true
3+| Implement the processor-internal boot ROM, pre-initialized with the default bootloader image when _true_.
This will also change the processor's boot address from the beginning of the instruction memory address space (default =
0x00000000) to the base address of the boot ROM. See section <<_boot_configuration>> for more information.
|======
 
 
224,6 → 235,7
|======
| **CPU_EXTENSION_RISCV_A** | _boolean_ | false
3+| Implement atomic memory access operations when _true_.
See section <<_a_atomic_memory_access>>.
|======
 
 
235,6 → 247,7
|======
| **CPU_EXTENSION_RISCV_C** | _boolean_ | false
3+| Implement compressed instructions (16-bit) when _true_.
See section <<_c_compressed_instructions>>.
|======
 
 
246,6 → 259,7
|======
| **CPU_EXTENSION_RISCV_E** | _boolean_ | false
3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_.
See section <<_e_embedded_cpu>>.
|======
 
 
257,6 → 271,7
|======
| **CPU_EXTENSION_RISCV_M** | _boolean_ | false
3+| Implement integer multiplication and division instructions when _true_.
See section <<_m_integer_multiplication_and_division>>.
|======
 
 
268,6 → 283,7
|======
| **CPU_EXTENSION_RISCV_U** | _boolean_ | false
3+| Implement less-privileged user mode when _true_.
See section <<_u_less_privileged_user_mode>>.
|======
 
 
278,8 → 294,8
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zfinx** | _boolean_ | false
3+| Implement the 32-bit single-precision floating-point extension (using integer registers) when _true_. For
more information see section <<_zfinx_single_precision_floating_point_operations>>.
3+| Implement the 32-bit single-precision floating-point extension (using integer registers) when _true_.
See section <<_zfinx_single_precision_floating_point_operations>>.
|======
 
 
293,6 → 309,7
3+| Implement the control and status register (CSR) access instructions when true. Note: When this option is
disabled, the complete privileged architecture / trap system will be excluded from synthesis. Hence, no interrupts, no exceptions and
no machine information will be available.
See section <<_zicsr_control_and_status_register_access_privileged_architecture>>.
|======
 
 
303,11 → 320,24
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zifencei** | _boolean_ | false
3+| Implement the instruction fetch synchronization instruction _fence.i_. For example, this option is required
3+| Implement the instruction fetch synchronization instruction `fence.i`. For example, this option is required
for self-modifying code (and/or for i-cache flushes).
See section <<_zifencei_instruction_stream_synchronization>>.
|======
 
 
:sectnums!:
===== _CPU_EXTENSION_RISCV_Zmmul_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zmmul** | _boolean_ | false
3+| Implement integer multiplication-only instructions when _true_. This is a sub-extensions of the `M` extension.
See section <<_zmmul_integer_multiplication>>.
|======
 
 
// ####################################################################################################################
:sectnums:
==== Extension Options
335,25 → 365,13
[frame="all",grid="none"]
|======
| **FAST_SHIFT_EN** | _boolean_ | false
3+| When this generic is enabled the shifter unit of the CPU's ALU is implement as fast barrel shifter (requiring
more hardware resources).
3+| When this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring
more hardware resources). If it is set _false_ the CPU uses a serial shifter that only performs a single bit shift per cycle
(small but slow).
|======
 
 
:sectnums!:
===== _TINY_SHIFT_EN_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **TINY_SHIFT_EN** | _boolean_ | false
3+| If this generic is enabled the shifter unit of the CPU's ALU is implemented as (slow but tiny) single-bit iterative shifter
(requires up to 32 clock cycles for a shift operations, but reducing hardware footprint). The configuration of
this generic is ignored if <<_fast_shift_en>> is _true_.
|======
 
 
:sectnums!:
===== _CPU_CNT_WIDTH_
 
[cols="4,4,2"]
460,18 → 478,6
|======
 
 
:sectnums!:
===== _MEM_INT_IMEM_ROM_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **MEM_INT_IMEM_ROM** | _boolean_ | false
3+| Implement processor-internal instruction memory as read-only memory, which will be initialized with the
application image at synthesis time. Has no effect when _MEM_INT_IMEM_EN_ is _false_.
|======
 
 
// ####################################################################################################################
:sectnums:
==== Internal Data Memory
580,12 → 586,105
[frame="all",grid="none"]
|======
| **MEM_EXT_TIMEOUT** | _natural_ | 255
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception. Set to 0 to disable auto-timeout.
3+| Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception. Set to 0 to disable auto-timeout.
|======
 
 
// ####################################################################################################################
:sectnums:
==== Stream Link Interface
 
See section <<_stream_link_interface_slink>> for more information.
 
 
:sectnums!:
===== _SLINK_NUM_TX_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **SLINK_NUM_TX** | _natural_ | 0
3+| Number of TX (send) links to implement. Valid values are 0..8.
|======
 
 
:sectnums!:
===== _SLINK_NUM_RX_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **SLINK_NUM_RX** | _natural_ | 0
3+| Number of RX (receive) links to implement. Valid values are 0..8.
|======
 
 
:sectnums!:
===== _SLINK_TX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **SLINK_TX_FIFO** | _natural_ | 1
3+| Internal FIFO depth for _all_ implemented TX links. Valid values are 1..32k and have to be a power of two.
|======
 
 
:sectnums!:
===== _SLINK_RX_FIFO_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **SLINK_RX_FIFO** | _natural_ | 1
3+| Internal FIFO depth for _all_ implemented RX links. Valid values are 1..32k and have to be a power of two.
|======
 
 
// ####################################################################################################################
:sectnums:
==== External Interrupt Controller
 
See section <<_external_interrupt_controller_xirq>> for more information.
 
 
:sectnums!:
===== _XIRQ_NUM_CH_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **XIRQ_NUM_CH** | _natural_ | 0
3+| Number of external interrupt channels o implement. Valid values are 0..32.
|======
 
 
:sectnums!:
===== _XIRQ_TRIGGER_TYPE_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **XIRQ_TRIGGER_TYPE** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF
3+| Interrupt trigger type configuration (one bit for each IRQ channel): `0` = level-triggered, '1' = edge triggered.
_XIRQ_TRIGGER_POLARITY_ generic is used to specify the actual level (high/low) or edge (falling/rising).
|======
 
 
:sectnums!:
===== _XIRQ_TRIGGER_POLARITY_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **XIRQ_TRIGGER_POLARITY** | _std_ulogic_vector(31 downto 0)_ | 0xFFFFFFFF
3+| Interrupt trigger polarity configuration (one bit for each IRQ channel): `0` = low-level/falling-edge,
'1' = high-level/rising-edge. _XIRQ_TRIGGER_TYPE_ generic is used to specify the actual type (level or edge).
|======
 
 
// ####################################################################################################################
:sectnums:
==== Processor Peripheral/IO Modules
 
See section <<_processor_internal_modules>> for more information.
635,7 → 734,7
[frame="all",grid="none"]
|======
| **IO_UART1_EN** | _boolean_ | true
3+| Implement secondary universal asynchronous receiver/transmitter (UART1) when _true_.
3+| Implement secondary universal asynchronous receiver/transmitter (UART1) when _true_.
See section <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1>> for more information.
|======
 
746,18 → 845,6
 
 
:sectnums!:
===== _IO_NCO_EN_
 
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **IO_NCO_EN** | _boolean_ | true
3+| Implement numerically-controlled oscillator (NCO) when _true_.
See section <<_numerically_controlled_oscillator_nco>> for more information.
|======
 
 
:sectnums!:
===== _IO_NEOLED_EN_
 
[cols="4,4,2"]
774,53 → 861,98
:sectnums:
=== Processor Interrupts
 
[TIP]
The interrupt request signals have specific `mip` CSR bits (see <<_machine_trap_setup>>), specifc
`mie` CSR bits (see <<_machine_trap_handling>>) and specifc `mcause` CSR trap codes and trap
priorities. For more information (also regarding the signaling protocol) see section <<_traps_exceptions_and_interrupts>>.
The NEORV32 Processor provides several interrupt request signals (IRQs) for custom platform use.
 
**RISC-V Standard Interrupts**
 
:sectnums:
==== RISC-V Standard Interrupts
 
The processor setup features the standard RISC-V interrupt lines for "machine timer interrupt", "machine
software interrupt" and "machine external interrupt". The software and external interrupt lines are available
via the processor's top entity. By default, the timer interrupt is connected to the internal machine timer
MTIME timer unit (<<_machine_system_timer_mtime>>). If this module has not been enabled for
synthesis, the machine timer interrupt is also available via the processor's top entity.
software interrupt" and "machine external interrupt". Their usage is defined by the RISC-V privileged architecture
specifications. However, bare-metal system can also repurpose these interrupts. See CPU section
<<_traps_exceptions_and_interrupts>> for more information.
 
**NEORV32-Specific Fast Interrupt Requests**
[cols="<3,^2,<11"]
[options="header",grid="rows"]
|=======================
| Top signal | Width | Description
| `mtime_irq_i` | 1 | Machine timer interrupt from _processor-external_ MTIME unit. This IRQ is only available if the processor-internal MTIME unit is not used (<<_io_mtime_en>> = false).
| `msw_irq_i` | 1 | Machine software interrupt. This interrupt is used for inter-processor interrupts in multi-core systems. However, it can also be used for any custom purpose.
| `mext_irq_i` | 1 | Machine external interrupt. This interrupt is used for any processor-external interrupt source (like a platform interrupt controller).
|=======================
 
[NOTE]
These IRQs trigger on high-level.
 
 
:sectnums:
==== Non-Maskable Interrupt
 
[cols="<3,^2,<11"]
[options="header",grid="rows"]
|=======================
| Top signal | Width | Description
| `nm_irq_i` | 1 | Non-maskable interrupt.
|=======================
 
The processor features a single non-maskable interrupt source via the `nm_irq_i` top
entity signal that can be used to signal _critical system conditions_. This interrupt source _cannot_ be masked/disabled.
See CPU section <<_traps_exceptions_and_interrupts>> for more information.
 
[NOTE]
This IRQ triggers on high-level.
 
 
:sectnums:
==== Platform External Interrupts
 
[cols="<3,^2,<11"]
[options="header",grid="rows"]
|=======================
| Top signal | Width | Description
| `xirq_i` | up to 32 | External platform interrupts (user-defined).
|=======================
 
The processor provides an optional interrupt controller for up to 32 user-defined external interrupts
(see section <<_external_interrupt_controller_xirq>>). These external IRQs are mapped to a _single_ CPU
fast interrupt request so a software handler is required to differentiate / prioritize these interrupts.
 
[NOTE]
The trigger for these interrupt can be defines via generics. See section
<<_external_interrupt_controller_xirq>> for more information.
 
 
:sectnums:
==== NEORV32-Specific Fast Interrupt Requests
 
As part of the custom/NEORV32-specific CPU extensions, the CPU features 16 fast interrupt request signals
(`FIRQ0` – `FIRQ15`).
(`FIRQ0` – `FIRQ15`). These are used for _processor-internal_ modules only (for example for the communication
interfaces to signal "available incoming data" or "ready to send new data").
 
The fast interrupt request signals are divided into two groups. The FIRQs with higher priority (FIRQ0 –
FIRQ9) are dedicated for processor-internal usage. The FIRQs with lower priority (FIRQ10 – FIRQ15) are
available for custom usage via the processor's top entity signal `soc_firq_i`.
The mapping of the 16 FIRQ channels is shown in the following table (the channel number also corresponds to
the according FIRQ priority; 0 = highest, 15 = lowest):
 
The mapping of the 16 FIRQ channels is shown in the following table (the channel number corresponds to the FIRQ priority):
 
.NEORV32 fast interrupt channel mapping
[cols="^1,<2,<7"]
[options="header",grid="rows"]
|=======================
| Channel | Source | Description
| 0 | _WDT_ | watchdog timeout interrupt
| 1 | _CFS_ | custom functions subsystem (CFS) interrupt (user-defined)
| 2 | _UART0_ (RXD) | UART0 data received interrupt (RX complete)
| 3 | _UART0_ (TXD) | UART0 sending done interrupt (TX complete)
| 4 | _UART1_ (RXD) | UART1 data received interrupt (RX complete)
| 5 | _UART1_ (TXD) | UART1 sending done interrupt (TX complete)
| 6 | _SPI_ | SPI transmission done interrupt
| 7 | _TWI_ | TWI transmission done interrupt
| 8 | _GPIO_ | GPIO input pin-change interrupt
| 9 | _NEOLED_ | NEOLED buffer TX empty / not full interrupt
| 10:15 | `soc_firq_i(5:0)` | Custom platform use; available via processor's top signal
| 0 | <<_watchdog_timer_wdt,WDT>> | watchdog timeout interrupt
| 1 | <<_custom_functions_subsystem_cfs,CFS>> | custom functions subsystem (CFS) interrupt (user-defined)
| 2 | <<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>> | UART0 data received interrupt (RX complete)
| 3 | <<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>> | UART0 sending done interrupt (TX complete)
| 4 | <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,UART1>> | UART1 data received interrupt (RX complete)
| 5 | <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,UART1>> | UART1 sending done interrupt (TX complete)
| 6 | <<_serial_peripheral_interface_controller_spi,SPI>> | SPI transmission done interrupt
| 7 | <<_two_wire_serial_interface_controller_twi,TWI>> | TWI transmission done interrupt
| 8 | <<_external_interrupt_controller_xirq,XIRQ>> | External interrupt controller interrupt
| 9 | <<_smart_led_interface_neoled,NEOLED>> | NEOLED buffer TX empty / not full interrupt
| 10 | <<_stream_link_interface_slink,SLINK>> | RX data received
| 11 | <<_stream_link_interface_slink,SLINK>> | TX data send
| 12:15 | - | _reserved_, cannot fire
|=======================
 
**Non-Maskable Interrupt**
 
The NEORV32 features a single non-maskable interrupt source via the `nm_irq_i` top
entity signal that can be used to signal critical system conditions. This interrupt source _cannot_ be disabled. Hence, it does _not_ provide
configuration/status flags in the `mie` and `mip` CSRs. The RISC-V-compatible `mcause` value `0x80000000` is used to indicate the non-maskable interrupt.
 
<<<
// ####################################################################################################################
827,50 → 959,25
:sectnums:
=== Address Space
 
By default, the total 32-bit (4GB) address space of the NEORV32 Processor is divided into four main regions:
The NEORV32 Processor provides 32-bit physical addresses accessing up to 4GB of address space.
By default, this address space is divided into four main regions:
 
1. Instruction memory (IMEM) space – for instructions and constants.
2. Data memory (DMEM) space – for application runtime data (heap, stack, etc.).
3. Bootloader ROM address space – for the processor-internal bootloader.
4. IO/peripheral address space – for the processor-internal IO/peripheral devices (e.g., UART).
1. **Instruction address space** – for instructions (=code) and constants. A configurable section of this address space is used by
internal and/or external _instruction memory_ (IMEM).
2. **Data address space** – for application runtime data (heap, stack, etc.). A configurable section of this address space is used by
internal and/or external _data memory_ (DMEM).
3. **Bootloader address space**. A _fixed_ section of this address space is used by
internal _bootloader memory_ (BOOTLDROM).
4. **IO/peripheral address space** – for the processor-internal IO/peripheral devices (e.g., UART).
 
.NEORV32 processor - address space (default configuration)
image::address_space.png[900]
 
[TIP]
These four memory regions are handled by the linker when compiling a NEORV32 executable.
See section <<_executable_image_format>> for more information.
 
**Address Space Layout**
.NEORV32 processor - address space (default configuration)
image::address_space.png[900]
 
The general address space layout consists of two main configuration constants: `ispace_base_c` defining
the base address of the instruction memory address space and `dspace_base_c` defining the base address of
the data memory address space. Both constants are defined in the NEORV32 VHDL package file
`rtl/core/neorv32_package.vhd`:
 
[source,vhdl]
----
-- Architecture Configuration ----------------------------------------------------
-- ----------------------------------------------------------------------------------
constant ispace_base_c : std_ulogic_vector(31 downto 0) := x"00000000";
constant dspace_base_c : std_ulogic_vector(31 downto 0) := x"80000000";
----
 
The default configuration assumes the instruction memory address space starting at address _0x00000000_
and the data memory address space starting at _0x80000000_. Both values can be modified for a specific
setup and the address space may overlap or can be completely identical.
 
The base address of the bootloader (at _0xFFFF0000_) and the IO region (at _0xFFFFFF00_) for the peripheral
devices are also defined in the package and are fixed. These address regions cannot be used for other
applications – even if the bootloader or all IO devices are not implemented.
 
[WARNING]
When using the processor-internal data and/or instruction memories (DMEM/IMEM) and using a non-default
configuration for the `dspace_base_c` and/or `ispace_base_c` base addresses, the
following requirements have to be fulfilled:
**1.** Both base addresses have to be aligned to a 4-byte boundary.
**2.** Both base addresses have to be aligned to the according internal memory sizes.
 
:sectnums:
==== CPU Data and Instruction Access
 
898,10 → 1005,41
configuration corrupting this interrupt. This kind of security issues can be compensated using the
PMP system (see <<_machine_physical_memory_protection>>).
 
 
:sectnums:
==== Address Space Layout
 
The general address space layout consists of two main configuration constants: `ispace_base_c` defining
the base address of the _instruction memory address space_ and `dspace_base_c` defining the base address of
the _data memory address space_. Both constants are defined in the NEORV32 VHDL package file
`rtl/core/neorv32_package.vhd`:
 
[source,vhdl]
----
-- Architecture Configuration ----------------------------------------------------
-- ----------------------------------------------------------------------------------
constant ispace_base_c : std_ulogic_vector(31 downto 0) := x"00000000";
constant dspace_base_c : std_ulogic_vector(31 downto 0) := x"80000000";
----
 
The default configuration assumes the _instruction memory address space_ starting at address _0x00000000_
and the _data memory address space_ starting at _0x80000000_. Both values can be modified for a specific
setup and the address space may overlap or can be completely identical. Make sure that both base addresses
are _aligned_ to a 4-byte boundary.
 
[NOTE]
The base address of the internal bootloader (at _0xFFFF0000_) and the internal IO region (at _0xFFFFFE00_) for
peripheral devices are also defined in the package and are fixed. These address regions cannot not be used for other
applications – even if the bootloader or all IO devices are not implemented - without modifying the core's
hardware sources.
 
 
:sectnums:
==== Physical Memory Attributes
 
The processor setup defines four simple attributes for the four processor-internal address space regions:
The processor setup defines fixed attributes for the four processor-internal address space regions.
Accessing a memory region in a way that violates any of these attributes will raise an according
access exception..
 
* `r` – read access (from CPU data access interface, e.g. via "load")
* `w` – write access (from CPU data access interface, e.g. via "store")
911,8 → 1049,8
* `16` – half-word (16-bit)-accessible (when writing)
* `32` – word (32-bit)-accessible (when writing)
 
The following table shows the provided physical memory attributes of each region. Additional attributes (like
denying execute right for certain region of the IMEM) can be provided using the RISC-V <<_machine_physical_memory_protection>> extension.
[NOTE]
Read accesses (i.e. loads) can always access data in word, half-word and byte quantities (requiring an accordingly aligned address).
 
[cols="^1,^2,^2,^3,^2"]
[options="header",grid="rows"]
924,47 → 1062,124
| 1 | IMEM | 0x00000000 | up to 2GB | `r/w/x/a/8/16/32`
|=======================
 
Only the CPU of the processor has access to the internal memories and IO devices, hence all accesses are
always exclusive. Accessing a memory region in a way that violates the provided attributes will trigger a
load/store/instruction fetch access exception or will return a failed atomic access result, respectively.
[TIP]
The following table shows the provided physical memory attributes of each region. Additional attributes (for example
controlling certain right for specific address space regions) can be provided using the RISC-V <<_machine_physical_memory_protection>> extension.
 
The physical memory attributes of memories and/or devices connected via the external bus interface have to
defined by those components or the interconnection fabric.
 
:sectnums:
==== Internal Memories
==== Memory Configuration
 
The processor can implement internal memories for instructions (IMEM) and data (DMEM), which will be
mapped to FPGA block RAMs. The implementation of these memories is controlled via the boolean
<<_mem_int_imem_en>> and <<_mem_int_dmem_en>> generics.
The NEORV32 Processor was designed to provide maximum flexibility for the memory configuration.
The processor can populate the _instruction address space_ and/or the _data address space_ with **internal memories**
for instructions (IMEM) and data (DMEM). Processor **external memories** can be used as an _alternative_ or even _in combination_ with
the internal ones. The figure below show some exemplary memory configurations.
 
The size of these memories are configured via the _MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_
generics (in bytes), respectively. The processor-internal instruction memory (IMEM) can optionally be
implemented as true ROM (<<_mem_int_imem_rom>>), which is initialized with the application code during
synthesis.
.Exemplary memory configurations
image::neorv32_memory_configurations.png[800]
 
:sectnums!:
===== Internal Memories
 
The processor-internal memories (<<_instruction_memory_imem>> and <<_data_memory_dmem>>) are enabled (=implemented)
via the <<_mem_int_imem_en>> and <<_mem_int_dmem_en>> generics. Their sizes are configures via the according
<<_mem_int_imem_size>> and <<_mem_int_dmem_size>> generics.
 
If the processor-internal IMEM is implemented, it is located right at the base address of the instruction
address space (default `ispace_base_c` = _0x00000000_). Vice versa, the processor-internal data memory is
located right at the beginning of the data address space (default `dspace_base_c` = _0x80000000_) when
implemented.
 
:sectnums:
==== External Memory/Bus Interface
[TIP]
The default processor setup uses only _internal_ memories.
 
Any CPU access (data or instructions), which does not fulfill one of the following conditions, is forwarded
to the <<_processor_external_memory_interface_wishbone_axi4_lite>>:
[NOTE]
If the IMEM (internal or external) is less than the (default) maximum size (2GB), there is
a "dead address space" between it and the DMEM. This provides an additional safety feature
since data corrupting scenarios like stack overflow cannot directly corrupt the content of the IMEM:
any access to the "dead address space" in between will raise an exception that can be caught
by the runtime environment.
 
:sectnums!:
===== External Memories
 
If external memories (or further IP modules) shall be connected via the _processor's external bus interface_,
the interface has to be enabled via <<_mem_ext_en>> generic (=_true_). More information regarding this interface can be
found in section <<_processor_external_memory_interface_wishbone_axi4_lite>>.
 
Any CPU access (data or instructions), which does not fulfill _at least one_ of the following conditions, is forwarded
via the processor's bus interface to external components:
 
* access to the processor-internal IMEM and processor-internal IMEM is implemented
* access to the processor-internal DMEM and processor-internal DMEM is implemented
* access to the bootloader ROM and beyond → addresses >= _BOOTROM_BASE_ (default 0xFFFF0000) will never be forwarded to the external memory interface
 
The external bus interface is available when the <<_mem_ext_en>> generic is _true_. If this interface is
deactivated, any access exceeding the internal memories or peripheral devices will trigger a bus access fault
exception. If <<_mem_ext_timeout>> is greater than zero any external bus access that is not acknowledged or terminated
within <<_mem_ext_timeout>> clock cycles will auto-timeout and raise the according bus fault exception.
If no (or not all) processor-internal memories are implemented, the according base addresses are mapped to external memories.
For example, if the processor-internal IMEM is not implemented (<<_mem_int_imem_en>> = _false_), the processor will forward
any access to the instruction address space (starting at `ispace_base_c`) via the external bus interface to the external
memory system.
 
[NOTE]
If the external interface is deactivated, any access exceeding the internal memory address space (instruction, data, bootloader) or
the internal peripheral address space will trigger a bus access fault exception.
 
 
:sectnums:
==== Boot Configuration
 
Due to the flexible memory configuration concept, the NEORV32 Processor provides several different boot concepts.
The figure below shows the exemplary concepts for the two most common boot scenarios.
 
.NEORV32 boot configurations
image::neorv32_boot_configurations.png[800]
 
[NOTE]
The configuration of internal or external data memory (DMEM; <<_mem_int_dmem_en>> = _true_ / _false_) is not further
relevant for the boot configuration itself. Hence, it is not further illustrated here.
 
There are two general boot scenarios: _Indirect Boot_ (1a and 1b) and _Direct Boot_ (2a and 2b) configured via the
<<_int_bootloader_en>> generic If this generic is set **true** the _indirect_ boot scenario is used. This is also the
default boot configuration of the processor. If <<_int_bootloader_en>> is set **false** the _direct_ boot scenario is used.
 
[NOTE]
Please note that the provided boot scenarios are just exemplary setups that (should) fit most common requirements.
Much more sophisticated boot scenarios are possible by combining internal and external memories. For example, the default
internal bootloader could be used as first-level bootloader that loads (from extern SPI flash) a second-level bootloader
that is placed and execute in internal IMEM. This second-level bootloader could then fetch the actual application and
store it to external _data_ memory and transfers CPU control to that.
 
:sectnums!:
===== Indirect Boot
 
The _indirect_ boot scenarios **1a** and **1b** use the processor-internal <<_bootloader>>. This general setup is enabled
by setting the <<_int_bootloader_en>> generic to true, which will implement the processor-internal <<_bootloader_rom_bootrom>>.
This read-only memory is pre-initialized during synthesis with the default bootloader firmware.
 
The bootloader provides several options to upload an executable (via UART or from external SPI flash) and store it to
the _instruction address space_ so the CPU can execute it. Boot scenario **1a** uses the processor-internal IMEM
(<<_mem_int_imem_en>> = _true_). This scenario implements the internal <<_instruction_memory_imem>> as non-initialized
RAM so the bootloader can write the actual executable to it.
 
Boot scenario **1b** uses a processor-external IMEM (<<_mem_int_imem_en>> = _false_) that is connected via the processor's
bus interface. In this scenario the internal <<_instruction_memory_imem>> is not implemented at all and the bootloader will
write the executable to the processor-external memory.
 
:sectnums!:
===== Direct Boot
 
The _direct_ boot scenarios **2a** and **2b** do not use the processor-internal bootloader. Hence, the <<_int_bootloader_en>>
generic is set _false_. In this configuration the <<_bootloader_rom_bootrom>> is not implemented at all and the CPU will
directly begin executing code from the instruction address space after reset. A "pre-initialization mechanism is required
in order to provide an executable _in_ memory.
 
Boot scenario **2a** uses the processor-internal IMEM (<<_mem_int_imem_en>> = _true_) that is implemented as _read-only memory_
in this scenario. It is pre-initialized (by the bitstream) with the actual application executable.
 
In contrast, boot scenario **2b** uses a processor-external IMEM (<<_mem_int_imem_en>> = _false_). In this scenario the
system designer is responsible for providing a initialized external memory that contains the actual application to be executed.
 
 
 
<<<
// ####################################################################################################################
:sectnums:
1070,6 → 1285,8
 
include::soc_wishbone.adoc[]
 
include::soc_slink.adoc[]
 
include::soc_gpio.adoc[]
 
include::soc_wdt.adoc[]
1088,10 → 1305,10
 
include::soc_cfs.adoc[]
 
include::soc_nco.adoc[]
 
include::soc_neoled.adoc[]
 
include::soc_xirq.adoc[]
 
include::soc_sysinfo.adoc[]
 
 
/datasheet/soc_bootrom.adoc
8,29 → 8,32
| Hardware source file(s): | neorv32_boot_rom.vhd |
| Software driver file(s): | none | _implicitly used_
| Top entity port: | none |
| Configuration generics: | _BOOTLOADER_EN_ | implement processor-internal bootloader when _true_
| Configuration generics: | _INT_BOOTLOADER_EN_ | implement processor-internal bootloader when _true_
| CPU interrupts: | none |
|=======================
 
As the name already suggests, the boot ROM contains the read-only bootloader image. When the bootloader
is enabled via the _BOOTLOADER_EN_ generic it is directly executed after system reset.
[NOTE]
The default `neorv32_boot_rom.vhd` HDL source file provides a _generic_ memory design that infers embedded
memory for _larger_ memory configurations. You might need to replace/modify the source file in order to use
platform-specific features (like advanced memory resources) or to improve technology mapping and/or timing.
 
The bootloader ROM is located at address 0xFFFF0000. This location is fixed and the bootloader ROM size
must not exceed 32kB. The bootloader read-only memory is automatically initialized during synthesis via the
`rtl/core/neorv32_bootloader_image.vhd` file, which is generated when compiling and installing the
bootloader sources.
This HDL modules provides a read-only memory that contain the executable code image of the bootloader.
If the <<_int_bootloader_en>> generic is _true_ this module will be implemented and the CPU boot address
is modified to directly execute the code from the bootloader ROM after reset.
 
The bootloader ROM address space cannot be used for other applications even when the bootloader is not
implemented.
The bootloader ROM is located at address `0xFFFF0000` and can occupy a address space of up to 32kB. The base
address as well as the maximum address space size are fixed and cannot (should not!) be modified as this
might address collision with other processor modules.
 
**Boot Configuration**
The bootloader memory is _read-only_ and is automatically initialized with the bootloader executable image
`rtl/core/neorv32_bootloader_image.vhd` during synthesis. The actual _physical_ size of the ROM is also
determined via synthesis and expanded to the next power of two. For example, if the bootloader code requires
10kB of storage, a ROM with 16kB will be generated. The maximum size must not exceed 32kB.
 
If the bootloader is implemented, the CPU starts execution after reset right at the beginning of the boot
ROM. If the bootloader is not implemented, the CPU starts execution at the beginning of the instruction
memory space (defined via `ispace_base_c` constant in the `neorv32_package.vhd` VHDL package file,
default `ispace_base_c` = 0x00000000). In this case, the instruction memory has to contain a valid
executable – either by using the internal IMEM with an initialization during synthesis or by a user-defined
initialization process.
.Bootloader - Software
[TIP]
See section <<_bootloader>> for more information regarding the actual bootloader software/executable itself.
 
.Boot Configuration
[TIP]
See section <<_bootloader>> for more information regarding the bootloader's boot process and configuration options.
See section <<_boot_configuration>> for more information regarding the processor's different boot scenarios.
/datasheet/soc_dmem.adoc
13,6 → 13,11
| CPU interrupts: | none |
|=======================
 
[NOTE]
The default `neorv32_dmem.vhd` HDL source file provides a _generic_ memory design that infers embedded
memory for _larger_ memory configurations. You might need to replace/modify the source file in order to use
platform-specific features (like advanced memory resources) or to improve technology mapping and/or timing.
 
Implementation of the processor-internal data memory is enabled via the processor's _MEM_INT_DMEM_EN_
generic. The size in bytes is defined via the _MEM_INT_DMEM_SIZE_ generic. If the DMEM is implemented,
the memory is mapped into the data memory space and located right at the beginning of the data memory
/datasheet/soc_gpio.adoc
8,35 → 8,32
| Hardware source file(s): | neorv32_gpio.vhd |
| Software driver file(s): | neorv32_gpio.c |
| | neorv32_gpio.h |
| Top entity port: | `gpio_o` | 32-bit parallel output port
| | `gpio_i` | 32-bit parallel input port
| Top entity port: | `gpio_o` | 64-bit parallel output port
| | `gpio_i` | 64-bit parallel input port
| Configuration generics: | _IO_GPIO_EN_ | implement GPIO port when _true_
| CPU interrupts: | FIRQ channel 8 | pin-change interrupt (see <<_processor_interrupts>>)
| CPU interrupts: | none |
|=======================
 
**Theory of Operation**
 
The general purpose parallel IO port unit provides a simple 32-bit parallel input port and a 32-bit parallel
The general purpose parallel IO port unit provides a simple 64-bit parallel input port and a 64-bit parallel
output port. These ports can be used chip-externally (for example to drive status LEDs, connect buttons, etc.)
or system-internally to provide control signals for other IP modules. When the modules is disabled for
implementation the GPIO output port is tied to zero.
or system-internally to provide control signals for other IP modules. The component is disabled for
implementation when the _IO_GPIO_EN_ generic is set _false_. In this case GPIO output port is tied to all-zero.
 
**Pin-Change Interrupt**
.Access atomicity
[NOTE]
The GPIO modules uses two memory-mapped registers (each 32-bit) each for accessing the input and
output signals. Since the CPU can only process 32-bit "at once" updating the entire output cannot
be performed within a single clock cycle.
 
The parallel input port `gpio_i` features a single pin-change interrupt. Whenever an input pin has a low-to-high
or high-to-low transition, the interrupt is triggered. By default, the pin-change interrupt is disabled and
can be enabled using a bit mask that has to be written to the _GPIO_INPUT_ register. Each set bit in this mask
enables the pin-change interrupt for the corresponding input pin. If more than one input pin is enabled for
triggering the pin-change interrupt, any transition on one of the enabled input pins will trigger the CPU's pinchange
interrupt. If the modules is disabled for implementation, the pin-change interrupt is also permanently
disabled.
 
.GPIO unit register map
[cols="<2,<2,^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
.2+<| `0xffffff80` .2+<| _GPIO_INPUT_ ^| 31:0 ^| r/- <| parallel input port
^| 31:0 ^| -/w <| parallel input pin-change IRQ enable mask
| `0xffffff84` | _GPIO_OUTPUT_ | 31:0 | r/w | parallel output port
| Address | Name [C] | Bit(s) | R/W | Function
| `0xffffffc0` | _GPIO_INPUT_LO_ | 31:0 | r/- | parallel input port pins 31:0 (write accesses are ignored)
| `0xffffffc4` | _GPIO_INPUT_HI_ | 31:0 | r/- | parallel input port pins 63:32 (write accesses are ignored)
| `0xffffffc8` | _GPIO_OUTPUT_LO_ | 31:0 | r/w | parallel output port pins 31:0
| `0xffffffcc` | _GPIO_OUTPUT_HI_ | 31:0 | r/w | parallel output port pins 63:32
|=======================
/datasheet/soc_icache.adoc
15,6 → 15,11
| CPU interrupts: | none |
|=======================
 
[NOTE]
The default `neorv32_icache.vhd` HDL source file provides a _generic_ memory design that infers embedded
memory. You might need to replace/modify the source file in order to use platform-specific features
(like advanced memory resources) or to improve technology mapping and/or timing.
 
The processor features an optional cache for instructions to compensate memories with high latency. The
cache is directly connected to the CPU's instruction fetch interface and provides a full-transparent buffering
of instruction fetch accesses to the entire 4GB address space.
/datasheet/soc_imem.adoc
10,10 → 10,15
| Top entity port: | none |
| Configuration generics: | _MEM_INT_IMEM_EN_ | implement processor-internal IMEM when _true_
| | _MEM_INT_IMEM_SIZE_ | IMEM size in bytes
| | _MEM_INT_IMEM_ROM_ | implement IMEM as ROM when _true_
| | _INT_BOOTLOADER_EN_ | use internal bootlodaer when _true_ (implements IMEM as ROM)
| CPU interrupts: | none |
|=======================
 
[NOTE]
The default `neorv32_imem.vhd` HDL source file provides a _generic_ memory design that infers embedded
memory for _larger_ memory configurations. You might need to replace/modify the source file in order to use
platform-specific features (like advanced memory resources) or to improve technology mapping and/or timing.
 
Implementation of the processor-internal instruction memory is enabled via the processor's
_MEM_INT_IMEM_EN_ generic. The size in bytes is defined via the _MEM_INT_IMEM_SIZE_ generic. If the
IMEM is implemented, the memory is mapped into the instruction memory space and located right at the
22,8 → 27,8
By default, the IMEM is implemented as RAM, so the content can be modified during run time. This is
required when using a bootloader that can update the content of the IMEM at any time. If you do not need
the bootloader anymore – since your application development has completed and you want the program to
permanently reside in the internal instruction memory – the IMEM can also be implemented as true _read-only_
memory. In this case set the _MEM_INT_IMEM_ROM_ generic of the processor's top entity to _true_.
permanently reside in the internal instruction memory – the IMEM is automatically implemented as _pre-intialized_
ROM when the processor-internal bootloader is disabled (_INT_BOOTLOADER_EN_ = _false_).
 
When the IMEM is implemented as ROM, it will be initialized during synthesis with the actual application
program image. The compiler toolchain will generate a VHDL initialization
/datasheet/soc_neoled.adoc
167,7 → 167,7
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.22+<| `0xffffffd8` .22+<| _NEOLED_CT_ <|`0` _NEOLED_CT_EN_ ^| r/w <| NCO enable
.22+<| `0xffffffd8` .22+<| _NEOLED_CT_ <|`0` _NEOLED_CT_EN_ ^| r/w <| NEOLED enable
<|`1` _NEOLED_CT_MODE_ ^| r/w <| data transfer size; `0`=24-bit; `1`=32-bit
<|`2` _NEOLED_CT_BSCON_ ^| r/w <| busy flag / IRQ trigger configuration (see table above)
<|`3` _NEOLED_CT_PRSC0_ ^| r/w <| 3-bit clock prescaler, bit 0
/datasheet/soc_slink.adoc
0,0 → 1,142
<<<
:sectnums:
==== Stream Link Interface (SLINK)
 
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_slink.vhd |
| Software driver file(s): | neorv32_slink.c |
| | neorv32_slink.h |
| Top entity port: | `slink_tx_dat_o` | TX link data (8x32-bit)
| | `slink_tx_val_o` | TX link data valid (8-bit)
| | `slink_tx_rdy_i` | TX link allowed to send (8-bit)
| | `slink_rx_dat_i` | RX link data (8x32-bit)
| | `slink_rx_val_i` | RX link data valid (8-bit)
| | `slink_rx_rdy_o` | RX link ready to receive (8-bit)
| Configuration generics: | _SLINK_NUM_TX_ | Number of TX links to implement (0..8)
| | _SLINK_NUM_RX_ | Number of RX links to implement (0..8)
| | _SLINK_TX_FIFO_ | FIFO depth (1..32k) of TX links, has to be a power of two
| | _SLINK_RX_FIFO_ | FIFO depth (1..32k) of RX links, has to be a power of two
| CPU interrupts: | fast IRQ channel 10 | RX data available (see <<_processor_interrupts>>)
| | fast IRQ channel 11 | TX data send (see <<_processor_interrupts>>)
|=======================
 
The SLINK component provides up to 8 independent RX (receiving) and TX (sending) links for transmitting
stream data. The interface provides higher bandwidth (and less latency) than the external memory bus
interface, which makes it ideally suited to couple custom stream processing units (like CORDIC, FFTs or
cryptographic accelerators).
 
Each individual link provides an internal FIFO for data buffering. The FIFO depth is globally defined
for all TX links via the _SLINK_TX_FIFO_ generic and for all RX links via the _SLINK_RX_FIFO_ generic.
The FIFO depth has to be at least 1, which will implement a simple input/output register. The maximum
value is limited to 32768 entries. Note that the FIFO depth has to be a power of two (for optimal
logic mapping).
 
The actual number of implemented RX/TX links is configured by the _SLINK_NUM_RX_ and _SLINK_NUM_TX_
generics. The SLINK module will be synthesized only if at least one of these generics is greater than
zero. All unimplemented links are internally terminated and their according output signals are pulled
to low level.
 
[NOTE]
The SLINK interface does not provide any additional tag signals (for example to define a "stream destination
address" or to indicate the last data word of a "package"). Use a custom controller connected
via the external memory bus interface or the processor's GPIO ports to implement custom data tags.
 
**Theory of Operation**
 
The SLINK is activated by setting the control register's (_SLINK_CT_) enable bit _SLINK_CT_EN_.
The actual data links are accessed by reading or writing the according link data registers _SLINK_CH0_
to _SLINK_CH7_. For example, writing the _SLINK_CH0_ will put the according data into the FIFO of TX link 0.
Accordingly, reading from _SLINK_CH0_ will return one data word from the FIFO of RX link 0.
 
The FIFO status of each RX and TX link is available via read-only bits in the device's control register.
Bits _SLINK_CT_TX0_FREE_ to _SLINK_CT_TX7_FREE_ indicate if the FIFO of the according TX link can take another
data word. Bits _SLINK_CT_RX0_AVAIL_ to _SLINK_CT_RX7_AVAIL_ indicate if the FIFO of the according RX link
contains another data word.
 
The _SLINK_CT_TX_FIFO_Sx_ and _SLINK_CT_RX_FIFO_Sx_ bits allow software to determine the total TX & RX FIFO sizes.
The _SLINK_CT_TX_NUMx_ and _SLINK_CT_RX_NUMx_ bits represent the absolute number of implemented TX and RX links
with an offset of "-1" (`0b000` = 1 link implemented, ..., `0b111` = 8 links implemented.
 
**Blocking Link Access**
 
When directly accessing the link data registers (without checking the according FIFO status flags) the access
might be executed as _blocking_. That means the CPU access will stall until the accessed link responds. For
example, when reading RX link 0 (via _SLINK_CH0_ register) the CPU access will stall, if there is not data
available in the according FIFO. The CPU access will complete as soon as RX link0 receives new data.
 
Vice versa, writing data to TX link 0 (via _SLINK_CH0_ register) might stall the CPU access until there is
at least one free entry in the link's FIFO.
 
[WARNING]
The NEORV32 processor ensures that _any_ CPU access to memory-mapped devices (including the SLINK module)
will **time out** after a certain number of cycles (see section <<_bus_interface>>).
Hence, blocking access to a stream link that does not complete within a certain amount of cycles will
raise a _store bus access exception_ when writing a TX link or a _load bus access exception_ when reading
an RX link.
 
**Non-Blocking Link Access**
 
For a non-blocking link access concept, the FIFO status signal in _SLINK_CT_ needs to be checked before
reading/writing the actual link data register. For example, a non-blocking write access to a TX link 0 has
to check _SLINK_CT_TX0_FREE_ first. If the bit is set, the FIFO of TX link 0 can take another data word
and the actual data can be written to _SLINK_CH0_. If the bit is cleared, the link's FIFO is full
and the status flag can be polled until it indicates free space in the FIFO.
 
This concept will not raise any exception as there is no "direct" access to the link data registers.
However, non-blocking accesses require additional instruction to check the according status flags prior
to the actual link access, which will reduce performance for high-bandwidth data stream.
 
**Interrupts**
 
The stream interface provides two interrupts that are _globally_ driven by the RX and TX link's
FIFO level status. If the FIFO of **any** TX link _was full_ and _becomes empty_ again, the TX interrupt fires.
Accordingly, if the FIFO of **any** RX link _was empty_ and a _new data word_ appears in it, the RX interrupt fires.
 
Note that these interrupts can only fire if the SLINK module is actually enabled by setting the
_SLINK_CT_EN_ bit in the unit's control register.
 
**Stream Link Interface & Protocol**
 
The SLINK interface consists of three signals `dat`, `val` and `rdy` for each RX and TX link.
Each signal is an "array" with eight entires (one for each link). Note that an entry in `slink_*x_dat` is 32-bit
wide while entries in `slink_*x_val` and `slink_*x_rdy` are are just 1-bit wide.
 
The stream link protocol is based on a simple FIFO-like interface between a source (sender) and a sink (receiver).
Each link provides two signals for implementing a simple FIFO-style handshake. The `slink_*x_val` signal is set
if the according `slink_*x_dat` contains valid data. The stream source has to ensure that both signals remain
stable until the according `slink_*x_rdy` signal is set. This signal is set by the stream source to indicate it
can accept another data word.
 
In summary, a data word is transferred if both `slink_*x_val` and `slink_*x_rdy` are high.
 
.Exemplary stream link transfer
image::stream_link_interface.png[width=560,align=center]
 
[TIP]
The SLINK handshake protocol is compatible to the AXI4-Stream base protocol.
 
.SLINK register map
[cols="^4,<5,^2,^2,<14"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
.8+<| `0xfffffec0` .8+<| _SLINK_CT_ <| `31` _SLINK_CT_EN_ ^| r/w | SLINK global enable
<| `30` _reserved_ ^| r/- <| reserved, read as zero
<| `29:26` _SLINK_CT_TX_FIFO_S3_ : _SLINK_CT_TX_FIFO_S0_ ^| r/- <| TX links FIFO depth, log2 of_SLINK_TX_FIFO_ generic
<| `25:22` _SLINK_CT_RX_FIFO_S3_ : _SLINK_CT_RX_FIFO_S0_ ^| r/- <| RX links FIFO depth, log2 of_SLINK_RX_FIFO_ generic
<| `21:19` _SLINK_CT_TX_NUM2_ : _SLINK_CT_TX_NUM0_ ^| r/- <| Number of implemented TX links minus 1
<| `18:16` _SLINK_CT_RX_NUM2_ : _SLINK_CT_RX_NUM0_ ^| r/- <| Number of implemented RX links minus 1
<| `15:8` _SLINK_CT_TX7_FREE_ : _SLINK_CT_TX0_FREE_ ^| r/- <| At least one free TX FIFO entry available for link 0..7
<| `7:0` _SLINK_CT_RX7_AVAIL_ : _SLINK_CT_RX0_AVAIL_ ^| r/- <| At least one data word in RX FIFO available for link 0..7
| `0xfffffec4` : `0xfffffedc` | _SLINK_CT_ |`31:0` | | _mirrored control register_
| `0xfffffee0` | _SLINK_CH0_ | `31:0` | r/w | Link 0 RX/TX data
| `0xfffffee4` | _SLINK_CH1_ | `31:0` | r/w | Link 1 RX/TX data
| `0xfffffee8` | _SLINK_CH2_ | `31:0` | r/w | Link 2 RX/TX data
| `0xfffffeec` | _SLINK_CH3_ | `31:0` | r/w | Link 3 RX/TX data
| `0xfffffef0` | _SLINK_CH4_ | `31:0` | r/w | Link 4 RX/TX data
| `0xfffffef4` | _SLINK_CH5_ | `31:0` | r/w | Link 5 RX/TX data
| `0xfffffef8` | _SLINK_CH6_ | `31:0` | r/w | Link 6 RX/TX data
| `0xfffffefc` | _SLINK_CH7_ | `31:0` | r/w | Link 7 RX/TX data
|=======================
/datasheet/soc_sysinfo.adoc
6,7 → 6,7
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_sysinfo.vhd |
| Software driver file(s): | (neorv32.h) |
| Software driver file(s): | neorv32.h |
| Top entity port: | none |
| Configuration generics: | * | most of the top's configuration generics
| CPU interrupts: | none |
42,13 → 42,12
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's _BOOTLOADER_EN_ generic)
| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's _INT_BOOTLOADER_EN_ generic)
| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's _MEM_EXT_EN_ generic)
| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's _MEM_INT_DMEM_EN_ generic)
| `3` | _SYSINFO_FEATURES_MEM_INT_IMEM_ROM_ | set if the processor-internal IMEM is read-only (via top's _MEM_INT_IMEM_ROM_ generic)
| `4` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's _MEM_INT_IMEM_EN_ generic)
| `5` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via package's `xbus_big_endian_c` constant)
| `6` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via _ICACHE_EN_ generic)
| `3` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's _MEM_INT_IMEM_EN_ generic)
| `4` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via package's `wb_big_endian_c` constant)
| `5` | _SYSINFO_FEATURES_ICACHE_ | set if processor-internal instruction cache is implemented (via _ICACHE_EN_ generic)
| `14` | _SYSINFO_FEATURES_HW_RESET_ | set if on-chip debugger implemented (via _ON_CHIP_DEBUGGER_EN_ generic)
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant)
| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant)
61,7 → 60,7
| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's _IO_WDT_EN_ generic)
| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's _IO_CFS_EN_ generic)
| `24` | _SYSINFO_FEATURES_IO_TRNG_ | set if the TRNG is implemented (via top's _IO_TRNG_EN_ generic)
| `25` | _SYSINFO_FEATURES_IO_NCO_ | set if the NCO is implemented (via top's _IO_NCO_EN_ generic)
| `25` | _SYSINFO_FEATURES_IO_SLINK_ | set if the SLINK is implemented (via top's _SLINK_NUM_TX_ / _SLINK_NUM_RX_ generics)
| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's _IO_UART1_EN_ generic)
| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's _IO_NEOLED_EN_ generic)
|=======================
/datasheet/soc_trng.adoc
78,7 → 78,7
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.3+<| `0xffffff88` .3+<| _TRNG_CT_ <|`7:0` _TRNG_CT_DATA_MSB_ : _TRNG_CT_DATA_MSB_ ^| r/- <| 8-bit random data output
.3+<| `0xffffffb8` .3+<| _TRNG_CT_ <|`7:0` _TRNG_CT_DATA_MSB_ : _TRNG_CT_DATA_MSB_ ^| r/- <| 8-bit random data output
<|`30` _TRNG_CT_EN_ ^| r/w <| TRNG enable
<|`31` _TRNG_CT_VALID_ ^| r/- <| random data output is valid when set
|=======================
/datasheet/soc_wdt.adoc
10,7 → 10,7
| | neorv32_wdt.h |
| Top entity port: | none |
| Configuration generics: | _IO_WDT_EN_ | implement GPIO port when _true_
| CPU interrupts: | FIRQ channel 0 | watchdog timer overflow (see <<_processor_interrupts>>)
| CPU interrupts: | fast IRQ channel 0 | watchdog timer overflow (see <<_processor_interrupts>>)
|=======================
 
**Theory of Operation**
57,7 → 57,7
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Writable if locked | Function
.9+<| `0xffffff8c` .9+<| _WDT_CT_ <|`0` _WDT_CT_EN_ ^| r/w ^| no <| watchdog enable
.9+<| `0xffffffbc` .9+<| _WDT_CT_ <|`0` _WDT_CT_EN_ ^| r/w ^| no <| watchdog enable
<|`1` _WDT_CT_CLK_SEL0_ ^| r/w ^| no .3+<| 3-bit clock prescaler select
<|`2` _WDT_CT_CLK_SEL1_ ^| r/w ^| no
<|`3` _WDT_CT_CLK_SEL2_ ^| r/w ^| no
/datasheet/soc_wishbone.adoc
5,7 → 5,7
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_wishbone.vhd |
| Hardware source file(s): | neorv32_wishbone.vhd |
| Software driver file(s): | none | _implicitly used_
| Top entity port: | `wb_tag_o` | request tag output (3-bit)
| | `wb_adr_o` | address output (32-bit)
23,8 → 23,9
| Configuration generics: | _MEM_EXT_EN_ | enable external memory interface when _true_
| | _MEM_EXT_TIMEOUT_ | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled)
| Configuration constants in VHDL package file `neorv32_package.vhd`: | `wb_pipe_mode_c` | when _false_ (default): classic/standard Wishbone protocol; when _true_: pipelined Wishbone protocol
| | `xbus_big_endian_c` | byte-order (Endianness) of external memory interface; true=BIG, false=little (default)
| CPU interrupts: | none |
| | `wb_big_endian_c` | byte-order (Endianness) of external memory interface; true=BIG, false=little (default)
| | `wb_rx_buffer_c` | enable register buffer for RX path (default)
| CPU interrupts: | none |
|=======================
 
The external memory interface uses the Wishbone interface protocol. The external interface port is available
49,7 → 50,7
 
[source,vhdl]
----
-- (external) bus interface --
-- external bus interface --
constant wb_pipe_mode_c : boolean := false;
----
 
76,9 → 77,20
 
**Interface Latency**
 
The Wishbone gateway introduces two additional latency cycles: Processor-outgoing and -incoming signals
are fully registered. Thus, any access from the CPU to a processor-external devices requires +2 clock cycles.
By default, the Wishbone gateway introduces two additional latency cycles: processor-outgoing ("TX") and
processor-incoming ("RX") signals are fully registered. Thus, any access from the CPU to a processor-external devices
via Wishbone requires 2 additional clock cycles (at least; depending on device's latency).
 
If the attached Wishbone network / peripheral already provides output registers or if the Wishbone network is not relevant
for timing closure, the default buffering of incoming ("RX") data within the gateway can be disabled.
The configuration is done via the `wb_rx_buffer_c` constant in the in the main VHDL package file (`rtl/neorv32_package.vhd`):
 
[source,vhdl]
----
-- external bus interface --
constant wb_rx_buffer_c : boolean := false; -- false to implement "async" RX (non-default)
----
 
**Bus Access Timeout**
 
The Wishbone bus interface provides an option to configure a bus access timeout counter. The _MEM_EXT_TIMEOUT_
125,14 → 137,14
 
The NEORV32 CPU and the Processor setup are *little-endian* architectures. To allow direct connection
to a big-endian memory system the external bus interface provides an _Endianness configuration_. The
Endianness (of the external memory interface) can be configured via the global `xbus_big_endian_c`
Endianness (of the external memory interface) can be configured via the global `wb_big_endian_c`
constant in the main VHDL package file (`rtl/neorv32_package.vhd`). By default, the external memory
interface uses little-endian byte-order.
 
[source,vhdl]
----
-- (external) bus interface --
constant xbus_big_endian_c : boolean := true;
-- external bus interface --
constant wb_big_endian_c : boolean := true;
----
 
Application software can check the Endianness configuration of the external bus interface via the
141,7 → 153,7
 
**AXI4-Lite Connectivity**
 
The AXI4-Lite wrapper (`rtl/top_templates/neorv32_top_axi4lite.vhd`) provides a Wishbone-to-
The AXI4-Lite wrapper (`rtl/templates/system/neorv32_SystemTop_axi4lite.vhd`) provides a Wishbone-to-
AXI4-Lite bridge, compatible with Xilinx Vivado (IP packager and block design editor). All entity signals of
this wrapper are of type _std_logic_ or _std_logic_vector_, respectively.
 
153,6 → 165,4
 
[WARNING]
Using the auto-termination timeout feature (_MEM_EXT_TIMEOUT_ greater than zero) is **not AXI4 compliant** as the AXI protocol does not support canceling of
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/top_templates/neorv32_top_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default.
 
 
bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/templates/system/neorv32_SystemTop_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default.
/datasheet/soc_xirq.adoc
0,0 → 1,69
<<<
:sectnums:
==== External Interrupt Controller (XIRQ)
 
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_xirq.vhd |
| Software driver file(s): | neorv32_xirq.c |
| | neorv32_xirq.h |
| Top entity port: | `xirq_i` | IRQ input (up to 32-bit)
| Configuration generics: | _XIRQ_NUM_CH_ | Number of IRQs to implement (0..32)
| | _XIRQ_TRIGGER_TYPE_ | IRQ trigger type configuration
| | _XIRQ_TRIGGER_POLARITY_ | IRQ trigger polarity configuration
| CPU interrupts: | fast IRQ channel 8 | XIRQ (see <<_processor_interrupts>>)
|=======================
 
The eXternal interrupt controller provides a simple mechanism to implement up to 32 processor-external interrupt
request signals. The external IRQ requests are prioritized, queued and signaled to the CPU via a
single _CPU fast interrupt request_.
 
**Theory of Operation**
 
The XIRQ provides up to 32 interrupt _channels_ (configured via the _XIRQ_NUM_CH_ generic). Each bit in `xirq_i`
represents one interrupt channel. An interrupt channel is enabled by setting the according bit in the
interrupt enable register _XIRQ_IER_.
 
If the configured trigger (see below) of an enabled channel fires, the request is stored into an internal buffer.
This buffer is available via the interrupt pending register _XIRQ_IPR_. A `1` in this register indicates that the
corresponding interrupt channel has fired but has not yet been serviced (so it is pending). Pending IRQs can be
cleared by writing `1` to the according pending bit. As soon as there is a least one pending interrupt in the
buffer, an interrupt request is send to the CPU.
 
The CPU can determine firing interrupt request either by checking the bits in the _XIRQ_IPR_ register, which show all
pending interrupt and does not prioritize, or by reading the interrupt source _XIRQ_SCR_ register.
This register provides a 5-bit wide ID (0..31) that shows the interrupt request with _highest priority_.
Interrupt channel `xirq_i(0)` has highest priority and `xirq_i(_XIRQ_NUM_CH_-1)` has lowest priority.
This priority assignment is fixed and cannot be altered by software.
The CPU can use the ID from _XIRQ_SCR_ to service IRQ according to their priority. To acknowledge the according
interrupt the CPU can write `1 << XIRQ_SCR` to _XIRQ_IPR_.
 
**IRQ Trigger Configuration**
 
The controller does not provide a configuration option to define the IRQ triggers _during runtime_. Instead, two
generics are provided to configure the trigger of each interrupt channel before synthesis: the _XIRQ_TRIGGER_TYPE_
and _XIRQ_TRIGGER_POLARITY_ generic. Both generics are 32 bit wide representing one bit per interrupt channel. If
less than 32 interrupt channels are implemented the remaining configuration bits are ignored.
 
_XIRQ_TRIGGER_TYPE_ is used to define the general trigger type. This can either be _level-triggered_ (`0`) or
_edge-triggered_ (`1`). _XIRQ_TRIGGER_POLARITY_ is used to configure the polarity of the trigger: a `0` defines
low-level or falling-edge and a `1` defines high-level or a rising-edge.
 
.Example trigger configuration: channel 0 for rising-edge, IRQ channels 1 to 31 for high-level
[source, vhdl]
----
XIRQ_TRIGGER_TYPE => x"00000001";
XIRQ_TRIGGER_POLARITY => x"ffffffff";
----
 
.XIRQ register map
[cols="^4,<5,^2,^2,<14"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
| `0xffffff80` | _XIRQ_IER_ | `31:0` | r/w | Interrupt enable register (one bit per channel, LSB-aligned)
| `0xffffff84` | _XIRQ_IPR_ | `31:0` | r/w | Interrupt pending register (one bit per channel, LSB-aligned); writing 1 to a bit clears according interrupt; writing _any_ value acknowledges the _current_ CPU interrupt
| `0xffffff88` | _XIRQ_SCR_ | `4:0` | r/- | Channel id (0..31) of firing IRQ (prioritized!)
| `0xffffff8c` | - | `31:0` | r/- | _reserved_, read as zero
|=======================
/datasheet/software.adoc
34,7 → 34,7
 
[TIP]
More information regarding the toolchain (building from scratch or downloading the prebuilt ones)
can be found in section <<_toolchain_setup>>.
can be found in the user guides' section https://stnolting.github.io/neorv32/ug/#_software_toolchain_setup[Software Toolchain Setup].
 
 
 
66,7 → 66,6
| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO**
| - | `neorv32_intrinsics.h` | macros for custom intrinsics/instructions
| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME**
| `neorv32_nco.c` | `neorv32_nco.h` | HW driver functions for the **NCO**
| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED**
| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM**
| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers
118,7 → 117,7
|=======================
| `help` | Show a short help text explaining all available targets.
| `check` | Check the compiler toolchain. You should run this target at least once after installing the toolchain.
| `info` | Show the makefile configuration (see next chapter).
| `info` | Show the makefile configuration (see section <<_configuration>>).
| `exe` | Compile all sources and generate application executable for upload via bootloader.
| `install` | Compile all sources, generate executable (via exe target) for upload via bootloader and generate and install IMEM VHDL initialization image file `rtl/core/neorv32_application_image.vhd`.
| `all` | Execute `exe` and `install`.
213,31 → 212,73
:sectnums:
=== Executable Image Format
 
When all the application sources have been compiled and linked, a final executable file has to be generated.
For this purpose, the makefile uses the NEORV32-specific linker script `sw/common/neorv32.ld`. This linker script defines three memory sections:
`rom`, `ram` and `iodev`. These sections have specific access attributes: Read access (`r`), write access (`w`) and executable (`x`).
In order to generate a file, which can be executed by the processor, all source files have to be compiler, linked
and packed into a final _executable_.
 
.Linker memory sections
:sectnums:
==== Linker Script
 
When all the application sources have been compiled, they need to be _linked_ in order to generate a unified
program file. For this purpose the makefile uses the NEORV32-specific linker script `sw/common/neorv32.ld` for
linking all object files that were generated during compilation.
 
The linker script defines three memory _sections_: `rom`, `ram` and `iodev`. Each section provides specific
access _attributes_: read access (`r`), write access (`w`) and executable (`x`).
 
.Linker memory sections - general
[cols="<2,^1,<7"]
[options="header",grid="rows"]
|=======================
| Memory section | Attributes | Description
| `rom` | `rx` | Instruction memory (IMEM) **OR** bootloader ROM
| `ram` | `rwx` | Data memory (DMEM)
| `iodev` | `rw` | Memory-mapped IO/peripheral devices
| `ram` | `rwx` | Data memory address space (processor-internal/external DMEM)
| `rom` | `rx` | Instruction memory address space (processor-internal/external IMEM) _or_ internal bootloader ROM
| `iodev` | `rw` | Processor-internal memory-mapped IO/peripheral devices address space
|=======================
 
The `iodev` section is reserved for processor-internal memory-mapped IO and peripheral devices. The linker does not use this section at all
and just passes the start and end adresses of this section to the start-up code `crt0.S` (see next section).
These sections are defined right at the beginning of the linker script:
 
[NOTE]
The `rom` region is used to place the instructions of "normal" applications. If the bootloader is being compiled, the makefile defines the `make_bootloader`
symbol, which changes the _ORIGIN_ (base address) and _LENGTH_ (size) attributes of the `rom` region according to the BOOTROM definitions.
.Linker memory sections - cut-out from linker script `neorv32.ld`
[source,c]
----
MEMORY
{
ram (rwx) : ORIGIN = 0x80000000, LENGTH = DEFINED(make_bootloader) ? 512 : 8*1024
rom (rx) : ORIGIN = DEFINED(make_bootloader) ? 0xFFFF0000 : 0x00000000, LENGTH = DEFINED(make_bootloader) ? 32K : 2048M
iodev (rw) : ORIGIN = 0xFFFFFE00, LENGTH = 512
}
----
 
The linker maps all the regions from the compiled object files into only four final sections: `.text`, `.rodata`, `.data` and `.bss`
using the specified memory section. These four regions contain everything required for the application to run:
Each memory section provides a _base address_ `ORIGIN` and a _size_ `LENGTH`. The base address and size of the `iodev` section is
fixed and must not be altered. The base addresses and sizes of the `ram` and `rom` regions correspond to the total available instruction
and data memory address space (see section <<_address_space_layout>>).
 
.Executable regions
[IMPORTANT]
`ORIGIN` of the `ram` section has to be always identical to the processor's `dspace_base_c` hardware configuration. Additionally,
`ORIGIN` of the `rom` section has to be always identical to the processor's `ispace_base_c` hardware configuration.
 
The sizes of `ram` section has to be equal to the size of the **physical available data instruction memory**. For example, if the processor
setup only uses processor-internal DMEM (<<_mem_int_dmem_en>> = _true_ and no external data memory attached) the `LENGTH` parameter of
this memory section has to be equal to the size configured by the <<_mem_int_dmem_size>> generic.
 
The sizes of `rom` section is a little bit more complicated. The default linker script configuration assumes a _maximum_ of 2GB _logical_
memory space, which is also the default configuration of the processor's hardware instruction memory address space. This size _does not_ have
to reflect the _actual_ physical size of the instruction memory (internal IMEM and/or processor-external memory). It just provides a maximum
limit. When uploading new executable via the bootloader, the bootloader itself checks if sufficient _physical_ instruction memory is available.
If a new executable is embedded right into the internal-IMEM the synthesis tool will check, if the configured instruction memory size
is sufficient (e.g., via the <<_mem_int_imem_size>> generic).
 
[IMPORTANT]
The `rom` region uses a conditional assignment (via the `make_bootloader` symbol) for `ORIGIN` and `LENGTH` that is used to place
"normal executable" (i.e. for the IMEM) or "the bootloader image" to their according memories. +
+
The `ram` region also uses a conditional assignment (via the `make_bootloader` symbol) for `LENGTH`. When compiling the bootloader
(`make_bootloader` symbol is set) the generated bootloader will only use the _first_ 512 bytes of the data address space. This is
a fall-back to ensure the bootloader can operate independently of the actual _physical_ data memory size.
 
The linker maps all the regions from the compiled object files into four final sections: `.text`, `.rodata`, `.data` and `.bss`.
These four regions contain everything required for the application to run:
 
.Linker memory regions
[cols="<1,<9"]
[options="header",grid="rows"]
|=======================
249,15 → 290,20
|=======================
 
The `.text` and `.rodata` sections are mapped to processor's instruction memory space and the `.data` and
`.bss` sections are mapped to the processor's data memory space. Finally, the `.text`, `.rodata` and `.data` sections are extracted and concatenated into a single file
**`main.bin`**.
`.bss` sections are mapped to the processor's data memory space. Finally, the `.text`, `.rodata` and `.data`
sections are extracted and concatenated into a single file `main.bin`.
 
**Executable Image Generator**
 
The **`main.bin`** file is processed by the NEORV32 image generator (`sw/image_gen`) to generate the final
executable. It is automatically compiled when invoking the makefile. The image generator can generate three
types of executables, selected by a flag when calling the generator:
:sectnums:
==== Executable Image Generator
 
The `main.bin` file is packed by the NEORV32 image generator (`sw/image_gen`) to generate the final executable file.
 
[NOTE]
The sources of the image generator are automatically compiled when invoking the makefile.
 
The image generator can generate three types of executables, selected by a flag when calling the generator:
 
[cols="<1,<9"]
[grid="none"]
|=======================
266,62 → 312,105
| `-bld_img` | Generates an executable VHDL memory initialization image for the processor-internal BOOT ROM. This option generates the `rtl/core/neorv32_bootloader_image.vhd` file.
|=======================
 
All these options are managed by the makefile – so you don't actually have to think about them. The normal
application compilation flow will generate the `neorv32_exe.bin` file in the current software project folder
ready for upload via UART to the NEORV32 bootloader.
All these options are managed by the makefile. The _normal application_ compilation flow will generate the `neorv32_exe.bin`
executable to be upload via UART to the NEORV32 bootloader.
 
The actual executable provides a very small header consisting of three 32-bit words located right at the
beginning of the file. This header is generated by the image generator. The first word of the executable is the signature
word and is always `0x4788cafe`. Based on this word, the bootloader can identify a valid image file. The next word represents the size in bytes of the actual program
The image generator add a small header to the `neorv32_exe.bin` executable, which consists of three 32-bit words located right at the
beginning of the file. The first word of the executable is the signature word and is always `0x4788cafe`. Based on this word the bootloader
can identify a valid image file. The next word represents the size in bytes of the actual program
image in bytes. A simple "complement" checksum of the actual program image is given by the third word. This
provides a simple protection against data transmission or storage errors.
 
 
=== Start-Up Code (crt0)
:sectnums:
==== Start-Up Code (crt0)
 
The CPU (and also the processor) requires a minimal start-up and initialization code o bring the CPU (and the SoC) into a stable and initialized state before the
acutal application can be executed. This start-up code is located in `sw/common/crt0.S` and is automatically linked with _every_ application program.
The `crt0.S` is directly executed right after a reset and performs the following operations:
The CPU and also the processor require a minimal start-up and initialization code to bring the CPU (and the SoC)
into a stable and initialized state and to initialize the C runtime environment before the actual application can be executed.
This start-up code is located in `sw/common/crt0.S` and is automatically linked _every_ application program
and placed right before the actual application code so it gets executed right after reset.
 
* Initialize integer registers `x1 - x31` (or `x1 - x15` when using the `E` CPU extension) to a defined value.
* Initialize all CPU core CSRs and also install a default "dummy" trap handler for _all_ traps.
* Initialize the global pointer `gp` and the stack pointer `sp` according to the `.data` segment layout provided by the linker script.
* Clear IO area: Write zero to all memory-mapped registers within the IO region (`iodev` section). If certain devices have not been implemented, a bus access fault exception will occur. This exception is captured by the dummy trap handler.
* Clear the `.bss` section defined by the linker script.
* Copy read-only data from the `.text` section to the `.data` section to set initialized variables.
* Call the application's `main` function (with no arguments: `argc` = `argv` = 0).
* If the `main` function returns, the processor goes to an endless sleep mode (using a simple loop or via the `wfi` instruction if available).
The `crt0.S` start-up performs the following operations:
 
[start=1]
. Initialize all integer registers `x1 - x31` (or jsut `x1 - x15` when using the `E` CPU extension) to a defined value.
. Initialize the global pointer `gp` and the stack pointer `sp` according to the `.data` segment layout provided by the linker script.
. Initialize all CPU core CSRs and also install a default "dummy" trap handler for _all_ traps. This handler catches all traps during the early boot phase.
. Clear IO area: Write zero to all memory-mapped registers within the IO region (`iodev` section). If certain devices have not been implemented, a bus access fault exception will occur. This exception is captured by the dummy trap handler.
. Clear the `.bss` section defined by the linker script.
. Copy read-only data from the `.text` section to the `.data` section to set initialized variables.
. Call the application's `main` function (with _no_ arguments: `argc` = `argv` = 0).
. If the `main` function returns `crt0` can call an "after-main handler" (see below)
. If there is no after-main handler or after returning from the after-main handler the processor goes to an endless sleep mode (using a simple loop or via the `wfi` instruction if available).
 
:sectnums:
===== After-Main Handler
 
If the application's `main()` function actually returns, an _after main handler_ can be executed. This handler can be a normal function
since the C runtime is still available when executed. If this handler uses any kind of peripheral/IO modules make sure these are
already initialized within the application or you have to initialize them _inside_ the handler.
 
.After-main handler - function prototype
[source,c]
----
int __neorv32_crt0_after_main(int32_t return_code);
----
 
The function has exactly one argument (`return_code`) that provides the _return value_ of the application's main function.
For instance, this variable contains _-1_ if the main function returned with `return -1;`. The return value of the
`__neorv32_crt0_after_main` function is irrelevant as there is no further "software instance" executed afterwards that can check this.
However, the on-chip debugger could still evaluate the return value of the after-main handler.
 
A simple `printf` can be used to inform the user when the application main function return
(this example assumes that UART0 has been already properly configured in the actual application):
 
.After-main handler - example
[source,c]
----
int __neorv32_crt0_after_main(int32_t return_code) {
 
neorv32_uart_printf("Main returned with code: %i\n", return_code);
return 0;
}
----
 
 
<<<
// ####################################################################################################################
:sectnums:
=== Bootloader
 
The default bootloader (sw/bootloader/bootloader.c) of the NEORV32 processor allows to upload
new program executables at every time. If there is an external SPI flash connected to the processor (like the
FPGA's configuration memory), the bootloader can store the program executable to it. After reset, the
bootloader can directly boot from the flash without any user interaction.
[NOTE]
This section illustrated the **default** bootloader from the repository. The bootloader can be customized
to target application-specific scenarios. See User Guide section
https://stnolting.github.io/neorv32/ug/#_customizing_the_internal_bootloader[Customizing the Internal Bootloader]
for more information.
 
[WARNING]
The bootloader is only implemented when the BOOTLOADER_EN generic is true and requires the
CSR access CPU extension (CPU_EXTENSION_RISCV_Zicsr generic is true).
The default NEORV32 bootloader (source code `sw/bootloader/bootloader.c`) provides a build-in firmware that
allows to upload new application executables via UART at every time and to optionally store/boot them to/from
an external SPI flash. It features a simple "automatic boot" feature that will try to fetch an executable
from SPI flash if there is _no_ UART user interaction. This allows to build processor setup with
non-volatile application storage, which can be updated at any time.
 
[IMPORTANT]
The bootloader requires the primary UART (UART0) for user interaction (_IO_UART0_EN_ generic is _true_).
The bootloader is only implemented if the <<_int_bootloader_en>> generic is _true_. This will
select the <<_indirect_boot>> boot configuration.
 
.Hardware requirements of the _default_ NEORV32 bootloader
[IMPORTANT]
For the automatic boot from an SPI flash, the SPI controller has to be implemented (_IO_SPI_EN_
generic is _true_) and the machine system timer MTIME has to be implemented (_IO_MTIME_EN_
generic is _true_), too, to allow an auto-boot timeout counter.
**REQUIRED**: The bootloader requires the CSR access CPU extension (<<_cpu_extension_riscv_zicsr>> generic is _true_)
and at least 512 bytes of data memory (processor-internal DMEM or external DMEM). +
+
_RECOMMENDED_: For user interaction via UART (like uploading executables) the primary UART (UART0) has to be
implemented (<<_io_uart0_en>> generic is _true_). Without UART the bootloader does not make much sense. However, auto-boot
via SPI is still supported but the bootloader should be customized (see User Guide) for this purpose. +
+
_OPTIONAL_: The default bootloader uses bit 0 of the GPIO output port as "heart beat" and status LED if the
GPIO controller is implemented (<<_io_gpio_en>> generic is _true_). +
+
_OPTIONAL_: The MTIME machine timer (<<_io_mtime_en>> generic is _true_) and the SPI controller
(<<_io_spi_en>> generic is _true_) are required in order to use the bootloader's auto-boot feature
(automatic boot from external SPI flash if there is no user interaction via UART).
 
[WARNING]
The bootloader is intended to work independent of the actual hardware (-configuration). Hence, it
should be compiled with the minimal base ISA only. The current version of the bootloader uses the
`rv32i` ISA – so it will not work on `rv32e` architectures. To make the bootloader work on an embedded
CPU configuration or on any other more sophisticated configuration, recompile it using the according ISA
(see section <<_customizing_the_internal_bootloader>>).
 
To interact with the bootloader, connect the primary UART (UART0) signals (`uart0_txd_o` and
`uart0_rxd_o`) of the processor's top entity via a serial port (-adapter) to your computer (hardware flow control is
not used so the according interface signals can be ignored.), configure your
347,7 → 436,6
BLDV: Mar 23 2021
HWV: 0x01050208
CLK: 0x05F5E100
USER: 0x10000DE0
MISA: 0x40901105
ZEXT: 0x00000023
PROC: 0x0EFF0037
364,7 → 452,6
|=======================
| `BLDV` | Bootloader version (built date).
| `HWV` | Processor hardware version (from the `mimpid` CSR) in BCD format (example: `0x01040606` = v1.4.6.6).
| `USER` | Custom user code (from the _USER_CODE_ generic).
| `CLK` | Processor clock speed in Hz (via the SYSINFO module, from the _CLOCK_FREQUENCY_ generic).
| `MISA` | CPU extensions (from the `misa` CSR).
| `ZEXT` | CPU sub-extensions (from the `mzext` CSR)
412,7 → 499,6
* `s`: Store executable to SPI flash at `spi_csn_o(0)`
* `l`: Load executable from SPI flash at `spi_csn_o(0)`
* `e`: Start the application, which is currently stored in the instruction memory (IMEM)
* `#`: Shortcut for executing u and e afterwards (not shown in help menu)
 
A new executable can be uploaded via UART by executing the `u` command. After that, the executable can be directly
executed via the `e` command. To store the recently uploaded executable to an attached SPI flash press `s`. To
423,58 → 509,18
The CPU is in machine level privilege mode after reset. When the bootloader boots an application,
this application is also started in machine level privilege mode.
 
:sectnums:
==== External SPI Flash for Booting
[TIP]
For detailed information on using an SPI flash for application storage see User Guide section
https://stnolting.github.io/neorv32/ug/#_programming_an_external_spi_flash_via_the_bootloader[Programming an External SPI Flash via the Bootloader].
 
If you want the NEORV32 bootloader to automatically fetch and execute an application at system start, you
can store it to an external SPI flash. The advantage of the external memory is to have a non-volatile program
storage, which can be re-programmed at any time just by executing some bootloader commands. Thus, no
FPGA bitstream recompilation is required at all.
 
**SPI Flash Requirements**
 
The bootloader can access an SPI compatible flash via the processor top entity's SPI port and connected to
chip select `spi_csn_o(0)`. The flash must be capable of operating at least at 1/8 of the processor's main
clock. Only single read and write byte operations are used. The address has to be 24 bit long. Furthermore,
the SPI flash has to support at least the following commands:
 
* READ (`0x03`)
* READ STATUS (`0x05`)
* WRITE ENABLE (`0x06`)
* PAGE PROGRAM (`0x02`)
* SECTOR ERASE (`0xD8`)
* READ ID (`0x9E`)
 
Compatible (FGPA configuration) SPI flash memories are for example the "Winbond W25Q64FV2 or the "Micron N25Q032A".
 
**SPI Flash Configuration**
 
The base address `SPI_FLASH_BOOT_ADR` for the executable image inside the SPI flash is defined in the
"user configuration" section of the bootloader source code (`sw/bootloader/bootloader.c`). Most
FPGAs that use an external configuration flash, store the golden configuration bitstream at base address 0.
Make sure there is no address collision between the FPGA bitstream and the application image. You need to
change the default sector size if your flash has a sector size greater or less than 64kB:
 
[source,c]
----
/** SPI flash boot image base address */
#define SPI_FLASH_BOOT_ADR 0x00800000
/** SPI flash sector size in bytes */
#define SPI_FLASH_SECTOR_SIZE (64*1024)
----
 
[IMPORTANT]
For any change you made inside the bootloader, you have to recompile the bootloader (see section
<<_customizing_the_internal_bootloader>>) and do a new synthesis of the processor.
 
 
:sectnums:
==== Auto Boot Sequence
When you reset the NEORV32 processor, the bootloader waits 8 seconds for a user console input before it
When you reset the NEORV32 processor, the bootloader waits 8 seconds for a UART console input before it
starts the automatic boot sequence. This sequence tries to fetch a valid boot image from the external SPI
flash, connected to SPI chip select `spi_csn_o(0)`. If a valid boot image is found and can be successfully
transferred into the instruction memory, it is automatically started. If no SPI flash was detected or if there
was no valid boot image found, the bootloader stalls and the status LED is permanently activated.
flash, connected to SPI chip select `spi_csn_o(0)`. If a valid boot image is found that can be successfully
transferred into the instruction memory, it is automatically started. If no SPI flash is detected or if there
is no valid boot image found, and error code will be shown.
 
 
:sectnums:
482,7 → 528,7
 
If something goes wrong during bootloader operation, an error code is shown. In this case the processor
stalls, a bell command and one of the following error codes are send to the terminal, the bootloader status
LED is permanently activated and the system must be reset manually.
LED is permanently activated and the system must be manually reset.
 
[cols="<2,<13"]
[grid="rows"]
491,9 → 537,6
| **`ERROR_1`** | Your program is way too big for the internal processor’s instructions memory. Increase the memory size or reduce (optimize!) your application code.
| **`ERROR_2`** | This indicates a checksum error. Something went wrong during the transfer of the program image (upload via UART or loading from the external SPI flash). If the error was caused by a UART upload, just try it again. When the error was generated during a flash access, the stored image might be corrupted.
| **`ERROR_3`** | This error occurs if the attached SPI flash cannot be accessed. Make sure you have the right type of flash and that it is properly connected to the NEORV32 SPI port using chip select #0.
| **`ERROR_4`** | The instruction memory is marked as read-only. Set the _MEM_INT_IMEM_ROM_ generic to _false_ to allow write accesses.
| **`ERROR_5`** | This error pops up when an unexpected exception or interrupt was triggered. The cause of the trap (`mcause` CSR) is displayed for further investigation. This might be caused if an ISA extension is used that has not been synthesized.
| **`ERROR_?`** | Something really bad happened when there is no specific error code available :(
|=======================
 
 
/figures/neorv32_boot_configurations.png Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
figures/neorv32_boot_configurations.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: figures/neorv32_bus.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/neorv32_cpu.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/neorv32_memory_configurations.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/neorv32_memory_configurations.png =================================================================== --- figures/neorv32_memory_configurations.png (nonexistent) +++ figures/neorv32_memory_configurations.png (revision 61)
figures/neorv32_memory_configurations.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: figures/neorv32_ocd_complex.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/neorv32_processor.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/stream_link_interface.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: figures/stream_link_interface.png =================================================================== --- figures/stream_link_interface.png (nonexistent) +++ figures/stream_link_interface.png (revision 61)
figures/stream_link_interface.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: references/riscv-spec.pdf =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: userguide/content.adoc =================================================================== --- userguide/content.adoc (revision 60) +++ userguide/content.adoc (revision 61) @@ -4,14 +4,15 @@ step by step and in the presented order. :sectnums: -== Toolchain Setup +== Software Toolchain Setup -There are two possibilities to get the actual RISC-V GCC toolchain: +To compile (and debug) executables for the NEORV32 a RISC-V toolchain is required. +There are two possibilities to get this: 1. Download and _build_ the official RISC-V GNU toolchain yourself -2. Download and install a prebuilt version of the toolchain +2. Download and install a prebuilt version of the toolchain; this might also done via the package manager / app store of your OS -[NOTE] +[TIP] The default toolchain prefix for this project is **`riscv32-unknown-elf`**. Of course you can use any other RISC-V toolchain (like `riscv64-unknown-elf`) that is capable to emit code for a `rv32` architecture. Just change the _RISCV_TOOLCHAIN_ variable in the application makefile(s) according to your needs or define this variable when invoking the makefile. @@ -25,62 +26,42 @@ :sectnums: === Building the Toolchain from Scratch -To build the toolchain by yourself you can follow the guide from the official https://github.com/riscv/riscvgnu-toolchain GitHub page. +To build the toolchain by yourself you can follow the guide from the official https://github.com/riscv/riscv-gnu-toolchain GitHub page. +You need to make sure the generated toolchain fits the architecture of the NEORV32 core. To get a toolchain that even supports minimal +ISA extension configurations, it is recommend to compile for `rv32i` only. Please note that this minimal ISA also provides further ISA +extensions like `m` or `c`. Of course you can use a `multilib` approach to generate +toolchains for several target ISAs. -The official RISC-V repository uses submodules. You need the `--recursive` option to fetch the submodules -automatically: - +.Configuring GCC build for `rv32i` (minimal ISA) [source,bash] ---- -$ git clone --recursive https://github.com/riscv/riscv-gnu-toolchain ----- - -Download and install the prerequisite standard packages: - -[source,bash] ----- -$ sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev libmpfrdev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ----- - -To build the Linux cross-compiler, pick an install path. If you choose, say, `/opt/riscv`, then add -`/opt/riscv/bin` to your `PATH` variable. - -[source,bash] ----- -$ export PATH=$PATH:/opt/riscv/bin ----- - -Then, simply run the following commands and configuration in the RISC-V GNU toolchain source folder to compile a -`rv32i` toolchain: - -[source,bash] ----- riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –-with-abi=ilp32 riscv-gnu-toolchain$ make ---- -After a while you will get `riscv32-unknown-elf-gcc` and all of its friends in your `/opt/riscv/bin` folder. - :sectnums: === Downloading and Installing a Prebuilt Toolchain Alternatively, you can download a prebuilt toolchain. -**Use The Toolchain I have Build** +:sectnums: +==== Use The Toolchain I have Build -I have compiled the toolchain on a 64-bit x86 Ubuntu (Ubuntu on Windows, actually) and uploaded it to +I have compiled a GCC toolchain on a 64-bit x86 Ubuntu (Ubuntu on Windows, actually) and uploaded it to GitHub. You can directly download the according toolchain archive as single _zip-file_ within a packed -release from github.com/stnolting/riscv-gcc-prebuilt. +release from https://github.com/stnolting/riscv-gcc-prebuilt. Unpack the downloaded toolchain archive and copy the content to a location in your file system (e.g. `/opt/riscv`). More information about downloading and installing my prebuilt toolchains can be found in the repository's README. -**Use a Third Party Toolchain** +:sectnums: +==== Use a Third Party Toolchain + Of course you can also use any other prebuilt version of the toolchain. There are a lot RISC-V GCC packages out there - -even for Windows. +even for Windows. On Linux system you might even be able to fetch a toolchain via your distribution's package manager. [IMPORTANT] Make sure the toolchain can (also) emit code for a `rv32i` architecture, uses the `ilp32` or `ilp32e` ABI and **was not build** using @@ -90,8 +71,8 @@ :sectnums: === Installation -Now you have the binaries. The last step is to add them to your `PATH` environment variable (if you have not -already done so). Make sure to add the binaries folder (`bin`) of your toolchain. +Now you have the toolchain binaries. The last step is to add them to your `PATH` environment variable (if you have not +already done so): make sure to add the _binaries_ folder (`bin`) of your toolchain. [source,bash] ---- @@ -112,7 +93,8 @@ neorv32/sw/example/blink_led$ make check ---- -This will test all the tools required for the NEORV32. Everything is working fine if "Toolchain check OK" appears at the end. +This will test all the tools required for the generating NEORV32 executables. +Everything is working fine if `Toolchain check OK` appears at the end. @@ -121,30 +103,35 @@ :sectnums: == General Hardware Setup -The following steps are required to generate a bitstream for your FPGA board. If you want to run the -NEORV32 processor in simulation only, the following steps might also apply. +This guide will setup a NEORV32 project for FPGA implementation (or simulation only) _from scratch_ [TIP] -Check out the example setups in the `boards` folder (@GitHub: https://github.com/stnolting/neorv32/tree/master/boards), which provides script-based -demo projects for various FPGA boars. +If you want to use a complete pre-defined setup to start with, check out the +project's `setups` folder (https://github.com/stnolting/neorv32/tree/master/setups), +which provides (script-based) demo setups for various FPGA boards and toolchains. -In this tutorial we will use a test implementation of the processor – using many of the processor's optional -modules but just propagating the minimal signals to the outer world. Hence, this guide is intended as -evaluation or "hello world" project to check out the NEORV32. A little note: The order of the following -steps might be a little different for your specific EDA tool. +This tutorial uses a _simplified_ test setup of the processor +to keeps things simple at the beginning as this setup is intended as +evaluation or "hello world" project to check out the NEORV32. -[start=0] +[start=1] . Create a new project with your FPGA EDA tool of choice. . Add all VHDL files from the project's `rtl/core` folder to your project. Make sure to _reference_ the files only – do not copy them. -. Make sure to add all the rtl files to a new library called **`neorv32`**. If your FPGA tools does not -provide a field to enter the library name, check out the "properties" menu of the rtl files. +. Make sure to add all the rtl files to a new library called `neorv32`. If your FPGA tools does not +provide a field to enter the library name, check out the "properties" menu of the added rtl files. . The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor. If you already have a design, instantiate this unit into your design and proceed. -. If you do not have a design yet and just want to check out the NEORV32 – no problem! In this guide -we will use a simplified top entity, that encapsulated the actual processor top entity: add the -`rtl/core/top_templates/neorv32_test_setup.vhd` VHDL file to your project too, and -select it as top entity. + +[IMPORTANT] +Make sure to include the `neorv32` package into your design when instantiating the processor: add +`library neorv32;` and `use neorv32.neorv32_package.all;` to your design unit. + +[start=5] +. If you do not have a design yet and just want to check out the NEORV32 – no problem! This guide +uses a simplified top entity, that encapsulates the actual processor top entity: add the +`rtl/templates/processor/neorv32_ProcessorTop_Test.vhd` VHDL file to your project, too, and +select it as _top entity_. . This test setup provides a minimal test hardware setup: .NEORV32 "hello world" test setup @@ -151,13 +138,13 @@ image::neorv32_test_setup.png[align=center] [start=7] -. This test setup only implements some very basic processor and CPU features. Also, only the -minimum number of signals is propagated to the outer world. Please note that the reset input signal -`rstn_i` is **low-active**. -. The configuration of the NEORV32 processor is done using the generics of the instantiated processor -top entity. Let's keep things simple at first and use the default configuration: +. It only implements some very basic processor and CPU features. Also, only the +minimum number of signals is propagated to the outer world. +. However, a minimal setup-specific configuration of the NEORV32 processor is required to make it run +on your FPGA board of choice. Only the absolutely required modifications will be made while +keeping the default configuration for the remaining configuration options: -.Cut-out of `neorv32_test_setup.vhd` showing the processor instance and its configuration +.Cut-out of `neorv32_ProcessorTop_Test.vhd` showing the processor instance and its configuration [source,vhdl] ---- neorv32_top_inst: neorv32_top @@ -164,33 +151,32 @@ generic map ( -- General -- CLOCK_FREQUENCY => 100000000, -- in Hz # <1> - BOOTLOADER_EN => true, - USER_CODE => x"00000000", + INT_BOOTLOADER_EN => true, ... -- Internal instruction memory -- MEM_INT_IMEM_EN => true, MEM_INT_IMEM_SIZE => 16*1024, # <2> - MEM_INT_IMEM_ROM => false, -- Internal data memory -- MEM_INT_DMEM_EN => true, MEM_INT_DMEM_SIZE => 8*1024, # <3> ... ---- -<1> Clock frequency of `clk_i` in Hertz -<2> Default size of internal instruction memory: 16kB (no need to change that _now_) -<3> Default size of internal data memory: 8kB (no need to change that _now_) +<1> Clock frequency of `clk_i` signal in Hertz +<2> Default size of internal instruction memory: 16kB +<3> Default size of internal data memory: 8kB [start=9] -. There is one generic that has to be set according to your FPGA / board: The clock frequency of the -top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's +. There is one generic that has to be set according to your FPGA board setup: the actual clock frequency +of the top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's frequency in Hertz (Hz) (note "1"). -. If you feel like it – or if your FPGA does not provide so many resources – you can modify the +. If you feel like it – or if your FPGA does not provide many resources – you can modify the **memory sizes** (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ – marked with notes "2" and "3") or even -exclude certain ISa extensions and peripheral modules from implementation - but as mentioned above, let's keep things +exclude certain ISA extensions and peripheral modules from implementation - but as mentioned above, let's keep things simple at first and use the standard configuration for now. [NOTE] -Keep the internal instruction and data memory sizes in mind – these values are required for setting +If you have changed the default memory configuration (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ generics) +keep those new sizes in mind – these values are required for setting up the software framework in the next section <<_general_software_framework_setup>>. [start=11] @@ -219,14 +205,12 @@ your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is **low-active**, so maybe you need to invert the input signal. . If possible, connected at least bit `0` of the GPIO output port `gpio_o` to a high-active LED (invert -the signal when your LEDs are low-active) - this LED will be used as status LED by the bootloader. -. Finally, connect the primary UART's (UART0) communication signals `uart0_txd_o` and -`uart0_rxd_i` to your serial host interface (USB-to-serial converter). +the signal when your LEDs are low-active). This LED will be used as status LED for the setup. +. Finally, if your FPGA board provides a serial host interface (USB-to-serial converter) interface, +connect the UART communication signals `uart0_txd_o` and `uart0_rxd_i`. . Perform the project HDL compilation (synthesis, mapping, bitstream generation). -. Download the generated bitstream into your FPGA ("program" it) and press the reset button (just to -make sure everything is sync). -. Done! If you have assigned the bootloader status LED , it should be -flashing now and you should receive the bootloader start prompt in your UART console (check the baudrate!). +. Program the generated bitstream into your FPGA and press the button connected to the reset signal. +. Done! The assigned status LED should be flashing now for some sections before permanently lighting up. @@ -235,92 +219,95 @@ :sectnums: == General Software Framework Setup -While your synthesis tool is crunching the NEORV32 HDL files, it is time to configure the project's software -framework for your processor hardware setup. +To allow executables to be _actually executed_ on the NEORV32 Processor the configuration of the software framework +has to be aware to the hardware configuration. This guide focuses on the memory configuration. To enabled +certain CPU ISA festures refer to the <<_enabling_risc_v_cpu_extensions>> section. +[TIP] +If you have **not** changed the _default_ memory configuration in section <<_general_hardware_setup>> +you are already done and you can skip the rest of this guide. + [start=1] -. You need to tell the linker the actual size of the processor's instruction and data memories. This has to be always sync -to the *hardware memory configuration* (done in section <<_general_hardware_setup>>). . Open the NEORV32 linker script `sw/common/neorv32.ld` with a text editor. Right at the -beginning of the linker script you will find the **MEMORY** configuration showing two regions: `rom` and `ram` +beginning of this script you will find the `MEMORY` configuration listing the different memory section: -.Cut-out of the linker script `neorv32.ld`: Memory configuration +.Cut-out of the linker script `neorv32.ld`: `ram` memory section configuration [source,c] ---- MEMORY { - rom (rx) : ORIGIN = DEFINED(make_bootloader) ? 0xFFFF0000 : 0x00000000, LENGTH = DEFINED(make_bootloader) ? 4*1024 : 16*1024 # <1> - ram (rwx) : ORIGIN = 0x80000000, LENGTH = 8*1024 # <2> -} + ram (rwx) : ORIGIN = 0x80000000, LENGTH = DEFINED(make_bootloader) ? 512 : 8*1024 # <1> +... ---- -<1> Size of internal instruction memory (IMEM): 16kB -<2> Size of internal data memory (DMEM): 8kB +<1> Size of the data memory address space (right-most value) (internal/external DMEM); here 8kB -[WARNING] -The `rom` region provides conditional assignments (via the _make_bootloader_ symbol) for the _origin_ -and the _length_ configuration depending on whether the executable is built as normal application (for the IMEM) or -as bootloader code (for the BOOTROM). To modify the IMEM configuration of the `rom` region, -make sure to **only edit the most right values** for `ORIGIN` and `LENGTH` (marked with notes "1" and "2"). +[start=2] +. We only need to change the `ram` section, which presents the available data address space. +If you have changed the DMEM (_MEM_INT_DMEM_SIZE_ generic) size adapt the `LENGTH` parameter of the `ram` +section (here: `8*1024`) so it is equal to your DMEM hardware configuration. +[IMPORTANT] +Make sure you only modify the _right-most_ value (here: 8*1024)! + +The "`512`" are not relevant for the application. + [start=3] -. There are four parameters that are relevant here (only the right-most value for the `rom` section): The _origin_ -and the _length_ of the instruction memory (region name `rom`) and the _origin_ and the _length_ of the data -memory (region name `ram`). These four parameters have to be always sync to your hardware memory -configuration as described in section <<_general_hardware_setup>>. +. Done! Save your changes and close the linker script. +.Advanced: Section base address and size [IMPORTANT] -The `rom` _ORIGIN_ parameter has to be equal to the configuration of the NEORV32 ispace_base_c -(default: 0x00000000) VHDL package (`rtl/core/neorv32_package.vhd`) configuration constant. The `ram` _ORIGIN_ parameter has to -be equal to the configuration of the NEORV32 `dspace_base_c` (default: 0x80000000) VHDL -package (`rtl/core/neorv32_package.vhd`) configuration constant. +More information can be found in the datasheet section https://stnolting.github.io/neorv32/#_address_space[Address Space]. -[IMPORTANT] -The `rom` _LENGTH_ and the `ram` _LENGTH_ parameters have to match the configured memory sizes. For -instance, if the system does not have any external memories connected, the `rom` _LENGTH_ parameter -has to be equal to the processor-internal IMEM size (defined via top's _MEM_INT_IMEM_SIZE_ generic) -and the `ram` _LENGTH_ parameter has to be equal to the processor-internal DMEM size (defined via top's -_MEM_INT_DMEM_SIZE_ generic). - <<< // #################################################################################################################### :sectnums: == Application Program Compilation +This guide shows how to compile an example C-code application into a NEORV32 executable that +can be uploaded via the bootloader or the on-chip debugger. + +[IMPORTANT] +If your FPGA board does not provide such an interface - don't worry! +Section <<_installing_an_executable_directly_into_memory>> shows how to +run custom programs on your FPGA setup without having a UART. + [start=1] -. Open a terminal console and navigate to one of the project's example programs. For instance navigate to the -simple `sw/example_blink_led` example program. This program uses the NEORV32 GPIO unit to display +. Open a terminal console and navigate to one of the project's example programs. For instance, navigate to the +simple `sw/example_blink_led` example program. This program uses the NEORV32 GPIO module to display an 8-bit counter on the lowest eight bit of the `gpio_o` output port. . To compile the project and generate an executable simply execute: [source,bash] ---- -neorv32/sw/example/blink_led$ make exe +neorv32/sw/example/blink_led$ make clean_all exe ---- [start=3] +. We are using the `clean_all` taret to make sure everything is re-build. . This will compile and link the application sources together with all the included libraries. At the end, -your application is transformed into an ELF file (`main.elf`). The *NEORV32 image generator* (in `sw/image_gen`) takes this file and creates a -final executable. The makefile will show the resulting memory utilization and the executable size: +your application is transformed into an ELF file (`main.elf`). The _NEORV32 image generator_ (in `sw/image_gen`) +takes this file and creates a final executable. The makefile will show the resulting memory utilization and +the executable size: [source,bash] ---- -neorv32/sw/example/blink_led$ make exe +neorv32/sw/example/blink_led$ make clean_all exe Memory utilization: - text data bss dec hex filename - 852 0 0 852 354 main.elf + text data bss dec hex filename + 3176 0 120 3296 ce0 main.elf +Compiling ../../../sw/image_gen/image_gen Executable (neorv32_exe.bin) size in bytes: -864 +3188 ---- -[start=4] -. That's it. The `exe` target has created the actual executable `neorv32_exe.bin` in the current -folder, which is ready to be uploaded to the processor via the bootloader's UART interface. +[start=5] +. That's it. The `exe` target has created the actual executable `neorv32_exe.bin` in the current folder +that is ready to be uploaded to the processor. [TIP] -The compilation process will also create a `main.asm` assembly listing file in the project directory, which -shows the actual assembly code of the complete application. +The compilation process will also create a `main.asm` assembly listing file in the current folder, which +shows the actual assembly code of the application. @@ -329,43 +316,28 @@ :sectnums: == Uploading and Starting of a Binary Executable Image via UART -You have just created the executable. Now it is time to upload it to the processor. There are basically two -options to do so. +Follow this guide to use the bootloader to upload an executable via UART. -[TIP] -Executables can also be uploaded via the **on-chip debugger**. -See section <<_debugging_with_gdb>> for more information. +[NOTE] +This concept uses the default "Indirect Boot" scenario that uses the bootloader to upload new executables. +See datasheet section https://stnolting.github.io/neorv32/#_indirect_boot[Indirect Boot] for more information. -**Option 1** +[IMPORTANT] +If your FPGA board does not provide such an interface - don't worry! +Section <<_installing_an_executable_directly_into_memory>> shows how to +run custom programs on your FPGA setup without having a UART. -The NEORV32 makefiles provide an upload target that allows to directly upload an executable from the -command line. Reset the processor and execute: - -[source,bash] ----- -sw/example/blink_led$ make COM_PORT=/dev/ttyUSB1 upload ----- - -Replace `/dev/ttyUSB1` with the actual serial port you are using to communicate with the processor. You -might have to use `sudo make ...` if the targeted device requires elevated access rights. - - -**Option 2** - -The "better" option is to use a standard terminal program to upload an executable. This provides a more -comfortable way as you can directly interact with the bootloader console. Additionally, using a terminal program -also allows to directly communicate with the uploaded application. - [start=1] -. Connect the primary UART (UART0) interface of your FPGA board to a serial port of your -computer or use an USB-to-serial adapter. -. Start a terminal program. In this tutorial, I am using TeraTerm for Windows. You can download it from https://ttssh2.osdn.jp/index.html.en +. Connect the primary UART (UART0) interface of your FPGA board to a serial port of your host computer. +. Start a terminal program. In this tutorial, I am using TeraTerm for Windows. You can download it fore free +from https://ttssh2.osdn.jp/index.html.en -[WARNING] -Make sure your terminal program can transfer the executable in raw byte mode without any protocol stuff around it. +[NOTE] +_Any_ terminal program that can connect to a serial port should work. However, make sure the program +can transfer data in _raw_ byte mode without any protocol overhead around it. [start=3] -. Open a connection to the corresponding srial port. Configure the terminal according to the +. Open a connection to the the serial port your UART is connected to. Configure the terminal setting according to the following parameters: * 19200 Baud @@ -372,11 +344,11 @@ * 8 data bits * 1 stop bit * no parity bits -* no transmission/flow control protocol! (just raw byte mode) -* newline on `\r\n` (carriage return & newline) +* _no_ transmission/flow control protocol +* receiver (host computer) newline on `\r\n` (carriage return & newline) [start=4] -. Also make sure, that single chars are transmitted without any consecutive "new line" or "carriage +. Also make sure that single chars are send from your computer _without_ any consecutive "new line" or "carriage return" commands (this is highly dependent on your terminal application of choice, TeraTerm only sends the raw chars by default). . Press the NEORV32 reset button to restart the bootloader. The status LED starts blinking and the @@ -391,7 +363,6 @@ BLDV: Mar 23 2021 HWV: 0x01050208 CLK: 0x05F5E100 -USER: 0x10000DE0 MISA: 0x40901105 ZEXT: 0x00000023 PROC: 0x0EFF0037 @@ -412,8 +383,7 @@ ---- [start=6] -. Execute the "Upload" command by typing `u`. Now the bootloader is waiting for a binary executable -to be send. +. Execute the "Upload" command by typing `u`. Now the bootloader is waiting for a binary executable to be send. [source,bash] ---- @@ -422,9 +392,9 @@ ---- [start=7] -. Use the "send file" option of your terminal program to transmit the previously generated binary executable `neorv32_exe.bin`. -. Again, make sure to transmit the executable in raw binary mode (no transfer protocol, no additional -header stuff). When using TeraTerm, select the "binary" option in the send file dialog. +. Use the "send file" option of your terminal program to send a NEORV32 executable (`neorv32_exe.bin`). +. Again, make sure to transmit the executable in raw binary mode (no transfer protocol). +When using TeraTerm, select the "binary" option in the send file dialog. . If everything went fine, OK will appear in your terminal: [source,bash] @@ -434,7 +404,7 @@ ---- [start=10] -. The executable now resides in the instruction memory of the processor. To execute the program right +. The executable is now in the instruction memory of the processor. To execute the program right now run the "Execute" command by typing `e`: [source,bash] @@ -447,24 +417,108 @@ ---- [start=11] -. Now you should see the LEDs counting. +. If everything went fine, you should see the LEDs blinking. +[NOTE] +The bootloader will print error codes if something went wrong. +See section https://stnolting.github.io/neorv32/#_bootloader[Bootloader] of the NEORV32 datasheet for more information. +[TIP] +See section <<_programming_an_external_spi_flash_via_the_bootloader>> to learn how to use an external SPI +flash for nonvolatile program storage. +[TIP] +Executables can also be uploaded via the **on-chip debugger**. +See section <<_debugging_with_gdb>> for more information. + + + <<< // #################################################################################################################### :sectnums: -== Setup of a New Application Program Project +== Installing an Executable Directly Into Memory -Done with all the introduction tutorials and those example programs? Then it is time to start your own -application project! +If you do not want to use the bootloader (or the on-chip debugger) for executable upload or if your setup does not provide +a serial interface for that, you can also directly install an application into embedded memory. +This concept uses the "Direct Boot" scenario that implements the processor-internal IMEM as ROM, which is +pre-initialized with the application's executable during synthesis. Hence, it provides _non-volatile_ storage of the +executable inside the processor. This storage cannot be altered during runtime and any source code modification of +the application requires to re-program the FPGA via the bitstream. + +[TIP] +See datasheet section https://stnolting.github.io/neorv32/#_direct_boot[Direct Boot] for more information. + + + +Using the IMEM as ROM: + +* for this boot concept the bootloader is no longer required +* this concept only works for the internal IMEM (but can be extended to work with external memories coupled via the processor's bus interface) +* make sure that the memory components (like block RAM) the IMEM is mapped to support an initialization via the bitstream + [start=1] -. The easiest way of creating a *new* project is to make a copy of an *existing* project (like the -`blink_led` project) inside the `sw/example` folder. By this, all file dependencies are kept and you can -start coding and compiling. -. If you want to place the project folder somewhere else you need to adapt the project's makefile. In -the makefile you will find a variable that keeps the relative or absolute path to the NEORV32 home +. At first, make sure your processor setup actually implements the internal IMEM: the `MEM_INT_IMEM_EN` generics has to be set to `true`: + +.Processor top entity configuration - enable internal IMEM +[source,vhdl] +---- + -- Internal Instruction memory -- + MEM_INT_IMEM_EN => true, -- implement processor-internal instruction memory +---- + +[start=2] +. For this setup we do not want the bootloader to be implemented at all. Disable implementation of the bootloader by setting the +`INT_BOOTLOADER_EN` generic to `false`. This will also modify the processor-internal IMEM so it is initialized with the executable during synthesis. + +.Processor top entity configuration - disable internal bootloader +[source,vhdl] +---- + -- General -- + INT_BOOTLOADER_EN => false, -- boot configuration: false = boot from int/ext (I)MEM +---- + +[start=3] +. To generate an "initialization image" for the IMEM that contains the actual application, run the `install` target when compiling your application: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make clean_all install +Memory utilization: + text data bss dec hex filename + 3176 0 120 3296 ce0 main.elf +Compiling ../../../sw/image_gen/image_gen +Installing application image to ../../../rtl/core/neorv32_application_image.vhd +---- + +[start=4] +. The `install` target has compiled all the application sources but instead of creating an executable (`neorv32_exe.bit`) that can be uploaded via the +bootloader, it has created a VHDL memory initialization image `core/neorv32_application_image.vhd`. +. This VHDL file is automatically copied to the core's rtl folder (`rtl/core`) so it will be included for the next synthesis. +. Perform a new synthesis. The IMEM will be build as pre-initialized ROM (inferring embedded memories if possible). +. Upload your bitstream. Your application code now resides unchangeable in the processor's IMEM and is directly executed after reset. + + +The synthesis tool / simulator will print asserts to inform about the (IMEM) memory / boot configuration: + +[source] +---- +NEORV32 PROCESSOR CONFIG NOTE: Boot configuration: Direct boot from memory (processor-internal IMEM). +NEORV32 PROCESSOR CONFIG NOTE: Implementing processor-internal IMEM as ROM (3176 bytes), pre-initialized with application. +---- + + + +<<< +// #################################################################################################################### +:sectnums: +== Setup of a New Application Program Project + +[start=1] +. The easiest way of creating a _new_ software application project is to copy an _existing_ one. This will keep all +file dependencies. For example you can copy `sw/example/blink_led` to `sw/example/flux_capacitor`. +. If you want to place you application somewhere outside `sw/example` you need to adapt the application's makefile. +In the makefile you will find a variable that keeps the relative or absolute path to the NEORV32 repo home folder. Just modify this variable according to your new project's home location: [source,makefile] @@ -474,7 +528,8 @@ ---- [start=3] -. If your project contains additional source files outside of the project folder, you can add them to the _APP_SRC_ variable: +. If your project contains additional source files outside of the project folder, you can add them to +the `APP_SRC` variable: [source,makefile] ---- @@ -483,7 +538,8 @@ ---- [start=4] -. You also need to add the folder containing the include files of your new project to the _APP_INC variable_ (do not forget the `-I` prefix): +. You also can add a folder containing your application's include files to the +`APP_INC` variable (do not forget the `-I` prefix): [source,makefile] ---- @@ -491,32 +547,19 @@ APP_INC = -I . -I ../somewhere/include_stuff_folder ---- -[start=5] -. If you feel like it, you can change the default optimization level: -[source,makefile] ----- -# Compiler effort -EFFORT = -Os ----- -[TIP] -All the assignments made to the makefile variable can also be done "inline" when invoking the makefile. For example: `$make EFFORT=-Os clean_all exe` - - - - <<< // #################################################################################################################### :sectnums: == Enabling RISC-V CPU Extensions -Whenever you enable/disable a RISC-V CPU extensions via the according _CPU_EXTENSION_RISCV_x_ generic, you need to +Whenever you enable/disable a RISC-V CPU extensions via the according `CPU_EXTENSION_RISCV_x` generic, you need to adapt the toolchain configuration so the compiler can actually generate according code for it. To do so, open the makefile of your project (for example `sw/example/blink_led/makefile`) and scroll to the -"USER CONFIGURATION" section right at the beginning of the file. You need to modify the _MARCH_ variable and eventually -the _MABI_ variable according to your CPU hardware configuration. +"USER CONFIGURATION" section right at the beginning of the file. You need to modify the `MARCH` variable and eventually +the `MABI` variable according to your CPU hardware configuration. [source,makefile] ---- @@ -527,11 +570,17 @@ <1> MARCH = Machine architecture ("ISA string") <2> MABI = Machine binary interface -For example when you enable the RISC-V `C` extension (16-bit compressed instructions) via the _CPU_EXTENSION_RISCV_C_ generic (set _true_) you need -to add the 'c' extension also to the _MARCH_ ISA string. +For example, if you enable the RISC-V `C` extension (16-bit compressed instructions) via the `CPU_EXTENSION_RISCV_C` +generic (set `true`) you need to add the 'c' extension also to the `MARCH` ISA string in order to make the compiler +emit compressed instructions. -You can also override the default _MARCH_ and _MABI_ configurations from the makefile when invoking the makefile: +[WARNING] +ISA extension enabled in hardware can be a superset of the extensions enabled in software, but not the other way +around. For example generating compressed instructions for a CPU configuration that has the `c` extension disabled +will cause _illegal instruction exceptions_ at runtime. +You can also override the default `MARCH` and `MABI` configurations from the makefile when invoking the makefile: + [source,bash] ---- $ make MARCH=-march=rv32ic clean_all all @@ -538,191 +587,154 @@ ---- [NOTE] -The RISC-V ISA string (for _MARCH_) follows a certain canonical structure: +The RISC-V ISA string (for _MARCH_) follows a certain _canonical_ structure: `rev32[i/e][m][a][f][d][g][q][c][b][v][n]...` For example `rv32imac` is valid while `rv32icma` is not valid. - <<< // #################################################################################################################### :sectnums: -== Building a Non-Volatile Application without External Boot Memory +== Customizing the Internal Bootloader -The primary purpose of the bootloader is to allow an easy and fast update of the current application. In particular, this is very handy -during the development stage of a project as you can upload modified programs at any time via the UART. -Maybe at some time your project has become mature and you want to actually _embed_ your processor -including the application. +The NEORV32 bootloader provides several options to configure and customize it for a certain application setup. +This configuration is done by passing _defines_ when compiling the bootloader. Of course you can also +modify to bootloader source code to provide a setup that perfectly fits your needs. -There are two options to provide _non-volatile_ storage of your application. The simplest (but also most constrained) one is to implement the IMEM -as true ROM to contain your program. The second option is to use an external boot memory - this concept is shown in a different section: -<<_programming_an_external_spi_flash_via_the_bootloader>>. +[IMPORTANT] +Each time the bootloader sources are modified, the bootloader has to be re-compiled (and re-installed to the +bootloader ROM) and the processor has to be re-synthesized. -Using the IMEM as ROM: +[NOTE] +Keep in mind that the maximum size for the bootloader is limited to 32kB and should be compiled using the +base ISA `rv32i` only to ensure it can work independently of the actual CPU configuration. -* for this boot concept the bootloader is no longer required -* this concept only works for the internal IMEM (but can be extended to work with external memories coupled via the processor's bus interface) -* make sure that the memory components (like block RAM) the IMEM is mapped to support an initialization via the bitstream +.Bootloader configuration parameters +[cols="<2,^1,^2,<6"] +[options="header", grid="rows"] +|======================= +| Parameter | Default | Legal values | Description +4+^| Serial console interface +| `UART_EN` | `1` | `0`, `1` | Set to `0` to disable UART0 (no serial console at all) +| `UART_BAUD` | `19200` | _any_ | Baud rate of UART0 +4+^| Status LED +| `STATUS_LED_EN` | `1` | `0`, `1` | Enable bootloader status led ("heart beat") at `GPIO` output port pin #`STATUS_LED_PIN` when `1` +| `STATUS_LED_PIN` | `0` | `0` ... `31` | `GPIO` output pin used for the high-active status LED +4+^| Boot configuration +| `AUTO_BOOT_SPI_EN` | `0` | `0`, `1` | Set `1` to enable immediate boot from external SPI flash +| `AUTO_BOOT_OCD_EN` | `0` | `0`, `1` | Set `1` to enable boot via on-chip debugger (OCD) +| `AUTO_BOOT_TIMEOUT` | `8` | _any_ | Time in seconds after the auto-boot sequence starts (if there is no UART input by user); set to 0 to disabled auto-boot sequence +4+^| SPI configuration +| `SPI_FLASH_CS` | `0` | `0` ... `7` | SPI chip select output (`spi_csn_o`) for selecting flash +| `SPI_FLASH_SECTOR_SIZE` | `65536` | _any_ | SPI flash sector size in bytes +| `SPI_FLASH_CLK_PRSC` | `CLK_PRSC_8` | `CLK_PRSC_2` `CLK_PRSC_4` `CLK_PRSC_8` `CLK_PRSC_64` `CLK_PRSC_128` `CLK_PRSC_1024` `CLK_PRSC_2024` `CLK_PRSC_4096` | SPI clock pre-scaler (dividing main processor clock) +| `SPI_BOOT_BASE_ADDR` | `0x08000000` | _any_ 32-bit value | Defines the _base_ address of the executable in external flash +|======================= -[start=1] -. At first, compile your application code by running the `make install` command: +Each configuration parameter is implemented as C-language `define` that can be manually overridden (_redefined_) when +invoking the bootloader's makefile. The according parameter and its new value has to be _appended_ +(using `+=`) to the makefile's `USER_FLAGS` variable. Make sure to use the `-D` prefix here. -[source,bash] ----- -neorv32/sw/example/blink_led$ make compile -Memory utilization: - text data bss dec hex filename - 852 0 0 852 354 main.elf -Executable (neorv32_exe.bin) size in bytes: -864 -Installing application image to ../../../rtl/core/neorv32_application_image.vhd ----- +For example, to configure a UART Baud rate of 57600 and redirecting the status LED to output pin 20 +use the following command (_in_ the bootloader's source folder `sw/bootloader`): -[start=2] -. The `install` target has created an executable, too, but this time also in the form of a VHDL memory -initialization file. during synthesis, this initialization will become part of the final FPGA bitstream, which -in terms initializes the IMEM's memory primitives. -. To allow a direct boot of this image without interference of the bootloader you _can_ deactivate the implementation of -the bootloader via the according top entity's generic: - -[source,vhdl] +.Example: customizing, re-compiling and re-installing the bootloader +[source,console] ---- -BOOTLOADER_EN => false, -- implement processor-internal bootloader? # <1> +$ make USER_FLAGS+=-DUART_BAUD=57600 USER_FLAGS+=-DSTATUS_LED_PIN=20 clean_all bootloader ---- -<1> Set to _false_ to make the CPU directly boot from the IMEM. In this case the BOOTROM is discarded from the design. -[start=4] -. When the bootloader is deactivated, the according module (BOOTROM) is removed from the design and the CPU will start booting -at the base address of the instruction memory space (IMEM base address) making the CPU directly executing your -application after reset. -. The IMEM could be still modified, since it is implemented as RAM by default, which might corrupt your -executable. To prevent this and to implement the IMEM as true ROM (and eventually saving some -more hardware resources), active the "IMEM as ROM" feature using the processor's according top entity -generic: +[NOTE] +The `clean_all` target ensure that all libraries are re-compiled. The `bootloader` target will automatically +compile and install the bootloader to the HDL boot ROM (updating `rtl/core/neorv32_bootloader_image.vhd`). -[source,vhdl] ----- -MEM_INT_IMEM_ROM => true, -- implement processor-internal instruction memory as ROM ----- +:sectnums: +=== Bootloader Boot Configuration -[start=6] -. Perform a new synthesis and upload your bitstream. Your application code now resides unchangeable -in the processor's IMEM and is directly executed after reset. +The bootloader provides several _boot configurations_ that define where the actual application's executable +shall be fetched from. Note that the non-default boot configurations provide a smaller memory footprint +reducing boot ROM implementation costs. +:sectnums!: +==== Default Boot Configuration +The _default_ bootloader configuration provides a UART-based user interface that allows to upload new executables +at any time. Optionally, the executable can also be programmed to an external SPI flash by the bootloader (see +section <<_programming_an_external_spi_flash_via_the_bootloader>>). +This configuration also provides an _automatic boot sequence_ (auto-boot) which will start fetching an executable +from external SPI flash using the default SPI configuration. By this, the default bootloader configuration +provides a "non volatile program storage" mechanism that automatically boot from external SPI flash +(after `AUTO_BOOT_TIMEOUT`) while still providing the option to re-program SPI flash at any time +via the UART interface. -<<< -// #################################################################################################################### -:sectnums: -== Customizing the Internal Bootloader +:sectnums!: +==== `AUTO_BOOT_SPI_EN` -The bootloader provides several configuration options to customize it for your specific applications. The -most important user-defined configuration options are available as C `#defines` right at the beginning of the -bootloader source code `sw/bootloader/bootloader.c`): +The automatic boot from SPI flash (enabled when `AUTO_BOOT_SPI_EN` is `1`) will fetch an executable from an external +SPI flash (using the according _SPI configuration_) right after reset. The bootloader will start fetching +the image at SPI flash base address `SPI_BOOT_BASE_ADDR`. -.Cut-out from the bootloader source code `bootloader.c`: configuration parameters -[source,c] ----- -/** UART BAUD rate */ -#define BAUD_RATE (19200) -/** Enable auto-boot sequence if != 0 */ -#define AUTOBOOT_EN (1) -/** Time until the auto-boot sequence starts (in seconds) */ -#define AUTOBOOT_TIMEOUT 8 -/** Set to 0 to disable bootloader status LED */ -#define STATUS_LED_EN (1) -/** SPI_DIRECT_BOOT_EN: Define/uncomment to enable SPI direct boot */ -//#define SPI_DIRECT_BOOT_EN -/** Bootloader status LED at GPIO output port */ -#define STATUS_LED (0) -/** SPI flash boot image base address (warning! address might wrap-around!) */ -#define SPI_FLASH_BOOT_ADR (0x00800000) -/** SPI flash chip select line at spi_csn_o */ -#define SPI_FLASH_CS (0) -/** Default SPI flash clock prescaler */ -#define SPI_FLASH_CLK_PRSC (CLK_PRSC_8) -/** SPI flash sector size in bytes (default = 64kb) */ -#define SPI_FLASH_SECTOR_SIZE (64*1024) -/** ASCII char to start fast executable upload process */ -#define FAST_UPLOAD_CMD '#' ----- +Note that there is _no_ UART console to interact with the bootloader. However, this boot configuration will +output minimal status messages via UART (if `UART_EN` is `1`). -**Changing the Default Size of the Bootloader ROM** +:sectnums!: +==== `AUTO_BOOT_OCD_EN` -The NEORV32 default bootloader uses 4kB of storage. This is also the default size of the BOOTROM memory component. -If your new/modified bootloader exceeds this size, you need to modify the boot ROM configurations. +If `AUTO_BOOT_OCD_EN` is `1` the bootloader is implemented as minimal "halt loop" to be used with the on-chip debugger. +After initializing the hardware, the CPU waits in this endless loop until the on-chip debugger takes control over +the core (to upload and run the actual executable). See section <<_debugging_using_the_on_chip_debugger>> +for more information on how to use the on-chip debugger to upload and run executables. -[start=1] -. Open the processor's main package file `rtl/core/neorv32_package.vhd` and edit the -`boot_size_c` constant according to your requirements. The boot ROM size must not exceed 32kB -and should be a power of two (for optimal hardware mapping). +[NOTE] +All bootloader boot configuration support uploading new executables via the on-chip debugger. -[source,vhdl] ----- --- Bootloader ROM -- -constant boot_size_c : natural := 4*1024; -- bytes ----- +[WARNING] +Note that this boot configuration does not load any executable at all! Hence, +this boot configuration is intended to be used with the on-chip debugger only. -[start=2] -. Now open the NEORV32 linker script `sw/common/neorv32.ld` and adapt the _LENGTH_ parameter -of the `rom` according to your new memory size. `boot_size_c` and the `rom` _LENGTH_ attribute have to be always -identical. Do **not modify** the _ORIGIN_ of the `rom` section. -[source,c] ----- -MEMORY -{ - rom (rx) : ORIGIN = DEFINED(make_bootloader) ? 0xFFFF0000 : 0x00000000, LENGTH = DEFINED(make_bootloader) ? 4*1024 : 16*1024 # <1> - ram (rwx) : ORIGIN = 0x80000000, LENGTH = 8*1024 -} ----- -<1> Bootloader ROM default size = 4*1024 bytes (**left** value) -[IMPORTANT] -The `rom` region provides conditional assignments (via symbol `make_bootloader`) for the origin -and the length depending on whether the executable is built as normal application (for the IMEM) or -as bootloader code (for the BOOTROM). To modify the BOOTLOADER memory size, make -sure to edit the first value for the origin (note "1"). +<<< +// #################################################################################################################### +:sectnums: +== Programming an External SPI Flash via the Bootloader -**Re-Compiling and Re-Installing the Bootloader** +The default processor-internal NEORV32 bootloader supports automatic booting from an external SPI flash. +This guide shows how to write an executable to the SPI flash via the bootloader so it can be automatically +fetched and executed after processor reset. For example, you can use a section of the FPGA bitstream configuration +memory to store an application executable. -Whenever you have modified the bootloader you need to recompile and re-install it and re-synthesize your design. +[NOTE] +This section assumes the _default_ configuration of the NEORV32 bootloader. +See section <<_customizing_the_internal_bootloader>> on how to customize the bootloader and its setting +(for example the SPI chip-select port, the SPI clock speed or the flash base address for storing the executable). -[start=1] -. Compile and install the bootloader using the explicit `bootloader` makefile target. -[source,bash] ----- -neorv32/sw/bootloader$ make bootloader ----- +:sectnums: +=== SPI Flash -[start=1] -. Now perform a new synthesis / HDL compilation to update the bitstream with the new bootloader -image (some synthesis tools also allow to only update the BRAM initialization without re-running -the entire synthesis process). +The bootloader can access an SPI compatible flash via the processor top entity's SPI port. By default, the flash +chip-select line is to `spi_csn_o(0)` and uses 1/8 of the processor's main clock as clock frequency. +The SPI flash has to support single-byte read and write, 24-bit addresses and at least the following standard commands: -[NOTE] -The bootloader is intended to work regardless of the actual NEORV32 hardware configuration – -especially when it comes to CPU extensions. Hence, the bootloader should be build using the -minimal `rv32i` ISA only (`rv32e` would be even better). +* READ `0x03` +* READ STATUS `0x05` +* WRITE ENABLE `0x06` +* PAGE PROGRAM `0x02` +* SECTOR ERASE `0xD8` +* READ ID `0x9E` +Compatible (FGPA configuration) SPI flash memories are for example the "Winbond W25Q64FV2 or the "Micron N25Q032A". - -<<< -// #################################################################################################################### :sectnums: -== Programming an External SPI Flash via the Bootloader +=== Programming an Executable -As described in section https://stnolting.github.io/neorv32/#_external_spi_flash_for_booting[Documentation: External SPI Flash for Booting] -the bootloader provides an option to store an application image to an external SPI flash -and to read this image back for booting. These steps show how to store a - [start=1] . At first, reset the NEORV32 processor and wait until the bootloader start screen appears in your terminal program. . Abort the auto boot sequence and start the user console by pressing any key. -. Press u to upload the program image, that you want to store to the external flash: +. Press u to upload the executable that you want to store to the external flash: [source] ---- @@ -731,7 +743,7 @@ ---- [start=4] -. Send the binary in raw binary via your terminal program. When the uploaded is completed and "OK" +. Send the binary in raw binary via your terminal program. When the upload is completed and "OK" appears, press `p` to trigger the programming of the flash (do not execute the image via the `e` command as this might corrupt the image): @@ -746,14 +758,18 @@ [start=5] . The bootloader shows the size of the executable and the base address inside the SPI flash where the executable is going to be stored. A prompt appears: Type `y` to start the programming or type `n` to -abort. See section <<_external_spi_flash_for_booting> for more information on how to configure the base address. +abort. +[TIP] +Section <<_customizing_the_internal_bootloader>> show the according C-language `define` that can be modified +to specify the base address of the executable inside the SPI flash. + [source] ---- CMD:> u Awaiting neorv32_exe.bin... OK CMD:> p -Write 0x000013FC bytes to SPI flash @ 0x00800000? (y/n) y +Write 0x000013FC bytes to SPI flash @ 0x08000000? (y/n) y Flashing... OK CMD:> ---- @@ -768,11 +784,31 @@ <<< // #################################################################################################################### :sectnums: +== Packaging the Processor as IP block for Xilinx Vivado Block Designer + +.WORK IN PROGRESS +[WARNING] +This Section Is Under Construction! + + + +FIXME! + + + +<<< +// #################################################################################################################### +:sectnums: == Simulating the Processor -**Testbench** +.WORK IN PROGRESS +[WARNING] +This Section Is Under Construction! + + + +FIXME! -The NEORV32 project features a simple default testbench (`sim/neorv32_tb.vhd`) that can be used to simulate +:sectnums: +=== Testbench + +The NEORV32 project features a simple default testbench (`sim/neorv32_tb.simple.vhd`) that can be used to simulate and test the processor setup. This testbench features a 100MHz clock and enables all optional peripheral and CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its combinatorial (looped) oscillator architecture). @@ -824,8 +860,10 @@ as part of the testbench. Received chars are send to the simulator console and are also stored to a log file (`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulator home folder. -**Faster Simulation Console Output** +:sectnums: +=== Faster Simulation Console Output + When printing data via the UART the communication speed will always be based on the configured BAUD rate. For a simulation this might take some time. To have faster output you can enable the **simulation mode** or UART0/UART1 (see section https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_and_transmitter_uart0[Documentation: Primary Universal Asynchronous Receiver and Transmitter (UART0)]). @@ -856,18 +894,17 @@ The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is completed with a line feed (newline, ASCII `\n` = 10). -**Simulation with Xilinx Vivado** -The project features default a Vivado simulation waveform configuration in `sim/vivado`. +:sectnums: +=== Simulation using GHDL -**Simulation with GHDL** +To simulate the processor using _GHDL_ navigate to the `sim` folder and run the provided shell script. +Any arguments that are provided while executing this script are passed to GHDL. +For example the simulation time can be set to 20ms using `--stop-time=20ms` as argument. -To simulate the processor using _GHDL_ navigate to the `sim` folder and run the provided shell script. All arguments are passed to GHDL. -For example the simulation time can be configured using `--stop-time=4ms` as argument. - [source, bash] ---- -neorv32/sim$ sh ghdl_sim.sh --stop-time=4ms +neorv32/sim$ sh ghdl_sim.sh --stop-time=20ms ---- @@ -877,14 +914,14 @@ :sectnums: == Building the Documentation -The documentation is written using `asciidoc`. The according source files can be found in `docs/...`. -The documentation of the software framework is written _in-code_ using `doxygen`. +The documentation (datasheet + user guide) is written using `asciidoc`. The according source files +can be found in `docs/...`. The documentation of the software framework is written _in-code_ using `doxygen`. -A makefiles in the project's root directory is provided to either build all of the documentation as HTML pages +A makefiles in the project's root directory is provided to build all of the documentation as HTML pages or as PDF documents. [TIP] -Pre-rendered PDFs are available online as nightly pre-releases: https://github.com/stnolting/neorv32/releases. +Pre-rendered PDFs are available online as _nightly pre-releases_: https://github.com/stnolting/neorv32/releases. The HTML-based documentation is also available online at the project's https://stnolting.github.io/neorv32/[GitHub Pages]. The makefile provides a help target to show all available build options and their according outputs. @@ -906,12 +943,6 @@ -// #################################################################################################################### -:sectnums: -== Building the Project Documentation - - - <<< // #################################################################################################################### :sectnums: @@ -918,12 +949,11 @@ == FreeRTOS Support A NEORV32-specific port and a simple demo for FreeRTOS (https://github.com/FreeRTOS/FreeRTOS) are -available in the `sw/example/demo_freeRTOS` folder. +available in the `sw/example/demo_freeRTOS` folder. See the according documentation (`sw/example/demo_freeRTOS/README.md`) +for more information. -See the according documentation (`sw/example/demo_freeRTOS/README.md`) for more information. - // #################################################################################################################### :sectnums: == RISC-V Architecture Test Framework @@ -933,7 +963,7 @@ All files required for executing the test framework on a simulated instance of the processor (including port files) are located in the `riscv-arch-test` folder in the root directory of the NEORV32 repository. Take a -look at the provided `riscv-arch-test/README.md` (https://github.com/stnolting/neorv32/blob/master/riscv-arch-test/README.md[online at GitHunb]) +look at the provided `riscv-arch-test/README.md` (https://github.com/stnolting/neorv32/blob/master/riscv-arch-test/README.md[online at GitHub]) file for more information on how to run the tests and how testing is conducted in detail. @@ -943,15 +973,18 @@ :sectnums: == Debugging using the On-Chip Debugger -The NEORV32 https://stnolting.github.io/neorv32/#_on_chip_debugger_ocd[Documentation: On-Chip Debugger] -allows _online_ in-system debugging via an external JTAG access port from a +The NEORV32 on-chip debugger allows _online_ in-system debugging via an external JTAG access port from a host machine. The general flow is independent of the host machine's operating system. However, this tutorial uses Windows and Linux (Ubuntu on Windows) in parallel. +[TIP] +See datasheet section https://stnolting.github.io/neorv32/#_on_chip_debugger_ocd[On Chip Debugger (OCD)] +for more information. + [NOTE] This tutorial uses `gdb` to **directly upload an executable** to the processor. If you are using the default processor setup _with_ internal instruction memory (IMEM) make sure it is implemented as RAM -(_MEM_INT_IMEM_ROM_ generic = false). +(_INT_BOOTLOADER_EN_ generic = true). :sectnums: @@ -1040,7 +1073,7 @@ Furthermore, an assembly listing file `main.asm` is generated that we will use to define breakpoints. Open another terminal in `sw/example/blink_led` and start `gdb`. -The GNU debugger is part of the toolchain (see <<_toolchain_setup>>). +The GNU debugger is part of the toolchain (see <<_software_toolchain_setup>>). .Starting GDB (on Linux (Ubuntu on Windows)) [source, bash]
/userguide/index.adoc
1,18 → 1,9
= The NEORV32 RISC-V Processor: User Guide
include::../attrs.adoc[]
:title: [User Guide] The NEORV32 RISC-V Processor
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.6.0
:doctype: book
:sectnums:
:icons: font
:imagesdir: ../img
:stem:
:reproducible:
:listing-caption: Listing
:toc: left
:toclevels: 4
:title-logo-image: neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
:favicon: ../img/icon.png
 
/userguide/main.adoc
1,21 → 1,6
= The NEORV32 RISC-V Processor: User Guide
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.6.0
:doctype: book
:sectnums:
:icons: image
:iconsdir: ../icons
:imagesdir: ../figures
:stem:
:reproducible:
:listing-caption: Listing
:toc: macro
:toclevels: 4
:title-logo-image: image:neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
// Uncomment next line to set page size (default is A4)
//:pdf-page-size: Letter
include::../attrs.adoc[]
include::../attrs.main.adoc[]
 
 
<<<
/attrs.adoc
0,0 → 1,12
:author: Dipl.-Ing. Stephan Nolting
:email: stnolting@gmail.com
:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
:revnumber: v1.5.7
:doctype: book
:sectnums:
:stem:
:reproducible:
:listing-caption: Listing
:toclevels: 4
:title-logo-image: neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
:favicon: img/icon.png
/attrs.main.adoc
0,0 → 1,7
:icons: image
:iconsdir: ../icons
:imagesdir: ../figures
:toc: macro
:title-logo-image: image:neorv32_logo_dark.png[pdfwidth=6.25in,align=center]
// Uncomment next line to set page size (default is A4)
//:pdf-page-size: Letter
/impressum.md
0,0 → 1,60
## Impressum
 
### Angaben gem. § 5 TMG
 
Dipl.-Ing. Stephan Nolting
 
Hannover, Germany
 
E-Mail: stnolting@gmail.com
 
### Haftungsausschluss – Disclaimer
 
#### Haftung für Inhalte
 
Alle Inhalte unseres Internetauftritts wurden mit größter Sorgfalt und nach bestem Gewissen erstellt. Für die Richtigkeit,
Vollständigkeit und Aktualität der Inhalte können wir jedoch keine Gewähr übernehmen. Als Diensteanbieter sind wir gemäß
§ 7 Abs.1 TMG für eigene Inhalte auf diesen Seiten nach den allgemeinen Gesetzen verantwortlich. Nach §§ 8 bis 10 TMG sind
wir als Diensteanbieter jedoch nicht verpflichtet, übermittelte oder gespeicherte fremde Informationen zu überwachen oder
nach Umständen zu forschen, die auf eine rechtswidrige Tätigkeit hinweisen. Verpflichtungen zur Entfernung oder Sperrung
der Nutzung von Informationen nach den allgemeinen Gesetzen bleiben hiervon unberührt.
 
Eine diesbezügliche Haftung ist jedoch erst ab dem Zeitpunkt der Kenntniserlangung einer konkreten Rechtsverletzung möglich.
Bei Bekanntwerden von den o.g. Rechtsverletzungen werden wir diese Inhalte unverzüglich entfernen.
 
#### Haftungsbeschränkung für externe Links
 
Unsere Webseite enthält Links auf externe Webseiten Dritter. Auf die Inhalte dieser direkt oder indirekt verlinkten
Webseiten haben wir keinen Einfluss. Daher können wir für die „externen Links“ auch keine Gewähr auf Richtigkeit der
Inhalte übernehmen. Für die Inhalte der externen Links sind die jeweilige Anbieter oder Betreiber (Urheber) der Seiten
verantwortlich.
 
Die externen Links wurden zum Zeitpunkt der Linksetzung auf eventuelle Rechtsverstöße überprüft und waren im Zeitpunkt
der Linksetzung frei von rechtswidrigen Inhalten. Eine ständige inhaltliche Überprüfung der externen Links ist ohne
konkrete Anhaltspunkte einer Rechtsverletzung nicht möglich. Bei direkten oder indirekten Verlinkungen auf die Webseiten
Dritter, die außerhalb unseres Verantwortungsbereichs liegen, würde eine Haftungsverpflichtung ausschließlich in dem Fall
nur bestehen, wenn wir von den Inhalten Kenntnis erlangen und es uns technisch möglich und zumutbar wäre, die Nutzung im
Falle rechtswidriger Inhalte zu verhindern.
 
Diese Haftungsausschlusserklärung gilt auch innerhalb des eigenen Internetauftrittes https://github.com/stnolting/neorv32
gesetzten Links und Verweise von Fragestellern, Blogeinträgern, Gästen des Diskussionsforums. Für illegale, fehlerhafte
oder unvollständige Inhalte und insbesondere für Schäden, die aus der Nutzung oder Nichtnutzung solcherart dargestellten
Informationen entstehen, haftet allein der Diensteanbieter der Seite, auf welche verwiesen wurde, nicht derjenige,
der über Links auf die jeweilige Veröffentlichung lediglich verweist.
 
Werden uns Rechtsverletzungen bekannt, werden die externen Links durch uns unverzüglich entfernt.
 
#### Urheberrecht
 
Die auf unserer Webseite veröffentlichen Inhalte und Werke unterliegen dem deutschen Urheberrecht
(http://www.gesetze-im-internet.de/bundesrecht/urhg/gesamt.pdf). Die Vervielfältigung, Bearbeitung,
Verbreitung und jede Art der Verwertung des geistigen Eigentums in ideeller und materieller Sicht des Urhebers
außerhalb der Grenzen des Urheberrechtes bedürfen der vorherigen schriftlichen Zustimmung des jeweiligen
Urhebers i.S.d. Urhebergesetzes (http://www.gesetze-im-internet.de/bundesrecht/urhg/gesamt.pdf).
Downloads und Kopien dieser Seite sind nur für den privaten und nicht kommerziellen Gebrauch erlaubt.
Sind die Inhalte auf unserer Webseite nicht von uns erstellt wurden, sind die Urheberrechte Dritter
zu beachten. Die Inhalte Dritter werden als solche kenntlich gemacht. Sollten Sie trotzdem auf eine
Urheberrechtsverletzung aufmerksam werden, bitten wir um einen entsprechenden Hinweis. Bei Bekanntwerden
von Rechtsverletzungen werden wir derartige Inhalte unverzüglich entfernen.
 
Dieses Impressum wurde freundlicherweise von jurarat.de zur Verfügung gestellt.
/legal.adoc
48,7 → 48,7
 
* "GitHub" is a Subsidiary of Microsoft Corporation.
* "Vivado" and "Artix" are trademarks of Xilinx Inc.
* "AXI" and "AXI4-Lite" are trademarks of Arm Holdings plc.
* "AXI", "AXI4-Lite" and "AXI4-Stream" are trademarks of Arm Holdings plc.
* "ModelSim" is a trademark of Mentor Graphics – A Siemens Business.
* "Quartus Prime" and "Cyclone" are trademarks of Intel Corporation.
* "iCE40", "UltraPlus" and "Radiant" are trademarks of Lattice Semiconductor Corporation.
98,6 → 98,11
}
----
 
[TIP]
Each official release of the project (see https://github.com/stnolting/neorv32/releases[releases page]) provides
a _digital object identifiere_ (**DOI**) provided by https://zenodo.org/.
 
 
:sectnums!:
=== Acknowledgments
 
107,6 → 112,9
https://riscv.org[RISC-V] - instruction sets want to be free!
 
 
=== Impressum (Imprint)
 
See https://github.com/stnolting/neorv32/blob/master/docs/impressum.md[`docs/impressum.md`].
 
 
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.