OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

Compare Revisions

  • This comparison shows the changes necessary to convert path
    /neorv32/trunk
    from Rev 56 to Rev 57
    Reverse comparison

Rev 56 → Rev 57

/.ci/hw_check.sh
8,7 → 8,7
homedir=$homedir/..
 
# Run simulation
sh $homedir/sim/ghdl/ghdl_sim.sh --stop-time=7ms
sh $homedir/sim/ghdl/ghdl_sim.sh
 
# Check if reference can be found in output (UART0 primary UART simulation output)
grep -qf $homedir/check_reference.out neorv32.uart0.sim_mode.text.out && echo "Hardware test completed successfully!"
/docs/figures/address_space.png Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
docs/figures/address_space.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/cpu_interface_read_long.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/cpu_interface_read_long.png =================================================================== --- docs/figures/cpu_interface_read_long.png (nonexistent) +++ docs/figures/cpu_interface_read_long.png (revision 57)
docs/figures/cpu_interface_read_long.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/cpu_interface_write_long.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/cpu_interface_write_long.png =================================================================== --- docs/figures/cpu_interface_write_long.png (nonexistent) +++ docs/figures/cpu_interface_write_long.png (revision 57)
docs/figures/cpu_interface_write_long.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/neopixel.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/neopixel.png =================================================================== --- docs/figures/neopixel.png (nonexistent) +++ docs/figures/neopixel.png (revision 57)
docs/figures/neopixel.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/neorv32_bus.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/neorv32_cpu.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/neorv32_cpu.png =================================================================== --- docs/figures/neorv32_cpu.png (nonexistent) +++ docs/figures/neorv32_cpu.png (revision 57)
docs/figures/neorv32_cpu.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/wishbone_classic_read.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/wishbone_classic_read.png =================================================================== --- docs/figures/wishbone_classic_read.png (nonexistent) +++ docs/figures/wishbone_classic_read.png (revision 57)
docs/figures/wishbone_classic_read.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/figures/wishbone_pipelined_write.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/figures/wishbone_pipelined_write.png =================================================================== --- docs/figures/wishbone_pipelined_write.png (nonexistent) +++ docs/figures/wishbone_pipelined_write.png (revision 57)
docs/figures/wishbone_pipelined_write.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/src_adoc/icons/important.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/src_adoc/icons/important.png =================================================================== --- docs/src_adoc/icons/important.png (nonexistent) +++ docs/src_adoc/icons/important.png (revision 57)
docs/src_adoc/icons/important.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/src_adoc/icons/note.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/src_adoc/icons/note.png =================================================================== --- docs/src_adoc/icons/note.png (nonexistent) +++ docs/src_adoc/icons/note.png (revision 57)
docs/src_adoc/icons/note.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/src_adoc/icons/tip.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/src_adoc/icons/tip.png =================================================================== --- docs/src_adoc/icons/tip.png (nonexistent) +++ docs/src_adoc/icons/tip.png (revision 57)
docs/src_adoc/icons/tip.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/src_adoc/icons/warning.png =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/src_adoc/icons/warning.png =================================================================== --- docs/src_adoc/icons/warning.png (nonexistent) +++ docs/src_adoc/icons/warning.png (revision 57)
docs/src_adoc/icons/warning.png Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/src_adoc/cpu.adoc =================================================================== --- docs/src_adoc/cpu.adoc (nonexistent) +++ docs/src_adoc/cpu.adoc (revision 57) @@ -0,0 +1,973 @@ +:sectnums: +== NEORV32 Central Processing Unit (CPU) + +image:../figures/riscv_logo.png[width=350,align=center] + +**Key Features** + +* 32-bit pipelined/multi-cycle in-order `rv32` RISC-V CPU +* Optional RISC-V extensions: `rv32[i/e][m][a][c][b][Zfinx]` + `[u][Zicsr][Zifencei]` +* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications – passes the official RISC-V Architecture Tests (v2+) +* Official RISC-V open-source architecture ID +* Safe execution hardware (see section 2.7. Execution Safety); among other things, the CPU supports all traps from the RISC-V specifications +(including bus access exceptions) and traps on all unimplemented/illegal/malformed instructions +* Optional physical memory configuration (PMP), compatible to the RISC-V specifications +* Optional hardware performance monitors (HPM) for application benchmarking +* Separated interfaces for instruction fetch and data access (merged into single bus via a bus switch for +the NEORV32 processor) +* BIG-endian byte order +* Configurable hardware reset +* No hardware support of unaligned data/instruction accesses – they will trigger an exception. If the C extension is enabled instructions +can also be 16-bit aligned and a misaligned instruction address exception is not possible anymore + +[NOTE] +It is recommended to use the **NEORV32 Processor** as default top instance even if you only want to use the actual +CPU. Simply disable all the processor-internal modules via the generics and you will get a "CPU +wrapper" that provides a minimal CPU environment and an external bus interface (like AXI4). This +setup also allows to further use the default bootloader and software framework. From this base you +can start building your own SoC. Of course you can also use the CPU in it’s true stand-alone mode. + + +<<< +// #################################################################################################################### +:sectnums: +=== Architecture + +The NEORV32 CPU was designed from scratch based only on the official ISA and privileged architecture +specifications. The following figure shows the simplified architecture of the CPU. + +image:../figures/neorv32_cpu.png[align=center] + +The CPU uses a pipelined architecture with basically two main stages. The first stage (IF – instruction fetch) +is responsible for fetching new instruction data from memory via the fetch engine. The instruction data is +stored to a FIFO – the instruction prefetch buffer. The issue engine takes this data and assembles 32-bit +instruction words for the next pipeline stage. Compressed instructions – if enabled – are also decompressed +in this stage. The second stage (EX – execution) is responsible for actually executing the fetched instructions +via the execute engine. + +These two pipeline stages are based on a multi-cycle processing engine. So the processing of each stage for a +certain operations can take several cycles. Since the IF and EX stages are decoupled via the instruction +prefetch buffer, both stages can operate in parallel and with overlapping operations. Hence, the optimal CPI +(cycles per instructions) is 2, but it can be significantly higher: For instance when executing loads/stores +multi-cycle operations like divisions or when the instruction fetch engine has to reload the prefetch buffers +due to a taken branch. + +Basically, the NEORV32 CPU is somewhere between a classical pipelined architecture, where each stage +requires exactly one processing cycle (if not stalled) and a classical multi-cycle architecture, which executes +every single instruction in a series of consecutive micro-operations. The combination of these two classical +design paradigms allows an increased instruction execution in contrast to a pure multi-cycle approach (due to +the pipelined approach) at a reduced hardware footprint (due to the multi-cycle approach). + +The CPU provides independent interfaces for instruction fetch and data access. These two bus interfaces are +merged into a single processor-internal bus via a bus switch. Hence, memory locations including peripheral +devices are mapped to a single 32-bit address space making the architecture a modified Von-Neumann +Architecture. + + +// #################################################################################################################### +:sectnums: +=== RISC-V Compliance + +The NEORV32 CPU passes the rv32_m/I, rv32_m/M, rv32_m/C, rv32_m/privilege, and +rv32_m/Zifencei tests of the official RISC-V Architecture Tests (GitHub). The port files for the +NEORV32 processor are located in riscv-arch-test folder. See section <<_risc_v_architecture_test_framework>> for information how to run +the tests on the NEORV32. + +.**RISC-V `rv32_m/C` Tests** +................................... +Check cadd-01 ... OK +Check caddi-01 ... OK +Check caddi16sp-01 ... OK +Check caddi4spn-01 ... OK +Check cand-01 ... OK +Check candi-01 ... OK +Check cbeqz-01 ... OK +Check cbnez-01 ... OK +Check cebreak-01 ... OK +Check cj-01 ... OK +Check cjal-01 ... OK +Check cjalr-01 ... OK +Check cjr-01 ... OK +Check cli-01 ... OK +Check clui-01 ... OK +Check clw-01 ... OK +Check clwsp-01 ... OK +Check cmv-01 ... OK +Check cnop-01 ... OK +Check cor-01 ... OK +Check cslli-01 ... OK +Check csrai-01 ... OK +Check csrli-01 ... OK +Check csub-01 ... OK +Check csw-01 ... OK +Check cswsp-01 ... OK +Check cxor-01 ... OK +-------------------------------- +OK: 27/27 RISCV_TARGET=neorv32 RISCV_DEVICE=C XLEN=32 +................................... + +.**RISC-V `rv32_m/I` Tests** +................................... +Check add-01 ... OK +Check addi-01 ... OK +Check and-01 ... OK +Check andi-01 ... OK +Check auipc-01 ... OK +Check beq-01 ... OK +Check bge-01 ... OK +Check bgeu-01 ... OK +Check blt-01 ... OK +Check bltu-01 ... OK +Check bne-01 ... OK +Check fence-01 ... OK +Check jal-01 ... OK +Check jalr-01 ... OK +Check lb-align-01 ... OK +Check lbu-align-01 ... OK +Check lh-align-01 ... OK +Check lhu-align-01 ... OK +Check lui-01 ... OK +Check lw-align-01 ... OK +Check or-01 ... OK +Check ori-01 ... OK +Check sb-align-01 ... OK +Check sh-align-01 ... OK +Check sll-01 ... OK +Check slli-01 ... OK +Check slt-01 ... OK +Check slti-01 ... OK +Check sltiu-01 ... OK +Check sltu-01 ... OK +Check sra-01 ... OK +Check srai-01 ... OK +Check srl-01 ... OK +Check srli-01 ... OK +Check sub-01 ... OK +Check sw-align-01 ... OK +Check xor-01 ... OK +Check xori-01 ... OK +-------------------------------- +OK: 38/38 RISCV_TARGET=neorv32 RISCV_DEVICE=I XLEN=32 +................................... + +.**RISC-V `rv32_m/M` Tests** +................................... +Check div-01 ... OK +Check divu-01 ... OK +Check mul-01 ... OK +Check mulh-01 ... OK +Check mulhsu-01 ... OK +Check mulhu-01 ... OK +Check rem-01 ... OK +Check remu-01 ... OK +-------------------------------- +OK: 8/8 RISCV_TARGET=neorv32 RISCV_DEVICE=M XLEN=32 +................................... + +.**RISC-V `rv32_m/privilege` Tests** +................................... +Check ebreak ... OK +Check ecall ... OK +Check misalign-beq-01 ... OK +Check misalign-bge-01 ... OK +Check misalign-bgeu-01 ... OK +Check misalign-blt-01 ... OK +Check misalign-bltu-01 ... OK +Check misalign-bne-01 ... OK +Check misalign-jal-01 ... OK +Check misalign-lh-01 ... OK +Check misalign-lhu-01 ... OK +Check misalign-lw-01 ... OK +Check misalign-sh-01 ... OK +Check misalign-sw-01 ... OK +Check misalign1-jalr-01 ... OK +Check misalign2-jalr-01 ... OK +-------------------------------- +OK: 16/16 RISCV_TARGET=neorv32 RISCV_DEVICE=privilege XLEN=32 +................................... + +.**RISC-V `rv32_m/Zifencei` Tests** +................................... +Check Fencei ... OK +-------------------------------- +OK: 1/1 RISCV_TARGET=neorv32 RISCV_DEVICE=Zifencei XLEN=32 +................................... + + +<<< +:sectnums: +==== RISC-V Incompatibility Issues and Limitations + +This list shows the currently known issues regarding full RISC-V-compatibility. More specific information +can be found in section <<_instruction_sets_and_extensions>>. + +[IMPORTANT] +CPU and Processor are BIG-ENDIAN, but this should be no problem as the external memory bus +interface provides big- and little-endian configurations. See section <<_processor_external_memory_interface_wishbone_axi4_lite>> for more information. + +[IMPORTANT] +The `misa` CSR is read-only. It reflects the synthesized CPU extensions. Hence, all implemented +CPU extensions are always active and cannot be enabled/disabled dynamically during runtime. Any +write access to it (in machine mode) is ignored and will not cause any exception or side-effects. + +[IMPORTANT] +The physical memory protection (see section <<_machine_physical_memory_protection>>) +only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region. + +[IMPORTANT] +The `A` CPU extension (atomic memory access) only implements the `lr.w` and `sc.w` instructions yet. +However, these instructions are sufficient to emulate all further AMO operations. + + +==== NEORV32-Specific (Custom) Extensions + +The NEORV32-specific extensions are always enabled and are indicated by the set `X` bit in the `misa` CSR. + +[NOTE] +The CPU provides eight _fast interrupt_ interrupts, which are controlled via custom bit in the `mie` +and `mip` CSR. This extension is mapped to bits, that are available for custom use (according to the +RISC-V specs). Also, custom trap codes for `mcause` are implemented. + +[NOTE] +A custom CSR `mzext` is available that can be used to check for implemented `Z*` CPU extensions +(for example `Zifencei`). This CSR is mapped to the official "custom CSR address region". + +[NOTE] +All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception +<<_execution_safety>>. + + +<<< +// #################################################################################################################### +:sectnums: +=== CPU Top Entity - Signals + +The following table shows all interface signals of the CPU top entity `rtl/core/neorv32_cpu.vhd`. The +type of all signals is _std_ulogic_ or _std_ulogic_vector_, respectively. The "Dir." column shows the signal +direction seen from the CPU. + +.NEORV32 CPU top entity signals +[cols="<2,^1,^1,<6"] +[options="header", grid="rows"] +|======================= +| Signal | Width | Dir. | Function +4+^| **Global Signals** +| `clk_i` | 1 | in | global clock line, all registers triggering on rising edge +| `rstn_i` | 1 | in | global reset, low-active +| `sleep_o` | 1 | out | CPU is in sleep mode when set +4+^| **Instruction Bus Interface** +| `i_bus_addr_o` | 32 | out | destination address +| `i_bus_rdata_i` | 32 | in | read data +| `i_bus_wdata_o` | 32 | out | write data (always zero) +| `i_bus_ben_o` | 4 | out | byte enable +| `i_bus_we_o` | 1 | out | write transaction (always zero) +| `i_bus_re_o` | 1 | out | read transaction +| `i_bus_lock_o` | 1 | out | exclusive access request (always zero) +| `i_bus_ack_i` | 1 | in | bus transfer acknowledge from accessed peripheral +| `i_bus_err_i` | 1 | in | bus transfer terminate from accessed peripheral +| `i_bus_fence_o` | 1 | out | indicates an executed _fence.i_ instruction +| `i_bus_priv_o` | 2 | out | current CPU privilege level +4+^| **Data Bus Interface** +| `d_bus_addr_o` | 32 | out | destination address +| `d_bus_rdata_i` | 32 | in | read data +| `d_bus_wdata_o` | 32 | out | write data +| `d_bus_ben_o` | 4 | out | byte enable +| `d_bus_we_o` | 1 | out | write transaction +| `d_bus_re_o` | 1 | out | read transaction +| `d_bus_lock_o` | 1 | out | exclusive access request +| `d_bus_ack_i` | 1 | in | bus transfer acknowledge from accessed peripheral +| `d_bus_err_i` | 1 | in | bus transfer terminate from accessed peripheral +| `d_bus_fence_o` | 1 | out | indicates an executed _fence_ instruction +| `d_bus_priv_o` | 2 | out | current CPU privilege level +4+^| **System Time** +| `time_i` | 64 | in | system time input (from MTIME) +4+^| **Interrupts (RISC-V-compatible)** +| `msw_irq_i` | 1 | in | RISC-V machine software interrupt +| `mext_irq_i` | 1 | in | RISC-V machine external interrupt +| `mtime_irq_i` | 1 | in | RISC-V machine timer interrupt +4+^| **Fast Interrupts (NEORV32-specific)** +| `firq_i` | 16 | in | fast interrupt request signals +| `firq_ack_o` | 16 | out | fast interrupt acknowledge signals +|======================= + +<<< +// #################################################################################################################### +:sectnums: +=== CPU Top Entity - Generics + +The CPU generics are not listed here because they are a subset of the processor's generics. +See section <<_processor_top_entity_generics>> for more information. + + +<<< +// #################################################################################################################### +:sectnums: +=== Instruction Sets and Extensions + +The NEORV32 is an RISC-V `rv32i` architecture that provides several optional RISC-V CPU and ISA +(instruction set architecture) extensions. For more information regarding the RISC-V ISA extensions please +see the The _RISC-V Instruction Set Manual – Volume I: Unprivileged ISA_ and _The RISC-V Instruction Set Manual +Volume II: Privileged Architecture_, which are available in the projects `docs/` folder. + + +==== **`A`** - Atomic Memory Access + +Atomic memory access instructions (for implementing semaphores and mutexes) are available when the +`CPU_EXTENSION_RISCV_A` configuration generic is _true_. In this case the following additional instructions +are available: + +* `lr.w`: load-reservate +* `sc.w`: store-conditional + +[NOTE] +Even though only `lr.w` and `sc.w` instructions are implemented yet, all further atomic operations +(load-modify-write instruction) can be emulated using these two instruction. Furthermore, the +instruction’s ordering flags (`aq` and `lr`) are ignored by the CPU hardware. Using any other (not yet +implemented) AMO (atomic memory operation) will trigger an illegal instruction exception. + +[NOTE] +The atomic instructions have special requirements for memory system / bus interconnect. More +information can be found in sections <<_bus_interface>> and <<_processor_external_memory_interface_wishbone_axi4_lite>>, respectively. + + + +==== **`B`** - Bit-Manipulation + +The bit-manipulation instructions extension are available when the `CPU_EXTENSION_RISCV_B` configuration generic +is _true_. Note that not all sub-extensions are implemented yet. When the bit-manipulation extension is enabled +the following instructions are available: + +* base subset **`Zbb`**: `clz`, `ctz`, `cpop`, `sext.b`, `sext.h`, `min[u]`, `max[u]`, `andn`, `orn`, `xnor`, `rol`, `ror`, +`rori`, `c.xor`, `zext` (_pseudo instruction_ `for pack rd, rs, zero`), `rev8` (_pseudo instruction_ for `grevi rd, rs, -8`), +`orc.b` (_pseudo instruction_ for `gorci rd, rs, 7`) +* single-bit operations **`Zbs`**: `sbset[i]`, `sbclr[i]`, `sbclr[i]`, `sbext[i]` +* shifted-add operations **`Zba`**: `sh1add`, `sh2add`, `sh3add` + +[WARNING] +The bit manipulation extension is not yet officially ratified and the NEORV32 implementation is still +_work-in-progess_. There is no software support in the upstream GCC RISC-V port yet. However, an intrinsic library +is provided to utilize the provided bit manipulation extension from C-language code (see +`sw/example/bit_manipulation`). + +[NOTE] +The current version of the bit manipulation specs that are supported by the NEORV32 can be found +in `docs/bitmanip-draft.pdf`. + + +==== **`C`** - Compressed Instructions + +Compressed 16-bit instructions are available when the `CPU_EXTENSION_RISCV_C` configuration generic is +_true_. In this case the following instructions are available: + +* `c.addi4spn`, `c.lw`, `c.sw`, `c.nop`, `c.addi`, `c.jal`, `c.li`, `c.addi16sp`, `c.lui`, `c.srli`, `c.srai` `c.andi`, `c.sub`, +`c.xor`, `c.or`, `c.and`, `c.j`, `c.beqz`, `c.bnez`, `c.slli`, `c.lwsp`, `c.jr`, `c.mv`, `c.ebreak`, `c.jalr`, `c.add`, `c.swsp` + +[NOTE] +When the compressed instructions extension is enabled, branches to an _unaligned_ and _uncompressed_ address require +an additional instruction fetch to load the required second half-word of that instruction. The performance can be increased +again by forcing a 32-bit alignment of branch target addresses. By default, this is enforced via the GCC `-falign-functions=4`, +`-falign-labels=4`, `-falign-loops=4` and `-falign-jumps=4` compile flags (via the makefile). + + +==== **`E`** - Embedded CPU + +The embedded CPU extensions reduces the size of the general purpose register file from 32 entries to 16 entries to reduce hardware +requirements. This extensions is enabled when the `CPU_EXTENSION_RISCV_E` configuration generic is _true_. Accesses to registers beyond +`x15` will raise and _illegal instruction exception_. + +Due to the reduced register file an alternate ABI (**`ilp32e`**) is required for the toolchain. + + +==== **`I`** - Base Integer ISA +The CPU always supports the complete `rv32i` base integer instruction set. This base set is always enabled +regardless of the setting of the remaining exceptions. The base instruction set includes the following +instructions: + +* immediates: `lui`, `auipc` +* jumps: `jal`, `jalr` +* branches: `beq`, `bne`, `blt`, `bge`, `bltu`, `bgeu` +* memory: `lb`, `lh`, `lw`, `lbu`, `lhu`, `sb`, `sh`, `sw` +* alu: `addi`, `slti`, `sltiu`, `xori`, `ori`, `andi`, `slli`, `srli`, `srai`, `add`, `sub`, `sll`, `slt`, `sltu`, `xor`, `srl`, `sra`, `or`, `and` +* environment: `ecall`, `ebreak`, `fence` + +[NOTE] +In order to keep the hardware footprint low, the CPU's shift unit uses a hybrid parallel/serial approach. Shift +operations are split in coarse shifts (multiples of 4) and a final fine shift (0 to 3). The total execution +time depends on the shift amount. Alternatively, the shift operations can be processed completely in parallels by a fast +(but large) barrel shifter when the `FAST_SHIFT_EN` generic is _true_. In that case, shift operations +complete within 2 cycles regardless of the shift amount. Shift operations can also be executed in a pure serial manner when +then `TINY_SHIFT_EN` generic is _true_. In that case, shift operations take up to 32 cycles depending on the shift amount. + +[NOTE] +Internally, the `fence` instruction does not perform any operation inside the CPU. It only sets the +top’s `d_bus_fence_o` signal high for one cycle to inform the memory system a `fence` instruction has been +executed. Any flags within the `fence` instruction word are ignore by the hardware. + + +==== **`M`** - Integer Multiplication and Division + +Hardware-accelerated integer multiplication and division instructions are available when the +`CPU_EXTENSION_RISCV_M` configuration generic is _true_. In this case the following instructions are +available: + +• multiplication: `mul`, `mulh`, `mulhsu`, `mulhu` +• division: `div`, `divu`, `rem`, `remu` + +[NOTE] +By default, multiplication and division operations are executed in a bit-serial approach. +Alternatively, the multiplier core can be implemented using DSP blocks if the `FAST_MUL_EN` +generic is _true_ allowing faster execution. Multiplications and divisions +always require a fixed amount of cycles to complete - regardless of the input operands. + + +==== **`U`** - Less-Privileged User Mode + +Adds the less-privileged _user mode_ when the `CPU_EXTENSION_RISCV_U` configuration generic is _true_. For +instance, use-level code cannot access machine-mode CSRs. Furthermore, access to the address space (like +peripheral/IO devices) can be limited via the physical memory protection (_PMP_) unit for code running in user mode. + + +==== **`Zfinx`** Single-Precision Floating-Point Operations + +The `Zfinx` floating-point extension is an alternative of the `F` floating-point instruction that also uses the +integer register file `x` to store and operate on floating-point data (hence, `F-in-x`). Since not dedicated floating-point `f` +register file exists, the `Zfinx` extension requires less hardware resources and features faster context changes. +This also implies that there are NO dedicated `f` register file related load/store or move instructions. The +official RISC-V specifications can be found here: https://github.com/riscv/riscv-zfinx + +The NEORV32 floating-point unit used by the `Zfinx` extension is compatible to the _IEEE-754_ specifications. + +The `Zfinx` extensions only supports single-precision (`.s` suffix) yet (so it is a direct alternative to the `F` +extension). The `Zfinx` extension is implemented when the `CPU_EXTENSION_RISCV_Zfinx` configuration +generic is _true_. In this case the following instructions and CSRs are available: + +* conversion: `fcvt.s.w`, `fcvt.s.wu`, `fcvt.w.s`, `fcvt.wu.s` +* comparison: `fmin.s`, `fmax.s`, `feq.s`, `flt.s`, `fle.s` +* computational: `fadd.s`, `fsub.s`, `fmul.s` +* sign-injection: `fsgnj.s`, `fsgnjn.s`, `fsgnjx.s` +* number classification: `fclass.s` + +* additional CSRs: `fcsr`, `frm`, `fflags` + +[WARNING] +Fused multiply-add instructions `f[n]m[add/sub].s` are not supported! +Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet! + +[WARNING] +Subnormal numbers (also "de-normalized" numbers) are not supported by the NEORV32 FPU. +Subnormal numbers (exponent = 0) are _flushed to zero_ (setting them to +/- 0) before entering the +FPU's processing core. If a computational instruction (like `fmul.s`) generates a subnormal result, the +result is also flushed to zero during normalization. + +[WARNING] +The `Zfinx` extension is not yet officially ratified, but is expected to stay unchanged. There is no +software support for the `Zfinx` extension in the upstream GCC RISC-V port yet. However, an +intrinsic library is provided to utilize the provided `Zfinx` floating-point extension from C-language +code (see `sw/example/floating_point_test`). + + +==== **`Zicsr`** Control and Status Register Access / Privileged Architecture + +The CSR access instructions as well as the exception and interrupt system (= the privileged architecture) is implemented when the +`CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_. In this case the following instructions are +available: + +* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci` +* environment: `mret`, `wfi` + +[WARNING] +If the `Zicsr` extension is disabled the CPU does not provide any kind of interrupt or exception +support at all. In order to provide the full spectrum of functions and to allow a secure executions +environment, the `Zicsr` extension should always be enabled. + +[NOTE] +The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is +halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to +be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set. + + +==== **`Zifencei`** Instruction Stream Synchronization + +The `Zifencei` CPU extension is implemented if the `CPU_EXTENSION_RISCV_Zifencei` configuration +generic is _true_. It allows manual synchronization of the instruction stream via the following instruction: + +* `fence.i` + +[NOTE] +The `fence.i` instruction resets the CPU's internal instruction fetch engine and flushes the prefetch buffer. +This allows a clean re-fetch of modified data from memory. Also, he top's `i_bus_fencei_o` signal is set +high for one cycle to inform the memory system. Any additional flags within the `fence.i` instruction word are ignore by the hardware. + + +==== **`PMP`** Physical Memory Protection + +The NEORV32 physical memory protection (PMP) is compatible to the PMP specified by the RISC-V specs. +The CPU PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger minimal sizes can be configured +via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements. The physical memory protection system is implemented when the +`PMP_NUM_REGIONS` configuration generic is >0. In this case the following additional CSRs are available: + +* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers +* `pmpaddr*` (0..63, depending on configuration): PMP address registers + +See section <<_machine_physical_memory_protection>> for more information regarding the PMP CSRs. + +**Configuration** + +The actual number of regions and the minimal region granularity are defined via the top entity +`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available +granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the +number of available `pmpcfg*` and `pmpaddr*` CSRs. + +When implementing more PMP regions that a _certain critical limit_ *an additional register stage +is automatically inserted* into the CPU's memory interfaces to reduce critical path length. Unfortunately, this will also +increase the latency of instruction fetches and data access by +1 cycle. + +The critical limit can be adapted for custom use by a constant from the main VHDL package file +(`rtl/core/neorv32_package.vhd`). The default value is 8: + +[source,vhdl] +---- +-- "critical" number of PMP regions -- +constant pmp_num_regions_critical_c : natural := 8; +---- + +**Operation** + +Any memory access address (from the CPU's instruction fetch or data access interface) is tested if it is accessing any +of the specified (configured via `pmpaddr*` and enabled via `pmpcfg*`) PMP regions. If an +address accesses one of these regions, the configured access rights (attributes in `pmpcfg*`) are checked: + +* a write access (store) will fail if no write attribute is set +* a read access (load) will fail if no read attribute is set +* an instruction fetch access will fail if no execute attribute is set + +If an access to a protected region does not have the according access rights (attributes) it will raise the according +_instruction/load/store access fault exception_. + +By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical +memory protection also for machine-level programs you need to active the _locked bit_ in the according +`pmpcfg*` configuration. + +[IMPORTANT] +After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for +internal (iterative) computations before the configuration becomes valid. + +[NOTE] +For more information regarding RISC-V physical memory protection see the official _The RISC-V +Instruction Set Manual – Volume II: Privileged Architecture_ specifications. + + +==== **`HPM`** Hardware Performance Monitors + +In additions to the mandatory cycles (`[m]cycle[h]`) and instruction (`[m]instret[h]`) counters the NEORV32 CPU provides +up to 29 hardware performance monitors (HPM 3..31), which can be used to benchmark applications. Each HPM consists of an +N-bit wide counter (split in a high-word 32-bit CSR and a low-word 32-bit CSR), where N is defined via the top's +`HPM_CNT_WIDTH` generic (1..64-bit), and a corresponding event configuration CSR. The event configuration +CSR defines the architectural events that lead to an increment of the associated HPM counter. + +The cycle, time and instructions-retired counters (`[m]cycle[h]`, `time[h]`, `[m]instret[h]`) are +mandatory performance monitors on every RISC-V platform and have fixed increment event. For example, +the instructions-retired counter increments with each executed instructions. The actual hardware performance +monitors are optional and can be configured to increment on arbitrary hardware events. The number of +available HPM is configured via the top's `HPM_NUM_CNTS` generic at synthesis time. Assigning a zero will exclude +all HPM logic from the design. + +Depending on the configuration, the following additional CSR are available: + +* counters: `[m]hpmcounter*[h]` (3..31, depending on configuration) +* event configuration: `mhpmevent*` (3..31, depending on configuration) + +User-level access to the counter registers `hpmcounter*[h]` can be individually restricted via the `mcounteren` CSR. +Auto-increment of the HPMs can be individually deactivated via the `mcountinhibit` CSR. + +If `HPM_NUM_CNTS` is lower than the maximumg value (=29) the remaining HPMs are not implemented. +However, accessing their associated CSRs will not raise an illegal instructions exception. These CSR are +read-only and will always return 0. + +[NOTE] +For a list of all allocated HPM-related CSRs and all provided event configurations see section <<_hardware_performance_monitors_hpm>>. + + +<<< +// #################################################################################################################### +:sectnums: +=== Instruction Timing + +The instruction timing listed in the table below shows the required clock cycles for executing a certain +instruction. These instruction cycles assume a bus access without additional wait states and a filled +pipeline. + +Average CPI (cycles per instructions) values for "real applications" like for executing the CoreMark benchmark for different CPU +configurations are presented in <<_cpu_performance>>. + +.Clock cycles per instruction +[cols="<2,^1,^4,<3"] +[options="header", grid="rows"] +|======================= +| Class | ISA | Instruction(s) | Execution cycles +| ALU | `I/E` | `addi` `slti` `sltiu` `xori` `ori` `andi` `add` `sub` `slt` `sltu` `xor` `or` `and` `lui` `auipc` | 2 +| ALU | `C` | `c.addi4spn` `c.nop` `c.addi` `c.li` `c.addi16sp` `c.lui` `c.andi` `c.sub` `c.xor` `c.or` `c.and` `c.add` `c.mv` | 2 +| ALU | `I/E` | `slli` `srli` `srai` `sll` `srl` `sra` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINY_SHIFT_EN` is enabled.]: 2..32 +| ALU | `C` | `c.srli` `c.srai` `c.slli` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINS_SHIFT_EN` is enabled.]: 2..32 +| Branches | `I/E` | `beq` `bne` `blt` `bge` `bltu` `bgeu` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3 +| Branches | `C` | `c.beqz` `c.bnez` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3 +| Jumps / Calls | `I/E` | `jal` `jalr` | 4 + ML +| Jumps / Calls | `C` | `c.jal` `c.j` `c.jr` `c.jalr` | 4 + ML +| Memory access | `I/E` | `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` | 4 + ML +| Memory access | `C` | `c.lw` `c.sw` `c.lwsp` `c.swsp` | 4 + ML +| Memory access | `A` | `lr.w` `sc.w` | 4 + ML +| Multiplication | `M` | `mul` `mulh` `mulhsu` `mulhu` | 2+31+3; FAST_MULfootnote:[DSP-based multiplication; enabled via `FAST_MUL_EN`.]: 5 +| Division | `M` | `div` `divu` `rem` `remu` | 22+32+4 +| Bit-manipulation - arithmetic/logic | `B(Zbb)` | `sext.b` `sext.h` `min` `minu` `max` `maxu` `andn` `orn` `xnor` `zext`(pack) `rev8`(grevi) `orc.b`(gorci) | 3 +| Bit-manipulation - shifts | `B(Zbb)` | `clz` `ctz` | 3 + 0..32 +| Bit-manipulation - shifts | `B(Zbb)` | `cpop` | 3 + 32 +| Bit-manipulation - shifts | `B(Zbb)` | `rol` `ror` `rori` | 3 + SA +| Bit-manipulation - single-bit | `B(Zbs)` | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 3 +| Bit-manipulation - shifted-add | `B(Zba)` | `sh1add` `sh2add` `sh3add` | 3 +| CSR access | `Zicsr` | `csrrw` `csrrs` `csrrc` `csrrwi` `csrrsi` `csrrci` | 4 +| System | `I/E`+`Zicsr` | `ecall` `ebreak` | 4 +| System | `I/E` | `fence` | 3 +| System | `C`+`Zicsr` | `c.break` | 4 +| System | `Zicsr` | `mret` `wfi` | 5 +| System | `Zifencei` | `fence.i` | 5 +| Floating-point - artihmetic | `Zfinx` | `fadd.s` | 110 +| Floating-point - artihmetic | `Zfinx` | `fsub.s` | 112 +| Floating-point - artihmetic | `Zfinx` | `fmul.s` | 22 +| Floating-point - compare | `Zfinx` | `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` | 13 +| Floating-point - misc | `Zfinx` | `fsgnj.s` `fsgnjn.s` `fsgnjx.s` `fclass.s` | 12 +| Floating-point - conversion | `Zfinx` | `fcvt.w.s` `fcvt.wu.s` | 47 +| Floating-point - conversion | `Zfinx` | `fcvt.s.w` `fcvt.s.wu` | 48 +|======================= + +[NOTE] +The presented values of the *floating-point execution cycles* are average values – obtained from +4096 instruction executions using pseudo-random input values. The execution time for emulating the +instructions (using pure-software libraries) is ~17..140 times higher. + + + +// #################################################################################################################### +include::cpu_csr.adoc[] + + + +<<< +// #################################################################################################################### +:sectnums: +==== Execution Safety + +The hardware of the NEORV32 CPU was designed for a maximum of execution safety. If the Zicsr CPU +extension is enabled, the core supports all traps specified by the official RISC-V specifications (obviously, +not the ones that are related to yet unimplemented extensions/features). Thus, the CPU provides well-defined +hardware fall-backs for (nearly) everything that can go wrong. Even if any kind of trap is triggered, the core +is always in a precise and fully synchronized state throughout the whole architecture (i.e. no need to make +out-of-order operations undone) that allows predictable execution behavior at any time. + +Additional and highlighted safety features: + +* The CPU supports all bus exceptions including bus access exceptions that are triggered if an +accessed address does not respond or encounters an internal error during access (which is a rare +feature in many open-source RISC-V cores). +* The CPU raises an illegal instruction trap for all unimplemented/malformed/illegal instruction words +(which is a rare feature in many open-source RISC-V cores, too). +* If user-level code tries to read from machine-level-only CSR (like mstatus) an illegal instruction +exception is raised (→ illegal access). The results of this operations is always zero (though, machinelevel +code handling this exception can modify the target register of the illegal access-causing +instruction to allow full virtualization). Illegal write accesses to machine CSRs will not be conducted +at all and will only result in raising an illegal instruction exception. +* Illegal user-level memory accesses to protected addresses or address regions (via physical memory +protection) will not be conducted at all (no actual write and no actual read; prevents triggering of +memory-mapped devices). Illegal load operations will not result any data (the instruction’s +destination register will not be written at all). + + + +<<< +// #################################################################################################################### +:sectnums: +==== Traps, Exceptions and Interrupts + +In this document a (maybe) special nomenclature regarding traps is used: + +* _interrupt_ = asynchronous exceptions +* _exceptions_ = synchronous exceptions +* _traps_ = exceptions + interrupts (synchronous or asynchronous exceptions) + +Whenever an exception or interrupt is triggered, the CPU transfers control to the address stored in the `mtvec` +CSR. The cause of the according interrupt or exception can be determined via the content of the `mcause` +CSR The address that reflected the current program counter when a trap was taken is stored to `mepc`. +Additional information regarding the cause of the trap can be retrieved from `mtval`. + +The traps are prioritized. If several exceptions occur at once only the one with highest priority is triggered. If +several interrupts trigger at once, the one with highest priority is triggered while the remaining ones are +queued. After completing the interrupt handler the interrupt with the second highest priority will issues and +so on. + + +**Memory Access Exceptions** + +If a load operation causes any exception, the destination register is not written at all. Exceptions caused by a +misalignment or a physical memory protection fault do not trigger a bus read-operation at all. +Exceptions caused by a store address misalignment or a store physical memory protection fault do not trigger +a bus write-operation at all. + + +**Instruction Atomicity** + +All instructions execute as atomic operations – interrupts can only trigger between two instructions. + + +**Custom Fast Interrupt Request Lines** + +As a custom extension, the NEORV32 CPU features 16 fast interrupt request lines via the `firq_i` CPU top +entity signals. These interrupts have custom configuration and status flags in the `mie` and `mip` CSRs and also +provide custom trap codes. + + +<<< +// #################################################################################################################### +:sectnums!: +===== NEORV32 Trap Listing + +.NEORV32 trap listing +[cols="3,6,5,14,11,4,4"] +[options="header",grid="rows"] +|======================= +| Prio. | `mcause` | [RISC-V] | ID [C] | Cause | `mepc` | `mtval` +| 1 | `0x8000000B` | 1.11 | _TRAP_CODE_MEI_ | machine external interrupt | _I-PC_ | _0_ +| 2 | `0x8000000B` | 1.11 | _TRAP_CODE_MEI_ | machine external interrupt | _I-PC_ | _0_ +| 2 | `0x80000003` | 1.3 | _TRAP_CODE_MSI_ | machine software interrupt | _I-PC_ | _0_ +| 3 | `0x80000007` | 1.7 | _TRAP_CODE_MTI_ | machine timer interrupt (from mtime) | _I-PC_ | _0_ +| 4 | `0x80000010` | 1.16 | _TRAP_CODE_FIRQ_0_ | fast interrupt request channel | _I-PC_ | _0_ +| 5 | `0x80000011` | 1.17 | _TRAP_CODE_FIRQ_1_ | fast interrupt request channel | _I-PC_ | _0_ +| 6 | `0x80000012` | 1.18 | _TRAP_CODE_FIRQ_2_ | fast interrupt request channel | _I-PC_ | _0_ +| 7 | `0x80000013` | 1.19 | _TRAP_CODE_FIRQ_3_ | fast interrupt request channel | _I-PC_ | _0_ +| 8 | `0x80000014` | 1.20 | _TRAP_CODE_FIRQ_4_ | fast interrupt request channel | _I-PC_ | _0_ +| 9 | `0x80000015` | 1.21 | _TRAP_CODE_FIRQ_5_ | fast interrupt request channel | _I-PC_ | _0_ +| 10 | `0x80000016` | 1.22 | _TRAP_CODE_FIRQ_6_ | fast interrupt request channel | _I-PC_ | _0_ +| 11 | `0x80000017` | 1.23 | _TRAP_CODE_FIRQ_7_ | fast interrupt request channel | _I-PC_ | _0_ +| 12 | `0x80000018` | 1.24 | _TRAP_CODE_FIRQ_8_ | fast interrupt request channel | _I-PC_ | _0_ +| 13 | `0x80000019` | 1.25 | _TRAP_CODE_FIRQ_9_ | fast interrupt request channel | _I-PC_ | _0_ +| 14 | `0x8000001a` | 1.26 | _TRAP_CODE_FIRQ_10_ | fast interrupt request channel | _I-PC_ | _0_ +| 15 | `0x8000001b` | 1.27 | _TRAP_CODE_FIRQ_11_ | fast interrupt request channel | _I-PC_ | _0_ +| 16 | `0x8000001c` | 1.28 | _TRAP_CODE_FIRQ_12_ | fast interrupt request channel | _I-PC_ | _0_ +| 17 | `0x8000001d` | 1.29 | _TRAP_CODE_FIRQ_13_ | fast interrupt request channel | _I-PC_ | _0_ +| 18 | `0x8000001e` | 1.30 | _TRAP_CODE_FIRQ_14_ | fast interrupt request channel | _I-PC_ | _0_ +| 19 | `0x8000001f` | 1.31 | _TRAP_CODE_FIRQ_15_ | fast interrupt request channel | _I-PC_ | _0_ +| 20 | `0x00000001` | 0.1 | _TRAP_CODE_I_ACCESS_ | instruction access fault | _B-ADR_ | _PC_ +| 21 | `0x00000002` | 0.2 | _TRAP_CODE_I_ILLEGAL_ | illegal instruction | _PC_ | _Inst_ +| 22 | `0x00000000` | 0.0 | _TRAP_CODE_I_MISALIGNED_ | instruction address misaligned | _B-ADR_ | _PC_ +| 23 | `0x0000000B` | 0.11 | _TRAP_CODE_MENV_CALL_ | environment call from M-mode (ECALL in machine-mode) | _PC_ | _PC_ +| 24 | `0x00000008` | 0.8 | _TRAP_CODE_UENV_CALL_ | environment call from U-mode(ECALL in user-mode) | _PC_ | _PC_ +| 25 | `0x00000003` | 0.3 | _TRAP_CODE_BREAKPOINT_ | breakpoint (EBREAK) | _PC_ | _PC_ +| 26 | `0x00000006` | 0.6 | _TRAP_CODE_S_MISALIGNED_ | store address misaligned | _B-ADR_ | _B-ADR_ +| 27 | `0x00000004` | 0.4 | _TRAP_CODE_L_MISALIGNED_ | load address misaligned | _B-ADR_ | _B-ADR_ +| 28 | `0x00000007` | 0.7 | _TRAP_CODE_S_ACCESS_ | store access fault | _B-ADR_ | _B-ADR_ +| 29 | `0x00000005` | 0.5 | _TRAP_CODE_L_ACCESS_ | lad access fault | _B-ADR_ | _B-ADR_ +|======================= + +**Notes** + +The "Prio." column shows the priority of each trap. The highest priority is 1. The "`mcause`" column shows the +cause ID of the according trap that is written to `mcause` CSR. The "[RISC-V]" columns show the interrupt/exception code value from the +official RISC-V privileged architecture manual. The "[C]" names are defined by the NEORV32 core library (`sw/lib/include/neorv32.h`) and can +be used in plain C code. The "`mepc`" and "`mtval`" columns show the value written to +`mepc` and `mtval` CSRs when a trap is triggered: + +* _I-PC_ - address of interrupted instruction (instruction has not been execute/completed yet) +* _B-ADR_- bad memory access address that cause the trap +* _PC_ - address of instruction that caused the trap +* _0_ - zero +* _Inst_ - the faulting instruction itself + + + +<<< +// #################################################################################################################### +:sectnums: +==== Bus Interface + +The CPU provides two independent bus interfaces: One for fetching instructions (`i_bus_*`) and one for +accessing data (`d_bus_*`) via load and store operations. Both interfaces use the same interface protocol. + +:sectnums: +===== Address Space + +The CPU is a 32-bit architecture with separated instruction and data interfaces making it a Harvard +Architecture. Each of this interfaces can access an address space of up to 2^32^ bytes (4GB). The memory +system is based on 32-bit words with a minimal granularity of 1 byte. Please note, that the NEORV32 CPU +does not support unaligned memory accesses _in hardware_ – however, a software-based handling can be +implemented as any unaligned memory access will trigger an according exception. + +:sectnums: +===== Interface Signals + +The following table shows the signals of the data and instruction interfaces seen from the CPU +(`*_o` signals are driven by the CPU / outputs, `*_i` signals are read by the CPU / inputs). + +.CPU bus interface +[cols="<2,^1,<7"] +[options="header",grid="rows"] +|======================= +| Signal | Size | Function +| `bus_addr_o` | 32 | access address +| `bus_rdata_i` | 32 | data input for read operations +| `bus_wdata_o` | 32 | data output for write operations +| `bus_ben_o` | 4 | byte enable signal for write operations +| `bus_we_o` | 1 | bus write access +| `bus_re_o` | 1 | bus read access +| `bus_lock_o` | 1 | exclusive access request +| `bus_ack_i` | 1 | accessed peripheral indicates a successful completion of the bus transaction +| `bus_err_i` | 1 | accessed peripheral indicates an error during the bus transaction +| `bus_fence_o` | 1 | this signal is set for one cycle when the CPU executes a data/instruction fence operation +| `bus_priv_o` | 2 | current CPU privilege level +|======================= + +[NOTE] +Currently, there a no pipelined or overlapping operations implemented within the same bus interface. +So only a single transfer request can be "on the fly". + +:sectnums: +===== Protocol + +A bus request is triggered either by the `bus_re_o` signal (for reading data) or by the `bus_we_o` signal (for +writing data). These signals are active for exactly one cycle and initiate either a read or a write transaction. The transaction is +completed when the accessed peripheral either sets the `bus_ack_i` signal (-> successful completion) or the +`bus_err_i` signal is set (-> failed completion). All these control signals are only active (= high) for one +single cycle. An error indicated via the `bus_err_i` signal during a transfer will trigger the according instruction bus +access fault or load/store bus access fault exception. + +[NOTE] +The transfer can be completed directly in the same cycle as it was initiated (via the `bus_re_o` or `bus_we_o` +signal) if the peripheral sets `bus_ack_i` or `bus_err_i` high for one cycle. However, in order to shorten the critical path such "asynchronous" +completion should be avoided. The default processor-internal module provide exactly **one cycle delay** between initiation and completion of transfers. + +.Bus Keeper: Processor-internal memories and memory-mapped devices with variable / high latency +[IMPORTANT] +Processor-internal peripherals or memories do not have to respond within one cycle after the transfer initiation (= latency > 1 cycle). +However, the bus transaction has to be completed (= acknowledged) within a certain **response time window**. This time window is defined +by the global `max_proc_int_response_time_c` constant (default = 15 cycles) from the processor's VHDL package file (`rtl/neorv32_package.vhd`). +It defines the maximum number of cycles after which an _unacknowledged_ processor-internal bus transfer will timeout and raise a **bus fault exception**. +The _BUSKEEPER_ hardware module (`rtl/core/neorv32_bus_keeper.vhd`) keeps track of all _internal_ bus transactions. If any bus operations times out +(for example when accessing "address space holes") this unit will issue a bus error to the CPU that will raise the according instruction fetch or data access bus exception. +Note that **the bus keeper does not track external accesses via the external memory bus interface**. However, the external memory bus interface also provides +an _optional_ bus timeout (see section <<_processor_external_memory_interface_wishbone_axi4_lite>>). + +**Exemplary Bus Accesses** + +.Example bus accesses: see read/write access description below +[cols="^2,^2"] +[grid="none"] +|======================= +| image:../figures/cpu_interface_read_long.png[read,300,150] + +Read access +| image:../figures/cpu_interface_write_long.png[write,300,150] + +Write access +|======================= + + +**Write Access** + +For a write access, the accessed address (`bus_addr_o`), the data to be written (`bus_wdata_o`) and the byte +enable signals (`bus_ben_o`) are set when bus_we_o goes high. These three signals are kept stable until the +transaction is completed. In the example the accessed peripheral cannot answer directly in the next +cycle after issuing. Here, the transaction is successful and the peripheral sets the `bus_ack_i` signal several +cycles after issuing. + +**Read Access** + +For a read access, the accessed address (`bus_addr_o`) is set when `bus_re_o` goes high. The address is kept +stable until the transaction is completed. In the example the accessed peripheral cannot answer +directly in the next cycle after issuing. The peripheral hast to apply the read data right in the same cycle as +the bus transaction is completed (here, the transaction is successful and the peripheral sets the `bus_ack_i` +signal). + +**Access Boundaries** + +The instruction interface will always access memory on word (= 32-bit) boundaries even if fetching +compressed (16-bit) instructions. The data interface can access memory on byte (= 8-bit), half-word (= 16- +bit) and word (= 32-bit) boundaries. + +**Exclusive (Atomic) Access** + +The CPU can access memory in an exclusive manner by generating a load-reservate and store-conditional +combination. Normally, these combinations should target the same memory address. + +The CPU starts an exclusive access to memory via the _load-reservate instruction_ (`lr.w`). This instruction +will set the CPU-internal _exclusive access lock_, which directly drives the `d_bus_lock_o`. It is the task of +the memory system to manage this exclusive access reservation by storing the according access address and +the source of the access itself (for example via the CPU ID in a multi-core system). + +When the CPU executes a _store-conditional instruction_ (`sc.w`) the _CPU-internal exclusive access lock_ is +evaluated to check if the exclusive access was successful. If the lock is still OK, the instruction will write-back +zero and will allow the according store operation to the memory system. If the lock is broken, the +instruction will write-back non-zero and will not generate an actual memory store operation. + +The CPU-internal exclusive access lock is broken if at least one of the situations appear. + +* when executing any other memory-access operation than `lr.w` +* when any trap (sync. or async.) is triggered (for example to force a context switch) +* when the memory system signals a bus error (via the `bus_err_i` signal) + +[TIP] +For more information regarding the SoC-level behavior and requirements of atomic operations see +section <<_processor_external_memory_interface_wishbone_axi4_lite>>. + +**Memory Barriers** + +Whenever the CPU executes a fence instruction, the according interface signal is set high for one cycle +(`d_bus_fence_o` for a _fence_ instruction; `i_bus_fence_o` for a _fencei_ instruction). It is the task of the +memory system to perform the necessary operations (like a cache flush and refill). + + + +<<< +// #################################################################################################################### +:sectnums: +==== CPU Hardware Reset + +In order to reduce routing constraints (and by this the actual hardware requirements), most uncritical +registers of the NEORV32 CPU as well as most register of the whole NEORV32 Processor do not use **a +dedicated hardware reset**. "Uncritical registers" in this context means that the initial value of these registers +after power-up is not relevant for a defined CPU boot process. + +**Rational** + +A good example to illustrate the concept of uncritical registers is a pipelined processing engine. Each stage +of the engine features an N-bit _data register_ and a 1-bit _status register_. The status register is set when the +data in the according data register is valid. At the end of the pipeline the status register might trigger a writeback +of the processing result to some kind of memory. The initial status of the data registers after power-up is +irrelevant as long as the status registers are all reset to a defined value that indicates there is no valid data in +the pipeline’s data register. Therefore, the pipeline data register do no require a dedicated reset as they do not +control the actual operation (in contrast to the status register). This makes the pipeline data registers from +this example "uncritical registers". + +**NEORV32 CPU Reset** + +In terms of the NEORV32 CPU, there are several pipeline registers, state machine registers and even status +and control registers (CSRs) that do not require a defined initial state to ensure a correct boot process. The +pipeline register will get initialized by the CPU’s internal state machines, which are initialized from the main +control engine that actually features a defined reset. The initialization of most of the CPU's core CSRs (like +interrupt control) is done by the software (to be more specific, this is done by the `crt0.S` start-up code). + +During the very early boot process (where `crt0.S` is running) there is no chance for undefined behavior due to +the lack of dedicated hardware resets of certain CSRs. For example the machine interrupt-enable CSR (`mie`) +does not provide a dedicated reset. The value after reset of this register is uncritical as interrupts cannot fire +because the global interrupt enabled flag in the status register (`mstatsus(mie)`) provides a dedicated +hardware reset setting it to low (globally disabling interrupts). + +**Reset Configuration** + +Most CPU-internal register do feature an asynchronous reset in the VHDL code, but the "don't care" value +(VHDL `'-'`) is used for initialization of the uncritical register, effectively generating a flip-flop without a +reset. However, certain applications or situations (like advanced gate-level / timing simulations) might +require a more deterministic reset state. For this case, a defined reset level (reset-to-low) of all registers can +be enabled via a constant in the main VHDL package file (`rtl/core/neorv32_package.vhd`): + +[source,vhdl] +---- +-- "critical" number of PMP regions -- +constant dedicated_reset_c : boolean := false; -- use dedicated hardware reset value +for UNCRITICAL registers (FALSE=reset value is irrelevant (might simplify HW), +default; TRUE=defined LOW reset value) +---- Index: docs/src_adoc/cpu_csr.adoc =================================================================== --- docs/src_adoc/cpu_csr.adoc (nonexistent) +++ docs/src_adoc/cpu_csr.adoc (revision 57) @@ -0,0 +1,777 @@ +<<< +:sectnums: +=== Control and Status Registers (CSRs) + +The following table shows a summary of all available CSRs. The address field defines the CSR address for +the CSR access instructions. The *[ASM]* name can be used for (inline) assembly code and is directly +understood by the assembler/compiler. The *[C]* names are defined by the NEORV32 core library and can be +used as immediate in plain C code. The *R/W* column shows whether the CSR can be read and/or written. +The NEORV32-specific CSRs are mapped to the official "custom CSRs" CSR address space. + +[IMPORTANT] +The CSRs, the CSR-related instructions as well as the complete exception/interrupt processing +system are only available when the `CPU_EXTENSION_RISCV_Zicsr` generic is _true_. + +[IMPORTANT] +When trying to write to a read-only CSR (like the `time` CSR) or when trying to access a nonexistent +CSR or when trying to access a machine-mode CSR from less-privileged user-mode an +illegal instruction exception is raised. + +[NOTE] +CSR reset value: Please note that most of the CSRs do *NOT* provide a dedicated reset. Hence, +these CSRs are not initialized by a hardware reset and keep an *UNDEFINED* value until they are +explicitly initialized by the software (normally, this is already done by the NEORV32-specific +`crt0.S` start-up code). For more information see section <<_cpu_hardware_reset>>. + +**CSR Listing** + +The description of each single CSR provides the following summary: + +.CSR description +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| _Address_ | _Description_ | _ASM alias_ +3+| Reset value: _CSR content after hardware reset_ (also see <<_cpu_hardware_reset>>) +3+| _Detailed description_ +|====== + +.Not Implemented CSRs / CSR Bits +[IMPORTANT] +All CSR bits that are unused / not implemented / not shown are _hardwired to zero_. All CSRs that are not +implemented at all (and are not "disabled" using certain configuration generics) will trigger an exception on +access. The CSR that are implemented within the NEORV32 might cause an exception if they are disabled. +See the according CSR description for more information. + + +<<< +// #################################################################################################################### +**CSR Listing Notes** + +CSRs with the following notes ... + +* `C` - have or are a custom CPU extension (that is allowed by the RISC-V specs) +* `R` - are read-only (in contrast to the originally specified r/w capability) +* `S` - have a constrained compatibility; for example not all specified bits are available + +.NEORV32 Control and Status Registers (CSRs) +[cols="<1,<2,<2,^1,<3,^1"] +[options="header"] +|======================= +| Address | Name [ASM] | Name [C] | R/W | Function | Note +6+^| **<<_floating_point_csrs>>** +| 0x001 | `fflags` | _CSR_FFLAGS_ | r/w | Floating-point accrued exceptions | +| 0x002 | `frm` | _CSR_FRM_ | r/w | Floating-point dynamic rounding mode | +| 0x003 | `fcsr` | _CSR_FCSR_ | r/w | Floating-point control and status (`frm` + `fflags`) | +6+^| **<<_machine_trap_setup>>** +| 0x300 | `mstatus` | _CSR_MSTATUS_ | r/w | Machine status register | `S` +| 0x301 | `misa` | _CSR_MISA_ | r/- | Machine CPU ISA and extensions | `R` +| 0x304 | `mie` | _CSR_MIE_ | r/w | Machine interrupt enable register | `C` +| 0x305 | `mtvec` | _CSR_MTVEC_ | r/w | Machine trap-handler base address (for ALL traps) | +| 0x306 | `mcounteren` | _CSR_MCOUNTEREN_ | r/w | Machine counter-enable register | `S` +| 0x310 | `mstatush` | _CSR_MSTATUSH_ | r/- | Machine status register – high word | `SR` +6+^| **<<_machine_trap_handling>>** +| 0x340 | `mscratch` | _CSR_MSCRATCH_ | r/w | Machine scratch register | +| 0x341 | `mepc` | _CSR_MEPC_ | r/w | Machine exception program counter | +| 0x342 | `mcause` | _CSR_MCAUSE_ | r/w | Machine trap cause | `C` +| 0x343 | `mtval` | _CSR_MTVAL_ | r/w | Machine bad address or instruction | +| 0x344 | `mip` | _CSR_MIP_ | r/w | Machine interrupt pending register | `C` +6+^| **<<_machine_physical_memory_protection>>** +| 0x3a0 .. 0x3af | `pmpcfg0` .. `pmpcfg15` | _CSR_PMPCFG0_ .. _CSR_PMPCFG15_ | r/w | Physical memory protection config. for region 0..63 | `S` +| 0x3b0 .. 0x3ef | `pmpaddr0` .. `pmpaddr63`| _CSR_PMPADDR0_ .. _CSR_PMPADDR63_ | r/w | Physical memory protection addr. register region 0..63 | +6+^| **<<_machine_counters_and_timers>>** +| 0xb00 | `mcycle` | _CSR_MCYCLE_ | r/w | Machine cycle counter low word | +| 0xb02 | `minstret` | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter low word | +| 0xb80 | `mcycleh` | _CSR_MCYCLE_ | r/w | Machine cycle counter high word | +| 0xb82 | `minstreth` | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter high word | +| 0xc00 | `cycle` | _CSR_CYCLE_ | r/- | Cycle counter low word | +| 0xc01 | `time` | _CSR_TIME_ | r/- | System time (from MTIME) low word | +| 0xc02 | `instret` | _CSR_INSTRET_ | r/- | Instruction-retired counter low word | +| 0xc80 | `cycleh` | _CSR_CYCLEH_ | r/- | Cycle counter high word | +| 0xc81 | `timeh` | _CSR_TIMEH_ | r/- | System time (from MTIME) high word | +| 0xc82 | `instreth` | _CSR_INSTRETH_ | r/- | Instruction-retired counter high word | +6+^| **<<_hardware_performance_monitors_hpm>>** +| 0x323 .. 0x33f | `mhpmevent3` .. `mhpmevent31` | _CSR_MHPMEVENT3_ .. _CSR_MHPMEVENT31_ | r/w | Machine performance-monitoring event selector 3..31 | `C` +| 0xb03 .. 0xb1f | `mhpmcounter3` .. `mhpmcounter31` | _CSR_MHPMCOUNTER3_ .. _CSR_MHPMCOUNTER31_ | r/w | Machine performance-monitoring counter 3..31 low word | +| 0xb83 .. 0xb9f | `mhpmcounter3h` .. `mhpmcounter31h` | _CSR_MHPMCOUNTER3H_ .. _CSR_MHPMCOUNTER31H_ | r/w | Machine performance-monitoring counter 3..31 high word | +| 0xc03 .. 0xc1f | `hpmcounter3` .. `hpmcounter31` | _CSR_HPMCOUNTER3_ .. _CSR_HPMCOUNTER31_ | r/- | Performance-monitoring counter 3..31 low word | +| 0xc83 .. 0xc9f | `hpmcounter3h` .. `hpmcounter31h` | _CSR_HPMCOUNTER3H_ .. _CSR_HPMCOUNTER31H_ | r/- | Performance-monitoring counter 3..31 high word | +6+^| **<<_machine_counter_setup>>** +| 0x320 | `mcountinhibit` | _CSR_MCOUNTINHIBIT_ | r/w | Machine counter-enable register | +6+^| **<<_machine_information_registers>>** +| 0xf11 | `mvendorid` | _CSR_MVENDORID_ | r/- | Vendor ID | +| 0xf12 | `marchid` | _CSR_MARCHID_ | r/- | Architecture ID | +| 0xf13 | `mimpid` | _CSR_MIMPID_ | r/- | Machine implementation ID / version | +| 0xf14 | `mhartid` | _CSR_MHARTID_ | r/- | Machine thread ID | +6+^| **<<_neorv32_specific_custom_csrs>>** +| 0xfc0 | `mzext` | _CSR_MZEXT_ | r/- | Available `Z*` CPU extensions | +|======================= + + + +<<< +// #################################################################################################################### +:sectnums: +==== Floating-Point CSRs + +These CSRs are available if the `Zfinx` extensions is enabled (`CPU_EXTENSION_RISCV_Zfinx` is _true_). +Otherwise any access to the floating-point CSRs will raise an illegal instruction exception. + + +:sectnums!: +===== **`fflags`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x001 | **Floating-point accrued exceptions** | `fflags` +3+| Reset value: _UNDEFINED_ +3+| The `fflags` CSR is compatible to the RISC-V specifications. It shows the accrued ("accumulated") +exception flags in the lowest 5 bits. This CSR is only available if a floating-point CPU extension is enabled. +See the RISC-V ISA spec for more information. +|====== + + +:sectnums!: +===== **`frm`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x002 | **Floating-point dynamic rounding mode** | `frm` +3+| Reset value: _UNDEFINED_ +3+| The `frm` CSR is compatible to the RISC-V specifications and is used to configure the rounding modes using +the lowest 3 bits. This CSR is only available if a floating-point CPU extension is enabled. See the RISC-V +ISA spec for more information. +|====== + + +:sectnums!: +===== **`fcsr`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x003 | **Floating-point control and status register** | `fcsr` +3+| Reset value: _UNDEFINED_ +3+| The `fcsr` CSR is compatible to the RISC-V specifications. It provides combined read/write access to the +`fflags` and `frm` CSRs. This CSR is only available if a floating-point CPU extension is enabled. See the +RISC-V ISA spec for more information. +|====== + + +<<< +// #################################################################################################################### +:sectnums: +==== Machine Trap Setup + +:sectnums!: +===== **`mstatus[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x300 | **Machine status register - low word** | `mstatus` +| 0x310 | **Machine status register - high word** | `mstatush` +3+| Reset value: _0x00000020.00000000_ +3+| The `mstatus` CSR is compatible to the RISC-V specifications. It shows the CPU's current execution state. +The following bits are implemented (all remaining bits are always zero and are read-only). +|====== + +.Machine status register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 12:11 | _CSR_MSTATUS_MPP_H_ : _CSR_MSTATUS_MPP_L_ | r/w | Previous machine privilege level, 11 = machine (M) level, 00 = user (U) level +| 7 | _CSR_MSTATUS_MPIE_ | r/w | Previous machine global interrupt enable flag state +| 6 | _CSR_MSTATUS_UBE_ | r/- | User-mode byte-order (Endianness) for load/Store operations, always set indicating BIG-endian byte-order (copy of `CSR_MSTATUSH_MBE`); bit is always zero if user-mode is not implemented +| 3 | _CSR_MSTATUS_MIE_ | r/w | Machine global interrupt enable flag +|======================= + +.Machine status register - high word +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 5 | _CSR_MSTATUSH_MBE_ | r/- | Machine-mode byte-order (Endianness) for load/Store operations, always set indicating BIG-endian byte-order +|======================= + +When entering an exception/interrupt, the `MIE` flag is copied to `MPIE` and cleared afterwards. When leaving +the exception/interrupt (via the `mret` instruction), `MPIE` is copied back to `MIE`. + + +:sectnums!: +===== **`misa`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x301 | **ISA and extensions** | `misa` +3+| Reset value: _configuration dependant_ +3+| The `misa` CSR gives information about the actual CPU features. The lowest 26 bits show the implemented +CPU extensions. The following bits are implemented (all remaining bits are always zero and are read-only). +|====== + +[IMPORTANT] +The `misa` CSR is not fully RISC-V-compatible as it is read-only. Hence, implemented CPU +extensions cannot be switch on/off during runtime. For compatibility reasons any write access to this +CSR is simply ignored and will NOT cause an illegal instruction exception. + +.Machine ISA and extension register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 31:30 | _CSR_MISA_MXL_HI_EXT_ : _CSR_MISA_MXL_LO_EXT_ | r/- | 32-bit architecture indicator (always _01_) +| 23 | _CSR_MISA_X_EXT_ | r/- | `X` extension bit is always set to indicate custom non-standard extensions +| 20 | _CSR_MISA_U_EXT_ | r/- | `U` CPU extension (user mode) available, set when _CPU_EXTENSION_RISCV_U_ enabled +| 12 | _CSR_MISA_M_EXT_ | r/- | `M` CPU extension (mul/div) available, set when _CPU_EXTENSION_RISCV_M_ enabled +| 8 | _CSR_MISA_I_EXT_ | r/- | `I` CPU base ISA, cleared when _CPU_EXTENSION_RISCV_E_ enabled +| 4 | _CSR_MISA_E_EXT_ | r/- | `E` CPU extension (embedded) available, set when _CPU_EXTENSION_RISCV_E_ enabled +| 2 | _CSR_MISA_C_EXT_ | r/- | `C` CPU extension (compressed instruction) available, set when _CPU_EXTENSION_RISCV_C_ enabled +| 1 | _CSR_MISA_B_EXT_ | r/- | `B` CPU extension (bit-manipulation) available, set when _CPU_EXTENSION_RISCV_B_ enabled +| 0 | _CSR_MISA_A_EXT_ | r/- | `A` CPU extension (atomic memory access) available, set when _CPU_EXTENSION_RISCV_A_ enabled +|======================= + +[TIP] +Information regarding the available RISC-V Z* _sub-extensions_ (like `Zicsr` or `Zfinx`) can be found in the <<_mzext>> CSR. + + +:sectnums!: +===== **`mie`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x304 | **Machine interrupt-enable register** | `mie` +3+| Reset value: _UNDEFINED_ +3+| The `mie` CSR is compatible to the RISC-V specifications and features custom extensions for the fast +interrupt channels. It is used to enabled specific interrupts sources. Please note that interrupts also have to be +globally enabled via the `CSR_MSTATUS_MIE` flag of the `mstatus` CSR. The following bits are implemented +(all remaining bits are always zero and are read-only): +|====== + +.Machine ISA and extension register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 31:16 | _CSR_MIE_FIRQ15E_ : _CSR_MIE_FIRQ0E_ | r/w | Fast interrupt channel 15..0 enable +| 11 | _CSR_MIE_MEIE_ | r/w | Machine _external_ interrupt enable +| 7 | _CSR_MIE_MTIE_ | r/w | Machine _timer_ interrupt enable (from _MTIME_) +| 3 | _CSR_MIE_MSIE_ | r/w | Machine _software_ interrupt enable +|======================= + + +:sectnums!: +===== **`mtvec`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x305 | **Machine trap-handler base address** | `mtvec` +3+| Reset value: _UNDEFINED_ +3+| The `mtvec` CSR is compatible to the RISC-V specifications. It stores the base address for ALL machine +traps. Thus, it defines the main entry point for exception/interrupt handling regardless of the actual trap +source. The lowest two bits of this register are always zero and cannot be modified (= fixed address mode). +|====== + +.Machine trap-handler base address +[cols="^1,^1,<8"] +[options="header",grid="rows"] +|======================= +| Bit | R/W | Function +| 31:2 | r/w | 4-byte aligned base address of trap base handler +| 1:0 | r/- | Always zero +|======================= + + +:sectnums!: +===== **`mcounteren`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x306 | **Machine counter enable** | `mcounteren` +3+| Reset value: _UNDEFINED_ +3+| The `mcounteren` CSR is compatible to the RISC-V specifications. The bits of this CSR define which +counter/timer CSR can be accessed (read) from code running in a less-privileged modes. For example, +if user-level code tries to read from a counter/timer CSR without having access, the illegal instruction +exception is raised. The following table shows all implemented bits (all remaining bits are always zero and +are read-only). If user mode in not implemented (_CPU_EXTENSION_RISCV_U_ = _false_) all bits of the +`mcounteren` CSR are tied to zero. +|====== + +.Machine counter enable register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 31:16 | _CSR_MCOUNTEREN_HPM31_ : _CSR_MCOUNTEREN_HPM3_ | r/w | User-level code is allowed to read `hpmcounter*[h]` CSRs when set +| 2 | _CSR_MCOUNTEREN_IR_ | r/w | User-level code is allowed to read `cycle[h]` CSRs when set +| 1 | _CSR_MCOUNTEREN_TM_ | r/w | User-level code is allowed to read `time[h]` CSRs when set +| 0 | _CSR_MCOUNTEREN_CY_ | r/w | User-level code is allowed to read `instret[h]` CSRs when set +|======================= + + + +<<< +// #################################################################################################################### +:sectnums: +==== Machine Trap Handling + +:sectnums!: +===== **`mscratch`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x340 | **Scratch register for machine trap handlers** | `mscratch` +3+| Reset value: _UNDEFINED_ +3+| The `mscratch` CSR is compatible to the RISC-V specifications. It is a general purpose scratch register that +can be used by the exception/interrupt handler. The content pf this register after reset is undefined. +|====== + +:sectnums!: +===== **`mepc`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x341 | **Machine exception program counter** | `mepc` +3+| Reset value: _UNDEFINED_ +3+| The `mepc` CSR is compatible to the RISC-V specifications. For exceptions (like an illegal instruction) this +register provides the address of the exception-causing instruction. For Interrupt (like a machine timer +interrupt) this register provides the address of the next not-yet-executed instruction. +|====== + +:sectnums!: +===== **`mtval`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x343 | **Machine bad address or instruction** | `mtval` +3+| Reset value: _UNDEFINED_ +3+| The `mtval` CSR is compatible to the RISC-V specifications. When a trap is triggered, the CSR shows either +the faulting address (for misaligned/faulting load/stores/fetch) or the faulting instruction itself (for illegal +instructions). For interrupts the CSR is set to zero. +|====== + +.Machine bad address or instruction register +[cols="^5,^5"] +[options="header",grid="rows"] +|======================= +| Trap cause | `mtval` content +| misaligned instruction fetch address or instruction fetch access fault | address of faulting instruction fetch +| breakpoint | program counter (= address) of faulting instruction itself +| misaligned load address, load access fault, misaligned store address or store access fault | program counter (= address) of faulting instruction itself +| illegal instruction | actual instruction word of faulting instruction +| anything else including interrupts | _0x00000000_ (always zero) +|======================= + + +:sectnums!: +===== **`mip`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x344 | **Machine interrupt Pending** | `mip` +3+| Reset value: _UNDEFINED_ +3+| The `mip` CSR is compatible to the RISC-V specifications and provides custom extensions. It shows pending interrupts. Any pending interrupt can +be cleared by writing zero to the according bit(s). The following CSR bits are implemented (all remaining bits are always zero and are read-only). +|====== + +.Machine interrupt pending register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Function +| 31:16 | _CSR_MIP_FIRQ15P_ : _CSR_MIP_FIRQ0P_ | r/w | fast interrupt channel 15..0 pending +| 11 | _CSR_MIP_MEIP_ | r/w | machine _external_ interrupt pending +| 7 | _CSR_MIP_MTIP_ | r/w | machine _timer_ interrupt pending +| 3 | _CSR_MIP_MSIP_ | r/w | machine _software_ interrupt pending +|======================= + + +<<< +// #################################################################################################################### +:sectnums: +==== Machine Physical Memory Protection + +The available physical memory protection logic is configured via the _PMP_NUM_REGIONS_ and +_PMP_MIN_GRANULARITY_ top entity generics. _PMP_NUM_REGIONS_ defines the number of implemented +protection regions and thus, the availability of the according `pmpcfg*` and `pmpaddr*` CSRs. + +[TIP] +If trying to access an PMP-related CSR beyond _PMP_NUM_REGIONS_ **no illegal instruction +exception** is triggered. The according CSRs are read-only and always return zero. + +[IMPORTANT] +The RISC-V-compatible NEORV32 physical memory protection only implements the _NAPOT_ +(naturally aligned power-of-two region) mode with a minimal region granularity of 8 bytes. + + +:sectnums!: +===== **`pmpcfg`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x3a0 - 0x3af| **Physical memory protection configuration registers** | `pmpcfg0` - `pmpcfg15` +3+| Reset value: _0x00000000_ +3+| The `pmpcfg*` CSRs are compatible to the RISC-V specifications. They are used to configure the protected +regions, where each `pmpcfg*` CSR provides configuration bits for four regions. The following bits (for the +first PMP configuration entry) are implemented (all remaining bits are always zero and are read-only): +|====== + +.Physical memory protection configuration register entry +[cols="^1,^3,^1,<11"] +[options="header",grid="rows"] +|======================= +| Bit | RISC-V name | R/W | Function +| 7 | _L_ | r/w | lock bit, can be set – but not be cleared again (only via CPU reset) +| 6:5 | - | r/- | reserved, read as zero +| 4:3 | _A_ | r/w | mode configuration; only OFF (`00`) and NAPOT (`11`) are supported +| 2 | _X_ | r/w | execute permission +| 1 | _W_ | r/w | write permission +| 0 | _R_ | r/w | read permission +|======================= + + +:sectnums!: +===== **`pmpaddr`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x3b0 - 0x3ef| **Physical memory protection configuration registers** | `pmpaddr0` - `pmpaddr63` +3+| Reset value: _UNDEFINED_ +3+| The `pmpaddr*` CSRs are compatible to the RISC-V specifications. They are used to configure the base +address and the region size. +|====== + +[NOTE] +When configuring PMP make sure to set `pmpaddr*` before activating the according region via +`pmpcfg*`. When changing the PMP configuration, deactivate the according region via `pmpcfg*` +before modifying `pmpaddr*`. + + +<<< +// #################################################################################################################### +:sectnums: +==== (Machine) Counters and Timers + +[IMPORTANT] +The _CPU_CNT_WIDTH_ generic defines the total size of the CPU's `[m]cycle` and `[m]instret` +counter CSRs (low and high words combined); the time CSRs are not affected by this generic. Any +configuration with _CPU_CNT_WIDTH_ less than 64 is not RISC-V compliant. + +[IMPORTANT] +If _CPU_CNT_WIDTH_ is less than 64 (the default value) and greater than or equal 32, the according +MSBs of `[m]cycleh` and `[m]instreth` are read-only and always read as zero. This configuration +will also set the _ZXSCNT_ flag in the `mzext` CSR. + +[IMPORTANT] +If _CPU_CNT_WIDTH_ is less than 32 and greater than 0, the `[m]cycleh` and `[m]instreth` do not +exist and any access will raise an illegal instruction exception. Furthermore, the according MSBs of +`[m]cycle` and `[m]instret` are read-only and always read as zero. This configuration will also +set the _ZXSCNT_ flag in the `mzext` CSR. + +[IMPORTANT] +If _CPU_CNT_WIDTH_ is 0, the `[m]cycleh`, `[m]cycle`, `[m]instreth` and `[m]instret` do not +exist and any access will raise an illegal instruction exception. This configuration will also set the +_ZXNOCNT_ flag in the `mzext` CSR. + + +:sectnums!: +===== **`cycle[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xc00 | **Cycle counter - low word** | `cycle` +| 0xc80 | **Cycle counter - high word** | `cycleh` +3+| Reset value: _UNDEFINED_ +3+| The `cycle[h]` CSR is compatible to the RISC-V specifications. It shows the lower/upper 32-bit of the 64-bit cycle +counter. The `cycle[h]` CSR is a read-only shadowed copy of the `mcycle[h]` CSR. +|====== + + +:sectnums!: +===== **`time[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xc01 | **System time - low word** | `time` +| 0xc81 | **System time - high word** | `timeh` +3+| Reset value: _UNDEFINED_ +3+| The `time[h]` CSR is compatible to the RISC-V specifications. It shows the lower/upper 32-bit of the 64-bit system +time. The system time is generated by the _MTIME_ system timer unit via the CPU `mtime_i` signal. The `time[h]` +CSR is read-only. Change the system time via the _MTIME_ unit. If the processor-internal machine timer _MTIME_ is not implemented (via _IO_MTIME_EN_ = _false_), the +processor's `mtime_i` top entity signal is accessible via the `time[h]` CSRs. +|====== + + +:sectnums!: +===== **`instret[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xc02 | **Instructions-retired counter - low word** | `instret` +| 0xc82 | **Instructions-retired counter - high word** | `instreth` +3+| Reset value: _UNDEFINED_ +3+| The `instret[h]` CSR is compatible to the RISC-V specifications. It shows the lower/upper 32-bit of the 64-bit retired +instructions counter. The `instret[h]` CSR is a read-only shadowed copy of the `minstret[h]` CSR. +|====== + + +:sectnums!: +===== **`mcycle[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xb00 | **Machine cycle counter - low word** | `mcycle` +| 0xb80 | **Machine cycle counter - high word** | `mcycleh` +3+| Reset value: _UNDEFINED_ +3+| The `mcycle[h]` CSR is compatible to the RISC-V specifications. It shows the lower/upper 32-bit of the 64-bit cycle +counter. The `mcycle[h]` CSR can also be written when in machine mode and is copied to the `cycle[h]` CSR. +|====== + + +:sectnums!: +===== **`minstret[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xb02 | **Machine instructions-retired counter - low word** | `minstret` +| 0xb82 | **Machine instructions-retired counter - high word** | `minstreth` +3+| Reset value: _UNDEFINED_ +3+| The `minstret[h]` CSR is compatible to the RISC-V specifications. It shows the lower/upper 32-bit of the 64-bit retired +instructions counter. The `minstret[h]` CSR also be written when in machine mode and is copied to the `instret[h]` CSR. +|====== + + + +<<< +// #################################################################################################################### +:sectnums: +==== Hardware Performance Monitors (HPM) + +The available hardware performance logic is configured via the _HPM_NUM_CNTS_ top entity generic. +_HPM_NUM_CNTS_ defines the number of implemented performance monitors and thus, the availability of the +according `[m]hpmcounter*[h]` and `mhpmevent*` CSRs. + +The total size of the HPMs can be configured before syntheis via the _HPM_CNT_WIDTH_ generic (1..64-bit). + +[TIP] +If trying to access an HPM-related CSR beyond _HPM_NUM_CNTS_ **no illegal instruction exception is +triggered**. The according CSRs are read-only and always return zero. + +[NOTE] +The total LSB-aligned HPM counter size (low word CSR + high word CSR) is defined via the +_HPM_CNT_WIDTH_ generic (1..64-bit). If _HPM_CNT_WIDTH_ is less than 64, all unused MSB-aligned +bits are hardwired to zero. + + +:sectnums!: +===== **`mhpmevent`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x232 -0x33f | **Machine hardware performance monitor event selector** | `mhpmevent3` - `mhpmevent31` +3+| Reset value: _UNDEFINED_ +3+| The `mhpmevent*` CSRs are compatible to the RISC-V specifications. The configuration of these CSR define +the architectural events that cause the according `[m]hpmcounter*[h]` counters to increment. All available events are +listed in the table below. If more than one event is selected, the according counter will increment if any of +the enabled events is observed (logical OR). Note that the counter will only increment by 1 step per clock +cycle even if more than one event is observed. If the CPU is in sleep mode, no HPM counter will increment +at all. +|====== + +The available hardware performance logic is configured via the _HPM_NUM_CNTS_ top entity generic. +_HPM_NUM_CNTS_ defines the number of implemented performance monitors and thus, the availability of the +according `[m]hpmcounter*[h]` and `mhpmevent*` CSRs. + +.HPM event selector +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Event +| 0 | _HPMCNT_EVENT_CY_ | r/w | active clock cycle (not in sleep) +| 1 | - | r/- | _not implemented, always read as zero_ +| 2 | _HPMCNT_EVENT_IR_ | r/w | retired instruction +| 3 | _HPMCNT_EVENT_CIR_ | r/w | retired cmpressed instruction +| 4 | _HPMCNT_EVENT_WAIT_IF_ | r/w | instruction fetch memory wait cycle (if more than 1 cycle memory latency) +| 5 | _HPMCNT_EVENT_WAIT_II_ | r/w | instruction issue pipeline wait cycle (if more than 1 cycle latency), caused by pipelines flushes (like taken branches) +| 6 | _HPMCNT_EVENT_WAIT_MC_ | r/w | multi-cycle ALU operation wait cycle +| 7 | _HPMCNT_EVENT_LOAD_ | r/w | load operation +| 8 | _HPMCNT_EVENT_STORE_ | r/w | store operation +| 9 | _HPMCNT_EVENT_WAIT_LS_ | r/w | load/store memory wait cycle (if more than 1 cycle memory latency) +| 10 | _HPMCNT_EVENT_JUMP_ | r/w | unconditional jump +| 11 | _HPMCNT_EVENT_BRANCH_ | r/w | conditional branch (taken or not taken) +| 12 | _HPMCNT_EVENT_TBRANCH_ | r/w | taken conditional branch +| 13 | _HPMCNT_EVENT_TRAP_ | r/w | entered trap +| 14 | _HPMCNT_EVENT_ILLEGAL_ | r/w | illegal instruction exception +|======================= + + +:sectnums!: +===== **`hpmcounter[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xc03 - 0xc1f | **Hardware performance monitor - counter low** | `hpmcounter3` - `hpmcounter31` +| 0xc83 - 0xc9f | **Hardware performance monitor - counter high** | `hpmcounter3h` - `hpmcounter31h` +3+| Reset value: _UNDEFINED_ +3+| The `hpmcounter*[h]` CSRs are compatible to the RISC-V specifications. These CSRs provide the lower/upper 32-bit +of arbitrary event counters (64-bit). These CSRs are read-only and provide a showed copy of the according +`mhpmcounter*[h]` CSRs. The event(s) that trigger an increment of theses counters are selected via the according +`mhpmevent*` CSRs. +|====== + + +:sectnums!: +===== **`mhpmcounter[h]`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xb03 - 0xb1f | **Machine hardware performance monitor - counter low** | `mhpmcounter3` - `mhpmcounter31` +| 0xb83 - 0xb9f | **Machine hardware performance monitor - counter high** | `mhpmcounter3h` - `mhpmcounter31h` +3+| Reset value: _UNDEFINED_ +3+| The `mhpmcounter*[h]` CSRs are compatible to the RISC-V specifications. These CSRs provide the lower/upper 32- +bit of arbitrary event counters (64-bit). The `mhpmcounter*[h]` CSRs can also be written and are copied to the +`hpmcounter*[h]` CSRs. The event(s) that trigger an increment of theses counters are selected via the according +`mhpmevent*` CSRs. +|====== + + +<<< +// #################################################################################################################### +:sectnums: +==== Machine Counter Setup + +:sectnums!: +===== **`mcountinhibit`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0x320 | **Machine counter-inhibit register** | `mcountinhibit` +3+| Reset value: _UNDEFINED_ +3+| The `mcountinhibit` CSR is compatible to the RISC-V specifications. The bits in this register define which +counter/timer CSR are allowed to perform an automatic increment. Automatic update is enabled if the +according bit in `mcountinhibit` is cleared. The following bits are implemented (all remaining bits are +always zero and are read-only). +|====== + +.Machine counter-inhibit register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Event +| 0 | _CSR_MCOUNTINHIBIT_IR_ | r/w | the `[m]instret[h]` CSRs will auto-increment with each committed instruction when set +| 2 | _CSR_MCOUNTINHIBIT_IR_ | r/w | the `[m]cycle[h]` CSRs will auto-increment with each clock cycle (if CPU is not in sleep state) when set +| 3:31 | _CSR_MCOUNTINHIBIT_HPM3_ _: _CSR_MCOUNTINHIBIT_HPM31_ | r/w | the `[m]hpmcount*[h]` CSRs will auto-increment according to the configured `mhpmevent*` selector +|======================= + + +<<< +// #################################################################################################################### +:sectnums: +==== Machine Information Registers + + +:sectnums!: +===== **`mvendorid`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xf11 | **Machine vendor ID** | `mvendorid` +3+| Reset value: _0x00000000_ +3+| The `mvendorid` CSR is compatible to the RISC-V specifications. It is read-only and always reads zero. +|====== + + +:sectnums!: +===== **`marchid`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xf12 | **Machine architecture ID** | `marchid` +3+| Reset value: _0x00000013_ +3+| The `marchid` CSR is compatible to the RISC-V specifications. It is read-only and shows the NEORV32 +official _RISC-V open-source architecture ID_ (decimal: 19, 32-bit hexadecimal: 0x00000013). +|====== + + +:sectnums!: +===== **`mimpid`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xf13 | **Machine implementation ID** | `mimpid` +3+| Reset value: _HW version number_ +3+| The `mimpid` CSR is compatible to the RISC-V specifications. It is read-only and shows the version of the +NEORV32 as BCD-coded number (example: `mimpid` = _0x01020312_ → 01.02.03.12 → version 1.2.3.12). +|====== + + +:sectnums!: +===== **`mhartid`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xf14 | **Machine hardware thread ID** | `mhartid` +3+| Reset value: _HW_THREAD_ID_ generic +3+| The `mhartid` CSR is compatible to the RISC-V specifications. It is read-only and shows the core's hart ID, +which is assigned via the CPU's _HW_THREAD_ID_ generic. +|====== + + + +<<< +// #################################################################################################################### +:sectnums: +==== NEORV32-Specific Custom CSRs + + +:sectnums!: +===== **`mzext`** + +[cols="4,27,>7"] +[frame="topbot",grid="none"] +|====== +| 0xfc0 | **Available Z* extensions** | `mzext` +3+| Reset value: _0x00000000_ +3+| The `mzext` CSR is a custom read-only CSR that shows the implemented Z* extensions. The following bits +are implemented (all remaining bits are always zero). +|====== + +.Machine counter-inhibit register +[cols="^1,<3,^1,<5"] +[options="header",grid="rows"] +|======================= +| Bit | Name [C] | R/W | Event +| 0 | _CPU_MZEXT_ZICSR_ | r/- | `Zicsr` extensions available (enabled via _CPU_EXTENSION_RISCV_Zicsr_ generic) +| 1 | _CPU_MZEXT_ZIFENCEI_ | r/- | `Zifencei` extensions available (enabled via _CPU_EXTENSION_RISCV_Zifencei_ generic) +| 2 | _CPU_MZEXT_ZBB_ | r/- | `Zbb` extensions available (enabled via _CPU_EXTENSION_RISCV_B_ generic) +| 3 | _CPU_MZEXT_ZBS_ | r/- | `Zbs` extensions available (enabled via _CPU_EXTENSION_RISCV_B_ generic) +| 4 | _CPU_MZEXT_ZBA_ | r/- | `Zba` extensions available (enabled via _CPU_EXTENSION_RISCV_B_ generic) +| 5 | _CPU_MZEXT_ZFINX_ | r/- | `Zfinx` extensions available (enabled via _CPU_EXTENSION_RISCV_Zfinx_ generic) +| 6 | _CPU_MZEXT_ZXSCNT_ | r/- | custom extension: "Small CPU counters": `cycle[h]` & `instret[h]` CSRs have less than 64-bit when set (when _CPU_CNT_WIDTH_ generic is less than 64). +| 7 | _CPU_MZEXT_ZXNOCNT_ | r/- | custom extension: "NO CPU counters": `cycle[h]` & `instret[h]` CSRs are not available at all when set (when _CPU_CNT_WIDTH_ generic is 0). +|======================= Index: docs/src_adoc/getting_started.adoc =================================================================== --- docs/src_adoc/getting_started.adoc (nonexistent) +++ docs/src_adoc/getting_started.adoc (revision 57) @@ -0,0 +1,930 @@ +:sectnums: +== Let's Get It Started! + +To make your NEORV32 project run, follow the guides from the upcoming sections. Follow these guides +step by step and in the presented order. + +:sectnums: +=== Toolchain Setup + +There are two possibilities to get the actual RISC-V GCC toolchain: + +1. Download and _build_ the official RISC-V GNU toolchain yourself +2. Download and install a prebuilt version of the toolchain + +[NOTE] +The default toolchain prefix for this project is **`riscv32-unknown-elf`**. Of course you can use any other RISC-V +toolchain (like `riscv64-unknown-elf`) that is capable to emit code for a `rv32` architecture. Just change the _RISCV_TOOLCHAIN_ variable in the application +makefile(s) according to your needs or define this variable when invoking the makefile. + +[IMPORTANT] +Keep in mind that – for instance – a rv32imc toolchain only provides library code compiled with +compressed (_C_) and `mul`/`div` instructions (_M_)! Hence, this code cannot be executed (without +emulation) on an architecture without these extensions! + + +:sectnums: +==== Building the Toolchain from Scratch + +To build the toolchain by yourself you can follow the guide from the official https://github.com/riscv/riscvgnu-toolchain GitHub page. + +The official RISC-V repository uses submodules. You need the `--recursive` option to fetch the submodules +automatically: + +[source,bash] +---- +$ git clone --recursive https://github.com/riscv/riscv-gnu-toolchain +---- + +Download and install the prerequisite standard packages: + +[source,bash] +---- +$ sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev libmpfrdev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev +---- + +To build the Linux cross-compiler, pick an install path. If you choose, say, `/opt/riscv`, then add +`/opt/riscv/bin` to your `PATH` variable. + +[source,bash] +---- +$ export PATH=$PATH:/opt/riscv/bin +---- + +Then, simply run the following commands and configuration in the RISC-V GNU toolchain source folder to compile a +`rv32i` toolchain: + +[source,bash] +---- +riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –-with-abi=ilp32 +riscv-gnu-toolchain$ make +---- + +After a while you will get `riscv32-unknown-elf-gcc` and all of its friends in your `/opt/riscv/bin` folder. + + +:sectnums: +==== Downloading and Installing a Prebuilt Toolchain + +Alternatively, you can download a prebuilt toolchain. + +**Use The Toolchain I have Build** + +I have compiled the toolchain on a 64-bit x86 Ubuntu (Ubuntu on Windows, actually) and uploaded it to +GitHub. You can directly download the according toolchain archive as single _zip-file_ within a packed +release from github.com/stnolting/riscv-gcc-prebuilt. + +Unpack the downloaded toolchain archive and copy the content to a location in your file system (e.g. +`/opt/riscv`). More information about downloading and installing my prebuilt toolchains can be found in +the repository's README. + +**Use a Third Party Toolchain** + +Of course you can also use any other prebuilt version of the toolchain. There are a lot RISC-V GCC packages out there - +even for Windows. + +[IMPORTANT] +Make sure the toolchain can (also) emit code for a `rv32i` architecture, uses the `ilp32` or `ilp32e` ABI and **was not build** using +CPU extensions that are not supported by the NEORV32 (like `D`). + + +:sectnums: +==== Installation + +Now you have the binaries. The last step is to add them to your `PATH` environment variable (if you have not +already done so). Make sure to add the binaries folder (`bin`) of your toolchain. + +[source,bash] +---- +$ export PATH:$PATH:/opt/riscv/bin +---- + +You should add this command to your `.bashrc` (if you are using bash) to automatically add the RISC-V +toolchain at every console start. + +:sectnums: +==== Testing the Installation + +To make sure everything works fine, navigate to an example project in the NEORV32 example folder and +execute the following command: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make check +---- + +This will test all the tools required for the NEORV32. Everything is working fine if "Toolchain check OK" appears at the end. + + + +<<< +// #################################################################################################################### +:sectnums: +=== General Hardware Setup + +The following steps are required to generate a bitstream for your FPGA board. If you want to run the +NEORV32 processor in simulation only, the following steps might also apply. + +[TIP] +Check out the example setups in the `boards` folder (@GitHub: https://github.com/stnolting/neorv32/tree/master/boards), which provides script-based +demo projects for various FPGA boars. + +In this tutorial we will use a test implementation of the processor – using many of the processor's optional +modules but just propagating the minimal signals to the outer world. Hence, this guide is intended as +evaluation or "hello world" project to check out the NEORV32. A little note: The order of the following +steps might be a little different for your specific EDA tool. + +[start=0] +. Create a new project with your FPGA EDA tool of choice. +. Add all VHDL files from the project's `rtl/core` folder to your project. Make sure to _reference_ the +files only – do not copy them. +. Make sure to add all the rtl files to a new library called **`neorv32`**. If your FPGA tools does not +provide a field to enter the library name, check out the "properties" menu of the rtl files. +. The `rtl/core/neorv32_top.vhd` VHDL file is the top entity of the NEORV32 processor. If you +already have a design, instantiate this unit into your design and proceed. +. If you do not have a design yet and just want to check out the NEORV32 – no problem! In this guide +we will use a simplified top entity, that encapsulated the actual processor top entity: add the +`rtl/core/top_templates/neorv32_test_setup.vhd` VHDL file to your project too, and +select it as top entity. +. This test setup provides a minimal test hardware setup: + +.NEORV32 "hello world" test setup +image::../figures/neorv32_test_setup.png[align=center] + +[start=7] +. This test setup only implements some very basic processor and CPU features. Also, only the +minimum number of signals is propagated to the outer world. Please note that the reset input signal +`rstn_i` is **low-active**. +. The configuration of the NEORV32 processor is done using the generics of the instantiated processor +top entity. Let's keep things simple at first and use the default configuration: + +.Cut-out of `neorv32_test_setup.vhd` showing the processor instance and its configuration +[source,vhdl] +---- +neorv32_top_inst: neorv32_top +generic map ( + -- General -- + CLOCK_FREQUENCY => 100000000, -- in Hz # <1> + BOOTLOADER_EN => true, + USER_CODE => x"00000000", + ... + -- Internal instruction memory -- + MEM_INT_IMEM_EN => true, + MEM_INT_IMEM_SIZE => 16*1024, # <2> + MEM_INT_IMEM_ROM => false, + -- Internal data memory -- + MEM_INT_DMEM_EN => true, + MEM_INT_DMEM_SIZE => 8*1024, # <3> + ... +---- +<1> Clock frequency of `clk_i` in Hertz +<2> Default size of internal instruction memory: 16kB (no need to change that _now_) +<3> Default size of internal data memory: 8kB (no need to change that _now_) + +[start=9] +. There is one generic that has to be set according to your FPGA / board: The clock frequency of the +top's clock input signal (`clk_i`). Use the _CLOCK_FREQUENC_Y generic to specify your clock source's +frequency in Hertz (Hz) (note "1"). +. If you feel like it – or if your FPGA does not provide so many resources – you can modify the +**memory sizes** (_MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ – marked with notes "2" and "3") or even +exclude certain ISa extensions and peripheral modules from implementation - but as mentioned above, let's keep things +simple at first and use the standard configuration for now. + +[NOTE] +Keep the internal instruction and data memory sizes in mind – these values are required for setting +up the software framework in the next section <<_general_software_framework_setup>>. + +[start=11] +. Depending on your FPGA tool of choice, it is time to assign the signals of the test setup top entity to +the according pins of your FPGA board. All the signals can be found in the entity declaration: + +.Entity signals of `neorv32_test_setup.vhd` +[source,vhdl] +---- +entity neorv32_test_setup is + port ( + -- Global control -- + clk_i : in std_ulogic := '0'; -- global clock, rising edge + rstn_i : in std_ulogic := '0'; -- global reset, low-active, async + -- GPIO -- + gpio_o : out std_ulogic_vector(7 downto 0); -- parallel output + -- UART0 -- + uart0_txd_o : out std_ulogic; -- UART0 send data + uart0_rxd_i : in std_ulogic := '0' -- UART0 receive data +); +end neorv32_test_setup; +---- + +[start=12] +. Attach the clock input `clk_i` to your clock source and connect the reset line `rstn_i` to a button of +your FPGA board. Check whether it is low-active or high-active – the reset signal of the processor is +**low-active**, so maybe you need to invert the input signal. +. If possible, connected at least bit `0` of the GPIO output port `gpio_o` to a high-active LED (invert +the signal when your LEDs are low-active) - this LED will be used as status LED by the bootloader. +. Finally, connect the primary UART's (UART0) communication signals `uart0_txd_o` and +`uart0_rxd_i` to your serial host interface (USB-to-serial converter). +. Perform the project HDL compilation (synthesis, mapping, bitstream generation). +. Download the generated bitstream into your FPGA ("program" it) and press the reset button (just to +make sure everything is sync). +. Done! If you have assigned the bootloader status LED , it should be +flashing now and you should receive the bootloader start prompt in your UART console (check the baudrate!). + + + +<<< +// #################################################################################################################### +:sectnums: +=== General Software Framework Setup + +While your synthesis tool is crunching the NEORV32 HDL files, it is time to configure the project's software +framework for your processor hardware setup. + +[start=1] +. You need to tell the linker the actual size of the processor's instruction and data memories. This has to be always sync +to the *hardware memory configuration* (done in section <<_general_hardware_setup>>). +. Open the NEORV32 linker script `sw/common/neorv32.ld` with a text editor. Right at the +beginning of the linker script you will find the **MEMORY** configuration showing two regions: `rom` and `ram` + +.Cut-out of the linker script `neorv32.ld`: Memory configuration +[source,c] +---- +MEMORY +{ + rom (rx) : ORIGIN = DEFINED(make_bootloader) ? 0xFFFF0000 : 0x00000000, LENGTH = DEFINED(make_bootloader) ? 4*1024 : 16*1024 # <1> + ram (rwx) : ORIGIN = 0x80000000, LENGTH = 8*1024 # <2> +} +---- +<1> Size of internal instruction memory (IMEM): 16kB +<2> Size of internal data memory (DMEM): 8kB + +[WARNING] +The `rom` region provides conditional assignments (via the _make_bootloader_ symbol) for the _origin_ +and the _length_ configuration depending on whether the executable is built as normal application (for the IMEM) or +as bootloader code (for the BOOTROM). To modify the IMEM configuration of the `rom` region, +make sure to **only edit the most right values** for `ORIGIN` and `LENGTH` (marked with notes "1" and "2"). + +[start=3] +. There are four parameters that are relevant here (only the right-most value for the `rom` section): The _origin_ +and the _length_ of the instruction memory (region name `rom`) and the _origin_ and the _length_ of the data +memory (region name `ram`). These four parameters have to be always sync to your hardware memory +configuration as described in section <<_general_hardware_setup>>. + +[IMPORTANT] +The `rom` _ORIGIN_ parameter has to be equal to the configuration of the NEORV32 ispace_base_c +(default: 0x00000000) VHDL package (`rtl/core/neorv32_package.vhd`) configuration constant. The `ram` _ORIGIN_ parameter has to +be equal to the configuration of the NEORV32 `dspace_base_c` (default: 0x80000000) VHDL +package (`rtl/core/neorv32_package.vhd`) configuration constant. + +[IMPORTANT] +The `rom` _LENGTH_ and the `ram` _LENGTH_ parameters have to match the configured memory sizes. For +instance, if the system does not have any external memories connected, the `rom` _LENGTH_ parameter +has to be equal to the processor-internal IMEM size (defined via top's _MEM_INT_IMEM_SIZE_ generic) +and the `ram` _LENGTH_ parameter has to be equal to the processor-internal DMEM size (defined via top's +_MEM_INT_DMEM_SIZE_ generic). + + + +<<< +// #################################################################################################################### +:sectnums: +=== Application Program Compilation + +[start=1] +. Open a terminal console and navigate to one of the project's example programs. For instance navigate to the +simple `sw/example_blink_led` example program. This program uses the NEORV32 GPIO unit to display +an 8-bit counter on the lowest eight bit of the `gpio_o` output port. +. To compile the project and generate an executable simply execute: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make exe +---- + +[start=3] +. This will compile and link the application sources together with all the included libraries. At the end, +your application is transformed into an ELF file (`main.elf`). The *NEORV32 image generator* (in `sw/image_gen`) takes this file and creates a +final executable. The makefile will show the resulting memory utilization and the executable size: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make exe +Memory utilization: + text data bss dec hex filename + 852 0 0 852 354 main.elf +Executable (neorv32_exe.bin) size in bytes: +864 +---- + +[start=4] +. That's it. The `exe` target has created the actual executable `neorv32_exe.bin` in the current +folder, which is ready to be uploaded to the processor via the bootloader's UART interface. + +[TIP] +The compilation process will also create a `main.asm` assembly listing file in the project directory, which +shows the actual assembly code of the complete application. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Uploading and Starting of a Binary Executable Image via UART + +You have just created the executable. Now it is time to upload it to the processor. There are basically two +options to do so. + +**Option 1** + +The NEORV32 makefiles provide an upload target that allows to directly upload an executable from the +command line. Reset the processor and execute: + +[source,bash] +---- +sw/example/blink_led$ make COM_PORT=/dev/ttyUSB1 upload +---- + +Replace `/dev/ttyUSB1` with the actual serial port you are using to communicate with the processor. You +might have to use `sudo make ...` if the targeted device requires elevated access rights. + +**Option 2** + +The "better" option is to use a standard terminal program to upload an executable. This provides a more +comfortable way as you can directly interact with the bootloader console. Additionally, using a terminal program +also allows to directly communicate with the uploaded application. + +[start=1] +. Connect the primary UART (UART0) interface of your FPGA board to a serial port of your +computer or use an USB-to-serial adapter. +. Start a terminal program. In this tutorial, I am using TeraTerm for Windows. You can download it from https://ttssh2.osdn.jp/index.html.en + +[WARNING] +Make sure your terminal program can transfer the executable in raw byte mode without any protocol stuff around it. + +[start=3] +. Open a connection to the corresponding srial port. Configure the terminal according to the +following parameters: + +* 19200 Baud +* 8 data bits +* 1 stop bit +* no parity bits +* no transmission/flow control protocol! (just raw byte mode) +* newline on `\r\n` (carriage return & newline) + +[start=4] +. Also make sure, that single chars are transmitted without any consecutive "new line" or "carriage +return" commands (this is highly dependent on your terminal application of choice, TeraTerm only +sends the raw chars by default). +. Press the NEORV32 reset button to restart the bootloader. The status LED starts blinking and the +bootloader intro screen appears in your console. Hurry up and press any key (hit space!) to abort the +automatic boot sequence and to start the actual bootloader user interface console. + +.Bootloader console; aborted auto-boot sequence +[source,bash] +---- +<< NEORV32 Bootloader >> + +BLDV: Mar 23 2021 +HWV: 0x01050208 +CLK: 0x05F5E100 +USER: 0x10000DE0 +MISA: 0x40901105 +ZEXT: 0x00000023 +PROC: 0x0EFF0037 +IMEM: 0x00004000 bytes @ 0x00000000 +DMEM: 0x00002000 bytes @ 0x80000000 + +Autoboot in 8s. Press key to abort. +Aborted. + +Available commands: +h: Help +r: Restart +u: Upload +s: Store to flash +l: Load from flash +e: Execute +CMD:> +---- + +[start=6] +. Execute the "Upload" command by typing `u`. Now the bootloader is waiting for a binary executable +to be send. + +[source,bash] +---- +CMD:> u +Awaiting neorv32_exe.bin... +---- + +[start=7] +. Use the "send file" option of your terminal program to transmit the previously generated binary executable `neorv32_exe.bin`. +. Again, make sure to transmit the executable in raw binary mode (no transfer protocol, no additional +header stuff). When using TeraTerm, select the "binary" option in the send file dialog. +. If everything went fine, OK will appear in your terminal: + +[source,bash] +---- +CMD:> u +Awaiting neorv32_exe.bin... OK +---- + +[start=10] +. The executable now resides in the instruction memory of the processor. To execute the program right +now run the "Execute" command by typing `e`: + +[source,bash] +---- +CMD:> u +Awaiting neorv32_exe.bin... OK +CMD:> e +Booting... +Blinking LED demo program +---- + +[start=11] +. Now you should see the LEDs counting. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Setup of a New Application Program Project + +Done with all the introduction tutorials and those example programs? Then it is time to start your own +application project! + +[start=1] +. The easiest way of creating a *new* project is to make a copy of an *existing* project (like the +`blink_led` project) inside the `sw/example` folder. By this, all file dependencies are kept and you can +start coding and compiling. +. If you want to place the project folder somewhere else you need to adapt the project's makefile. In +the makefile you will find a variable that keeps the relative or absolute path to the NEORV32 home +folder. Just modify this variable according to your new project's home location: + +[source,makefile] +---- +# Relative or absolute path to the NEORV32 home folder (use default if not set by user) +NEORV32_HOME ?= ../../.. +---- + +[start=3] +. If your project contains additional source files outside of the project folder, you can add them to the _APP_SRC_ variable: + +[source,makefile] +---- +# User's application sources (add additional files here) +APP_SRC = $(wildcard *.c) ../somewhere/some_file.c +---- + +[start=4] +. You also need to add the folder containing the include files of your new project to the _APP_INC variable_ (do not forget the `-I` prefix): + +[source,makefile] +---- +# User's application include folders (don't forget the '-I' before each entry) +APP_INC = -I . -I ../somewhere/include_stuff_folder +---- + +[start=5] +. If you feel like it, you can change the default optimization level: + +[source,makefile] +---- +# Compiler effort +EFFORT = -Os +---- + +[TIP] +All the assignments made to the makefile variable can also be done "inline" when invoking the makefile. For example: `$make EFFORT=-Os clean_all exe` + + + + +<<< +// #################################################################################################################### +:sectnums: +=== Enabling RISC-V CPU Extensions + +Whenever you enable/disable a RISC-V CPU extensions via the according _CPU_EXTENSION_RISCV_x_ generic, you need to +adapt the toolchain configuration so the compiler can actually generate according code for it. + +To do so, open the makefile of your project (for example `sw/example/blink_led/makefile`) and scroll to the +"USER CONFIGURATION" section right at the beginning of the file. You need to modify the _MARCH_ variable and eventually +the _MABI_ variable according to your CPU hardware configuration. + +[source,makefile] +---- +# CPU architecture and ABI +MARCH = -march=rv32i # <1> +MABI = -mabi=ilp32 # <2> +---- +<1> MARCH = Machine architecture ("ISA string") +<2> MABI = Machine binary interface + +For example when you enable the RISC-V `C` extension (16-bit compressed instructions) via the _CPU_EXTENSION_RISCV_C_ generic (set _true_) you need +to add the 'c' extension also to the _MARCH_ ISA string. + +You can also override the default _MARCH_ and _MABI_ configurations from the makefile when invoking the makefile: + +[source,bash] +---- +$ make MARCH=-march=rv32ic clean_all all +---- + +[NOTE] +The RISC-V ISA string (for _MARCH_) follows a certain canonical structure: +`rev32[i/e][m][a][f][d][g][q][c][b][v][n]...` For example `rv32imac` is valid while `rv32icma` is not valid. + + + + +<<< +// #################################################################################################################### +:sectnums: +=== Building a Non-Volatile Application without External Boot Memory + +The primary purpose of the bootloader is to allow an easy and fast update of the current application. In particular, this is very handy +during the development stage of a project as you can upload modified programs at any time via the UART. +Maybe at some time your project has become mature and you want to actually _embed_ your processor +including the application. + +There are two options to provide _non-volatile_ storage of your application. The simplest (but also most constrained) one is to implement the IMEM +as true ROM to contain your program. The second option is to use an external boot memory - this concept is shown in a different section: +<<_programming_an_external_spi_flash_via_the_bootloader>>. + +Using the IMEM as ROM: + +* for this boot concept the bootloader is no longer required +* this concept only works for the internal IMEM (but can be extended to work with external memories coupled via the processor's bus interface) +* make sure that the memory components (like block RAM) the IMEM is mapped to support an initialization via the bitstream + +[start=1] +. At first, compile your application code by running the `make install` command: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make compile +Memory utilization: + text data bss dec hex filename + 852 0 0 852 354 main.elf +Executable (neorv32_exe.bin) size in bytes: +864 +Installing application image to ../../../rtl/core/neorv32_application_image.vhd +---- + +[start=2] +. The `install` target has created an executable, too, but this time also in the form of a VHDL memory +initialization file. during synthesis, this initialization will become part of the final FPGA bitstream, which +in terms initializes the IMEM's memory primitives. +. To allow a direct boot of this image without interference of the bootloader you _can_ deactivate the implementation of +the bootloader via the according top entity's generic: + +[source,vhdl] +---- +BOOTLOADER_EN => false, -- implement processor-internal bootloader? # <1> +---- +<1> Set to _false_ to make the CPU directly boot from the IMEM. In this case the BOOTROM is discarded from the design. + +[start=4] +. When the bootloader is deactivated, the according module (BOOTROM) is removed from the design and the CPU will start booting +at the base address of the instruction memory space (IMEM base address) making the CPU directly executing your +application after reset. +. The IMEM could be still modified, since it is implemented as RAM by default, which might corrupt your +executable. To prevent this and to implement the IMEM as true ROM (and eventually saving some +more hardware resources), active the "IMEM as ROM" feature using the processor's according top entity +generic: + +[source,vhdl] +---- +MEM_INT_IMEM_ROM => true, -- implement processor-internal instruction memory as ROM +---- + +[start=6] +. Perform a new synthesis and upload your bitstream. Your application code now resides unchangeable +in the processor's IMEM and is directly executed after reset. + + + + +<<< +// #################################################################################################################### +:sectnums: +=== Customizing the Internal Bootloader + +The bootloader provides several configuration options to customize it for your specific applications. The +most important user-defined configuration options are available as C `#defines` right at the beginning of the +bootloader source code `sw/bootloader/bootloader.c`): + +.Cut-out from the bootloader source code `bootloader.c`: configuration parameters +[source,c] +---- +/** UART BAUD rate */ +#define BAUD_RATE (19200) +/** Enable auto-boot sequence if != 0 */ +#define AUTOBOOT_EN (1) +/** Time until the auto-boot sequence starts (in seconds) */ +#define AUTOBOOT_TIMEOUT 8 +/** Set to 0 to disable bootloader status LED */ +#define STATUS_LED_EN (1) +/** SPI_DIRECT_BOOT_EN: Define/uncomment to enable SPI direct boot */ +//#define SPI_DIRECT_BOOT_EN +/** Bootloader status LED at GPIO output port */ +#define STATUS_LED (0) +/** SPI flash boot image base address (warning! address might wrap-around!) */ +#define SPI_FLASH_BOOT_ADR (0x00800000) +/** SPI flash chip select line at spi_csn_o */ +#define SPI_FLASH_CS (0) +/** Default SPI flash clock prescaler */ +#define SPI_FLASH_CLK_PRSC (CLK_PRSC_8) +/** SPI flash sector size in bytes (default = 64kb) */ +#define SPI_FLASH_SECTOR_SIZE (64*1024) +/** ASCII char to start fast executable upload process */ +#define FAST_UPLOAD_CMD '#' +---- + +**Changing the Default Size of the Bootloader ROM** + +The NEORV32 default bootloader uses 4kB of storage. This is also the default size of the BOOTROM memory component. +If your new/modified bootloader exceeds this size, you need to modify the boot ROM configurations. + +[start=1] +. Open the processor's main package file `rtl/core/neorv32_package.vhd` and edit the +`boot_size_c` constant according to your requirements. The boot ROM size must not exceed 32kB +and should be a power of two (for optimal hardware mapping). + +[source,vhdl] +---- +-- Bootloader ROM -- +constant boot_size_c : natural := 4*1024; -- bytes +---- + +[start=2] +. Now open the NEORV32 linker script `sw/common/neorv32.ld` and adapt the _LENGTH_ parameter +of the `rom` according to your new memory size. `boot_size_c` and the `rom` _LENGTH_ attribute have to be always +identical. Do **not modify** the _ORIGIN_ of the `rom` section. + +[source,c] +---- +MEMORY +{ + rom (rx) : ORIGIN = DEFINED(make_bootloader) ? 0xFFFF0000 : 0x00000000, LENGTH = DEFINED(make_bootloader) ? 4*1024 : 16*1024 # <1> + ram (rwx) : ORIGIN = 0x80000000, LENGTH = 8*1024 +} +---- +<1> Bootloader ROM default size = 4*1024 bytes (**left** value) + +[IMPORTANT] +The `rom` region provides conditional assignments (via symbol `make_bootloader`) for the origin +and the length depending on whether the executable is built as normal application (for the IMEM) or +as bootloader code (for the BOOTROM). To modify the BOOTLOADER memory size, make +sure to edit the first value for the origin (note "1"). + +**Re-Compiling and Re-Installing the Bootloader** + +Whenever you have modified the bootloader you need to recompile and re-install it and re-synthesize your design. + +[start=1] +. Compile and install the bootloader using the explicit `bootloader` makefile target. + +[source,bash] +---- +neorv32/sw/bootloader$ make bootloader +---- + +[start=1] +. Now perform a new synthesis / HDL compilation to update the bitstream with the new bootloader +image (some synthesis tools also allow to only update the BRAM initialization without re-running +the entire synthesis process). + +[NOTE] +The bootloader is intended to work regardless of the actual NEORV32 hardware configuration – +especially when it comes to CPU extensions. Hence, the bootloader should be build using the +minimal `rv32i` ISA only (`rv32e` would be even better). + + + + +<<< +// #################################################################################################################### +:sectnums: +=== Programming an External SPI Flash via the Bootloader + +As described in section <<_external_spi_flash_for_booting>> the bootloader provides an option to store an application image to an external SPI flash +and to read this image back for booting. These steps show how to store a + +[start=1] +. At first, reset the NEORV32 processor and wait until the bootloader start screen appears in your terminal program. +. Abort the auto boot sequence and start the user console by pressing any key. +. Press u to upload the program image, that you want to store to the external flash: + +[source] +---- +CMD:> u +Awaiting neorv32_exe.bin... +---- + +[start=4] +. Send the binary in raw binary via your terminal program. When the uploaded is completed and "OK" +appears, press `p` to trigger the programming of the flash (do not execute the image via the `e` +command as this might corrupt the image): + +[source] +---- +CMD:> u +Awaiting neorv32_exe.bin... OK +CMD:> p +Write 0x000013FC bytes to SPI flash @ 0x00800000? (y/n) +---- + +[start=5] +. The bootloader shows the size of the executable and the base address inside the SPI flash where the +executable is going to be stored. A prompt appears: Type `y` to start the programming or type `n` to +abort. See section <<_external_spi_flash_for_booting> for more information on how to configure the base address. + +[source] +---- +CMD:> u +Awaiting neorv32_exe.bin... OK +CMD:> p +Write 0x000013FC bytes to SPI flash @ 0x00800000? (y/n) y +Flashing... OK +CMD:> +---- + +[start=6] +. If "OK" appears in the terminal line, the programming process was successful. Now you can use the +auto boot sequence to automatically boot your application from the flash at system start-up without +any user interaction. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Simulating the Processor + +**Testbench** + +The NEORV32 project features a simple default testbench (`sim/neorv32_tb.vhd`) that can be used to simulate +and test the processor setup. This testbench features a 100MHz clock and enables all optional peripheral and +CPU extensions except for the `E` extension and the TRNG IO module (that CANNOT be simulated due to its +combinatorial (looped) oscillator architecture). + +The simulation setup is configured via the "User Configuration" section located right at the beginning of +the testbench's architecture. Each configuration constant provides comments to explain the functionality. + +Besides the actual NEORV32 Processor, the testbench also simulates "external" components that are connected +to the processor's external bus/memory interface. These components are: + +* an external instruction memory (that also allows booting from it) +* an external data memory +* an external memory to simulate "external IO devices" +* a memory-mapped registers to trigger the processor's interrupt signals + +The following table shows the base addresses of these four components and their default configuration and +properties (attributes: `r` = read, `w` = write, `e` = execute, `a` = atomic accesses possible, `8` = byte-accessible, `16` = +half-word-accessible, `32` = word-accessible). + +.Testbench: processor-external memories +[cols="^4,>3,^5,<11"] +[options="header",grid="rows"] +|======================= +| Base address | Size | Attributes | Description +| `0x00000000` | `imem_size_c` | `r/w/e, a, 8/16/32` | external IMEM (initialized with application image) +| `0x80000000` | `dmem_size_c` | `r/w/e, a, 8/16/32` | external DMEM +| `0xf0000000` | 64 bytes | `r/w/e, !a, 8/16/32` | external "IO" memory, atomic accesses will fail +| `0xff000000` | 4 bytes | `-/w/-, a, -/-/32` | memory-mapped register to trigger "machine external", "machine software" and "SoC Fast Interrupt" interrupts +|======================= + +The simulated NEORV32 does not use the bootloader and directly boots the current application image (from +the `rtl/core/neorv32_application_image.vhd` image file). Make sure to use the `all` target of the +makefile to install your application as VHDL image after compilation: + +[source, bash] +---- +sw/example/blink_led$ make clean_all all +---- + +.Simulation-Optimized CPU/Processors Modules +[NOTE] +The `sim/rtl_modules` folder provides simulation-optimized versions of certain CPU/processor modules. +These alternatives can be used to replace the default CPU/processor HDL files to allow faster/easier/more +efficient simulation. **These files are not intended for synthesis!** + +**Simulation Console Output** + +Data written to the NEORV32 UART0 / UART1 transmitter is send to a virtual UART receiver implemented +as part of the testbench. Received chars are send to the simulator console and are also stored to a log file +(`neorv32.testbench_uart0.out` for UART0, `neorv32.testbench_uart1.out` for UART1) inside the simulator home folder. + +**Faster Simulation Console Output** + +When printing data via the UART the communication speed will always be based on the configured BAUD +rate. For a simulation this might take some time. To have faster output you can enable the **simulation mode** +or UART0/UART1 (see section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>). + +ASCII data send to UART0 will be immediately printed to the simulator console. Additionally, the +ASCII data is logged in a file (`neorv32.uart0.sim_mode.text.out`) in the simulator home folder. All +written 32-bit data is also dumped as 8-char hexadecimal value into a file +(`neorv32.uart0.sim_mode.data.out`) also in the simulator home folder. + +ASCII data send to UART1 will be immediately printed to the simulator console. Additionally, the +ASCII data is logged in a file (`neorv32.uart1.sim_mode.text.out`) in the simulator home folder. All +written 32-bit data is also dumped as 8-char hexadecimal value into a file +(`neorv32.uart1.sim_mode.data.out`) also in the simulator home folder. + +You can "automatically" enable the simulation mode of UART0/UART1 when compiling an application. In this case the +"real" UART0/UART1 transmitter unit is permanently disabled. To enable the simulation mode just compile +and install your application and add _UART0_SIM_MODE_ for UART0 and/or _UART1_SIM_MODE_ for UART1 to +the compiler's _USER_FLAGS_ variable (do not forget the `-D` suffix flag): + +[source, bash] +---- +sw/example/blink_led$ make USER_FLAGS+=-DUART0_SIM_MODE clean_all all +---- + +The provided define will change the default UART0/UART1 setup function in order to set the simulation mode flag in the according UART's control register. + +[NOTE] +The UART simulation output (to file and to screen) outputs "complete lines" at once. A line is +completed with a line feed (newline, ASCII `\n` = 10). + +**Simulation with Xilinx Vivado** + +The project features default a Vivado simulation waveform configuration in `sim/vivado`. + +**Simulation with GHDL** + +To simulate the processor using _GHDL_ navigate to the `sim` folder and run the provided shell script. All arguments are passed to GHDL. +For example the simulation time can be configured using `--stop-time=4ms` as argument. + +[source, bash] +---- +neorv32/sim$ sh ghdl_sim.sh --stop-time=4ms +---- + + + +<<< +// #################################################################################################################### +:sectnums: +=== Building the Software Framework Documentation + +All core library software sources (libraries `sw/lib`, example programs `sw/example`, ...) are highly documented using _doxygen_. +To build the documentation by yourself navigate to the project's `doc` folder and run _doxygen_: + +[source,bash] +---- +neorv32/docs$ doxygen Doxyfile +---- + +This will generate the `docs/doxygen_build` folder. To view the documentation, open the +`docs/doxygen_build/html/index.html` file with your browser of choice. Click on the "files" tab to +see a list of all documented files. + +[TIP] +The documentation is automatically built and deployed to GitHub pages by the CI workflow (https://stnolting.github.io/neorv32/files.html). + + + +// #################################################################################################################### +:sectnums: +=== Building this Data Sheet + +This data sheet is written using `asciidoc` and rendered by `asciidoc-pdf`. To build the pdf by yourself navigate +to the project's `doc` folder and execute the data sheet generator script: + +[source,bash] +---- +neorv32/docs$ sh make_datasheet.sh +---- + +This will render all `asciidoc` files from `docs/src_adoc` to generate this document (`docs/NEORV32.pdf`). + + +<<< +// #################################################################################################################### +:sectnums: +=== FreeRTOS Support + +A NEORV32-specific port and a simple demo for FreeRTOS (https://github.com/FreeRTOS/FreeRTOS) are +available in the `sw/example/demo_freeRTOS` folder. + +See the according documentation (`sw/example/demo_freeRTOS/README.md`) for more information. + + + +// #################################################################################################################### +:sectnums: +=== RISC-V Architecture Test Framework + +The NEORV32 Processor passes the according tests provided by the official RISC-V Architecture Test Suite +(V2.0+), which is available online at GitHub: https://github.com/riscv/riscv-arch-test + +All files required for executing the test framework on a simulated instance of the processor (including port +files) are located in the `riscv-arch-test` folder in the root directory of the NEORV32 repository. Take a +look at the provided `riscv-arch-test/README.md` (https://github.com/stnolting/neorv32/blob/master/riscv-arch-test/README.md[online at GitHunb]) +file for more information on how to run the tests and how testing is conducted in detail. + Index: docs/src_adoc/neorv32-theme.yml =================================================================== --- docs/src_adoc/neorv32-theme.yml (nonexistent) +++ docs/src_adoc/neorv32-theme.yml (revision 57) @@ -0,0 +1,48 @@ +extends: default +page: + margin: [0.8in, 0.67in, 0.75in, 0.67in] +link: + font-color: #edac00 +image: + align: center +caption: + align: center +running-content: + start-at: toc +header: + height: 0.65in + vertical-align: bottom + image-vertical-align: bottom + font-size: 11 + border-color: #000000 + border-width: 1 + recto: + left: + content: '*The NEORV32 Processor*' + right: + content: '*Visit on https://github.com/stnolting/neorv32[GitHub]*' + verso: + left: + content: '*The NEORV32 Processor*' + right: + content: '*Visit on https://github.com/stnolting/neorv32[GitHub]*' +footer: + start-at: toc + height: 0.75in + font-size: 10 + border-color: #000000 + border-width: 1 + recto: + left: + content: '{page-number} / {page-count}' + center: + content: 'Copyright (c) 2021, Stephan Nolting. All rights reserved.' + right: + content: '{docdate}' + verso: + left: + content: '{page-number} / {page-count}' + center: + content: 'NEORV32 Version: {revnumber}' + right: + content: '{docdate}' Index: docs/src_adoc/neorv32.adoc =================================================================== --- docs/src_adoc/neorv32.adoc (nonexistent) +++ docs/src_adoc/neorv32.adoc (revision 57) @@ -0,0 +1,104 @@ += The NEORV32 RISC-V Processor +:author: Dipl.-Ing. Stephan Nolting +:email: stnolting@gmail.com +:description: A size-optimized, customizable and open-source full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL. +:revnumber: v1.5.4.8 +:doctype: book +:sectnums: +:icons: image +:iconsdir: icons +:stem: +:reproducible: +:listing-caption: Listing +:toc: +:toclevels: 4 +:title-logo-image: image:../figures/neorv32_logo_dark.png[pdfwidth=6.25in,align=center] +// Uncomment next line to add a title page (or set doctype to book) +//:title-page: +// Uncomment next line to set page size (default is A4) +//:pdf-page-size: Letter + + +// ------------------------------------------------------------------------------------------------ +// ------------------------------------------------------------------------------------------------ +:sectnums!: +== Proprietary and Legal Notice + +* "GitHub" is a Subsidiary of Microsoft Corporation. +* "Vivado" and "Artix" are trademarks of Xilinx Inc. +* "AXI" and "AXI4-Lite" are trademarks of Arm Holdings plc. +* "ModelSim" is a trademark of Mentor Graphics – A Siemens Business. +* "Quartus Prime" and "Cyclone" are trademarks of Intel Corporation. +* "iCE40", "UltraPlus" and "Radiant" are trademarks of Lattice Semiconductor Corporation. +* "Windows" is a trademark of Microsoft Corporation. +* "Tera Term" copyright by T. Teranishi. +* Timing diagrams made with WaveDrom Editor. +* "NeoPixel" is a trademark of Adafruit Industries. + +Icons from https://www.flaticon.com and made by +link:https://www.freepik.com[Freepik], link:https://www.flaticon.com/authors/good-ware[Good Ware], +link:https://www.flaticon.com/authors/pixel-perfect[Pixel perfect], link:https://www.flaticon.com/authors/vectors-market[Vectors Market] + + +**Limitation of Liability for External Links** + +This document contains links to the websites of third parties ("external links"). As the content of these websites +is not under our control, we cannot assume any liability for such external content. In all cases, the provider of +information of the linked websites is liable for the content and accuracy of the information provided. At the +point in time when the links were placed, no infringements of the law were recognizable to us. As soon as an +infringement of the law becomes known to us, we will immediately remove the link in question. + +**Disclaimer** + +This project is released under the BSD 3-Clause license. No copyright infringement +intended. Other implied or used projects might have different licensing – see their documentation to get more information. + + +<<< +:sectnums!: +== BSD 3-Clause License +Copyright (c) 2021, Stephan Nolting. All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that +the following conditions are met: + +. Redistributions of source code must retain the above copyright notice, this list of conditions and the +following disclaimer. +. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and +the following disclaimer in the documentation and/or other materials provided with the distribution. +. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or +promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF + + +========================== +**The NEORV32 Processor Project** + +Copyright (c) 2021, by Dipl.-Ing. Stephan Nolting. All rights reserved. + +HQ: https://github.com/stnolting/neorv32 + +Contact: stnolting@gmail.com + +_made in Hanover, Germany_ +========================== + + +// ------------------------------------------------------------------------------------------------ +// ------------------------------------------------------------------------------------------------ + +include::overview.adoc[] + +include::cpu.adoc[] + +include::soc.adoc[] + +include::software.adoc[] + +include::getting_started.adoc[] Index: docs/src_adoc/overview.adoc =================================================================== --- docs/src_adoc/overview.adoc (nonexistent) +++ docs/src_adoc/overview.adoc (revision 57) @@ -0,0 +1,375 @@ +:sectnums: +== Overview + +The NEORV32footnote:[Pronounced "neo-R-V-thirty-two" or "neo-risc-five-thirty-two" in its long form.] Processor +is a customizable microcontroller-like system on chip (SoC) that is based on the +RISC-V NEORV32 CPU. The processor is intended as ready-to-go auxiliary processor within a larger SoC +designs or as stand-alone custom microcontroller. Its top entity can be directly synthesized for any target +technology without modifications. + +The system is highly configurable and provides optional common peripherals like embedded memories, +timers, serial interfaces, general purpose IO ports and an external bus interface to connect custom IP like +memories, NoCs and peripherals. + +The software framework of the processor comes with application makefiles, software libraries for all CPU +and processor features, a bootloader, a runtime environment and several example programs – including a port +of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as +default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[a prebuilt toolchain is also available on GitHub]). + +The project's change log is available in the https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md[CHANGELOG.md] +file in the root directory of the NEORV32 repository. + + +:sectnums!: +=== Structure + +Chapter <<_neorv32_central_processing_unit_cpu>> + +* instruction set(s) and extensions, instruction timing, control ans status registers, traps, exceptions and interrupts, +hardware execution safety, native bus interface + +Chapter <<_neorv32_processor_soc>> + +* top entity signals and configuration generics, address space layout, internal peripheral devices and interrupts, internal +memories and caches, internal bus architecture, external bus interface + +Chapter <<_software_framework>> + +* core libraries, bootloader, makefiles, runtime environment + +Chapter <<_lets_get_it_started>> + +* toolchain installation and setup, hardware setup, software setup, application compilation, simulating the processor + +[TIP] +Links in this document are <<_structure,highlighted>>. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Project Key Features + +* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU - passes the official RISC-V architecture tests +* official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID] +* optional RISC-V CPU extensions: +** `A` - atomic memory access operations +** `B` - bit-manipulation instructions +** `C` - 16-bit compressed instructions +** `E` - embedded CPU version (reduced register file size) +** `M` - integer multiplication and division hardware +** `U` - less-privileged _user_ mode +** `Zfinx` - single-precision floating-point unit +** `Zicsr` - control and status register access (privileged architecture) +** `Zifencei` - instruction stream synchronization +** `PMP` - physical memory protection +** `HPM` - hardware performance monitors +* **Software framework** +** GCC-based toolchain - prebuilt toolchains available; application compilation based on GNU makefiles +** internal bootloader with serial user interface +** core libraries for high-level usage of the provided functions and peripherals +** runtime environment and several example programs +** doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/files.html +** FreeRTOS port + demos available +* **NEORV32 Processor**: highly-configurable full-scale microcontroller-like processor system / SoC based on the NEORV32 CPU with optional standard peripherals: +** serial interfaces (UARTs, TWI, SPI) +** timers and counters (WDT, MTIME, NCO) +** general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface +** embedded memories / caches for data, instructions and bootloader +** external memory interface (Wishbone or AXI4-Lite) +* fully synchronous design, no latches, no gated clocks +* completely described in behavioral, platform-independent VHDL +* small hardware footprint and high operating frequency + + +<<< +// #################################################################################################################### +:sectnums: +=== Project Folder Structure + +................................... +neorv32 - Project home folder +├.ci - Scripts for continuous integration +├boards - Example setups for various FPGA boards +├CHANGELOG.md - Project change log +├docs - Project documentation +│├doxygen_build - Software framework documentation (generated by doxygen) +│├src_adoc - AsciiDoc sources for this document +│└figures - Figures and logos +├riscv-arch-test - Port files for the official RISC-V architecture tests +├rtl - VHDL sources +│├core - Sources of the CPU & SoC +│└top_templates - Alternate/additional top entities/wrappers +├sim - Simulation files +│├ghdl - Simulation scripts for GHDL +│├rtl_modules - Processor modules for simulation-only +│└vivado - Pre-configured Xilinx ISIM waveform +└sw - Software framework + ├bootloader - Sources and scripts for the NEORV32 internal bootloader + ├common - Linker script and crt0.S start-up code + ├example - Various example programs + │└... + ├image_gen - Helper program to generate NEORV32 executables + └lib - Processor core library + ├include - Header files (*.h) + └source - Source files (*.c) +................................... + +[NOTE] +There are further files and folders starting with a dot which – for example – contain +data/configurations only relevant for git or for the continuous integration framework (`.ci`). + + +<<< +// #################################################################################################################### +:sectnums: +=== VHDL File Hierarchy + +All necessary VHDL hardware description files are located in the project's `rtl/core folder`. The top entity +of the entire processor including all the required configuration generics is **`neorv32_top.vhd`**. + +[IMPORTANT] +All core VHDL files from the list below have to be assigned to a new design library named **`neorv32`**. Additional +files, like alternative top entities, can be assigned to any library. + +................................... +neorv32_top.vhd - NEORV32 Processor top entity +├neorv32_boot_rom.vhd - Bootloader ROM +│└neorv32_bootloader_image.vhd - Bootloader boot ROM memory image +├neorv32_busswitch.vhd - Processor bus switch for CPU buses (I&D) +├neorv32_bus_keeper.vhd - Processor-internal bus monitor +├neorv32_icache.vhd - Processor-internal instruction cache +├neorv32_cfs.vhd - Custom functions subsystem +├neorv32_cpu.vhd - NEORV32 CPU top entity +│├neorv32_package.vhd - Processor/CPU main VHDL package file +│├neorv32_cpu_alu.vhd - Arithmetic/logic unit +│├neorv32_cpu_bus.vhd - Bus interface unit + physical memory protection +│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs +││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder +│├neorv32_cpu_cp_bitmanip.vhd - Bit manipulation co-processor (B extension) +│├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx extension) +│├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension) +│└neorv32_cpu_regfile.vhd - Data register file +├neorv32_dmem.vhd - Processor-internal data memory +├neorv32_gpio.vhd - General purpose input/output port unit +├neorv32_imem.vhd - Processor-internal instruction memory +│└neor32_application_image.vhd - IMEM application initialization image +├neorv32_mtime.vhd - Machine system timer +├neorv32_nco.vhd - Numerically-controlled oscillator +├neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface +├neorv32_pwm.vhd - Pulse-width modulation controller +├neorv32_spi.vhd - Serial peripheral interface controller +├neorv32_sysinfo.vhd - System configuration information memory +├neorv32_trng.vhd - True random number generator +├neorv32_twi.vhd - Two wire serial interface controller +├neorv32_uart.vhd - Universal async. receiver/transmitter +├neorv32_wdt.vhd - Watchdog timer +└neorv32_wb_interface.vhd - External (Wishbone) bus interface +................................... + + +<<< +// #################################################################################################################### +:sectnums: +=== FPGA Implementation Results + +This chapter shows exemplary implementation results of the NEORV32 CPU and Processor. Please note, that +the provided results are just a relative measure as logic functions of different modules might be merged +between entity boundaries, so the actual utilization results might vary a bit. + +:sectnums: +==== CPU + +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware version: | `1.5.3.2` +| Top entity: | `rtl/core/neorv32_cpu.vhd` +|======================= + +[cols="<5,>1,>1,>1,>1,>1"] +[options="header",grid="rows"] +|======================= +| CPU | LEs | FFs | MEM bits | DSPs | _f~max~_ +| `rv32i` | 980 | 409 | 1024 | 0 | 123 MHz +| `rv32i_Zicsr` | 1835 | 856 | 1024 | 0 | 124 MHz +| `rv32im_Zicsr` | 2443 | 1134 | 1024 | 0 | 124 MHz +| `rv32imc_Zicsr` | 2669 | 1149 | 1024 | 0 | 125 MHz +| `rv32imac_Zicsr` | 2685 | 1156 | 1024 | 0 | 124 MHz +| `rv32imac_Zicsr` + `u` | 2698 | 1162 | 1024 | 0 | 124 MHz +| `rv32imac_Zicsr_Zifencei` + `u` | 2715 | 1162 | 1024 | 0 | 122 MHz +| `rv32imac_Zicsr_Zifencei_Zfinx` + `u` | 4004 | 1812 | 1024 | 7 | 121 MHz +|======================= + + +:sectnums: +==== Processor Modules + +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware version: | `1.5.2.4` +| Top entity: | `rtl/core/neorv32_top.vhd` +|======================= + +.Hardware utilization by the processor modules +[cols="<2,<8,>1,>1,>2,>1"] +[options="header",grid="rows"] +|======================= +| Module | Description | LEs | FFs | MEM bits | DSPs +| Boot ROM | Bootloader ROM (4kB) | 3 | 1 | 32768 | 0 +| BUSSWITCH | Bus mux for CPU instr. and data interfaces | 65 | 8 | 0 | 0 +| iCACHE | Instruction cache (4 blocks, 256 bytes per block) | 234 | 156 | 8192 | 0 +| CFS | Custom functions subsystem | - | - | - | - +| DMEM | Processor-internal data memory (8kB) | 6 | 2 | 65536 | 0 +| GPIO | General purpose input/output ports | 67 | 65 | 0 | 0 +| IMEM | Processor-internal instruction memory (16kB) | 6 | 2 | 131072 | 0 +| MTIME | Machine system timer | 274 | 166 | 0 | 0 +| NCO | Numerically-controlled oscillator | 254 | 226 | 0 | 0 +| NEOLED | Smart LED Interface (NeoPixel/WS28128) [4xFIFO] | 347 | 309 | 0 | 0 +| PWM | Pulse_width modulation controller | 71 | 69 | 0 | 0 +| SPI | Serial peripheral interface | 138 | 124 | 0 | 0 +| SYSINFO | System configuration information memory | 10 | 10 | 0 | 0 +| TRNG | True random number generator | 132 | 105 | 0 | 0 +| TWI | Two-wire interface | 77 | 44 | 0 | 0 +| UART0/1 | Universal asynchronous receiver/transmitter | 176 | 132 | 0 | 0 +| WDT | Watchdog timer | 60 | 45 | 0 | 0 +| WISHBONE | External memory interface | 129 | 104 | 0 | 0 +|======================= + + +<<< +:sectnums: +==== Exemplary Setups + +[TIP] +Exemplary setups for different technologies and various FPGA boards can be found in the `boards` folder +(https://github.com/stnolting/neorv32/tree/master/boards). + +The following table shows exemplary NEORV32 processor implementation results for different FPGA +platforms. The processor setup uses the default peripheral configuration (like no CFS, no caches and no +TRNG), no external memory interface and only internal instruction and data memories. IMEM uses 16kB +and DMEM uses 8kB memory space. + +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware version: | `1.4.9.0` +|======================= + +.Hardware utilization for exemplary NEORV32 setups +[cols="<4,<5,<4,<4,<3,<3,<3,<4,<4,<3"] +[options="header",grid="rows"] +|======================= +| Vendor | FPGA | Board | Toolchain | CPU | LUT | FF | DSP | Memory | _f_ +| Intel | Cyclone IV `EP4CE22F17-C6N` | Terasic DE0-Nano | Quartus Prime Lite 20.1 | `rv32imc_Zicsr_Zifencei` + `u` + `PMP` | 3813 (17%) | 1890 (8%) | 0 (0%) | Memory bits: 231424 (38%) | 119 MHz +| Lattice | iCE40 UltraPlus `iCE40UP5KSG48I` | Upduino v2.0 | Radiant 2.1 | `rv32ic_Zicsr_Zifencei` + `u` | 4397 (83%) | 1679 (31%) | 0 (0%) | EBR: 12 (40%) SPRAM: 4 (100%) | 22.15 MHz +| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | `rv32imc_Zicsr_Zifencei` + `u` + `PMP` | 2465 (12%) | 1912 (5%) | 0 (0%) | BRAM: 8 (16%) | 100 MHz +|======================= + +**Notes** + +* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DEMEM (each 64kB). +* The Upduino and the Arty board have on-board SPI flash memories for storing the FPGA configuration. These device can also be used by the default NEORV32 bootloader to store and automatically boot an application program after reset (both tested successfully). +* The setups with PMP implement 2 regions with a minimal granularity of 64kB. +* No HPM counters are used. + + +<<< +// #################################################################################################################### +:sectnums: +=== CPU Performance + +:sectnums: +==== CoreMark Benchmark + +.Configuration +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware: | 32kB IMEM, 16kB DMEM, no caches, 100MHz clock +| CoreMark: | 2000 iterations, MEM_METHOD is MEM_STACK +| Compiler: | RISCV32-GCC 10.1.0 +| Peripherals: | UART for printing the results +| Compiler flags: | default, see makefile +|======================= + +The performance of the NEORV32 was tested and evaluated using the https://www.eembc.org/coremark/[Core Mark CPU benchmark]. This +benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the whole +system. The according source code and the SW project can be found in the `sw/example/coremark` folder. + +The resulting CoreMark score is defined as CoreMark iterations per second. +The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score is +defined as CoreMark score divided by the CPU's clock frequency in MHz. + +:sectnums!: +===== Results + +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware version: | `1.4.9.8` +|======================= + +.CoreMark results +[cols="<4,>1,>1,>1"] +[options="header",grid="rows"] +|======================= +| CPU (incl. `Zicsr`) | Executable size | CoreMark Score | CoreMarks/Mhz +| `rv32i` | 28756 bytes | 36.36 | **0.3636** +| `rv32im` | 27516 bytes | 68.97 | **0.6897** +| `rv32imc` | 22008 bytes | 68.97 | **0.6897** +| `rv32imc` + _FAST_MUL_EN_ | 22008 bytes | 86.96 | **0.8696** +| `rv32imc` + _FAST_MUL_EN_ + _FAST_SHIFT_EN_ | 22008 bytes | 90.91 | **0.9091** +|======================= + +[NOTE] +All executable were generated using maximum optimization `-O3`. +The _FAST_MUL_EN_ configuration uses DSPs for the multiplier of the _M_ extension (enabled via the +_FAST_MUL_EN_ generic). The _FAST_SHIFT_EN_ configuration uses a barrel shifter for CPU shift +operations (enabled via the _FAST_SHIFT_EN_ generic). + + +<<< +:sectnums: +==== Instruction Timing + +The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of +several consecutive micro operations. Hence, each instruction requires several clock cycles to execute. + +The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also on +the available CPU extensions. The following table shows the performance results for successfully (!) running +2000 CoreMark iterations. + +The average CPI is computed by dividing the total number of required clock cycles (only the timed core to +avoid distortion due to IO wait cycles) by the number of executed instructions (`[m]instret[h]` CSRs). The +executables were generated using optimization -O3. + +[cols="<2,<8"] +[grid="topbot"] +|======================= +| Hardware version: | `1.4.9.8` +|======================= + +.CoreMark instruction timing +[cols="<4,>2,>2,>2"] +[options="header",grid="rows"] +|======================= +| CPU (incl. `Zicsr`) | Required clock cycles | Executed instruction | Average CPI +| `rv32i` | 5595750503 | 1466028607 | **3.82** +| `rv32im` | 2966086503 | 598651143 | **4.95** +| `rv32imc` | 2981786734 | 611814918 | **4.87** +| `rv32imc` + _FAST_MUL_EN_ | 2399234734 | 611814918 | **3.92** +| `rv32imc` + _FAST_MUL_EN_ + _FAST_SHIFT_EN_ | 2265135174 | 611814948 | **3.70** +|======================= + +[TIP] +The _FAST_MUL_EN_ configuration uses DSPs for the multiplier of the M extension (enabled via the +_FAST_MUL_EN_ generic). The _FAST_SHIFT_EN_ configuration uses a barrel shifter for CPU shift +operations (enabled via the _FAST_SHIFT_EN_ generic). + +[TIP] +More information regarding the execution time of each implemented instruction can be found in +chapter <<_instruction_timing>>. + Index: docs/src_adoc/soc.adoc =================================================================== --- docs/src_adoc/soc.adoc (nonexistent) +++ docs/src_adoc/soc.adoc (revision 57) @@ -0,0 +1,925 @@ + +// #################################################################################################################### +:sectnums: +== NEORV32 Processor (SoC) + +The NEORV32 Processor is based on the NEORV32 CPU. Together with common peripheral +interfaces and embedded memories it provides a RISC-V-based full-scale microcontroller-like SoC platform. + +image::../figures/neorv32_processor.png[align=center] + +**Key Features** + +* _optional_ processor-internal data and instruction memories (<<_data_memory_dmem,**DMEM**>>/<<_instruction_memory_imem,**IMEM**>>) + cache (<<_processor_internal_instruction_cache_icache,**iCACHE**>>) +* _optional_ internal bootloader (<<_bootloader_rom_bootrom,**BOOTROM**>>) with UART console & SPI flash boot option +* _optional_ machine system timer (<<_machine_system_timer_mtime,**MTIME**>>), RISC-V-compatible +* _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>, <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,**UART1**>>) with optional hardware flow control (RTS/CTS) +* _optional_ 8/16/24/32-bit serial peripheral interface controller (<<_serial_peripheral_interface_controller_spi,**SPI**>>) with 8 dedicated CS lines +* _optional_ two wire serial interface controller (<<_two_wire_serial_interface_controller_twi,**TWI**>>), compatible to the I²C standard +* _optional_ general purpose parallel IO port (<<_general_purpose_input_and_output_port_gpio,**GPIO**>>), 32xOut, 32xIn +* _optional_ 32-bit external bus interface, Wishbone b4 / AXI4-Lite compatible (<<_processor_external_memory_interface_wishbone_axi4_lite,**WISHBONE**>>) +* _optional_ watchdog timer (<<_watchdog_timer_wdt,**WDT**>>) +* _optional_ PWM controller with 4 channels and 8-bit duty cycle resolution (<<_pulse_width_modulation_controller_pwm,**PWM**>>) +* _optional_ ring-oscillator-based true random number generator (<<_true_random_number_generator_trng,**TRNG**>>) +* _optional_ custom functions subsystem for custom co-processor extensions (<<_custom_functions_subsystem_cfs,**CFS**>>) +* _optional_ numerically-controlled oscillator (<<_numerically_controlled_oscillator_nco,**NCO**>>) with 3 independent channels +* _optional_ NeoPixel(TM)/WS2812-compatible smart LED interface (<<_smart_led_interface_neoled,**NEOLED**>>) +* system configuration information memory to check HW configuration via software (<<_system_configuration_information_memory_sysinfo,**SYSINFO**>>) + + +<<< +// #################################################################################################################### +:sectnums: +=== Processor Top Entity - Signals + +The following table shows all interface ports of the processor top entity (`rtl/core/neorv32_top.vhd`). +The type of all signals is _std_ulogic_ or _std_ulogic_vector_, respectively. + +[TIP] +A wrapper for the NEORV32 Processor setup providing resolved port signals can be found in +`rtl/top_templates/neorv32_top_stdlogic.vhd`. + +[cols="<3,^2,^2,<11"] +[options="header",grid="rows"] +|======================= +| Signal | Width | Dir. | Function +4+^| **Global Control** +| `clk_i` | 1 | in | global clock line, all registers triggering on rising edge +| `rstn_i` | 1 | in | global reset, asynchronous, **low-active** +4+^| **External bus interface (<<_processor_external_memory_interface_wishbone_axi4_lite,WISHBONE>>)** +| `wb_tag_o` | 3 | out | tag (access type identifier) +| `wb_adr_o` | 32 | out | destination address +| `wb_dat_i` | 32 | in | write data +| `wb_dat_o` | 32 | out | read data +| `wb_we_o` | 1 | out | write enable ('0' = read transfer) +| `wb_sel_o` | 4 | out | byte enable +| `wb_stb_o` | 1 | out | strobe +| `wb_cyc_o` | 1 | out | valid cycle +| `wb_lock_o`| 1 | out | exclusive access request +| `wb_ack_i` | 1 | in | transfer acknowledge +| `wb_err_i` | 1 | in | transfer error +4+^| **Advanced memory control signals** +| `fence_o` | 1 | out | indicates an executed _fence_ instruction +| `fencei_o` | 1 | out | indicates an executed _fencei_ instruction +4+^| **General Purpose Inputs & Outputs (<<_general_purpose_input_and_output_port_gpio,GPIO>>)** +| `gpio_o` | 32 | out | general purpose parallel output +| `gpio_i` | 32 | in | general purpose parallel input +4+^| **Primary Universal Asynchronous Receiver/Transmitter (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>>)** +| `uart0_txd_o` | 1 | out | UART0 serial transmitter +| `uart0_rxd_i` | 1 | in | UART0 serial receiver +| `uart0_rts_o` | 1 | out | UART0 RX ready to receive new char +| `uart0_cts_i` | 1 | in | UART0 TX allowed to start sending +4+^| **Primary Universal Asynchronous Receiver/Transmitter (<<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,UART1>>)** +| `uart1_txd_o` | 1 | out | UART1 serial transmitter +| `uart1_rxd_i` | 1 | in | UART1 serial receiver +| `uart1_rts_o` | 1 | out | UART1 RX ready to receive new char +| `uart1_cts_i` | 1 | in | UART1 TX allowed to start sending +4+^| **Serial Peripheral Interface Controller (<<_serial_peripheral_interface_controller_spi,SPI>>)** +| `spi_sck_o` | 1 | out | SPI controller clock line +| `spi_sdo_o` | 1 | out | SPI serial data output +| `spi_sdi_i` | 1 | in | SPI serial data input +| `spi_csn_o` | 8 | out | SPI dedicated chip select (low-active) +4+^| **Two-Wire Interface Controller (<<_two_wire_serial_interface_controller_twi,TWI>>)** +| `twi_sda_io` | 1 | inout | TWI serial data line +| `twi_scl_io` | 1 | inout | TWI serial clock line +4+^| **Custom Functions Subsystem (<<_custom_functions_subsystem_cfs,CFS>>)** +| `cfs_in_i` | 32 | in | custom CFS input signal conduit +| `cfs_out_o` | 32 | out | custom CFS output signal conduit +4+^| **Pulse-Width Modulation Channels (<<_pulse_width_modulation_controller_pwm,PWM>>)** +| `pwm_o` | 4 | out | pulse-width modulated channels +4+^| **Numerically-Controller Oscillator (<<_numerically_controlled_oscillator_nco,NCO>>)** +| `nco_o` | 3 | out | NCO output channels +4+^| **Smart LED Interface - NeoPixel(TM) compatible (<<_smart_led_interface_neoled,NEOLED>>)** +| `neoled_o` | 1 | out | asynchronous serial data output +4+^| **System time input from external MTIME unit** +| `mtime_i` | 32 | in | machine timer time (to `time[h]` CSRs) from external _MTIME_ unit if the processor-internal <<_machine_system_timer_mtime,**MTIME**>> unit is NOT used +4+^| **Interrupts** +| `soc_firq_i` | 6 | in | platform fast interrupt channels (custom) +| `mtime_irq_i` | 1 | in | machine timer interrupt13 (RISC-V) +| `msw_irq_i` | 1 | in | machine software interrupt (RISC-V) +| `mext_irq_i` | 1 | in | machine external interrupt (RISC-V) +|======================= + + +<<< +// #################################################################################################################### +:sectnums: +=== Processor Top Entity - Generics + +This is a list of all configuration generics of the NEORV32 processor top entity rtl/neorv32_top.vhd. +The generic name is shown in orange, followed by the type in printed in black and concluded by the default +value printed in light gray. + +[TIP] +The NEORV32 generics allow to configure the system according to your needs. The generics are +used to control implementation of certain CPU extensions and peripheral modules and even allow to +optimize the system for certain design goals like minimal area or maximum performance. + +[TIP] +Privileged software can determine the actual CPU and processor configuration via the `misa` and +`mzext` (see <<_machine_trap_setup>> and <<_neorv32_specific_custom_csrs>>) CSRs and via the memory-mapped _SYSINFO_ module (see <<_system_configuration_information_memory_sysinfo>>), +respectively. + +**CSR Description** + +The description of each CSR provides the following summary: + +.Generic description +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| _Generic_ | _type_ | _default value_ +3+| _Description_ +|====== + + +// #################################################################################################################### +:sectnums: +==== General + +See section <<_system_configuration_information_memory_sysinfo>> for more information. + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CLOCK_FREQUENCY** | _natural_ | 0 +3+| The clock frequency of the processor's `clk_i` input port in Hertz (Hz). +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **BOOTLOADER_EN** | _boolean_ | true +3+| Implement the boot ROM, pre-initialized with the bootloader image when true. This will also change the +processor's boot address from the beginning of the instruction memory address space (default = +0x00000000) to the base address of the boot ROM. See section <<_bootloader>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **USER_CODE** | _std_ulogic_vector(31 downto 0)_ | x"00000000" +3+| Custom user code that can be read by software via the _SYSINFO_ module. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **HW_THREAD_ID** | _natural_ | 0 +3+| The hart ID of the CPU. Can be read via the `mhartid` CSR. Hart IDs must be unique within a system. +|====== + + +// #################################################################################################################### +:sectnums: +==== RISC-V CPU Extensions + +See section <<_instruction_sets_and_extensions>> for more information. + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_A** | _boolean_ | false +3+| Implement atomic memory access operations when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_B** | _boolean_ | false +3+| Implement bit manipulation instructions when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_C** | _boolean_ | false +3+| Implement compressed instructions (16-bit) when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_E** | _boolean_ | false +3+| Implement the embedded CPU extension (only implement the first 16 data registers) when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_M** | _boolean_ | false +3+| Implement integer multiplication and division instructions when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_U** | _boolean_ | false +3+| Implement less-privileged user mode when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_Zfinx** | _boolean_ | false +3+| Implement the 32-bit single-precision floating-point extension (using integer registers) when _true_. For +more information see section <<_zfinx_single_precision_floating_point_operations>>. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_Zicsr** | _boolean_ | true +3+| Implement the control and status register (CSR) access instructions when true. Note: When this option is +disabled, the complete privileged architecture / trap system will be excluded from synthesis. Hence, no interrupts, no exceptions and +no machine information will be available. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_Zifencei** | _boolean_ | false +3+| Implement the instruction fetch synchronization instruction _fence.i_. For example, this option is required +for self-modifying code (and/or for i-cache flushes). +|====== + + +// #################################################################################################################### +:sectnums: +==== Extension Options + +See section <<_instruction_sets_and_extensions>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **FAST_MUL_EN** | _boolean_ | false +3+| When this generic is enabled, the multiplier of the `M` extension is realized using DSPs blocks instead of an +iterative bit-serial approach. This generic is only relevant when the multiplier and divider CPU extension is +enabled (_CPU_EXTENSION_RISCV_M_ is _true_). +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **FAST_SHIFT_EN** | _boolean_ | false +3+| When this generic is enabled the shifter unit of the CPU's ALU is implement as fast barrel shifter (requiring +more hardware resources). +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **TINY_SHIFT_EN** | _boolean_ | false +3+| If this generic is enabled the shifter unit of the CPU's ALU is implemented as (slow but tiny) single-bit iterative shifter +(requires up to 32 clock cycles for a shift operations, but reducing hardware footprint). The configuration of +this generic is ignored if _FAST_SHIFT_EN_ is _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_CNT_WIDTH** | _natural_ | 0 +3+| This generic configures the total size of the CPU's `cycle` and `instret` CSRs (low word + high word). See +section <<_machine_counters_and_timers>> for more information. Note: Configurations with _CPU_CNT_WIDTH_ +less than 64 are not RISC-V compliant. +|====== + + +// #################################################################################################################### +:sectnums: +==== Physical Memory Protection (PMP) + +See section <<_pmp_physical_memory_protection>> for more information. + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **PMP_NUM_REGIONS** | _natural_ | 0 +3+| Total number of implemented protections regions (0..64). If this generics is zero no physical memory +protection logic will be implemented at all. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **PMP_MIN_GRANULARITY** | _natural_ | 64*1024 +3+| Minimal region granularity in bytes. Has to be a power of two. Has to be at least 8 bytes. +|====== + + +// #################################################################################################################### +:sectnums: +==== Hardware Performance Monitors (HPM) + +See section <<_hpm_hardware_performance_monitors>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **HPM_NUM_CNTS** | _natural_ | 0 +3+| Total number of implemented hardware performance monitor counters (0..29). If this generics is zero no +hardware performance monitor logic will be implemented at all. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **HPM_CNT_WIDTH** | _natural_ | 40 +3+| This generic defines the total LSB-aligned size of each HPM counter (size(`[m]hpmcounter*h`) + +size(`[m]hpmcounter*`)). The maximum value is 64, the minimal is 1. If the size is less than 64-bit, the +unused MSB-aligned counter bits are hardwired to zero. +|====== + + +// #################################################################################################################### +:sectnums: +==== Internal Instruction Memory + +See sections <<_address_space>> and <<_instruction_memory_imem>> for more information. + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_INT_IMEM_EN** | _boolean_ | true +3+| Implement processor internal instruction memory (IMEM) when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_INT_IMEM_SIZE** | _natural_ | 16*1024 +3+| Size in bytes of the processor internal instruction memory (IMEM). Has no effect when _MEM_INT_IMEM_EN_ is _false_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_INT_IMEM_ROM** | _boolean_ | false +3+| Implement processor-internal instruction memory as read-only memory, which will be initialized with the +application image at synthesis time. Has no effect when _MEM_INT_IMEM_EN_ is _false_. +|====== + + +// #################################################################################################################### +:sectnums: +==== Internal Data Memory + +See sections <<_address_space>> and <<_data_memory_dmem>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_INT_DMEM_EN** | _boolean_ | true +3+| Implement processor internal data memory (DMEM) when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_INT_DMEM_SIZE** | _natural_ | 8*1024 +3+| Size in bytes of the processor-internal data memory (DMEM). Has no effect when _MEM_INT_DMEM_EN_ is _false_. +|====== + + +// #################################################################################################################### +:sectnums: +==== Internal Cache Memory + +See section <<_processor_internal_instruction_cache_icache>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **ICACHE_EN** | _boolean_ | false +3+| Implement processor internal instruction cache when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **ICACHE_NUM_BLOCK** | _natural_ | 4 +3+| Number of blocks (cache "pages" or "lines") in the instruction cache. Has to be a power of two. Has no +effect when _ICACHE_DMEM_EN_ is false. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **ICACHE_BLOCK_SIZE** | _natural_ | 64 +3+| Size in bytes of each block in the instruction cache. Has to be a power of two. Has no effect when +_ICACHE_EN_ is _false_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **ICACHE_ASSOCIATIVITY** | _natural_ | 1 +3+| Associativity (= number of sets) of the instruction cache. Has to be a power of two. Allowed configurations: +`1` = 1 set, direct mapped; `2` = 2-way set-associative. Has no effect when _ICACHE_EN_ is _false_. +|====== + + +// #################################################################################################################### +:sectnums: +==== External Memory Interface + +See sections <<_address_space>> and <<_processor_external_memory_interface_wishbone_axi4_lite>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_EXT_EN** | _boolean_ | false +3+| Implement external bus interface (WISHBONE) when _true_. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **MEM_EXT_TIMEOUT** | _natural_ | 255 +3+| Clock cycles after which a pending external bus access will auto-terminates and raise a bus fault exception. Set to 0 to disable auto-timeout. +|====== + + +// #################################################################################################################### +:sectnums: +==== Processor Peripheral/IO Modules + +See section <<_processor_internal_modules>> for more information. + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_GPIO_EN** | _boolean_ | true +3+| Implement general purpose input/output port unit (GPIO) when _true_. +See section <<_general_purpose_input_and_output_port_gpio>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_MTIME_EN** | _boolean_ | true +3+| Implement machine system timer (MTIME) when _true_. +See section <<_machine_system_timer_mtime>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_UART0_EN** | _boolean_ | true +3+| Implement primary universal asynchronous receiver/transmitter (UART0) when _true_. +See section <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> for +more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_UART1_EN** | _boolean_ | true +3+| Implement secondary universal asynchronous receiver/transmitter (UART1) when _true_. +See section <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_SPI_EN** | _boolean_ | true +3+| Implement serial peripheral interface controller (SPI) when _true_. +See section <<_serial_peripheral_interface_controller_spi>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_TWI_EN** | _boolean_ | true +3+| Implement two-wire interface controller (TWI) when _true_. +See section <<_two_wire_serial_interface_controller_twi>> for +more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_PWM_EN** | _boolean_ | true +3+| Implement pulse-width modulation controller (PWM) when _true_. +See section <<_pulse_width_modulation_controller_pwm>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_WDT_EN** | _boolean_ | true +3+| Implement watchdog timer (WDT) when _true_. See section <<_watchdog_timer_wdt>> for more +information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_TRNG_EN** | _boolean_ | false +3+| Implement true-random number generator (TRNG) when _true_. See section <<_true_random_number_generator_trng>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_CFS_EN** | _boolean_ | false +3+| Implement custom functions subsystem (CFS) when _true_. See section <<_custom_functions_subsystem_cfs>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_CFS_CONFIG** | _std_ulogic_vector(31 downto 0)_ | 0x"00000000" +3+| This is a "conduit" generic that can be used to pass user-defined CFS implementation flags to the custom +functions subsystem entity. See section <<_custom_functions_subsystem_cfs>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_CFS_IN_SIZE** | _positive_ | 32 +3+| Defines the size of the CFS input signal conduit (`cfs_in_i`). See section <<_custom_functions_subsystem_cfs>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_CFS_OUT_SIZE** | _positive_ | 32 +3+| Defines the size of the CFS output signal conduit (`cfs_out_o`). See section <<_custom_functions_subsystem_cfs>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_NCO_EN** | _boolean_ | true +3+| Implement numerically-controlled oscillator (NCO) when _true_. +See section <<_numerically_controlled_oscillator_nco>> for more information. +|====== + + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **IO_NEOLED_EN** | _boolean_ | true +3+| Implement smart LED interface (WS2812 / NeoPixel(TM)-compatible) (NEOLED) when _true_. +See section <<_smart_led_interface_neoled>> Compatible for more information. +|====== + + +<<< +// #################################################################################################################### +:sectnums: +=== Processor Interrupts + +**RISC-V Standard Interrupts** + +The processor setup features the standard RISC-V interrupt lines for "machine timer interrupt", "machine +software interrupt" and "machine external interrupt". The software and external interrupt lines are available +via the processor's top entity. By default, the timer interrupt is connected to the internal machine timer +MTIME timer unit (<<_machine_system_timer_mtime>>). If this module has not been enabled for +synthesis, the machine timer interrupt is also available via the processor's top entity. + +**NEORV32-Specific Fast Interrupt Requests** + +As part of the custom/NEORV32-specific CPU extensions, the CPU features 16 fast interrupt request signals +(`FIRQ0` – `FIRQ15`). + +[TIP] +The fast interrupt request signals have custom `mip` CSR bits (see <<_machine_trap_setup>>), custom +`mie` CSR bits (see <<_machine_trap_handling>>) and custom `mcause` CSR trap codes and trap +priories (see <<_traps_exceptions_and_interrupts>>). + +The fast interrupt request signals are divided into two groups. The FIRQs with higher priority (FIRQ0 – +FIRQ9) are dedicated for processor-internal usage. The FIRQs with lower priority (FIRQ10 – FIRQ15) are +available for custom usage via the processor's top entity signal `soc_firq_i`. + +The mapping of the 16 FIRQ channels is shown in the following table (the channel number corresponds to the FIRQ priority): + +.NEORV32 fast interrupt channel mapping +[cols="^1,<2,<7"] +[options="header",grid="rows"] +|======================= +| Channel | Source | Description +| 0 | _WDT_ | watchdog timeout interrupt +| 1 | _CFS_ | custom functions subsystem (CFS) interrupt (user-defined) +| 2 | _UART0_ (RXD) | UART0 data received interrupt (RX complete) +| 3 | _UART0_ (TXD) | UART0 sending done interrupt (TX complete) +| 4 | _UART1_ (RXD) | UART1 data received interrupt (RX complete) +| 5 | _UART1_ (TXD) | UART1 sending done interrupt (TX complete) +| 6 | _SPI_ | SPI transmission done interrupt +| 7 | _TWI_ | TWI transmission done interrupt +| 8 | _GPIO_ | GPIO input pin-change interrupt +| 9 | _NEOLED_ | NEOLED buffer TX empty / not full interrupt +| 10:15 | `soc_firq_i(5:0)` | Custom platform use; available via processor's top signal +|======================= + + +<<< +// #################################################################################################################### +:sectnums: +=== Address Space + +By default, the total 32-bit (4GB) address space of the NEORV32 Processor is divided into four main regions: + +1. Instruction memory (IMEM) space – for instructions and constants. +2. Data memory (DMEM) space – for application runtime data (heap, stack, etc.). +3. Bootloader ROM address space – for the processor-internal bootloader. +4. IO/peripheral address space – for the processor-internal IO/peripheral devices (e.g., UART). + +.NEORV32 processor - address space (default configuration) +image::../figures/address_space.png[900] + +**Address Space Layout** + +The general address space layout consists of two main configuration constants: `ispace_base_c` defining +the base address of the instruction memory address space and `dspace_base_c` defining the base address of +the data memory address space. Both constants are defined in the NEORV32 VHDL package file +`rtl/core/neorv32_package.vhd`: + +[source,vhdl] +---- +-- Architecture Configuration ---------------------------------------------------- +-- ---------------------------------------------------------------------------------- +constant ispace_base_c : std_ulogic_vector(31 downto 0) := x"00000000"; +constant dspace_base_c : std_ulogic_vector(31 downto 0) := x"80000000"; +---- + +The default configuration assumes the instruction memory address space starting at address _0x00000000_ +and the data memory address space starting at _0x80000000_. Both values can be modified for a specific +setup and the address space may overlap or can be completely identical. + +The base address of the bootloader (at _0xFFFF0000_) and the IO region (at _0xFFFFFF00_) for the peripheral +devices are also defined in the package and are fixed. These address regions cannot be used for other +applications – even if the bootloader or all IO devices are not implemented. + +[WARNING] +When using the processor-internal data and/or instruction memories (DMEM/IMEM) and using a non-default +configuration for the `dspace_base_c` and/or `ispace_base_c` base addresses, the +following requirements have to be fulfilled: +**1.** Both base addresses have to be aligned to a 4-byte boundary. +**2.** Both base addresses have to be aligned to the according internal memory sizes. + +:sectnums: +==== CPU Data and Instruction Access + +The CPU can access all of the 4GB address space from the instruction fetch interface (**I**) and also from the +data access interface (**D**). These two CPU interfaces are multiplexed by a simple bus switch +(`rtl/core/neorv32_busswitch.vhd`) into a _single_ processor-internal bus. All processor-internal +memories, peripherals and also the external memory interface are connected to this bus. Hence, both CPU +interfaces (instruction fetch & data access) have access to the same (**identical**) address space making the +setup a modified von-Neumann architecture. + +.Processor-internal bus architecture +image::../figures/neorv32_bus.png[1300] + +[NOTE] +The internal processor bus might appear as bottleneck. In order to reduce traffic jam on this bus +(when instruction fetch and data interface access the bus at the same time) the instruction fetch of +the CPU is equipped with a prefetch buffer. Instruction fetches can be further buffered using the i-cache. +Furthermore, data accesses (loads and stores) have higher priority than instruction fetch +accesses. + +[IMPORTANT] +Please note that all processor-internal components including the peripheral/IO devices can also be +accessed from programs running in less-privileged user mode. For example, if the system relies on +a periodic interrupt from the _MTIME_ timer unit, user-level programs could alter the _MTIME_ +configuration corrupting this interrupt. This kind of security issues can be compensated using the +PMP system (see <<_machine_physical_memory_protection>>). + +:sectnums: +==== Physical Memory Attributes + +The processor setup defines four simple attributes for the four processor-internal address space regions: + +* `r` – read access (from CPU data access interface, e.g. via "load") +* `w` – write access (from CPU data access interface, e.g. via "store") +* `x` – execute access (from CPU instruction fetch interface) +* `a` – atomic access (from CPU data access interface) +* `8` – byte (8-bit)-accessible (when writing) +* `16` – half-word (16-bit)-accessible (when writing) +* `32` – word (32-bit)-accessible (when writing) + +The following table shows the provided physical memory attributes of each region. Additional attributes (like +denying execute right for certain region of the IMEM) can be provided using the RISC-V <<_machine_physical_memory_protection>> extension. + +[cols="^1,^2,^2,^3,^2"] +[options="header",grid="rows"] +|======================= +| # | Region | Base address | Size | Attributes +| 4 | IO/peripheral devices | 0xffffff00 | 256 bytes | `r/w/a/32` +| 3 | bootloader ROM | 0xffff0000 | up to 32kB| `r/x/a` +| 2 | DMEM | 0x80000000 | up to 2GB (-64kB) | `r/w/x/a/8/16/32` +| 1 | IMEM | 0x00000000 | up to 2GB | `r/w/x/a/8/16/32` +|======================= + +Only the CPU of the processor has access to the internal memories and IO devices, hence all accesses are +always exclusive. Accessing a memory region in a way that violates the provided attributes will trigger a +load/store/instruction fetch access exception or will return a failed atomic access result, respectively. + +The physical memory attributes of memories and/or devices connected via the external bus interface have to +defined by those components or the interconnection fabric. + +:sectnums: +==== Internal Memories + +The processor can implement internal memories for instructions (IMEM) and data (DMEM), which will be +mapped to FPGA block RAMs. The implementation of these memories is controlled via the boolean +_MEM_INT_IMEM_EN_ and _MEM_INT_DMEM_EN_ generics. + +The size of these memories are configured via the _MEM_INT_IMEM_SIZE_ and _MEM_INT_DMEM_SIZE_ +generics (in bytes), respectively. The processor-internal instruction memory (IMEM) can optionally be +implemented as true ROM (_MEM_INT_IMEM_ROM_), which is initialized with the application code during +synthesis. + +If the processor-internal IMEM is implemented, it is located right at the base address of the instruction +address space (default `ispace_base_c` = _0x00000000_). Vice versa, the processor-internal data memory is +located right at the beginning of the data address space (default `dspace_base_c` = _0x80000000_) when +implemented. + +:sectnums: +==== External Memory/Bus Interface + +Any CPU access (data or instructions), which does not fulfill one of the following conditions, is forwarded +to the <<_processor_external_memory_interface_wishbone_axi4_lite>>: + +* access to the processor-internal IMEM and processor-internal IMEM is implemented +* access to the processor-internal DMEM and processor-internal DMEM is implemented +* access to the bootloader ROM and beyond → addresses >= _BOOTROM_BASE_ (default 0xFFFF0000) will never be forwarded to the external memory interface + +The external bus interface is available when the _MEM_EXT_EN_ generic is _true_. If this interface is +deactivated, any access exceeding the internal memories or peripheral devices will trigger a bus access fault +exception. If _MEM_EXT_TIMEOUT_ is greater than zero any external bus access that is not acknowledged or terminated +within _MEM_EXT_TIMEOUT_ clock cycles will auto-timeout and raise the according bus fault exception. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Processor-Internal Modules + +Basically, the processor is a SoC consisting of the NEORV32 CPU, peripheral/IO devices, embedded +memories, an external memory interface and a bus infrastructure to interconnect all units. Additionally, the +system implements an internal reset generator and a global clock generator/divider. + +**Internal Reset Generator** + +Most processor-internal modules – except for the CPU and the watchdog timer – do not have a dedicated +reset signal. However, all devices can be reset by software by clearing the corresponding unit's control +register. The automatically included application start-up code will perform such a software-reset of all +modules to ensure a clean system reset state. The hardware reset signal of the processor can either be +triggered via the external reset pin (`rstn_i`, low-active) or by the internal watchdog timer (if implemented). +Before the external reset signal is applied to the system, it is filtered (so no spike can generate a reset, a +minimum active reset period of one clock cycle is required) and extended to have a minimal duration of four +clock cycles. + +**Internal Clock Divider** + +An internal clock divider generates 8 clock signals derived from the processor's main clock input `clk_i`. +These derived clock signals are not actual _clock signals_. Instead, they are derived from a simple counter and +are used as "clock enable" signal by the different processor modules. Thus, the whole design operates using +only the main clock signal (single clock domain). Some of the processor peripherals like the Watchdog or the +UARTs can select one of the derived clock enabled signals for their internal operation. If none of the +connected modules require a clock signal from the divider, it is automatically deactivated to reduce dynamic +power. + +The peripheral devices, which feature a time-based configuration, provide a three-bit prescaler select in their +according control register to select one out of the eight available clocks. The mapping of the prescaler select +bits to the actually obtained clock are shown in the table below. Here, f represents the processor main clock +from the top entity's `clk_i` signal. + +[cols="<3,^1,^1,^1,^1,^1,^1,^1,^1"] +[grid="rows"] +|======================= +| Prescaler bits: | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting clock: | _f/2_ | _f/4_ | _f/8_ | _f/64_ | _f/128_ | _f/1024_| _f/2048_| _f/4096_ +|======================= + +**Peripheral / IO Devices** + +The processor-internal peripheral/IO devices are located at the end of the 32-bit address space at base +address _0xFFFFF00_. A region of 256 bytes is reserved for this devices. Hence, all peripheral/IO devices are +accessed using a memory-mapped scheme. A special linker script as well as the NEORV32 core software +library abstract the specific memory layout for the user. + +[IMPORTANT] +When accessing an IO device that hast not been implemented (via the according _IO_x_EN_ generic), a +load/store access fault exception is triggered. + +[IMPORTANT] +The peripheral/IO devices can only be written in full-word mode (i.e. 32-bit). Byte or half-word +(8/16-bit) writes will trigger a store access fault exception. Read accesses are not size constrained. +Processor-internal memories as well as modules connected to the external memory interface can still +be written with a byte-wide granularity. + +[TIP] +You should use the provided core software library to interact with the peripheral devices. This +prevents incompatibilities with future versions, since the hardware driver functions handle all the +register and register bit accesses. + +[TIP] +Most of the IO devices do not have a hardware reset. Instead, the devices are reset via software by +writing zero to the unit's control register. A general software-based reset of all devices is done by the +application start-up code `crt0.S`. + +**Nomenclature for the Peripheral / IO Devices Listing** + +Each peripheral device chapter features a register map showing accessible control and data registers of the +according device including the implemented control and status bits. You can directly interact with these +registers/bits via the provided _C-code defines_. These defines are set in the main processor core library +include file `sw/lib/include/neorv32.h`. The registers and/or register bits, which can be accessed +directly using plain C-code, are marked with a "[C]". + +Not all registers or register bits can be arbitrarily read/written. The following read/write access types are +available: + +* `r/w` registers / bits can be read and written +* `r/-` registers / bits are read-only; any write access to them has no effect +* `-/w` these registers / bits are write-only; they auto-clear in the next cycle and are always read as zero + +[TIP] +Bits / registers that are not listed in the register map tables are not (yet) implemented. These registers +/ bits are always read as zero. A write access to them has no effect, but user programs should only +write zero to them to keep compatible with future extension. + +[TIP] +When writing to read-only registers, the access is nevertheless acknowledged, but no actual data is +written. When reading data from a write-only register the result is undefined. + + +include::soc_imem.adoc[] + +include::soc_dmem.adoc[] + +include::soc_bootrom.adoc[] + +include::soc_icache.adoc[] + +include::soc_wishbone.adoc[] + +include::soc_gpio.adoc[] + +include::soc_wdt.adoc[] + +include::soc_mtime.adoc[] + +include::soc_uart.adoc[] + +include::soc_spi.adoc[] + +include::soc_twi.adoc[] + +include::soc_pwm.adoc[] + +include::soc_trng.adoc[] + +include::soc_cfs.adoc[] + +include::soc_nco.adoc[] + +include::soc_neoled.adoc[] + +include::soc_sysinfo.adoc[] + + Index: docs/src_adoc/soc_bootrom.adoc =================================================================== --- docs/src_adoc/soc_bootrom.adoc (nonexistent) +++ docs/src_adoc/soc_bootrom.adoc (revision 57) @@ -0,0 +1,36 @@ +<<< +:sectnums: +==== Bootloader ROM (BOOTROM) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_boot_rom.vhd | +| Software driver file(s): | none | _implicitly used_ +| Top entity port: | none | +| Configuration generics: | _BOOTLOADER_EN_ | implement processor-internal bootloader when _true_ +| CPU interrupts: | none | +|======================= + +As the name already suggests, the boot ROM contains the read-only bootloader image. When the bootloader +is enabled via the _BOOTLOADER_EN_ generic it is directly executed after system reset. + +The bootloader ROM is located at address 0xFFFF0000. This location is fixed and the bootloader ROM size +must not exceed 32kB. The bootloader read-only memory is automatically initialized during synthesis via the +`rtl/core/neorv32_bootloader_image.vhd` file, which is generated when compiling and installing the +bootloader sources. + +The bootloader ROM address space cannot be used for other applications even when the bootloader is not +implemented. + +**Boot Configuration** + +If the bootloader is implemented, the CPU starts execution after reset right at the beginning of the boot +ROM. If the bootloader is not implemented, the CPU starts execution at the beginning of the instruction +memory space (defined via `ispace_base_c` constant in the `neorv32_package.vhd` VHDL package file, +default `ispace_base_c` = 0x00000000). In this case, the instruction memory has to contain a valid +executable – either by using the internal IMEM with an initialization during synthesis or by a user-defined +initialization process. + +[TIP] +See section <<_bootloader>> for more information regarding the bootloader's boot process and configuration options. Index: docs/src_adoc/soc_cfs.adoc =================================================================== --- docs/src_adoc/soc_cfs.adoc (nonexistent) +++ docs/src_adoc/soc_cfs.adoc (revision 57) @@ -0,0 +1,103 @@ +<<< +:sectnums: +==== Custom Functions Subsystem (CFS) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_gfs.vhd | +| Software driver file(s): | neorv32_gfs.c | +| | neorv32_gfs.h | +| Top entity port: | `cfs_in_i` | custom input conduit +| | `cfs_out_o` | custom output conduit +| Configuration generics: | _IO_CFS_EN_ | implement CFS when _true_ +| | _IO_CFS_CONFIG_ | custom generic conduit +| | _IO_CFS_IN_SIZE_ | size of `cfs_in_i` +| | _IO_CFS_OUT_SIZE_ | size of `cfs_out_o` +| CPU interrupts: | fast IRQ channel 1 | CFS interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The custom functions subsystem can be used to implement application-specific user-defined co-processors +(like encryption or arithmetic accelerators) or peripheral/communication interfaces. In contrast to connecting +custom hardware accelerators via the external memory interface, the CFS provide a convenient and low-latency +extension and customization option. + +The CFS provides up to 32x 32-bit memory-mapped registers (see register map table below). The actual +functionality of these register has to be defined by the hardware designer. + +[INFO] +Take a look at the template CFS VHDL source file (`rtl/core/neorv32_cfs.vhd`). The file is highly +commented to illustrate all aspects that are relevant for implementing custom CFS-based co-processor designs. + +**CFS Software Access** + +The CFS memory-mapped registers can be accessed by software using the provided C-language aliases (see +register map table below). Note that all interface registers provide 32-bit access data of type `uint32_t`. + +[source,c] +---- +// C-code CFS usage example +CFS_REG_0 = (uint32_t)some_data_array(i); // write to CFS register 0 +uint32_t temp = CFS_REG_20; // read from CFS register 20 +---- + +**CFS Interrupt** + +The CFS provides a single one-shot interrupt request signal mapped to the CPU's fast interrupt channel 1. +See section <<_processor_interrupts>> for more information. + +**CFS Configuration Generic** + +By default, the CFS provides a single 32-bit `std_(u)logic_vector` configuration generic _IO_CFS_CONFIG_ +that is available in the processor's top entity. This generic can be used to pass custom configuration options +from the top entity down to the CFS entity. + +**CFS Custom IOs** + +By default, the CFS also provides two unidirectional input and output conduits `cfs_in_i` and `cfs_out_o`. +These signals are propagated to the processor's top entity. The actual use of these signals has to be defined +by the hardware designer. The size of the input signal conduit `cfs_in_i` is defined via the (top's) _IO_CFS_IN_SIZE_ configuration +generic (default = 32-bit). The size of the output signal conduit `cfs_out_o` is defined via the (top's) +_IO_CFS_OUT_SIZE_ configuration generic (default = 32-bit). If the custom function subsystem is not implemented +(_IO_CFS_EN_ = false) the `cfs_out_o` signal is tied to all-zero. + +.CFS register map +[cols="^4,<5,^2,^3,<14"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s) | R/W | Function +| `0xffffff00` | _CFS_REG_0_ |`31:0` | (r)/(w) | custom CFS interface register 0 +| `0xffffff04` | _CFS_REG_1_ |`31:0` | (r)/(w) | custom CFS interface register 1 +| `0xffffff08` | _CFS_REG_2_ |`31:0` | (r)/(w) | custom CFS interface register 2 +| `0xffffff0c` | _CFS_REG_3_ |`31:0` | (r)/(w) | custom CFS interface register 3 +| `0xffffff10` | _CFS_REG_4_ |`31:0` | (r)/(w) | custom CFS interface register 4 +| `0xffffff14` | _CFS_REG_5_ |`31:0` | (r)/(w) | custom CFS interface register 5 +| `0xffffff18` | _CFS_REG_6_ |`31:0` | (r)/(w) | custom CFS interface register 6 +| `0xffffff1c` | _CFS_REG_7_ |`31:0` | (r)/(w) | custom CFS interface register 7 +| `0xffffff20` | _CFS_REG_8_ |`31:0` | (r)/(w) | custom CFS interface register 8 +| `0xffffff24` | _CFS_REG_9_ |`31:0` | (r)/(w) | custom CFS interface register 9 +| `0xffffff28` | _CFS_REG_10_ |`31:0` | (r)/(w) | custom CFS interface register 10 +| `0xffffff2c` | _CFS_REG_11_ |`31:0` | (r)/(w) | custom CFS interface register 11 +| `0xffffff30` | _CFS_REG_12_ |`31:0` | (r)/(w) | custom CFS interface register 12 +| `0xffffff34` | _CFS_REG_13_ |`31:0` | (r)/(w) | custom CFS interface register 13 +| `0xffffff38` | _CFS_REG_14_ |`31:0` | (r)/(w) | custom CFS interface register 14 +| `0xffffff3c` | _CFS_REG_15_ |`31:0` | (r)/(w) | custom CFS interface register 15 +| `0xffffff40` | _CFS_REG_16_ |`31:0` | (r)/(w) | custom CFS interface register 16 +| `0xffffff44` | _CFS_REG_17_ |`31:0` | (r)/(w) | custom CFS interface register 17 +| `0xffffff48` | _CFS_REG_18_ |`31:0` | (r)/(w) | custom CFS interface register 18 +| `0xffffff4c` | _CFS_REG_19_ |`31:0` | (r)/(w) | custom CFS interface register 19 +| `0xffffff50` | _CFS_REG_20_ |`31:0` | (r)/(w) | custom CFS interface register 20 +| `0xffffff54` | _CFS_REG_21_ |`31:0` | (r)/(w) | custom CFS interface register 21 +| `0xffffff58` | _CFS_REG_22_ |`31:0` | (r)/(w) | custom CFS interface register 22 +| `0xffffff5c` | _CFS_REG_23_ |`31:0` | (r)/(w) | custom CFS interface register 23 +| `0xffffff60` | _CFS_REG_24_ |`31:0` | (r)/(w) | custom CFS interface register 24 +| `0xffffff64` | _CFS_REG_25_ |`31:0` | (r)/(w) | custom CFS interface register 25 +| `0xffffff68` | _CFS_REG_26_ |`31:0` | (r)/(w) | custom CFS interface register 26 +| `0xffffff6c` | _CFS_REG_27_ |`31:0` | (r)/(w) | custom CFS interface register 27 +| `0xffffff70` | _CFS_REG_28_ |`31:0` | (r)/(w) | custom CFS interface register 28 +| `0xffffff74` | _CFS_REG_29_ |`31:0` | (r)/(w) | custom CFS interface register 29 +| `0xffffff78` | _CFS_REG_30_ |`31:0` | (r)/(w) | custom CFS interface register 30 +| `0xffffff7c` | _CFS_REG_31_ |`31:0` | (r)/(w) | custom CFS interface register 31 +|======================= Index: docs/src_adoc/soc_dmem.adoc =================================================================== --- docs/src_adoc/soc_dmem.adoc (nonexistent) +++ docs/src_adoc/soc_dmem.adoc (revision 57) @@ -0,0 +1,19 @@ +<<< +:sectnums: +==== Data Memory (DMEM) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_dmem.vhd | +| Software driver file(s): | none | _implicitly used_ +| Top entity port: | none | +| Configuration generics: | _MEM_INT_DMEM_EN_ | implement processor-internal DMEM when _true_ +| | _MEM_INT_DMEM_SIZE_ | DMEM size in bytes +| CPU interrupts: | none | +|======================= + +Implementation of the processor-internal data memory is enabled via the processor's _MEM_INT_DMEM_EN_ +generic. The size in bytes is defined via the _MEM_INT_DMEM_SIZE_ generic. If the DMEM is implemented, +the memory is mapped into the data memory space and located right at the beginning of the data memory +space (default `dspace_base_c` = 0x80000000). The DMEM is always implemented as RAM. Index: docs/src_adoc/soc_gpio.adoc =================================================================== --- docs/src_adoc/soc_gpio.adoc (nonexistent) +++ docs/src_adoc/soc_gpio.adoc (revision 57) @@ -0,0 +1,42 @@ +<<< +:sectnums: +==== General Purpose Input and Output Port (GPIO) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_gpio.vhd | +| Software driver file(s): | neorv32_gpio.c | +| | neorv32_gpio.h | +| Top entity port: | `gpio_o` | 32-bit parallel output port +| | `gpio_i` | 32-bit parallel input port +| Configuration generics: | _IO_GPIO_EN_ | implement GPIO port when _true_ +| CPU interrupts: | FIRQ channel 8 | pin-change interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The general purpose parallel IO port unit provides a simple 32-bit parallel input port and a 32-bit parallel +output port. These ports can be used chip-externally (for example to drive status LEDs, connect buttons, etc.) +or system-internally to provide control signals for other IP modules. When the modules is disabled for +implementation the GPIO output port is tied to zero. + +**Pin-Change Interrupt** + +The parallel input port `gpio_i` features a single pin-change interrupt. Whenever an input pin has a low-to-high +or high-to-low transition, the interrupt is triggered. By default, the pin-change interrupt is disabled and +can be enabled using a bit mask that has to be written to the _GPIO_INPUT_ register. Each set bit in this mask +enables the pin-change interrupt for the corresponding input pin. If more than one input pin is enabled for +triggering the pin-change interrupt, any transition on one of the enabled input pins will trigger the CPU's pinchange +interrupt. If the modules is disabled for implementation, the pin-change interrupt is also permanently +disabled. + +.GPIO unit register map +[cols="<2,<2,^1,^1,<6"] +[options="header",grid="rows"] +|======================= +| Address | Name [C] | Bit(s) | R/W | Function +.2+<| `0xffffff80` .2+<| _GPIO_INPUT_ ^| 31:0 ^| r/- <| parallel input port + ^| 31:0 ^| -/w <| parallel input pin-change IRQ enable mask +| `0xffffff84` | _GPIO_OUTPUT_ | 31:0 | r/w | parallel output port +|======================= Index: docs/src_adoc/soc_icache.adoc =================================================================== --- docs/src_adoc/soc_icache.adoc (nonexistent) +++ docs/src_adoc/soc_icache.adoc (revision 57) @@ -0,0 +1,50 @@ +<<< +:sectnums: +==== Processor-Internal Instruction Cache (iCACHE) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_icache.vhd | +| Software driver file(s): | none | _implicitly used_ +| Top entity port: | none | +| Configuration generics: | _ICACHE_EN_ | implement processor-internal instruction cache when _true_ +| | _ICACHE_NUM_BLOCKS_ | number of cache blocks (pages/lines) +| | _ICACHE_BLOCK_SIZE_ | size of a cache block in bytes +| | _ICACHE_ASSOCIATIVITY_ | associativity / number of sets +| CPU interrupts: | none | +|======================= + +The processor features an optional cache for instructions to compensate memories with high latency. The +cache is directly connected to the CPU's instruction fetch interface and provides a full-transparent buffering +of instruction fetch accesses to the entire 4GB address space. + +[IMPORTANT] +The instruction cache is intended to accelerate instruction fetch via the external memory interface. +Since all processor-internal memories provide an access latency of one cycle (by default), caching +internal memories does not bring any performance gain. However, it _might_ reduce traffic on the +processor-internal bus. + +The cache is implemented if the _ICACHE_EN_ generic is true. The size of the cache memory is defined via +_ICACHE_BLOCK_SIZE_ (the size of a single cache block/page/line in bytes; has to be a power of two and >= +4 bytes), _ICACHE_NUM_BLOCKS_ (the total amount of cache blocks; has to be a power of two and >= 1) and +the actual cache associativity _ICACHE_ASSOCIATIVITY_ (number of sets; 1 = direct-mapped, 2 = 2-way set-associative, +has to be a power of two and >= 1). + +If the cache associativity (_ICACHE_ASSOCIATIVITY_) is > 1 the LRU replacement policy (least recently +used) is used. + +[TIP] +Keep the features of the targeted FPGA's memory resources (block RAM) in mind when configuring +the cache size/layout to maximize and optimize resource utilization. + +By executing the `ifence.i` instruction (`Zifencei` CPU extension) the cache is cleared and a reload from +main memory is forced. Among other things, this allows to implement self-modifying code. + +**Bus Access Fault Handling** + +The cache always loads a complete cache block (_ICACHE_BLOCK_SIZE_ bytes) aligned to the size of a cache +block if a miss is detected. If any of the accessed addresses within a single block do not successfully +acknowledge (i.e. issuing an error signal or timing out) the whole cache block is invalidate and any access to +an address within this cache block will also raise an instruction fetch bus error fault exception. + Index: docs/src_adoc/soc_imem.adoc =================================================================== --- docs/src_adoc/soc_imem.adoc (nonexistent) +++ docs/src_adoc/soc_imem.adoc (revision 57) @@ -0,0 +1,31 @@ +<<< +:sectnums: +==== Instruction Memory (IMEM) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_imem.vhd | +| Software driver file(s): | none | _implicitly used_ +| Top entity port: | none | +| Configuration generics: | _MEM_INT_IMEM_EN_ | implement processor-internal IMEM when _true_ +| | _MEM_INT_IMEM_SIZE_ | IMEM size in bytes +| | _MEM_INT_IMEM_ROM_ | implement IMEM as ROM when _true_ +| CPU interrupts: | none | +|======================= + +Implementation of the processor-internal instruction memory is enabled via the processor's +_MEM_INT_IMEM_EN_ generic. The size in bytes is defined via the _MEM_INT_IMEM_SIZE_ generic. If the +IMEM is implemented, the memory is mapped into the instruction memory space and located right at the +beginning of the instruction memory space (default `ispace_base_c` = 0x00000000). + +By default, the IMEM is implemented as RAM, so the content can be modified during run time. This is +required when using a bootloader that can update the content of the IMEM at any time. If you do not need +the bootloader anymore – since your application development has completed and you want the program to +permanently reside in the internal instruction memory – the IMEM can also be implemented as true _read-only_ +memory. In this case set the _MEM_INT_IMEM_ROM_ generic of the processor's top entity to _true_. + +When the IMEM is implemented as ROM, it will be initialized during synthesis with the actual application +program image. The compiler toolchain will generate a VHDL initialization +file `rtl/core/neorv32_application_image.vhd`, which is automatically inserted into the IMEM. If +the IMEM is implemented as RAM (default), the memory will **not be initialized** at all. Index: docs/src_adoc/soc_mtime.adoc =================================================================== --- docs/src_adoc/soc_mtime.adoc (nonexistent) +++ docs/src_adoc/soc_mtime.adoc (revision 57) @@ -0,0 +1,47 @@ +<<< +:sectnums: +==== Machine System Timer (MTIME) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_mtime.vhd | +| Software driver file(s): | neorv32_mtime.c | +| | neorv32_mtime.h | +| Top entity port: | none | +| Configuration generics: | _IO_MTIME_EN_ | implement MTIME when _true_ +| CPU interrupts: | `MTI` | machine timer interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The MTIME machine system timer implements the memory-mapped MTIME timer from the official RISC-V +specifications. This unit features a 64-bit system timer incremented with the primary processor clock. + +The 64-bit system time can be accessed via the `MTIME_LO` and `MTIME_HI` memory-mapped registers (read/write) and also via +the CPU's `time[h]` CSRs (read-only). A 64-bit time compare register – accessible via memory-mapped `MTIMECMP_LO` and `MTIMECMP_HI` +registers – are used to configure an interrupt to the CPU. The interrupt is triggered +whenever `MTIME` (high & low part) >= `MTIMECMP` (high & low part) and is directly forwarded to the CPU's `MTI` interrupt. + +[NOTE] +If the processor-internal **MTIME unit is NOT implemented**, the top's `mtime_i` input signal is used to update the `time[h]` CSRs +and the `MTI` machine timer interrupt) CPU interrupt is directly connected to the top's `mtime_irq_i` input. + +[TIP] +The interrupt request is a single-shot signal, +so the CPU is triggered once if the system time is greater than or equal to the compare time. Hence, +another MTIME IRQ is only possible when updating `MTIMECMP`. + +The 64-bit counter and the 64-bit comparator are implemented as 2×32-bit counters and comparators with a +registered carry to prevent a 64-bit carry chain and thus, to simplify timing closure. + +.MTIME register map +[cols="<3,<3,^1,^1,<6"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bits | R/W | Function +| `0xffffff90` | _MTIME_LO_ | 31:0 | r/w | machine system time, low word +| `0xffffff94` | _MTIME_HI_ | 31:0 | r/w | machine system time, high word +| `0xffffff98` | _MTIMECMP_LO_ | 31:0 | r/w | time compare, low word +| `0xffffff9c` | _MTIMECMP_HI_ | 31:0 | r/w | time compare, high word +|======================= Index: docs/src_adoc/soc_nco.adoc =================================================================== --- docs/src_adoc/soc_nco.adoc (nonexistent) +++ docs/src_adoc/soc_nco.adoc (revision 57) @@ -0,0 +1,129 @@ +<<< +:sectnums: +==== Numerically-Controlled Oscillator (NCO) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_nco.vhd | +| Software driver file(s): | neorv32_nco.c | +| | neorv32_nco.h | +| Top entity port: | `nco_o` | NCO output (3x 1-bit channels) +| Configuration generics: | _IO_NCO_EN_ | implement NCO when _true_ +| CPU interrupts: | none | +|======================= + +**Theory of Operation** + +The numerically-controller oscillator (NCO) provides a precise arbitrary linear frequency generator with +three independent channels. Based on a **direct digital synthesis** core, the NCO features a 20-bit wide +accumulator that is incremented with a programmable "tuning word". Whenever the accumulator overflows, a +flip flop is toggled that provides the actual frequency output. The accumulator increment is driven by one of +eight configurable clock sources, which are derived from the processor's main clock. + +The NCO features four accessible registers: the control register _NCO_CT_ and three _NCO_TUNE_CHi_ registers for +the tuning word of each channel i. The NCO is globally enabled by setting the _NCO_CT_EN_ bit in the control +register. If this bit is cleared, the accumulators of all channels are reset. The clock source for each channel i is +selected via the three bits _NCO_CT_CHi_PRSCx_ prescaler. The resulting clock is generated from the main +processor clock (f~main~) divided y the selected prescaler. + +.NCO prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`NCO_CT_CHi_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +The resulting output frequency of each channel i is defined by the following equation: + +_**f~NCO~(i)**_ = ( _f~main~[Hz]_ / `clock_prescaler`(i) ) * (`tuning_word`(i) / 2*2^20+1^) + +The maximum NCO frequency f~NCOmax~ is configured when using the minimal clock prescaler and a maximum all-one +tuning word: + +_**f~NCOmax~**_ = ( _f~main~[Hz]_ / 2 ) * (1 / 2*2^20+1^) + +The minimum "frequency" is always 0 Hz when the tuning word is zero. The frequency resolution f~NCOres~ is +defined using the maximum clock prescaler and a minimal non-zero tuning word (= 1): + +_**f~NCOres~**_ = ( _f~main~[Hz]_ / 4096 ) * (1 / 2*2^20+1^) + +Assuming a processor frequency of f~main~ = 100 MHz the maximum NCO output frequency is f~NCOmax~ = 12.499 +MHz with an NCO frequency resolution of f~NCOres~ = 0.00582 Hz. + +**Advanced Configuration** + +The idle polarity of each channel is configured via the _NCO_CT_CHi_IDLE_POL_ flag and can be either `0` +(idle low) or `1` (idle high), which basically allows to invert the NCO output. If the NCO is globally disabled +by clearing the _NCO_CT_EN_ flag, `nco_o(i)` output bit i is set to the according _NCO_CT_CHi_IDLE_POL_. + +The current state of each NCO channel output can be read by software via the NCO_CT_CHi_OUTPUT bit. +The NCO frequency output is normally available via the top nco_o output signal. The according channel +output can be permanently set to zero by clearing the according NCO_CT_CHi_OE bit. + +Each NCO channel can operate either in standard mode or in pulse mode. The mode is configured via the +according channel's NCO_CT_CHi_MODE control register bit. + +**_Standard_ Operation Mode** + +If this _NCO_CT_CHi_MODE_ bit of channel i is cleared, the channel operates in standard mode providing a +frequency with **exactly 50% duty cycle** (T~high~ = T~low~). + +**_Pulse_ Operation Mode** + +If the _NCO_CT_CHi_MODE_ bit of channel i is set, the channel operates in pulse mode. In this mode, the duty +cycle can be modified to generate active pulses with variable length. Note that the "active" pulse polarity is defined +by the inverted _NCO_CT_CHi_IDLE_POL_ bit. + +Eight different pulse lengths are available. The active pulse length is defined as number of NCO clock +cycles, where the NCO clock is defined via the clock prescaler bits _NCO_CT_Chi_PRSCx_. The pulse length +of channel i is programmed by the 3-bit _NCO_CT_CHi_PULSEx_ configuration: + +.NCO pulse length configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`NCO_CT_CHi_PULSEx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Pulse length (in NCO clock cycles) | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 +|======================= + +If _NCO_CT_CHi_IDLE_POL_ is cleared, T~high~ is defined by the _NCO_CT_CHi_PULSEx_ configuration and T~low~ = +T – T~high~. _If NCO_CT_CHi_IDLE_POL_ is set, T~low~ is defined by the _NCO_CT_CHi_PULSEx_ configuration and +T~high~ = T – T~low~. + +The actual output frequency of the channel (defined via the clock prescaler and the tuning word) is not +affected by the pulse configuration. + +For simple PWM applications, that do not require a precise frequency but a more flexible duty cycle +configuration, see section <<_pulse_width_modulation_controller_pwm>>. + +<<< +.NCO register map +[cols="<4,<3,<9,^2,<11"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.22+<| `0xffffffc0` .22+<| _NCO_CT_ ^|`0` _NCO_CT_EN_ ^| r/w <| NCO enable + 3+^| Channel 0 `nco_o(0)` + ^|`1` _NCO_CT_CH0_MODE_ ^| r/w <| output mode (`0`=fixed 50% duty cycle; `1`=pulse mode) + ^|`2` _NCO_CT_CH0_IDLE_POL_ ^| r/w <| output idle polarity + ^|`3` _NCO_CT_CH0_OE_ ^| r/w <| enable output to `nco_o(0)` + ^|`4` _NCO_CT_CH0_OUTPUT_ ^| r/- <| current state of `nco_o(0)` + ^|`7:5` _NCO_CT_CH0_PRSC02_ : _NCO_CT_CH0_PRSC0_ ^| r/w <| 3-bit clock prescaler select + ^|`10_:8` _NCO_CT_CH0_PULSE2_ : _NCO_CT_CH0_PULSE0_ ^| r/w <| 3-bit pulse length select + 3+^| Channel 1 `nco_o(1)` + ^|`11` _NCO_CT_CH1_MODE_ ^| r/w <| output mode (`0`=fixed 50% duty cycle; `1`=pulse mode) + ^|`12` _NCO_CT_CH1_IDLE_POL_ ^| r/w <| output idle polarity + ^|`13` _NCO_CT_CH1_OE_ ^| r/w <| enable output to `nco_o(1)` + ^|`14` _NCO_CT_CH1_OUTPUT_ ^| r/- <| current state of `nco_o(1)` + ^|`17:15` _NCO_CT_CH1_PRSC2_ : _NCO_CT_CH1_PRSC0_ ^| r/w <| 3-bit clock prescaler select + ^|`20:18` _NCO_CT_CH1_PULSE2_ : _NCO_CT_CH1_PULSE0_ ^| r/w <| 3-bit pulse length select + 3+^| Channel 2 `nco_o(2)` + ^|`21` _NCO_CT_CH2_MODE_ ^| r/w <| output mode (`0`=fixed 50% duty cycle; `1`=pulse mode) + ^|`22` _NCO_CT_CH2_IDLE_POL_ ^| r/w <| output idle polarity + ^|`23` _NCO_CT_CH2_OE_ ^| r/w <| enable output to `nco_o(2)` + ^|`24` _NCO_CT_CH2_OUTPUT_ ^| r/- <| current state of `nco_o(2)` + ^|`27:25` _NCO_CT_CH2_PRSC2_ : _NCO_CT_CH2_PRSC0_ ^| r/w <| 3-bit clock prescaler select + ^|`30:28` _NCO_CT_CH2_PULSE2_ : _NCO_CT_CH2_PULSE0_ ^| r/w <| 3-bit pulse length select +|======================= Index: docs/src_adoc/soc_neoled.adoc =================================================================== --- docs/src_adoc/soc_neoled.adoc (nonexistent) +++ docs/src_adoc/soc_neoled.adoc (revision 57) @@ -0,0 +1,193 @@ +<<< +:sectnums: +==== Smart LED Interface (NEOLED) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_neoled.vhd | +| Software driver file(s): | neorv32_neoled.c | +| | neorv32_neoled.h | +| Top entity port: | `neoled_o` | 1-bit serial data +| Configuration generics: | _IO_NEOLED_EN_ | implement NEOLED when _true_ +| CPU interrupts: | fast IRQ channel 9 | NEOLED interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The NEOLED module provides a dedicated interface for "smart RGB LEDs" like the WS2812 or WS2811. +These LEDs provide a single interface wire that uses an asynchronous serial protocol for transmitting color +data. Basically, data is transferred via LED-internal shift registers, which allows to cascade an unlimited +number of smart LEDs. The protocol provides a RESET command to strobe the transmitted data into the +LED PWM driver registers after data has shifted throughout all LEDs in a chain. + +[NOTE] +The NEOLED interface is compatible to the "Adafruit Industries NeoPixel" products, which feature +WS2812 (or older WS2811) smart LEDs (see link:https://learn.adafruit.com/adafruit-neopixel-uberguide). + +The interface provides a single 1-bit output `neoled_o` to drive an arbitrary number of LEDs. Since the +NEOLED module provides 24-bit and 32-bit operating modes, a mixed setup with RGB LEDs (24-bit color) +and RGBW LEDs (32-bit color including a dedicated white LED chip) is also possible. + +**Theory of Operation – Protocol** + +The interface of the WS2812 LEDs uses an 800kHz carrier signal. Data is transmitted in a serial manner +starting with LSB-first. The intensity for each R, G & B LED chip (= color code) is defined via an 8-bit +value. The actual data bits are transferred by modifying the duty cycle of the signal (the timings for the +WS2812 are shown below). A RESET command is "send" by pulling the data line LOW for at least 50μs. + +.WS2812 bit-level protocol - taken from the "Adafruit NeoPixel Überguide" +image::../figures/neopixel.png[align=center] + +.WS2812 interface timing +[cols="<2,<2,<6"] +[grid="all"] +|======================= +| T~total~ (T~carrier~) | 1.25μs +/- 300ns | period for a single bit +| T~0H~ | 0.4μs +/- 150ns | high-time for sending a `1` +| T~0L~ | 0.8μs +/- 150ns | low-time for sending a `1` +| T~1H~ | 0.85μs +/- 150ns | high-time for sending a `0` +| T~1L~ | 0.45μs +/- 150 ns | low-time for sending a `0` +| RESET | Above 50μs | low-time for sending a RESET command +|======================= + +**Theory of Operation – NEOLED Module** + +The NEOLED modules provides two accessible interface register: the control register _NEOLED_CT_ and the +TX data register _NEOLED_DATA_. The NEOLED module is globally enabled via the control register's +_NEOLED_CT_EN_ bit. Clearing this bit will terminate any current operation, reset the module and +set the `neoled_o` output to zero. The precise timing (implementing the **WS2812** protocol) and transmission +mode are fully programmable via the _NEOLED_CT_ register to provide maximum flexibility. + +**Timing Configuration** + +The basic carrier frequency (800kHz for the WS2812 LEDs) is configured via a 3-bit main clock prescaler (_NEOLED_CT_PRSCx_, see table below) +that scales the main processor clock f~main~ and a 5-bit cycle multiplier _NEOLED_CT_T_TOT_x_. + +.NEOLED prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`NEOLED_CT_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +The duty-cycles (or more precisely: the high- and low-times for sending either a '1' bit or a '0' bit) are +defined via the 5-bit _NEOLED_CT_T_ONE_H_x_ and _NEOLED_CT_T_ZERO_H_x_ values, respecively. These programmable +timing constants allow to adapt the interface for a wide variety of smart LED protocol (for example WS2812 vs. +WS2811). + +**Timing Configuration – Example (WS2812)** + +Generate the base clock f~TX~ for the NEOLED TX engine: + +* processor clock f~main~ = 100 MHz +* _NEOLED_CT_PRSCx_ = `0b001` = f~main~ / 4 + +_**f~TX~**_ = _f~main~[Hz]_ / `clock_prescaler` = 100MHz / 4 = 25MHz + +_**T~TX~**_ = 1 / _**f~TX~**_ = 40ns + +Generate carrier period (T~carrier~) and *high-times* (duty cycle) for sending `0` (T~0H~) and `1` (T~1H~) bits: + +* _NEOLED_CT_T_TOT_ = `0b11110` (= decimal 30) +* _NEOLED_CT_T_ZERO_H_ = `0b01010` (= decimal 10) +* _NEOLED_CT_T_ONE_H_ = `0b10100` (= decimal 20) + +_**T~carrier~**_ = _**T~TX~**_ * _NEOLED_CT_T_TOT_ = 40ns * 30 = 1.4µs + +_**T~0H~**_ = _**T~TX~**_ * _NEOLED_CT_T_ZERO_H_ = 40ns * 10 = 0.4µs + +_**T~1H~**_ = _**T~TX~**_ * _NEOLED_CT_T_ONE_H_ = 40ns * 20 = 0.8µs + +[TIP] +The NEOLED SW driver library (`neorv32_neoled.h`) provides a simplified configuration +function that configures all timing parameters for driving WS2812 LEDs based on the processor +clock configuration. + +**RGB / RGBW Configuration** + +NeoPixel are available in two "color" version: LEDs with three chips providing RGB color and LEDs with +four chips providing RGB color plus a dedicated white LED chip (= RGBW). Since the intensity of every +LED chip is defined via an 8-bit value the RGB LEDs require a frame of 24-bit per module and the RGBW +LEDs require a frame of 32-bit per module. + +The data transfer quantity of the NEOLED module can be configured via the _NEOLED_MODE_EN_ control +register bit. If this bit is cleared, the NEOLED interface operates in 24-bit mode and will transmit bits `23:0` of +the data written to _NEOLED_DATA_. If _NEOLED_MODE_EN_ is set, the NEOLED interface operates in 32-bit +mode and will transmit bits `31:0` of the data written to _NEOLED_DATA_. + +**TX Data FIFO** + +The interface features a TX data buffer (a FIFO) to allow CPU-independent operation. The buffer depth +is configured via the `tx_buffer_entries_c` constant (default = 4 entries) in the module's VHDL source +file `rtl/core/neorv32_neoled.vhd`. The current configuration can be read via the _NEOLED_CT_BUFS_x_ +control register bits, which result log2(`tx_buffer_entries_c`). + +When writing data to the _NEOLED_DATA_ register the data is automatically written to the TX buffer. Whenever +data is available in the buffer the serial transmission engine will take it and transmit it to the LEDs. + +The data transfer size (_NEOLED_MODE_EN_) can be modified at every time since this control register bit is also buffered +in the FIFO. This allows to arbitrarily mixing RGB and RGBW LEDs in the chain. + +[WARNING] +Please note that the timing configurations (_NEOLED_CT_PRSCx_, _NEOLED_CT_T_TOT_x_, +_NEOLED_CT_T_ONE_H_x_ and _NEOLED_CT_T_ZERO_H_x_) are NOT stored to the buffer. Changing +these value while the buffer is not empty or the TX engine is still sending will cause data corruption. + +**Status Configuration** + +The NEOLED modules features two read-only status bits in the control register: _NEOLED_CT_BUSY_ and +_NEOLED_CT_TX_STATUS_. + +If the _NEOLED_CT_TX_STATUS_ is set the serial TX engine is still busy sending serial data to the LED stripes. +If the flag is cleared, the TX engine is idle and the serial data output `neoled_o` is set LOW. + +The _NEOLED_CT_BUSY_ flag provides a programmable option to check for the TX buffer state. The control +register's _NEOLED_CT_BSCON_ bit is used to configure the "meaning" of the _NEOLED_CT_BUSY_ flag. The +condition for sending an interrupt request (IRQ) to the CPU is also configured via the _NEOLED_CT_BSCON_ +bit. + +[cols="^5,^8,^8"] +[options="header",grid="all"] +|======================= +| _NEOLED_CT_BSCON_ | _NEOLED_CT_BUSY_ | Sending an IRQ when ... +| 0 | the busy flag will clear if there **IS at least one free entry** in the TX buffer | the IRQ will fire if **at least one entry GETS free** in the TX buffer +| 1 | the busy flag will clear if the **whole TX buffer IS empty** | the IRQ will fire if the **whole TX buffer GETS empty** +|======================= + +When _NEOLED_CT_BSCON_ is set, the CPU can write up to `tx_buffer_entries_c` of new data words to +_NEOLED_DATA_ without checking the busy flag _NEOLED_CT_BUSY_. This highly relaxes time constraints for +sending a continuous data stream to the LEDs (as an idle time beyond 50μs will trigger the LED's a RESET +command). + +<<< +.NEOLED register map +[cols="<4,<5,<9,^2,<9"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.22+<| `0xffffffd8` .22+<| _NEOLED_CT_ <|`0` _NEOLED_CT_EN_ ^| r/w <| NCO enable + <|`1` _NEOLED_CT_MODE_ ^| r/w <| data transfer size; `0`=24-bit; `1`=32-bit + <|`2` _NEOLED_CT_BSCON_ ^| r/w <| busy flag / IRQ trigger configuration (see table above) + <|`3` _NEOLED_CT_PRSC0_ ^| r/w <| 3-bit clock prescaler, bit 0 + <|`4` _NEOLED_CT_PRSC1_ ^| r/w <| 3-bit clock prescaler, bit 1 + <|`5` _NEOLED_CT_PRSC2_ ^| r/w <| 3-bit clock prescaler, bit 2 + <|`6` _NEOLED_CT_BUFS0_ ^| r/- .4+<| 4-bit log2(`tx_buffer_entries_c`) + <|`7` _NEOLED_CT_BUFS1_ ^| r/- + <|`8` _NEOLED_CT_BUFS2_ ^| r/- + <|`9` _NEOLED_CT_BUFS3_ ^| r/- + <|`10` _NEOLED_CT_T_TOT_0_ ^| r/w .5+| 5-bit pulse clock ticks per total single-bit period (T~total~) + <|`11` _NEOLED_CT_T_TOT_1_ ^| r/w + <|`12` _NEOLED_CT_T_TOT_2_ ^| r/w + <|`13` _NEOLED_CT_T_TOT_3_ ^| r/w + <|`14` _NEOLED_CT_T_TOT_4_ ^| r/w + <|`20` _NEOLED_CT_ONE_H_0_ ^| r/w .5+<| 5-bit pulse clock ticks per high-time for sending a one-bit (T~H1~) + <|`21` _NEOLED_CT_ONE_H_1_ ^| r/w + <|`22` _NEOLED_CT_ONE_H_2_ ^| r/w + <|`23` _NEOLED_CT_ONE_H_3_ ^| r/w + <|`24` _NEOLED_CT_ONE_H_4_ ^| r/w + <|`30` _NEOLED_CT_TX_STATUS_ ^| r/- <| transmit engine busy when `1` + <|`31` _NEOLED_CT_BUSY_ ^| r/- <| busy / buffer status flag; configured via _NEOLED_CT_BSCON_ (see table above) +| `0xffffffdc` | _NEOLED_DATA_ <|`31:0` / `23:0` ^| -/w <| TX data (32-/24-bit) +|======================= Index: docs/src_adoc/soc_pwm.adoc =================================================================== --- docs/src_adoc/soc_pwm.adoc (nonexistent) +++ docs/src_adoc/soc_pwm.adoc (revision 57) @@ -0,0 +1,65 @@ +<<< +:sectnums: +==== Pulse-Width Modulation Controller (PWM) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_pwm.vhd | +| Software driver file(s): | neorv32_pwm.c | +| | neorv32_pwm.h | +| Top entity port: | `pwm_o` | 4-channel PWM output (1-bit per channel) +| Configuration generics: | _IO_PWM_EN_ | implement PWM controller when _true_ +| CPU interrupts: | none | +|======================= + +**Theory of Operation** + +The PWM controller implements a pulse-width modulation controller with four independent channels and 8- +bit resolution per channel. It is based on an 8-bit counter with four programmable threshold comparators that +control the actual duty cycle of each channel. The controller can be used to drive a fancy RGB-LED with 24- +bit true color, to dim LCD back-lights or even for "analog" control. An external integrator (RC low-pass filter) +can be used to smooth the generated "analog" signals. + +The PWM controller is activated by setting the _PWM_CT_EN_ bit in the module's control register _PWM_CT_. When this +bit is cleared, the unit is reset and all PWM output channels are set to zero. +The 8-bit duty cycle for each channel, which represents the channel's "intensity", is defined via the according 8-bit_ PWM_DUTY_CHx_ byte in the _PWM_DUTY_ register. +Based on the duty cycle _PWM_DUTY_CHx_ the according intensity of each channel can be computed by the following formula: + +_**Intensity~x~**_ = _PWM_DUTY_CHx_ / (2^8^) + +The frequency of the generated PWM signals is defined by the PWM operating clock. This clock is derived +from the main processor clock and divided by a prescaler via the 3-bit PWM_CT_PRSCx in the unit's control +register. The following prescalers are available: + +.PWM prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`PWM_CT_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +The resulting PWM frequency is defined by: + +_**f~PWM~**_ = _f~main~[Hz]_ / (2^8^ * `clock_prescaler`) + +[TIP] +A more sophisticated frequency generation option is provided by by the numerically-controlled oscillator +module (see section <<_numerically_controller-oscillator_nco>>). + +<<< +.PWM register map +[cols="<4,<5,<10,^2,<11"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.4+<| `0xffffffb8` .4+<| _PWM_CT_ <|`0` _PWM_CT_EN_ ^| r/w <| TWI enable + <|`1` _PWM_CT_PRSC0_ ^| r/w .3+<| 3-bit clock prescaler select + <|`2` _PWM_CT_PRSC1_ ^| r/w + <|`3` _PWM_CT_PRSC2_ ^| r/w +.4+<| `0xffffffbc` .4+<| _PWM_DUTY_ <|`7:0` _PWM_DUTY_CH0_MSB_ : _PWM_DUTY_CH0_LSB_ ^| r/w <| 8-bit duty cycle for channel 0 + <|`15:8` _PWM_DUTY_CH1_MSB_ : _PWM_DUTY_CH1_LSB_ ^| r/w <| 8-bit duty cycle for channel 1 + <|`23:16` _PWM_DUTY_CH2_MSB_ : _PWM_DUTY_CH2_LSB_ ^| r/w <| 8-bit duty cycle for channel 2 + <|`31:24` _PWM_DUTY_CH3_MSB_ : _PWM_DUTY_CH3_LSB_ ^| r/w <| 8-bit duty cycle for channel 3 +|======================= Index: docs/src_adoc/soc_spi.adoc =================================================================== --- docs/src_adoc/soc_spi.adoc (nonexistent) +++ docs/src_adoc/soc_spi.adoc (revision 57) @@ -0,0 +1,75 @@ +<<< +:sectnums: +==== Serial Peripheral Interface Controller (SPI) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_spi.vhd | +| Software driver file(s): | neorv32_spi.c | +| | neorv32_spi.h | +| Top entity port: | `spi_sck_o` | 1-bit serial clock output +| | `spi_sdo_i` | 1-bit serial data output +| | `spi_sdi_o` | 1-bit serial data input +| | `spi_csn_i` | 8-bit dedicated chip select (low-active) +| Configuration generics: | _IO_SPI_EN_ | implement SPI controller when _true_ +| CPU interrupts: | fast IRQ channel 6 | transmission done interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +SPI is a synchronous serial transmission interface. The NEORV32 SPI transceiver allows 8-, 16-, 24- and 32- +bit long transmissions. The unit provides 8 dedicated chip select signals via the top entity's `spi_csn_o` +signal. + +The SPI unit is enabled via the _SPI_CT_EN_ bit in the _SPI_CT_ control register. The idle clock polarity is configured via the _SPI_CT_CPHA_ +bit and can be low (`0`) or high (`1`) during idle. The data quantity to be transferred within a +single transmission is defined via the _SPI_CT_SIZEx bits_. The unit supports 8-bit (`00`), 16-bit (`01`), 24- +bit (`10`) and 32-bit (`11`) transfers. Whenever a transfer is completed, the "transmission done interrupt" is triggered. +A transmission is still in progress as long as the _SPI_CT_BUSY_ flag is set. + +The SPI controller features 8 dedicated chip-select lines. These lines are controlled via the control register's _SPI_CT_CSx_ bits. When +a specifc _SPI_CT_CSx_ bit is **set**, the according chip select line `spi_csn_o(x)` goes **low** (low-active chip select lines). + +The SPI clock frequency is defined via the 3-bit _SPI_CT_PRSCx_ clock prescaler. The following prescalers +are available: + +.SPI prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`SPI_CT_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +Based on the _SPI_CT_PRSCx_ configuration, the actual SPI clock frequency f~SPI~ is derived from the processor's main clock f~main~ and is determined by: + +_**f~SPI~**_ = _f~main~[Hz]_ / (2 * `clock_prescaler`) + +A transmission is started when writing data to the _SPI_DATA_ register. The data must be LSB-aligned. So if +the SPI transceiver is configured for less than 32-bit transfers data quantity, the transmit data must be placed +into the lowest 8/16/24 bit of _SPI_DATA_. Vice versa, the received data is also always LSB-aligned. + +.SPI register map +[cols="<2,<2,<4,^1,<7"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.16+<| `0xffffffa8` .16+<| _SPI_CT_ <|`0` _SPI_CT_CS0_ ^| r/w .8+<| Direct chip-select 0..7; setting `spi_csn_o(x)` low when set + <|`1` _SPI_CT_CS1_ ^| r/w + <|`2` _SPI_CT_CS2_ ^| r/w + <|`3` _SPI_CT_CS3_ ^| r/w + <|`4` _SPI_CT_CS4_ ^| r/w + <|`5` _SPI_CT_CS5_ ^| r/w + <|`6` _SPI_CT_CS6_ ^| r/w + <|`7` _SPI_CT_CS7_ ^| r/w + <|`8` _SPI_CT_EN_ ^| r/w <| SPI enable + <|`9` _SPI_CT_CPHA_ ^| r/w <| polarity of `spi_sck_o` when idle + <|`10` _SPI_CT_PRSC0_ ^| r/w .3+| 3-bit clock prescaler select + <|`11` _SPI_CT_PRSC1_ ^| r/w + <|`12` _SPI_CT_PRSC2_ ^| r/w + <|`14` _SPI_CT_SIZE0_ ^| r/w .2+<| transfer size (`00`=8-bit, `01`=16-bit, `10`=24-bit, `11`=32-bit) + <|`15` _SPI_CT_SIZE1_ ^| r/w + <|`31` _SPI_CT_BUSY_ ^| r/- <| transmission in progress when set +| `0xffffffac` | _SPI_DATA_ |`31:0` | r/w | receive/transmit data, LSB-aligned +|======================= Index: docs/src_adoc/soc_sysinfo.adoc =================================================================== --- docs/src_adoc/soc_sysinfo.adoc (nonexistent) +++ docs/src_adoc/soc_sysinfo.adoc (revision 57) @@ -0,0 +1,64 @@ +<<< +:sectnums: +==== System Configuration Information Memory (SYSINFO) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_sysinfo.vhd | +| Software driver file(s): | (neorv32.h) | +| Top entity port: | none | +| Configuration generics: | * | most of the top's configuration generics +| CPU interrupts: | none | +|======================= + +**Theory of Operation** + +The SYSINFO allows the application software to determine the setting of most of the processor's top entity +generics that are related to processor/SoC configuration. All registers of this unit are read-only. + +This device is always implemented – regardless of the actual hardware configuration. The bootloader as well +as the NEORV32 software runtime environment require information from this device (like memory layout +and default clock speed) for correct operation. + +.SYSINFO register map +[cols="<2,<4,<7"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Function +| `0xffffffe0` | _SYSINFO_CLK_ | clock speed in Hz (via top's _CLOCK_FREQUENCY_ generic) +| `0xffffffe4` | _SYSINFO_USER_CODE_ | custom user code, assigned via top's _USER_CODE_ generic +| `0xffffffe8` | _SYSINFO_FEATURES_ | specific hardware configuration (see next table) +| `0xffffffec` | _SYSINFO_CACHE_ | cache configuration information (see next table) +| `0xfffffff0` | _SYSINFO_ISPACE_BASE_ | instruction address space base (defined via `ispace_base_c` constant in the `neorv32_package.vhd` file) +| `0xfffffff4` | _SYSINFO_IMEM_SIZE_ | internal IMEM size in bytes (defined via top's _MEM_INT_IMEM_SIZE_ generic) +| `0xfffffff8` | _SYSINFO_DSPACE_BASE_ | data address space base (defined via `sdspace_base_c` constant in the `neorv32_package.vhd` file) +| `0xfffffffc` | _SYSINFO_DMEM_SIZE_ | internal DMEM size in bytes (defined via top's _MEM_INT_DMEM_SIZE_ generic) +|======================= + + +._SYSINFO_FEATURES_ bits +[cols="^1,<10,<11"] +[options="header",grid="all"] +|======================= +| Bit | Name [C] | Function +| `0` | _SYSINFO_FEATURES_BOOTLOADER_ | set if the processor-internal bootloader is implemented (via top's _BOOTLOADER_EN_ generic) +| `1` | _SYSINFO_FEATURES_MEM_EXT_ | set if the external Wishbone bus interface is implemented (via top's _MEM_EXT_EN_ generic) +| `2` | _SYSINFO_FEATURES_MEM_INT_IMEM_ | set if the processor-internal DMEM implemented (via top's _MEM_INT_DMEM_EN_ generic) +| `3` | _SYSINFO_FEATURES_MEM_INT_IMEM_ROM_ | set if the processor-internal IMEM is read-only (via top's _MEM_INT_IMEM_ROM_ generic) +| `4` | _SYSINFO_FEATURES_MEM_INT_DMEM_ | set if the processor-internal IMEM is implemented (via top's _MEM_INT_IMEM_EN_ generic) +| `5` | _SYSINFO_FEATURES_MEM_EXT_ENDIAN_ | set if external bus interface uses BIG-endian byte-order (via package's `xbus_big_endian_c` constant) +| `15` | _SYSINFO_FEATURES_HW_RST_ | set if a dedicated hardware reset of all core registers is implemented (via package's _dedicated_reset_c_ constant) +| `16` | _SYSINFO_FEATURES_IO_GPIO_ | set if the GPIO is implemented (via top's _IO_GPIO_EN_ generic) +| `17` | _SYSINFO_FEATURES_IO_MTIME_ | set if the MTIME is implemented (via top's _IO_MTIME_EN_ generic) +| `18` | _SYSINFO_FEATURES_IO_UART0_ | set if the primary UART0 is implemented (via top's _IO_UART0_EN_ generic) +| `19` | _SYSINFO_FEATURES_IO_SPI_ | set if the SPI is implemented (via top's _IO_SPI_EN_ generic) +| `20` | _SYSINFO_FEATURES_IO_TWI_ | set if the TWI is implemented (via top's _IO_TWI_EN_ generic) +| `21` | _SYSINFO_FEATURES_IO_PWM_ | set if the PWM is implemented (via top's _IO_PWM_EN_ generic) +| `22` | _SYSINFO_FEATURES_IO_WDT_ | set if the WDT is implemented (via top's _IO_WDT_EN_ generic) +| `23` | _SYSINFO_FEATURES_IO_CFS_ | set if the custom functions subsystem is implemented (via top's _IO_CFS_EN_ generic) +| `24` | _SYSINFO_FEATURES_IO_TRNG_ | set if the TRNG is implemented (via top's _IO_TRNG_EN_ generic) +| `25` | _SYSINFO_FEATURES_IO_NCO_ | set if the NCO is implemented (via top's _IO_NCO_EN_ generic) +| `26` | _SYSINFO_FEATURES_IO_UART1_ | set if the secondary UART1 is implemented (via top's _IO_UART1_EN_ generic) +| `27` | _SYSINFO_FEATURES_IO_NEOLED_ | set if the NEOLED is implemented (via top's _IO_NEOLED_EN_ generic) +|======================= Index: docs/src_adoc/soc_trng.adoc =================================================================== --- docs/src_adoc/soc_trng.adoc (nonexistent) +++ docs/src_adoc/soc_trng.adoc (revision 57) @@ -0,0 +1,84 @@ +<<< +:sectnums: +==== True Random-Number Generator (TRNG) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_trng.vhd | +| Software driver file(s): | neorv32_trng.c | +| | neorv32_trng.h | +| Top entity port: | none | +| Configuration generics: | _IO_TRNG_EN_ | implement TRNG when _true_ +| CPU interrupts: | none | +|======================= + +**Theory of Operation** + +The NEORV32 true random number generator provides _physical true random numbers_ for your application. +Instead of using a pseudo RNG like a LFSR, the TRNG of the processor uses a simple, straight-forward ring +oscillator as physical entropy source. Hence, voltage and thermal fluctuations are used to provide true +physical random data. + +[NOTE] +The TRNG features a platform independent architecture without FPGA-specific primitives, macros or +attributes. + +**Architecture** + +The NEORV32 TRNG is based on simple ring oscillators, which are implemented as an inverter chain with +an odd number of inverters. A **latch** is used to decouple each individual inverter. Basically, this architecture +is some king of asynchronous LFSR. + +The output of several ring oscillators are synchronized using two registers and are XORed together. The +resulting output is de-biased using a von-Neumann randomness extractor. This de-biased output is further +processed by a simple 8-bit Fibonacci LFSR to improve whitening. After at least 8 clock cycles the state of +the LFSR is sampled and provided as final data output. + +To prevent the synthesis tool from doing logic optimization and thus, removing all but one inverter, the +TRNG uses simple latches to decouple an inverter and its actual output. The latches are reset when the +TRNG is disabled and are enabled one by one by a "real" shift register when the TRNG is activated. This +construct can be synthesized for any FPGA platform. Thus, the NEORV32 TRNG provides a platform +independent architecture. + +**TRNG Configuration** + +The TRNG uses several ring-oscillators, where the next oscillator provides a slightly longer chain (more +inverters) than the one before. This increment is constant for all implemented oscillators. This setup can be +customized by modifying the "Advanced Configuration" constants in the TRNG's VHDL file: + +* The `num_roscs_c` constant defines the total number of ring oscillators in the system. num_inv_start_c +defines the number of inverters used by the first ring oscillators (has to be an odd number). Each additional +ring oscillator provides `num_inv_inc_c` more inverters that the one before (has to be an even number). +* The LFSR-based post-processing can be deactivated using the `lfsr_en_c` constant. The polynomial tap +mask of the LFSR can be customized using `lfsr_taps_c`. + +**Using the TRNG** + +The TRNG features a single register for status and data access. When the _TRNG_CT_EN_ control register bit is +set, the TRNG is enabled and starts operation. As soon as the _TRNG_CT_VALID_ bit is set, the currently +sampled 8-bit random data byte can be obtained from the lowest 8 bits of the TRNG_CT register +(_TRNG_CT_DATA_MSB_ : _TRNG_CT_DATA_LSB_). The _TRNG_CT_VALID_ bit is automatically cleared +when reading the control register. + +[IMPORTANT] +The TRNG needs at least 8 clock cycles to generate a new random byte. During this sampling time +the current output random data is kept stable in the output register until a valid sampling of the new byte has +completed. + +Randomness "Quality" +I have not verified the quality of the generated random numbers (for example using NIST test suites). The +quality is highly effected by the actual configuration of the TRNG and the resulting FPGA mapping/routing. +However, generating larger histograms of the generated random number shows an equal distribution (binary +average of the random numbers = 127). A simple evaluation test/demo program can be found in +`sw/example/demo_trng`. + +.TRNG register map +[cols="<2,<2,<4,^1,<7"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.3+<| `0xffffff88` .3+<| _TRNG_CT_ <|`7:0` _TRNG_CT_DATA_MSB_ : _TRNG_CT_DATA_MSB_ ^| r/- <| 8-bit random data output + <|`30` _TRNG_CT_EN_ ^| r/w <| TRNG enable + <|`31` _TRNG_CT_VALID_ ^| r/- <| random data output is valid when set +|======================= Index: docs/src_adoc/soc_twi.adoc =================================================================== --- docs/src_adoc/soc_twi.adoc (nonexistent) +++ docs/src_adoc/soc_twi.adoc (revision 57) @@ -0,0 +1,84 @@ +<<< +:sectnums: +==== Two-Wire Serial Interface Controller (TWI) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_twi.vhd | +| Software driver file(s): | neorv32_twi.c | +| | neorv32_twi.h | +| Top entity port: | `twi_sda_io` | 1-bit bi-directional serial data +| | `twi_scl_io` | 1-bit bi-directional serial clock +| Configuration generics: | _IO_TWI_EN_ | implement TWI controller when _true_ +| CPU interrupts: | fast IRQ channel 7 | transmission done interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The two wire interface – also called "I²C" – is a quite famous interface for connecting several on-board +components. Since this interface only needs two signals (the serial data line `twi_sda_io` and the serial +clock line `twi_scl_io`) – despite of the number of connected devices – it allows easy interconnections of +several peripheral nodes. + +The NEORV32 TWI implements a **TWI controller**. It features "clock stretching" (if enabled via the control +register), so a slow peripheral can halt the transmission by pulling the SCL line low. Currently, **no multi-controller +support** is available. Also, the NEORV32 TWI unit cannot operate in peripheral mode. + +The TWI is enabled via the _TWI_CT_EN_ bit in the _TWI_CT_ control register. The user program can start / stop a +transmission by issuing a START or STOP condition. These conditions are generated by setting the +according bits (_TWI_CT_START_ or _TWI_CT_STOP_) in the control register. + +Data is send by writing a byte to the _TWI_DATA_ register. Received data can also be read from this +register. The TWI controller is busy (transmitting data or performing a START or STOP condition) as long as the +_TWI_CT_BUSY_ bit in the control register is set. + +An accessed peripheral has to acknowledge each transferred byte. When the _TWI_CT_ACK_ bit is set after a +completed transmission, the accessed peripheral has send an acknowledge. If it is cleared after a +transmission, the peripheral has send a not-acknowledge (NACK). The NEORV32 TWI controller can also +send an ACK by itself ("controller acknowledge _MACK_") after a transmission by pulling SDA low during the +ACK time slot. Set the _TWI_CT_MACK_ bit to activate this feature. If this bit is cleared, the ACK/NACK of the +peripheral is sampled in this time slot instead (normal mode). + +In summary, the following independent TWI operations can be triggered by the application program: + +* send START condition (also as REPEATED START condition) +* send STOP condition +* send (at least) one byte while also sampling one byte from the bus + +[IMPORTANT] +The serial clock (SCL) and the serial data (SDA) lines can only be actively driven low by the +controller. Hence, external pull-up resistors are required for these lines. + +The TWI clock frequency is defined via the 3-bit _TWI_CT_PRSCx_ clock prescaler. The following prescalers +are available: + +.TWI prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`TWI_CT_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +Based on the _TWI_CT_PRSCx_ configuration, the actual TWI clock frequency f~SCL~ is derived from the processor main clock f~main~ and is determined by: + +_**f~SCL~**_ = _f~main~[Hz]_ / (4 * `clock_prescaler`) + +.TWI register map +[cols="<2,<2,<4,^1,<7"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.10+<| `0xffffffb0` .10+<| _TWI_CT_ <|`0` _TWI_CT_EN_ ^| r/w <| TWI enable + <|`1` _TWI_CT_START_ ^| r/w <| generate START condition + <|`2` _TWI_CT_STOP_ ^| r/w <| generate STOP condition + <|`3` _TWI_CT_PRSC0_ ^| r/w .3+<| 3-bit clock prescaler select + <|`4` _TWI_CT_PRSC1_ ^| r/w + <|`5` _TWI_CT_PRSC2_ ^| r/w + <|`6` _TWI_CT_MACK_ ^| r/w <| generate controller ACK for each transmission ("MACK") + <|`7` _TWI_CT_CKSTEN_ ^| r/w <| allow clock-stretching by peripherals when set + <|`30` _TWI_CT_ACK_ ^| r/- <| ACK received when set + <|`31` _TWI_CT_BUSY_ ^| r/- <| transfer/START/STOP in progress when set +| `0xffffffb4` | _TWI_DATA_ |`7:0` _TWI_DATA_MSB_ : TWI_DATA_LSB_ | r/w | receive/transmit data +|======================= Index: docs/src_adoc/soc_uart.adoc =================================================================== --- docs/src_adoc/soc_uart.adoc (nonexistent) +++ docs/src_adoc/soc_uart.adoc (revision 57) @@ -0,0 +1,216 @@ +<<< +:sectnums: +==== Primary Universal Asynchronous Receiver and Transmitter (UART0) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_uart.vhd | +| Software driver file(s): | neorv32_uart.c | +| | neorv32_uart.h | +| Top entity port: | `uart0_txd_o` | serial transmitter output UART0 +| | `uart0_rxd_i` | serial receiver input UART0 +| | `uart0_rts_o` | flow control: RX ready to receive +| | `uart0_cts_i` | flow control: TX allowed to send +| Configuration generics: | _IO_UART0_EN_ | implement UART0 when _true_ +| CPU interrupts: | fast IRQ channel 2 | RX done interrupt +| | fast IRQ channel 3 | TX done interrupt (see <<_processor_interrupts>>) +|======================= + +[IMPORTANT] +Please note that ALL default example programs and software libraries of the NEORV32 software +framework (including the bootloader and the runtime environment) use the primary UART +(_UART0_) as default user console interface. For compatibility, all C-language function calls to +`neorv32_uart_*` are mapped to the according primary UART (_UART0_) `neorv32_uart0_*` +functions. + +**Theory of Operation** + +In most cases, the UART is a standard interface used to establish a communication channel between the +computer/user and an application running on the processor platform. The NEORV32 UARTs features a +standard configuration frame configuration: 8 data bits, an optional parity bit (even or odd) and 1 stop bit. +The parity and the actual Baudrate are configurable by software. + +The UART0 is enabled by setting the _UART_CT_EN_ bit in the UART control register _UART0_CT_. The actual +transmission Baudrate (like 19200) is configured via the 12-bit _UART_CT_BAUDxx_ baud prescaler (`baud_rate`) and the +3-bit _UART_CT_PRSCx_ clock prescaler. + +.UART prescaler configuration +[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| **`UART_CT_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111` +| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096 +|======================= + +_**Baudrate**_ = (_f~main~[Hz]_ / `clock_prescaler`) / (`baud_rate` + 1) + +A new transmission is started by writing the data byte to be send to the lowest byte of the _UART0_DATA_ register. The +transfer is completed when the _UART_CT_TX_BUSY_ control register flag returns to zero. A new received byte +is available when the _UART_DATA_AVAIL_ flag of the UART0_DATA register is set. A "frame error" in a received byte +(broken stop bit) is indicated via the _UART_DATA_FERR_ flag in the UART0_DATA register. + +**RX Double-Buffering** + +The UART receive engine provides a simple data buffer with two entries. These two entries are transparent +for the user. The transmitting device can send up to 2 chars to the UART without risking data loss. If another +char is sent before at least one char has been read from the buffer data loss occurs. This situation can be +detected via the receiver overrun flag _UART_DATA_OVERR_ in the _UART0_DATA_ register. The flag is +automatically cleared after reading _UART0_DATA_. + +**Parity Modes** + +The parity flag is added if the _UART_CT_PMODE1_ flag is set. When _UART_CT_PMODE0_ is zero the UART +operates in "even parity" mode. If this flag is set, the UART operates in "odd parity" mode. Parity errors in +received data are indicated via the _UART_DATA_PERR_ flag in the _UART_DATA_ registers. This flag is updated with each new +received character. A frame error in the received data (i.e. stop bit is not set) is indicated via the +_UART_DATA_FERR_ flag in the _UART0_DATA_. This flag is also updated with each new received character + +**Hardware Flow Control – RTS/CTS** + +The UART supports hardware flow control using the standard CTS (clear to send) and/or RTS (ready to send +/ ready to receive "RTR") signals. Both hardware control flow mechanisms can be individually enabled. + +If **RTS hardware flow control** is enabled by setting the _UART_CT_RTS_EN_ control register flag, the UART +will pull the `uart0_rts_o` signal low if the UART's receiver is idle and no received data is waiting to get read by +application software. As long as this signal is low the connected device can send new data. `uart0_rts_o` is always LOW if the UART is disabled. + +The RTS line is de-asserted (going high) as soon as the start bit of a new incoming char has been +detected. The transmitting device continues sending the current char and can also send another char +(due to the RX double-buffering), which is done by most terminal programs. Any additional data send +when RTS is still asserted will override the RX input buffer causing data loss. This will set the _UART_DATA_OVERR_ flag in the +_UART0_DATA_ register. Any read access to this register clears the flag again. + +If **CTS hardware flow control** is enabled by setting the _UART_CT_CTS_EN_ control register flag, the UART's +transmitter will not start sending a new char until the `uart0_cts_i` signal goes low. If a new data to be +send is written to the UART data register while `uart0_cts_i` is not asserted (=low), the UART will wait for +`uart0_cts_i` to become asserted (=high) before sending starts. During this time, the UART busy flag +_UART_CT_TX_BUSY_ remains set. + +If `uart0_cts_i` is asserted, no new data transmission will be started by the UART. The state of the `uart0_cts_i` +signals has no effect on a transmission being already in progress. + +Signal changes on `uart0_cts_i` during an active transmission are ignored. Application software can check +the current state of the `uart0_cts_o` input signal via the _UART_CT_CTS_ control register flag. + +[TIP] +Please note that – just like the RXD and TXD signals – the RTS and CTS signals have to be **cross**-coupled +between devices. + +**Interrupts** + +The UART features two interrupts: the "TX done interrupt" is triggered when a transmit operation (sending) has finished. The "RX +done interrupt" is triggered when a data byte has been received. If the UART0 is not implemented, the UART0 interrupts are permanently tied to zero. + +[NOTE] +The UART's RX interrupt is always triggered when a new data word has arrived – regardless of the +state of the RX double-buffer. + +**Simulation Mode** + +The default UART0 operation will transmit any data written to the _UART0_DATA_ register via the serial TX line at +the defined baud rate. Even though the default testbench provides a simulated UART0 receiver, which +outputs any received char to the simulator console, such a transmission takes a lot of time. To accelerate +UART0 output during simulation (and also to dump large amounts of data for further processing like +verification) the UART0 features a **simulation mode**. + +The simulation mode is enabled by setting the _UART_CT_SIM_MODE_ bit in the UART0's control register +_UART0_CT_. Any other UART0 configuration bits are irrelevant, but the UART0 has to be enabled via the +_UART_CT_EN_ bit. When the simulation mode is enabled, any written char to _UART0_DATA_ (bits 7:0) is +directly output as ASCII char to the simulator console. Additionally, all text is also stored to a text file +`neorv32.uart0.sim_mode.text.out` in the simulation home folder. Furthermore, the whole 32-bit word +written to _UART0_DATA_ is stored as plain 8-char hexadecimal value to a second text file +`neorv32.uart0.sim_mode.data.out` also located in the simulation home folder. + +If the UART is configured for simulation mode there will be **NO physical UART0 transmissions via +`uart0_txd_o`** at all. Furthermore, no interrupts (RX done or TX done) will be triggered in any situation. + +[TIP] +More information regarding the simulation-mode of the UART0 can be found in section <<_simulating_the_processor>>. + +.UART0 register map +[cols="<6,<7,<10,^2,<18"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.12+<| `0xffffffa0` .12+<| _UART0_CT_ <|`11:0` _UART_CT_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value + <|`12` _UART_CT_SIM_MODE_ ^| r/w <| enable **simulation mode** + <|`20` _UART_CT_RTS_EN_ ^| r/w <| enable RTS hardware flow control + <|`21` _UART_CT_CTS_EN_ ^| r/w <| enable CTS hardware flow control + <|`22` _UART_CT_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity) + <|`23` _UART_CT_PMODE1_ ^| r/w + <|`24` _UART_CT_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select + <|`25` _UART_CT_PRSC1_ ^| r/w + <|`26` _UART_CT_PRSC2_ ^| r/w + <|`27` _UART_CT_CTS_ ^| r/- <| current state of UART's CTS input signal + <|`28` _UART_CT_EN_ ^| r/w <| UART enable + <|`31` _UART_CT_TX_BUSY_ ^| r/- <| trasmitter busy flag +.6+<| `0xffffffa4` .6+<| _UART0_DATA_ <|`7:0` _UART_DATA_MSB_ : _UART_DATA_LSB_ ^| r/w <| receive/transmit data (8-bit) + <|`31:0` - ^| -/w <| **simulation data output** + <|`28` _UART_DATA_PERR_ ^| r/- <| RX parity error + <|`29` _UART_DATA_FERR_ ^| r/- <| RX data frame error (stop bit nt set) + <|`30` _UART_DATA_OVERR_ ^| r/- <| RX data overrun + <|`31` _UART_DATA_AVAIL_ ^| r/- <| RX data available when set +|======================= + + + +<<< +// #################################################################################################################### +:sectnums: +==== Secondary Universal Asynchronous Receiver and Transmitter (UART1) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_uart.vhd | +| Software driver file(s): | neorv32_uart.c | +| | neorv32_uart.h | +| Top entity port: | `uart1_txd_o` | serial transmitter output UART1 +| | `uart1_rxd_i` | serial receiver input UART1 +| | `uart1_rts_o` | flow control: RX ready to receive +| | `uart1_cts_i` | flow control: TX allowed to send +| Configuration generics: | _IO_UART1_EN_ | implement UART1 when _true_ +| CPU interrupts: | fast IRQ channel 4 | RX done interrupt +| | fast IRQ channel 5 | TX done interrupt (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The secondary UART (UART1) is functional identical to the primary UART (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>). +Obviously, UART1 has different addresses for +thw control register (_UART1_CT_) and the data register (_UART1_DATA_) – see the register map below. However, the +register bits/flags use the same bit positions and naming. Furthermore, the "RX done" and "TX done" interrupts are +mapped to different CPU fast interrupt channels. + +**Simulation Mode** + +The secondary UART (UART1) provides the same simulation options as the primary UART. However, +output data is written to UART1-specific files: `neorv32.uart1.sim_mode.text.out` is used to store +plain ASCII text and `neorv32.uart1.sim_mode.data.out` is used to store full 32-bit hexadecimal +encoded data words. + +.UART1 register map +[cols="<6,<7,<10,^2,<18"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Function +.12+<| `0xffffffd0` .12+<| _UART1_CT_ <|`11:0` _UART_CT_BAUDxx_ ^| r/w <| 12-bit BAUD value configuration value + <|`12` _UART_CT_SIM_MODE_ ^| r/w <| enable **simulation mode** + <|`20` _UART_CT_RTS_EN_ ^| r/w <| enable RTS hardware flow control + <|`21` _UART_CT_CTS_EN_ ^| r/w <| enable CTS hardware flow control + <|`22` _UART_CT_PMODE0_ ^| r/w .2+<| parity bit enable and configuration (`00`/`01`= no parity; `10`=even parity; `11`=odd parity) + <|`23` _UART_CT_PMODE1_ ^| r/w + <|`24` _UART_CT_PRSC0_ ^| r/w .3+<| 3-bit baudrate clock prescaler select + <|`25` _UART_CT_PRSC1_ ^| r/w + <|`26` _UART_CT_PRSC2_ ^| r/w + <|`27` _UART_CT_CTS_ ^| r/- <| current state of UART's CTS input signal + <|`28` _UART_CT_EN_ ^| r/w <| UART enable + <|`31` _UART_CT_TX_BUSY_ ^| r/- <| trasmitter busy flag +.6+<| `0xffffffd4` .6+<| _UART1_DATA_ <|`7:0` _UART_DATA_MSB_ : _UART_DATA_LSB_ ^| r/w <| receive/transmit data (8-bit) + <|`31:0` - ^| -/w <| **simulation data output** + <|`28` _UART_DATA_PERR_ ^| r/- <| RX parity error + <|`29` _UART_DATA_FERR_ ^| r/- <| RX data frame error (stop bit nt set) + <|`30` _UART_DATA_OVERR_ ^| r/- <| RX data overrun + <|`31` _UART_DATA_AVAIL_ ^| r/- <| RX data available when set +|======================= Index: docs/src_adoc/soc_wdt.adoc =================================================================== --- docs/src_adoc/soc_wdt.adoc (nonexistent) +++ docs/src_adoc/soc_wdt.adoc (revision 57) @@ -0,0 +1,69 @@ +<<< +:sectnums: +==== Watchdog Timer (WDT) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_wdt.vhd | +| Software driver file(s): | neorv32_wdt.c | +| | neorv32_wdt.h | +| Top entity port: | none | +| Configuration generics: | _IO_WDT_EN_ | implement GPIO port when _true_ +| CPU interrupts: | FIRQ channel 0 | watchdog timer overflow (see <<_processor_interrupts>>) +|======================= + +**Theory of Operation** + +The watchdog (WDT) provides a last resort for safety-critical applications. The WDT has an internal 20-bit +wide counter that needs to be reset every now and then by the user program. If the counter overflows, either +a system reset or an interrupt is generated (depending on the configured operation mode). + +Configuration of the watchdog is done by a single control register _WDT_CT_. The watchdog is enabled by +setting the _WDT_CT_EN_ bit. The clock used to increment the internal counter is selected via the 3-bit +_WDT_CT_CLK_SELx_ prescaler: + +[cols="^3,^3,>4"] +[options="header",grid="rows"] +|======================= +| **`WDT_CT_CLK_SELx`** | Main clock prescaler | Timeout period in clock cycles +| `0b000` | 2 | 2 097 152 +| `0b001` | 4 | 4 194 304 +| `0b010` | 8 | 8 388 608 +| `0b011` | 64 | 67 108 864 +| `0b100` | 128 | 134 217 728 +| `0b101` | 1024 | 1 073 741 824 +| `0b110` | 2048 | 2 147 483 648 +| `0b111` | 4096 | 4 294 967 296 +|======================= + +Whenever the internal timer overflows the watchdog executes one of two possible actions: Either a hard +processor reset is triggered or an interrupt is requested at CPU's fast interrupt channel #0. The +WDT_CT_MODE bit defines the action to be taken on an overflow: When cleared, the Watchdog will trigger an +IRQ, when set the WDT will cause a system reset. The configured actions can also be triggered manually at +any time by setting the _WDT_CT_FORCE_ bit. The watchdog is reset by setting the _WDT_CT_RESET_ bit. + +The cause of the last action of the watchdog can be determined via the _WDT_CT_RCAUSE_ flag. If this flag is +zero, the processor has been reset via the external reset signal. If this flag is set the last system reset was +initiated by the watchdog. + +The Watchdog control register can be locked in order to protect the current configuration. The lock is +activated by setting bit _WDT_CT_LOCK_. In the locked state any write access to the configuration flags is +ignored (see table below, "accessible if locked"). Read accesses to the control register are not effected. The +lock can only be removed by a system reset (via external reset signal or via a watchdog reset action). + +.WDT register map +[cols="<2,<2,<4,^1,^2,<4"] +[options="header",grid="all"] +|======================= +| Address | Name [C] | Bit(s), Name [C] | R/W | Writable if locked | Function +.9+<| `0xffffff8c` .9+<| _WDT_CT_ <|`0` _WDT_CT_EN_ ^| r/w ^| no <| watchdog enable + <|`1` _WDT_CT_CLK_SEL0_ ^| r/w ^| no .3+<| 3-bit clock prescaler select + <|`2` _WDT_CT_CLK_SEL1_ ^| r/w ^| no + <|`3` _WDT_CT_CLK_SEL2_ ^| r/w ^| no + <|`4` _WDT_CT_MODE_ ^| r/w ^| no <| overflow action: `1`=reset, `0`=IRQ + <|`5` _WDT_CT_RCAUSE_ ^| r/- ^| - <| cause of last system reset: `0`=caused by external reset signal, `1`=caused by watchdog + <|`6` _WDT_CT_RESET_ ^| -/w ^| yes <| watchdog reset when set, auto-clears + <|`7` _WDT_CT_FORCE_ ^| -/w ^| yes <| force configured watchdog action when set, auto-clears + <|`8` _WDT_CT_LOCK_ ^| r/w ^| no <| lock access to configuration when set, clears only on system reset (via external reset signal OR watchdog reset action = reset) +|======================= Index: docs/src_adoc/soc_wishbone.adoc =================================================================== --- docs/src_adoc/soc_wishbone.adoc (nonexistent) +++ docs/src_adoc/soc_wishbone.adoc (revision 57) @@ -0,0 +1,158 @@ +<<< +:sectnums: +==== Processor-External Memory Interface (WISHBONE) (AXI4-Lite) + +[cols="<3,<3,<4"] +[frame="topbot",grid="none"] +|======================= +| Hardware source file(s): | neorv32_wishbone.vhd | +| Software driver file(s): | none | _implicitly used_ +| Top entity port: | `wb_tag_o` | request tag output (3-bit) +| | `wb_adr_o` | address output (32-bit) +| | `wb_dat_i` | data input (32-bit) +| | `wb_dat_o` | data output (32-bit) +| | `wb_we_o` | write enable (1-bit) +| | `wb_sel_o` | byte enable (4-bit) +| | `wb_stb_o` | strobe (1-bit) +| | `wb_cyc_o` | valid cycle (1-bit) +| | `wb_lock_o` | exclusive access request (1-bit) +| | `wb_ack_i` | acknowledge (1-bit) +| | `wb_err_i` | bus error (1-bit) +| | `fence_o` | an executed `fence` instruction +| | `fencei_o` | an executed `fence.i` instruction +| Configuration generics: | _MEM_EXT_EN_ | enable external memory interface when _true_ +| | _MEM_EXT_TIMEOUT_ | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled) +| Configuration constants in VHDL package file `neorv32_package.vhd`: | `wb_pipe_mode_c` | when _false_ (default): classic/standard Wishbone protocol; when _true_: pipelined Wishbone protocol +| | `xbus_big_endian_c` | byte-order (Endianness) of external memory interface (true=BIG (default), false=little) +| CPU interrupts: | none | +|======================= + +The external memory interface uses the Wishbone interface protocol. The external interface port is available +when the _MEM_EXT_EN_ generic is _true_. This interface can be used to attach external memories, custom +hardware accelerators additional IO devices or all other kinds of IP blocks. All memory accesses from the +CPU, that do not target the internal bootloader ROM, the internal IO region or the internal data/instruction +memories (if implemented at all) are forwarded to the Wishbone gateway and thus to the external memory +interface. + +[TIP] +When using the default processor setup, all access addresses between 0x00000000 and +0xffff0000 (= beginning of processor-internal BOOT ROM) are delegated to the external memory +/ bus interface if they are not targeting the (actually enabled/implemented) processor-internal +instruction memory (IMEM) or the (actually enabled/implemented) processor-internal data memory +(DMEM). See section <<_address_space>> for more information. + +**Wishbone Bus Protocol** + +The external memory interface either uses **standard** ("classic") Wishbone transactions (default) or +**pipelined** Wishbone transactions. The transaction protocol is configured via the wb_pipe_mode_c constant +in the in the main VHDL package file (`rtl/neorv32_package.vhd`): + +[source,vhdl] +---- +-- (external) bus interface -- +constant wb_pipe_mode_c : boolean := false; +---- + +When `wb_pipe_mode_c` is disabled, all bus control signals including _STB_ are active (and stable) until the +transfer is acknowledged/terminated. If `wb_pipe_mode_c` is enabled, all bus control except _STB_ are active +(and stable) until the transfer is acknowledged/terminated. In this case, _STB_ is active only during the very +first bus clock cycle. + +.Exemplary Wishbone bus accesses using "classic" and "pipelined" protocol +[cols="^2,^2"] +[grid="none"] +|======================= +| image:../figures/wishbone_classic_read.png[700,300] + +**Classic** Wishbone read access +| image:../figures/wishbone_pipelined_write.png[700,300] + +**Pipelined** Wishbone write access +|======================= + + +[TOP] +A detailed description of the implemented Wishbone bus protocol and the according interface signals +can be found in the data sheet "Wishbone B4 – WISHBONE System-on-Chip (SoC) Interconnection +Architecture for Portable IP Cores". A copy of this document can be found in the docs folder of this +project. + +**Interface Latency** + +The Wishbone gateway introduces two additional latency cycles: Processor-outgoing and -incoming signals +are fully registered. Thus, any access from the CPU to a processor-external devices requires +2 clock cycles. + +**Bus Access Timeout** + +The Wishbone bus interface provides an option to configure a bus access timeout counter. The _MEM_EXT_TIMEOUT_ +top generic is used to specify the _maximum_ time (in clock cycles) a bus access can be pending before it is automatically +terminated. If _MEM_EXT_TIMEOUT_ is set to zero, the timeout disabled an a bus access can take an arbitrary number of cycles to complete. + +When _MEM_EXT_TIMEOUT_ is greater than zero, the WIshbone adapter starts an internal countdown whenever the CPU +accesses a memory address via the external memory interface. If the accessed memory / device does not acknowledge (via `wb_ack_i`) +or terminate (via `wb_err_i`) the transfer within _MEM_EXT_TIMEOUT_ clock cycles, the bus access is automatically canceled +(setting `wb_cyc_o` low again) and a load/store/instruction fetch bus access fault exception is raised. + +[TIP] +This feature can be used as **safety guard** if the external memory system does not check for "address space holes". That means that addresses, which +do not belong to a certain memory or device, do not permanently stall the processor due to an unacknowledged/unterminated bus access. If the external +memory system can guarantee to access **any** bus access (even it targets an unimplemented address) the timeout feature should be disabled +(_MEM_EXT_TIMEOUT_ = 0). + +**Wishbone Tag** + +The 3-bit wishbone `wb_tag_o` signal provides additional information regarding the access type. This signal +is compatible to the AXI4 _AxPROT_ signal. + +* `wb_tag_o(0)` 1: privileged access (CPU is in machine mode); 0: unprivileged access +* `wb_tag_o(1)` always zero (indicating "secure access") +* `wb_tag_o(2)` 1: instruction fetch access, 0: data access + +**Exclusive / Atomic Bus Access** + +If the atomic memory access CPU extension (via _CPU_EXTENSION_RISCV_A_) is enabled, the CPU can +request an atomic/exclusive bus access via the external memory interface. + +The load-reservate instruction (`lr.w`) will set the `wb_lock_o` signal telling the bus interconnect to establish a +reservation for the current accessed address (start of an exclusive access). This signal will stay asserted until +another memory access instruction is executed (for example a `sc.w`). + +The memory system has to make sure that no other entity can access the reservated address until `wb_lock_o` +is released again. If this attempt fails, the memory system has to assert `wb_err_i` in order to indicate that the +reservation was broken. + +[TIP] +See section <<_bus_interface>> for the CPU bus interface protocol. + +**Endianness** + +The NEORV32 CPU and the Processor setup are BIG-endian architectures. However, to allow a connection +to a little-endian memory system the external bus interface provides an Endianness configuration. The +Endianness can be configured via the global `xbus_big_endian_c` constant in the main VHDL package file +(rtl/neorv32_package.vhd). By default, the external memory interface uses BIG-endian byte-order. + +[source,vhdl] +---- +-- (external) bus interface -- +constant xbus_big_endian_c : boolean := true; +---- + +Application software can check the Endianness configuration of the external bus interface via the +_SYSINFO_FEATURES_MEM_EXT_ENDIAN_ flag in the processor's SYSINFO module (see section +<<_system_configuration_information_memory_sysinfo>> for more information). + +**AXI4-Lite Connectivity** + +The AXI4-Lite wrapper (`rtl/top_templates/neorv32_top_axi4lite.vhd`) provides a Wishbone-to- +AXI4-Lite bridge, compatible with Xilinx Vivado (IP packager and block design editor). All entity signals of +this wrapper are of type _std_logic_ or _std_logic_vector_, respectively. + +The AXI Interface has been verified using Xilinx Vivado IP Packager and Block Designer. The AXI +interface port signals are automatically detected when packaging the core. + +.Example AXI SoC using Xilinx Vivado +image:../figures/neorv32_axi_soc.png[] + +[WARNING] +Using the auto-termination timeout feature (_MEM_EXT_TIMEOUT_ greater than zero) is **not AXI4 compliant** as the AXI protocol does not support canceling of +bus transactions. Therefore, the NEORV32 top wrapper with AXI4-Lite interface (`rtl/top_templates/neorv32_top_axi4lite`) configures _MEM_EXT_TIMEOUT_ = 0 by default. + + Index: docs/src_adoc/software.adoc =================================================================== --- docs/src_adoc/software.adoc (nonexistent) +++ docs/src_adoc/software.adoc (revision 57) @@ -0,0 +1,597 @@ +:sectnums: +== Software Framework + +To make actual use of the NEORV32 processor, the project comes with a complete software eco-system. This +ecosystem is based on the RISC-V port of the GCC GNU Compiler Collection and consists of the following elementary parts: + +[cols="<6,<4"] +[grid="none"] +|======================= +| Application/bootloader start-up code | `sw/common/crt0.S` +| Application/bootloader linker script | `sw/common/neorv32.ld` +| Core hardware driver libraries | `sw/lib/include/` & `sw/lib/source/` +| Makefiles | e.g. `sw/example/blink_led/makefile` +| Auxiliary tool for generating NEORV32 executables | `sw/image_gen/` +| Default bootloader | `sw/bootloader/bootloader.c` +|======================= + +Last but not least, the NEORV32 ecosystem provides some example programs for testing the hardware, for +illustrating the usage of peripherals and for general getting in touch with the project (`sw/example`). + +// #################################################################################################################### +:sectnums: +=== Compiler Toolchain + +The toolchain for this project is based on the free RISC-V GCC-port. You can find the compiler sources and +build instructions on the official RISC-V GNU toolchain GitHub page: https://github.com/riscv/riscv-gnutoolchain. + +The NEORV32 implements a 32-bit base integer architecture (`rv32i`) and a 32-bit integer and soft-float ABI +(ilp32), so make sure you build an according toolchain. + +Alternatively, you can download my prebuilt `rv32i/e` toolchains for 64-bit x86 Linux from: https://github.com/stnolting/riscv-gcc-prebuilt + +The default toolchain prefix used by the project's makefiles is (can be changed in the makefiles): **`riscv32-unknown-elf`** + +[TIP] +More information regarding the toolchain (building from scratch or downloading the prebuilt ones) +can be found in section <<_toolchain_setup>>. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Core Libraries + +The NEORV32 project provides a set of C libraries that allows an easy usage of the processor/CPU features. +Just include the main NEORV32 library file in your application's source file(s): + +[source,c] +---- +#include +---- + +Together with the makefile, this will automatically include all the processor's header files located in +`sw/lib/include` into your application. The actual source files of the core libraries are located in +`sw/lib/source` and are automatically included into the source list of your software project. The following +files are currently part of the NEORV32 core library: + +[cols="<3,<4,<8"] +[options="header",grid="rows"] +|======================= +| C source file | C header file | Description +| - | `neorv32.h` | main NEORV32 definitions and library file +| `neorv32_cfs.c` | `neorv32_cfs.h` | HW driver (stub)footnote:[This driver file only represents a stub, since the real CFS drivers are defined by the actual CFS implementation.] functions for the custom functions subsystem +| `neorv32_cpu.c` | `neorv32_cpu.h` | HW driver functions for the NEORV32 **CPU** +| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO** +| - | `neorv32_intrinsics.h` | macros for custom intrinsics/instructions +| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME** +| `neorv32_nco.c` | `neorv32_nco.h` | HW driver functions for the **NCO** +| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED** +| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM** +| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers +| `neorv32_spi.c` | `neorv32_spi.h` | HW driver functions for the **SPI** +| `neorv32_trng.c` | `neorv32_trng.h` | HW driver functions for the **TRNG** +| `neorv32_twi.c` | `neorv32_twi.h` | HW driver functions for the **TWI** +| `neorv32_uart.c` | `neorv32_uart.h` | HW driver functions for the **UART0** and **UART1** +| `neorv32_wdt.c` | `neorv32_wdt.h` | HW driver functions for the **WDT** +|======================= + +.Documentation +[TIP] +All core library software sources are highly documented using _doxygen_. See section <>. +The documentation is automatically built and deployed to GitHub pages by the CI workflow (:https://stnolting.github.io/neorv32/files.html). + + + + +<<< +// #################################################################################################################### +:sectnums: +=== Application Makefile + +Application compilation is based on **GNU makefiles**. Each project in the `sw/example` folder features +a makefile. All these makefiles are identical. When creating a new project, copy an existing project folder or +at least the makefile to your new project folder. I suggest to create new projects also in `sw/example` to keep +the file dependencies. Of course, these dependencies can be manually configured via makefiles variables +when your project is located somewhere else. + +Before you can use the makefiles, you need to install the RISC-V GCC toolchain. Also, you have to add the +installation folder of the compiler to your system's `PATH` variable. More information can be found in chapter +<<_lets_get_it_started>>. + +The makefile is invoked by simply executing make in your console: + +[source,bash] +---- +neorv32/sw/example/blink_led$ make +---- + +:sectnums: +==== Targets + +Just executing `make` will show the help menu showing all available targets. The following targets are +available: + +[cols="<3,<15"] +[grid="none"] +|======================= +| `help` | Show a short help text explaining all available targets. +| `check` | Check the compiler toolchain. You should run this target at least once after installing the toolchain. +| `info` | Show the makefile configuration (see next chapter). +| `exe` | Compile all sources and generate application executable for upload via bootloader. +| `install` | Compile all sources, generate executable (via exe target) for upload via bootloader and generate and install IMEM VHDL initialization image file `rtl/core/neorv32_application_image.vhd`. +| `all` | Execute `exe` and `install`. +| `clean` | Remove all generated files in the current folder. +| `clean_all` | Remove all generated files in the current folder and also removes the compiled core libraries and the compiled image generator tool. +| `bootloader` | Compile all sources, generate executable and generate and install BOOTROM VHDL initialization image file `rtl/core/neorv32_bootloader_image.vhd`. This target modifies the ROM origin and length in the linker script by setting the `make_bootloader` define. +| `upload` | Upload NEORV32 executable to the bootloader via serial port +|======================= + +[TIP] +An assembly listing file (`main.asm`) is created by the compilation flow for further analysis or debugging purpose. + +:sectnums: +==== Configuration + +The compilation flow is configured via variables right at the beginning of the makefile: + +[source,makefile] +---- +# ***************************************************************************** +# USER CONFIGURATION +# ***************************************************************************** +# User's application sources (*.c, *.cpp, *.s, *.S); add additional files here +APP_SRC ?= $(wildcard ./*.c) $(wildcard ./*.s) $(wildcard ./*.cpp) $(wildcard ./*.S) +# User's application include folders (don't forget the '-I' before each entry) +APP_INC ?= -I . +# User's application include folders - for assembly files only (don't forget the '-I' before each +entry) +ASM_INC ?= -I . +# Optimization +EFFORT ?= -Os +# Compiler toolchain +RISCV_TOOLCHAIN ?= riscv32-unknown-elf +# CPU architecture and ABI +MARCH ?= -march=rv32i +MABI ?= -mabi=ilp32 +# User flags for additional configuration (will be added to compiler flags) +USER_FLAGS ?= +# Serial port for executable upload via bootloer +COM_PORT ?= /dev/ttyUSB0 +# Relative or absolute path to the NEORV32 home folder +NEORV32_HOME ?= ../../.. +# ***************************************************************************** +---- + +[cols="<3,<10"] +[grid="none"] +|======================= +| _APP_SRC_ | The source files of the application (`*.c`, `*.cpp`, `*.S` and `*.s` files are allowed; file of these types in the project folder are automatically added via wildcards). Additional files can be added; separated by white spaces +| _APP_INC_ | Include file folders; separated by white spaces; must be defined with `-I` prefix +| _ASM_INC_ | Include file folders that are used only for the assembly source files (`*.S`/`*.s`). +| _EFFORT_ | Optimization level, optimize for size (`-Os`) is default; legal values: `-O0`, `-O1`, `-O2`, `-O3`, `-Os` +| _RISCV_TOOLCHAIN_ | The toolchain prefix to be used; follows the naming convention "architecture-vendor-output" +| _MARCH_ | The targetd RISC-V architecture/ISA. Only `rv32` is supported by the NEORV32. Enable compiler support of optional CPU extension by adding the according extension letter (e.g. `rv32im` for _M_ CPU extension). See section <<_enabling_risc_v_cpu_extensions>>. +| _MABI_ | The default 32-bit integer ABI. +| _USER_FLAGS_ | Additional flags that will be forwarded to the compiler tools +| _NEORV32_HOME_ | Relative or absolute path to the NEORV32 project home folder. Adapt this if the makefile/project is not in the project's `sw/example folder`. +| _COM_PORT_ | Default serial port for executable upload to bootloader. +|======================= + +:sectnums: +==== Default Compiler Flags + +The following default compiler flags are used for compiling an application. These flags are defined via the +`CC_OPTS` variable. Custom flags can be appended via the `USER_FLAGS` variable to the `CC_OPTS` variable. + +[cols="<3,<9"] +[grid="none"] +|======================= +| `-Wall` | Enable all compiler warnings. +| `-ffunction-sections` | Put functions and data segment in independent sections. This allows a code optimization as dead code and unused data can be easily removed. +| `-nostartfiles` | Do not use the default start code. The makefiles use the NEORV32-specific start-up code instead (`sw/common/crt0.S`). +| `-Wl,--gc-sections` | Make the linker perform dead code elimination. +| `-lm` | Include/link with `math.h`. +| `-lc` | Search for the standard C library when linking. +| `-lgcc` | Make sure we have no unresolved references to internal GCC library subroutines. +| `-mno-fdiv` | Use builtin software functions for floating-point divisions and square roots (since the according instructions are not supported yet). +| `-falign-functions=4` .4+| Force a 32-bit alignment of functions and labels (branch/jump/call targets). This increases performance as it simplifies instruction fetch when using the C extension. As a drawback this will also slightly increase the program code. +| `-falign-labels=4` +| `-falign-loops=4` +| `-falign-jumps=4` +|======================= + +[TIP] +The makefile configuration variables can be (re-)defined directly when invoking the makefile. For +example: `$ make MARCH=-march=rv32ic clean_all exe` + + +<<< +// #################################################################################################################### +:sectnums: +=== Executable Image Format + +When all the application sources have been compiled and linked, a final executable file has to be generated. +For this purpose, the makefile uses the NEORV32-specific linker script `sw/common/neorv32.ld` to link +all the sections into only four final sections: `.text`, `.rodata`, `.data` and `.bss`. These four section contain +everything required for the application to run: + +[cols="<1,<9"] +[grid="none"] +|======================= +| `.text` | Executable instructions generated from the start-up code and all application sources. +| `.rodata` | Constants (like strings) from the application; also the initial data for initialized variables. +| `.data` | This section is required for the address generation of fixed (= global) variables only. +| `.bss` | This section is required for the address generation of dynamic memory constructs only. +|======================= + +The `.text` and `.rodata` sections are mapped to processor's instruction memory space and the `.data` and +`.bss` sections are mapped to the processor's data memory space. Finally, the `.text`, `.rodata` and `.data` sections are extracted and concatenated into a single file +**`main.bin`**. + +**Executable Image Generator** + +The **`main.bin`** file is processed by the NEORV32 image generator (`sw/image_gen`) to generate the final +executable. It is automatically compiled when invoking the makefile. The image generator can generate three +types of executables, selected by a flag when calling the generator: + +[cols="<1,<9"] +[grid="none"] +|======================= +| `-app_bin` | Generates an executable binary file `neorv32_exe.bin` (for UART uploading via the bootloader). +| `-app_img` | Generates an executable VHDL memory initialization image for the processor-internal IMEM. This option generates the `rtl/core/neorv32_application_image.vhd` file. +| `-bld_img` | Generates an executable VHDL memory initialization image for the processor-internal BOOT ROM. This option generates the `rtl/core/neorv32_bootloader_image.vhd` file. +|======================= + +All these options are managed by the makefile – so you don't actually have to think about them. The normal +application compilation flow will generate the `neorv32_exe.bin` file in the current software project folder +ready for upload via UART to the NEORV32 bootloader. + +The actual executable provides a very small header consisting of three 32-bit words located right at the +beginning of the file. This header is generated by the image generator. The first word of the executable is the signature +word and is always `0x4788cafe`. Based on this word, the bootloader can identify a valid image file. The next word represents the size in bytes of the actual program +image in bytes. A simple "complement" checksum of the actual program image is given by the third word. This +provides a simple protection against data transmission or storage errors. + + + +<<< +// #################################################################################################################### +:sectnums: +=== Bootloader + +The default bootloader (sw/bootloader/bootloader.c) of the NEORV32 processor allows to upload +new program executables at every time. If there is an external SPI flash connected to the processor (like the +FPGA's configuration memory), the bootloader can store the program executable to it. After reset, the +bootloader can directly boot from the flash without any user interaction. + +[WARNING] +The bootloader is only implemented when the BOOTLOADER_EN generic is true and requires the +CSR access CPU extension (CPU_EXTENSION_RISCV_Zicsr generic is true). + +[IMPORTANT] +The bootloader requires the primary UART (UART0) for user interaction (_IO_UART0_EN_ generic is _true_). + +[IMPORTANT] +For the automatic boot from an SPI flash, the SPI controller has to be implemented (_IO_SPI_EN_ +generic is _true_) and the machine system timer MTIME has to be implemented (_IO_MTIME_EN_ +generic is _true_), too, to allow an auto-boot timeout counter. + +[WARNING] +The bootloader is intended to work independent of the actual hardware (-configuration). Hence, it +should be compiled with the minimal base ISA only. The current version of the bootloader uses the +`rv32i` ISA – so it will not work on `rv32e` architectures. To make the bootloader work on an embedded +CPU configuration or on any other more sophisticated configuration, recompile it using the according ISA +(see section <<_customizing_the_internal_bootloader>>). + +To interact with the bootloader, connect the primary UART (UART0) signals (`uart0_txd_o` and +`uart0_rxd_o`) of the processor's top entity via a serial port (-adapter) to your computer (hardware flow control is +not used so the according interface signals can be ignored.), configure your +terminal program using the following settings and perform a reset of the processor. + +Terminal console settings (`19200-8-N-1`): + +* 19200 Baud +* 8 data bits +* no parity bit +* 1 stop bit +* newline on `\r\n` (carriage return, newline) +* no transfer protocol / control flow protocol - just the raw byte stuff + +The bootloader uses the LSB of the top entity's `gpio_o` output port as high-active status LED (all other +output pin are set to low level by the bootloader). After reset, this LED will start blinking at ~2Hz and the +following intro screen should show up in your terminal: + +[source] +---- +<< NEORV32 Bootloader >> + +BLDV: Mar 23 2021 +HWV: 0x01050208 +CLK: 0x05F5E100 +USER: 0x10000DE0 +MISA: 0x40901105 +ZEXT: 0x00000023 +PROC: 0x0EFF0037 +IMEM: 0x00004000 bytes @ 0x00000000 +DMEM: 0x00002000 bytes @ 0x80000000 + +Autoboot in 8s. Press key to abort. +---- + +This start-up screen also gives some brief information about the bootloader and several system configuration parameters: + +[cols="<2,<15"] +[grid="none"] +|======================= +| `BLDV` | Bootloader version (built date). +| `HWV` | Processor hardware version (from the `mimpid` CSR) in BCD format (example: `0x01040606` = v1.4.6.6). +| `USER` | Custom user code (from the _USER_CODE_ generic). +| `CLK` | Processor clock speed in Hz (via the SYSINFO module, from the _CLOCK_FREQUENCY_ generic). +| `MISA` | CPU extensions (from the `misa` CSR). +| `ZEXT` | CPU sub-extensions (from the `mzext` CSR) +| `PROC` | Processor configuration (via the SYSINFO module, from the IO_* and MEM_* configuration generics). +| `IMEM` | IMEM memory base address and size in byte (from the _MEM_INT_IMEM_SIZE_ generic). +| `DMEM` | DMEM memory base address and size in byte (from the _MEM_INT_DMEM_SIZE_ generic). +|======================= + +Now you have 8 seconds to press any key. Otherwise, the bootloader starts the auto boot sequence. When +you press any key within the 8 seconds, the actual bootloader user console starts: + +[source] +---- +<< NEORV32 Bootloader >> + +BLDV: Mar 23 2021 +HWV: 0x01050208 +CLK: 0x05F5E100 +USER: 0x10000DE0 +MISA: 0x40901105 +ZEXT: 0x00000023 +PROC: 0x0EFF0037 +IMEM: 0x00004000 bytes @ 0x00000000 +DMEM: 0x00002000 bytes @ 0x80000000 + +Autoboot in 8s. Press key to abort. +Aborted. + +Available commands: +h: Help +r: Restart +u: Upload +s: Store to flash +l: Load from flash +e: Execute +CMD:> +---- + +The auto-boot countdown is stopped and now you can enter a command from the list to perform the +corresponding operation: + +* `h`: Show the help text (again) +* `r`: Restart the bootloader and the auto-boot sequence +* `u`: Upload new program executable (`neorv32_exe.bin`) via UART into the instruction memory +* `s`: Store executable to SPI flash at `spi_csn_o(0)` +* `l`: Load executable from SPI flash at `spi_csn_o(0)` +* `e`: Start the application, which is currently stored in the instruction memory (IMEM) +* `#`: Shortcut for executing u and e afterwards (not shown in help menu) + +A new executable can be uploaded via UART by executing the `u` command. After that, the executable can be directly +executed via the `e` command. To store the recently uploaded executable to an attached SPI flash press `s`. To +directly load an executable from the SPI flash press `l`. The bootloader and the auto-boot sequence can be +manually restarted via the `r` command. + +[TIP] +The CPU is in machine level privilege mode after reset. When the bootloader boots an application, +this application is also started in machine level privilege mode. + +:sectnums: +==== External SPI Flash for Booting + +If you want the NEORV32 bootloader to automatically fetch and execute an application at system start, you +can store it to an external SPI flash. The advantage of the external memory is to have a non-volatile program +storage, which can be re-programmed at any time just by executing some bootloader commands. Thus, no +FPGA bitstream recompilation is required at all. + +**SPI Flash Requirements** + +The bootloader can access an SPI compatible flash via the processor top entity's SPI port and connected to +chip select `spi_csn_o(0)`. The flash must be capable of operating at least at 1/8 of the processor's main +clock. Only single read and write byte operations are used. The address has to be 24 bit long. Furthermore, +the SPI flash has to support at least the following commands: + +* READ (`0x03`) +* READ STATUS (`0x05`) +* WRITE ENABLE (`0x06`) +* PAGE PROGRAM (`0x02`) +* SECTOR ERASE (`0xD8`) +* READ ID (`0x9E`) + +Compatible (FGPA configuration) SPI flash memories are for example the "Winbond W25Q64FV2 or the "Micron N25Q032A". + +**SPI Flash Configuration** + +The base address `SPI_FLASH_BOOT_ADR` for the executable image inside the SPI flash is defined in the +"user configuration" section of the bootloader source code (`sw/bootloader/bootloader.c`). Most +FPGAs that use an external configuration flash, store the golden configuration bitstream at base address 0. +Make sure there is no address collision between the FPGA bitstream and the application image. You need to +change the default sector size if your flash has a sector size greater or less than 64kB: + +[source,c] +---- +/** SPI flash boot image base address */ +#define SPI_FLASH_BOOT_ADR 0x00800000 +/** SPI flash sector size in bytes */ +#define SPI_FLASH_SECTOR_SIZE (64*1024) +---- + +[IMPORTANT] +For any change you made inside the bootloader, you have to recompile the bootloader (see section +<<_customizing_the_internal_bootloader>>) and do a new synthesis of the processor. + + +:sectnums: +==== Auto Boot Sequence +When you reset the NEORV32 processor, the bootloader waits 8 seconds for a user console input before it +starts the automatic boot sequence. This sequence tries to fetch a valid boot image from the external SPI +flash, connected to SPI chip select `spi_csn_o(0)`. If a valid boot image is found and can be successfully +transferred into the instruction memory, it is automatically started. If no SPI flash was detected or if there +was no valid boot image found, the bootloader stalls and the status LED is permanently activated. + + +:sectnums: +==== Bootloader Error Codes + +If something goes wrong during bootloader operation, an error code is shown. In this case the processor +stalls, a bell command and one of the following error codes are send to the terminal, the bootloader status +LED is permanently activated and the system must be reset manually. + +[cols="<2,<13"] +[grid="rows"] +|======================= +| **`ERROR_0`** | If you try to transfer an invalid executable (via UART or from the external SPI flash), this error message shows up. There might be a transfer protocol configuration error in the terminal program. See section <<_uploading_and_starting_of_a_binary_executable_image_via_uart>> for more information. Also, if no SPI flash was found during an auto-boot attempt, this message will be displayed. +| **`ERROR_1`** | Your program is way too big for the internal processor’s instructions memory. Increase the memory size or reduce (optimize!) your application code. +| **`ERROR_2`** | This indicates a checksum error. Something went wrong during the transfer of the program image (upload via UART or loading from the external SPI flash). If the error was caused by a UART upload, just try it again. When the error was generated during a flash access, the stored image might be corrupted. +| **`ERROR_3`** | This error occurs if the attached SPI flash cannot be accessed. Make sure you have the right type of flash and that it is properly connected to the NEORV32 SPI port using chip select #0. +| **`ERROR_4`** | The instruction memory is marked as read-only. Set the _MEM_INT_IMEM_ROM_ generic to _false_ to allow write accesses. +| **`ERROR_5`** | This error pops up when an unexpected exception or interrupt was triggered. The cause of the trap (`mcause` CSR) is displayed for further investigation. +| **`ERROR_?`** | Something really bad happened when there is no specific error code available... +|======================= + + + +<<< +// #################################################################################################################### +:sectnums: +=== NEORV32 Runtime Environment + +The NEORV32 provides a minimal runtime environment (RTE) that mainly takes care of two things: + +* clean application start +* stable and _safe_ execution environment (e.g. handling of exceptions/interrupts) + +[NOTE] +Performance or latency-optimized applications or embedded operating systems should use a custom +trap management. + + +:sectnums: +==== CRT0 Start-Up Code + +The initial part of the runtime environment is the `sw/common/crt0.S` application start-up code. This piece +of code is automatically linked with every application program and represents the starting point for every +application - regardless if you are using the actual RTE in your application or not. The start-up code is directly executed after a reset. +Ir performs the following operations to bring the CPU (and the SoC) into a stable and initialized state: + +* Initialize integer registers `x1` – `x15`/`x31`. +* Initialize all CPU core CSRs. +* Initialize the global pointer `gp` and the stack pointer `sp` according to the `.data` segment layout provided by the linker script. +* Clear IO area: Write zero to all memory-mapped registers within the IO region. If certain devices have not been implemented, a bus access fault exception will occur. This exception is captured by a simple dummy handler in the start-up code. +* Clear the `.bss` section defined by the linker script. +* Copy read-only data from the `.text` section to the `.data` section to set initialized variables. +* Call the application's `main` function (with no arguments). +* If the `main` function returns, the processor goes to an endless sleep mode (using a simple loop or via the `wfi` instruction if available). + + +:sectnums: +==== Using the NEORV32 Runtime Environment (RTE) in Your Application + +When execution enters the application's `main` function, the actual runtime environment is responsible for catching all implemented exceptions +and interrupts. To activate the NEORV32 RTE execute the following function: + +[source,c] +---- +void neorv32_rte_setup(void); +---- + +This setup initializes the `mtvec` CSR, which provides the base entry point for all trap +handlers. The address stored to this register reflects the first-level exception handler provided by the +NEORV32 RTE. Whenever an exception or interrupt is triggered, this first-level handler is called. + +The first-level handler performs a complete context save, analyzes the source of the exception/interrupt and +calls the according second-level exception handler, which actually takes care of the exception/interrupt +handling. For this, the RTE manages a private look-up table to store the addresses of the according trap +handlers. + +After the initial setup of the RTE, each entry in the trap handler's look-up table is initialized with a debug +handler, that outputs detailed hardware information via the **primary UART (UART0)** when triggered. This +is intended as a fall-back for debugging or for accidentally-triggered exceptions/interrupts. +For instance, an illegal instruction exception catched by the RTE debug handler might look like this in the UART0 output: + +[source] +---- + Illegal instruction @0x000002d6, MTVAL=0x00001537 +---- + +To install the **actual application's trap handlers** the NEORV32 RTE provides functions for installing and +un-installing trap handler for each implemented exception/interrupt source. + +[source,c] +---- +int neorv32_rte_exception_install(uint8_t id, void (*handler)(void)); +---- + +[cols="<5,<12"] +[options="header",grid="rows"] +|======================= +| ID name [C] | Description / trap causing entry +| `RTE_TRAP_I_MISALIGNED` | instruction address misaligned +| `RTE_TRAP_I_ACCESS` | instruction (bus) access fault +| `RTE_TRAP_I_ILLEGAL` | illegal instruction +| `RTE_TRAP_BREAKPOINT` | breakpoint (`ebreak` instruction) +| `RTE_TRAP_L_MISALIGNED` | load address misaligned +| `RTE_TRAP_L_ACCESS` | load (bus) access fault +| `RTE_TRAP_S_MISALIGNED` | store address misaligned +| `RTE_TRAP_S_ACCESS` | store (bus) access fault +| `RTE_TRAP_MENV_CALL` | environment call from machine mode (`ecall` instruction) +| `RTE_TRAP_UENV_CALL` | environment call from user mode (`ecall` instruction) +| `RTE_TRAP_MTI` | machine timer interrupt +| `RTE_TRAP_MEI` | machine external interrupt +| `RTE_TRAP_MSI` | machine software interrupt +| `RTE_TRAP_FIRQ_0` : `RTE_TRAP_FIRQ_15` | fast interrupt channel 0..15 +|======================= + +When installing a custom handler function for any of these exception/interrupts, make sure the function uses +**no attributes** (especially no interrupt attribute!), has no arguments and no return value like in the following +example: + +[source,c] +---- +void handler_xyz(void) { + + // handle exception/interrupt... +} +---- + +[WARNING] +Do NOT use the `((interrupt))` attribute for the application exception handler functions! This +will place an `mret` instruction to the end of it making it impossible to return to the first-level +exception handler of the RTE, which will cause stack corruption. + +Example: Installation of the MTIME interrupt handler: + +[source,c] +---- +neorv32_rte_exception_install(EXC_MTI, handler_xyz); +---- + +To remove a previously installed exception handler call the according un-install function from the NEORV32 +runtime environment. This will replace the previously installed handler by the initial debug handler, so even +un-installed exceptions and interrupts are further captured. + +[source,c] +---- +int neorv32_rte_exception_uninstall(uint8_t id); +---- + +Example: Removing the MTIME interrupt handler: + +[source,c] +---- +neorv32_rte_exception_uninstall(EXC_MTI); +---- + +[TIP] +More information regarding the NEORV32 runtime environment can be found in the doxygen +software documentation (also available online at GitHub pages:https://stnolting.github.io/neorv32/files.html). Index: docs/Doxyfile =================================================================== --- docs/Doxyfile (revision 56) +++ docs/Doxyfile (revision 57) @@ -32,7 +32,7 @@ # title of most generated pages and in a few other places. # The default value is: My Project. -PROJECT_NAME = "The NEORV32 RISC-V Processor - Software Framework Documentation" +PROJECT_NAME = "NEORV32 - Software Framework Documentation" # The PROJECT_NUMBER tag can be used to enter a project or revision number. This # could be handy for archiving the generated documentation or if some version
/docs/NEORV32.legacy.pdf Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream
docs/NEORV32.legacy.pdf Property changes : Added: svn:mime-type ## -0,0 +1 ## +application/octet-stream \ No newline at end of property Index: docs/NEORV32.pdf =================================================================== Cannot display: file marked as a binary type. svn:mime-type = application/octet-stream Index: docs/make_datasheet.sh =================================================================== --- docs/make_datasheet.sh (nonexistent) +++ docs/make_datasheet.sh (revision 57) @@ -0,0 +1,9 @@ +# Generate data sheet NEORV32.pdf from adoc sources using asciidoctor-pdf + +#!/bin/bash + +# Abort if any command returns != 0 +set -e + +thisdir="$( cd "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )" +asciidoctor-pdf -a pdf-theme=$thisdir/src_adoc/neorv32-theme.yml $thisdir/src_adoc/neorv32.adoc --out-file $thisdir/NEORV32.pdf Index: rtl/fpga_specific/README.md =================================================================== --- rtl/fpga_specific/README.md (revision 56) +++ rtl/fpga_specific/README.md (nonexistent) @@ -1,12 +0,0 @@ -## FPGA Platform-Specific Components - -This folder contains FPGA vendor-specific CPU/processor components (mostly memory components). -These alternative files allow a more efficient usage of the special (FPGA-specific) hard macros -and thus, might result in higher performance and/or less area utilization. - -Please note, that these FPGA-specific components are **optional**. The Processor/CPU uses an FPGA-independent -VHDL description. - -For example, if you want to use the Lattice iCE40up FPGA optimized versions of the DMEM and DMEM please use the files -from [`rtl/fpga_specific/lattice_ice40up`](https://github.com/stnolting/neorv32/tree/master/rtl/fpga_specific/lattice_ice40up) -folder **instead** of the original files from the [`rtl/core folder`](https://github.com/stnolting/neorv32/tree/master/rtl/core). \ No newline at end of file Index: rtl/fpga_specific/lattice_ice40up/neorv32_imem.ice40up_spram.vhd =================================================================== --- rtl/fpga_specific/lattice_ice40up/neorv32_imem.ice40up_spram.vhd (revision 56) +++ rtl/fpga_specific/lattice_ice40up/neorv32_imem.ice40up_spram.vhd (nonexistent) @@ -1,165 +0,0 @@ --- ################################################################################################# --- # << NEORV32 - Processor-Internal IMEM for Lattice iCE40 UltraPlus >> # --- # ********************************************************************************************* # --- # Memory has a logical size of 64kb (2 x SPRAMs). Logical size IMEM_SIZE must be less or equal. # --- # ********************************************************************************************* # --- # BSD 3-Clause License # --- # # --- # Copyright (c) 2021, Stephan Nolting. All rights reserved. # --- # # --- # Redistribution and use in source and binary forms, with or without modification, are # --- # permitted provided that the following conditions are met: # --- # # --- # 1. Redistributions of source code must retain the above copyright notice, this list of # --- # conditions and the following disclaimer. # --- # # --- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # --- # conditions and the following disclaimer in the documentation and/or other materials # --- # provided with the distribution. # --- # # --- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # --- # endorse or promote products derived from this software without specific prior written # --- # permission. # --- # # --- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # --- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # --- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # --- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # --- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # --- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # --- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # --- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # --- # OF THE POSSIBILITY OF SUCH DAMAGE. # --- # ********************************************************************************************* # --- # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # --- ################################################################################################# - -library ieee; -use ieee.std_logic_1164.all; -use ieee.numeric_std.all; - -library neorv32; -use neorv32.neorv32_package.all; - -library iCE40UP; -use iCE40UP.components.all; - -entity neorv32_imem is - generic ( - IMEM_BASE : std_ulogic_vector(31 downto 0) := x"00000000"; -- memory base address - IMEM_SIZE : natural := 64*1024; -- processor-internal instruction memory size in bytes - IMEM_AS_ROM : boolean := false; -- implement IMEM as read-only memory? - BOOTLOADER_EN : boolean := true -- implement and use bootloader? - ); - port ( - clk_i : in std_ulogic; -- global clock line - rden_i : in std_ulogic; -- read enable - wren_i : in std_ulogic; -- write enable - ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable - addr_i : in std_ulogic_vector(31 downto 0); -- address - data_i : in std_ulogic_vector(31 downto 0); -- data in - data_o : out std_ulogic_vector(31 downto 0); -- data out - ack_o : out std_ulogic -- transfer acknowledge - ); -end neorv32_imem; - -architecture neorv32_imem_rtl of neorv32_imem is - - -- advanced configuration -------------------------------------------------------------------------------- - constant spram_sleep_mode_en_c : boolean := false; -- put IMEM into sleep mode when idle (for low power) - -- ------------------------------------------------------------------------------------------------------- - - -- IO space: module base address -- - constant hi_abb_c : natural := 31; -- high address boundary bit - constant lo_abb_c : natural := index_size_f(64*1024); -- low address boundary bit - - -- local signals -- - signal acc_en : std_ulogic; - signal mem_cs : std_ulogic; - signal rdata : std_ulogic_vector(31 downto 0); - signal rden : std_ulogic; - - -- SPRAM signals -- - signal spram_clk : std_logic; - signal spram_addr : std_logic_vector(13 downto 0); - signal spram_di_lo : std_logic_vector(15 downto 0); - signal spram_di_hi : std_logic_vector(15 downto 0); - signal spram_do_lo : std_logic_vector(15 downto 0); - signal spram_do_hi : std_logic_vector(15 downto 0); - signal spram_be_lo : std_logic_vector(03 downto 0); - signal spram_be_hi : std_logic_vector(03 downto 0); - signal spram_we : std_logic; - signal spram_pwr_n : std_logic; - signal spram_cs : std_logic; - -begin - - -- Access Control ------------------------------------------------------------------------- - -- ------------------------------------------------------------------------------------------- - acc_en <= '1' when (addr_i(hi_abb_c downto lo_abb_c) = IMEM_BASE(hi_abb_c downto lo_abb_c)) else '0'; - mem_cs <= acc_en and (rden_i or wren_i); - - - -- Memory Access -------------------------------------------------------------------------- - -- ------------------------------------------------------------------------------------------- - imem_spram_lo_inst : SP256K - port map ( - AD => spram_addr, -- I - DI => spram_di_lo, -- I - MASKWE => spram_be_lo, -- I - WE => spram_we, -- I - CS => spram_cs, -- I - CK => spram_clk, -- I - STDBY => '0', -- I - SLEEP => spram_pwr_n, -- I - PWROFF_N => '1', -- I - DO => spram_do_lo -- O - ); - - imem_spram_hi_inst : SP256K - port map ( - AD => spram_addr, -- I - DI => spram_di_hi, -- I - MASKWE => spram_be_hi, -- I - WE => spram_we, -- I - CS => spram_cs, -- I - CK => spram_clk, -- I - STDBY => '0', -- I - SLEEP => spram_pwr_n, -- I - PWROFF_N => '1', -- I - DO => spram_do_hi -- O - ); - - -- access logic and signal type conversion -- - spram_clk <= std_logic(clk_i); - spram_addr <= std_logic_vector(addr_i(13+2 downto 0+2)); - spram_di_lo <= std_logic_vector(data_i(15 downto 00)); - spram_di_hi <= std_logic_vector(data_i(31 downto 16)); - spram_we <= '1' when ((acc_en and wren_i) = '1') else '0'; -- global write enable - spram_cs <= std_logic(mem_cs); - spram_be_lo <= std_logic(ben_i(1)) & std_logic(ben_i(1)) & std_logic(ben_i(0)) & std_logic(ben_i(0)); -- low byte write enable - spram_be_hi <= std_logic(ben_i(3)) & std_logic(ben_i(3)) & std_logic(ben_i(2)) & std_logic(ben_i(2)); -- high byte write enable - spram_pwr_n <= '0' when ((spram_sleep_mode_en_c = false) or (mem_cs = '1')) else '1'; -- LP mode disabled or IMEM selected - rdata <= std_ulogic_vector(spram_do_hi) & std_ulogic_vector(spram_do_lo); - - buffer_ff: process(clk_i) - begin - -- sanity check -- - if (IMEM_AS_ROM = true) or (BOOTLOADER_EN = false) then - assert false report "ICE40 Ultra Plus SPRAM cannot be initialized by bitstream!" severity error; - end if; - if (IMEM_SIZE > 64*1024) then - assert false report "IMEM has a physical size of 64kB. Logical size must be less or equal." severity error; - end if; - -- buffer -- - if rising_edge(clk_i) then - ack_o <= mem_cs; - rden <= acc_en and rden_i; - end if; - end process buffer_ff; - - -- output gate -- - data_o <= rdata when (rden = '1') else (others => '0'); - - -end neorv32_imem_rtl; Index: rtl/fpga_specific/lattice_ice40up/neorv32_dmem.ice40up_spram.vhd =================================================================== --- rtl/fpga_specific/lattice_ice40up/neorv32_dmem.ice40up_spram.vhd (revision 56) +++ rtl/fpga_specific/lattice_ice40up/neorv32_dmem.ice40up_spram.vhd (nonexistent) @@ -1,160 +0,0 @@ --- ################################################################################################# --- # << NEORV32 - Processor-Internal DMEM for Lattice iCE40 UltraPlus >> # --- # ********************************************************************************************* # --- # Memory has a logical size of 64kb (2 x SPRAMs). Logical size DMEM_SIZE must be less or equal. # --- # ********************************************************************************************* # --- # BSD 3-Clause License # --- # # --- # Copyright (c) 2020, Stephan Nolting. All rights reserved. # --- # # --- # Redistribution and use in source and binary forms, with or without modification, are # --- # permitted provided that the following conditions are met: # --- # # --- # 1. Redistributions of source code must retain the above copyright notice, this list of # --- # conditions and the following disclaimer. # --- # # --- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # --- # conditions and the following disclaimer in the documentation and/or other materials # --- # provided with the distribution. # --- # # --- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # --- # endorse or promote products derived from this software without specific prior written # --- # permission. # --- # # --- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # --- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # --- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # --- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # --- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # --- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # --- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # --- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # --- # OF THE POSSIBILITY OF SUCH DAMAGE. # --- # ********************************************************************************************* # --- # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # --- ################################################################################################# - -library ieee; -use ieee.std_logic_1164.all; -use ieee.numeric_std.all; - -library neorv32; -use neorv32.neorv32_package.all; - -library iCE40UP; -use iCE40UP.components.all; - -entity neorv32_dmem is - generic ( - DMEM_BASE : std_ulogic_vector(31 downto 0) := x"80000000"; -- memory base address - DMEM_SIZE : natural := 32*1024 -- processor-internal instruction memory size in bytes - ); - port ( - clk_i : in std_ulogic; -- global clock line - rden_i : in std_ulogic; -- read enable - wren_i : in std_ulogic; -- write enable - ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable - addr_i : in std_ulogic_vector(31 downto 0); -- address - data_i : in std_ulogic_vector(31 downto 0); -- data in - data_o : out std_ulogic_vector(31 downto 0); -- data out - ack_o : out std_ulogic -- transfer acknowledge - ); -end neorv32_dmem; - -architecture neorv32_dmem_rtl of neorv32_dmem is - - -- advanced configuration -------------------------------------------------------------------------------- - constant spram_sleep_mode_en_c : boolean := false; -- put DMEM into sleep mode when idle (for low power) - -- ------------------------------------------------------------------------------------------------------- - - -- IO space: module base address -- - constant hi_abb_c : natural := 31; -- high address boundary bit - constant lo_abb_c : natural := index_size_f(64*1024); -- low address boundary bit - - -- local signals -- - signal acc_en : std_ulogic; - signal mem_cs : std_ulogic; - signal rdata : std_ulogic_vector(31 downto 0); - signal rden : std_ulogic; - - -- SPRAM signals -- - signal spram_clk : std_logic; - signal spram_addr : std_logic_vector(13 downto 0); - signal spram_di_lo : std_logic_vector(15 downto 0); - signal spram_di_hi : std_logic_vector(15 downto 0); - signal spram_do_lo : std_logic_vector(15 downto 0); - signal spram_do_hi : std_logic_vector(15 downto 0); - signal spram_be_lo : std_logic_vector(03 downto 0); - signal spram_be_hi : std_logic_vector(03 downto 0); - signal spram_we : std_logic; - signal spram_pwr_n : std_logic; - signal spram_cs : std_logic; - -begin - - -- Access Control ------------------------------------------------------------------------- - -- ------------------------------------------------------------------------------------------- - acc_en <= '1' when (addr_i(hi_abb_c downto lo_abb_c) = DMEM_BASE(hi_abb_c downto lo_abb_c)) else '0'; - mem_cs <= acc_en and (rden_i or wren_i); - - - -- Memory Access -------------------------------------------------------------------------- - -- ------------------------------------------------------------------------------------------- - dmem_spram_lo_inst : SP256K - port map ( - AD => spram_addr, -- I - DI => spram_di_lo, -- I - MASKWE => spram_be_lo, -- I - WE => spram_we, -- I - CS => spram_cs, -- I - CK => spram_clk, -- I - STDBY => '0', -- I - SLEEP => spram_pwr_n, -- I - PWROFF_N => '1', -- I - DO => spram_do_lo -- O - ); - - dmem_spram_hi_inst : SP256K - port map ( - AD => spram_addr, -- I - DI => spram_di_hi, -- I - MASKWE => spram_be_hi, -- I - WE => spram_we, -- I - CS => spram_cs, -- I - CK => spram_clk, -- I - STDBY => '0', -- I - SLEEP => spram_pwr_n, -- I - PWROFF_N => '1', -- I - DO => spram_do_hi -- O - ); - - -- access logic and signal type conversion -- - spram_clk <= std_logic(clk_i); - spram_addr <= std_logic_vector(addr_i(13+2 downto 0+2)); - spram_di_lo <= std_logic_vector(data_i(15 downto 00)); - spram_di_hi <= std_logic_vector(data_i(31 downto 16)); - spram_we <= '1' when ((acc_en and wren_i) = '1') else '0'; -- global write enable - spram_cs <= std_logic(mem_cs); - spram_be_lo <= std_logic(ben_i(1)) & std_logic(ben_i(1)) & std_logic(ben_i(0)) & std_logic(ben_i(0)); -- low byte write enable - spram_be_hi <= std_logic(ben_i(3)) & std_logic(ben_i(3)) & std_logic(ben_i(2)) & std_logic(ben_i(2)); -- high byte write enable - spram_pwr_n <= '0' when ((spram_sleep_mode_en_c = false) or (mem_cs = '1')) else '1'; -- LP mode disabled or IMEM selected - rdata <= std_ulogic_vector(spram_do_hi) & std_ulogic_vector(spram_do_lo); - - buffer_ff: process(clk_i) - begin - -- sanity check -- - if (DMEM_SIZE > 64*1024) then - assert false report "DMEM has a physical size of 64kB. Logical size must be less or equal." severity error; - end if; - -- buffer -- - if rising_edge(clk_i) then - ack_o <= mem_cs; - rden <= acc_en and rden_i; - end if; - end process buffer_ff; - - -- output gate -- - data_o <= rdata when (rden = '1') else (others => '0'); - - -end neorv32_dmem_rtl; Index: rtl/core/neorv32_bootloader_image.vhd =================================================================== --- rtl/core/neorv32_bootloader_image.vhd (revision 56) +++ rtl/core/neorv32_bootloader_image.vhd (revision 57) @@ -6,7 +6,7 @@ package neorv32_bootloader_image is - type bootloader_init_image_t is array (0 to 1014) of std_ulogic_vector(31 downto 0); + type bootloader_init_image_t is array (0 to 1021) of std_ulogic_vector(31 downto 0); constant bootloader_init_image : bootloader_init_image_t := ( 00000000 => x"00000093", 00000001 => x"00000113", @@ -52,7 +52,7 @@ 00000041 => x"00158593", 00000042 => x"ff5ff06f", 00000043 => x"00001597", - 00000044 => x"f2c58593", + 00000044 => x"f4858593", 00000045 => x"80010617", 00000046 => x"f4c60613", 00000047 => x"80010697", @@ -111,20 +111,20 @@ 00000100 => x"00200513", 00000101 => x"0087f463", 00000102 => x"00400513", - 00000103 => x"36d000ef", + 00000103 => x"389000ef", 00000104 => x"00100513", - 00000105 => x"40d000ef", + 00000105 => x"429000ef", 00000106 => x"00005537", 00000107 => x"00000613", 00000108 => x"00000593", 00000109 => x"b0050513", - 00000110 => x"2a9000ef", - 00000111 => x"1c5000ef", + 00000110 => x"2b5000ef", + 00000111 => x"1d1000ef", 00000112 => x"00245793", 00000113 => x"00a78533", 00000114 => x"00f537b3", 00000115 => x"00b785b3", - 00000116 => x"1dd000ef", + 00000116 => x"1e9000ef", 00000117 => x"ffff07b7", 00000118 => x"4d478793", 00000119 => x"30579073", @@ -134,78 +134,78 @@ 00000123 => x"00000013", 00000124 => x"00000013", 00000125 => x"ffff1537", - 00000126 => x"eec50513", - 00000127 => x"309000ef", + 00000126 => x"f0850513", + 00000127 => x"315000ef", 00000128 => x"f1302573", 00000129 => x"260000ef", 00000130 => x"ffff1537", - 00000131 => x"f2450513", - 00000132 => x"2f5000ef", + 00000131 => x"f4050513", + 00000132 => x"301000ef", 00000133 => x"fe002503", 00000134 => x"24c000ef", 00000135 => x"ffff1537", - 00000136 => x"f2c50513", - 00000137 => x"2e1000ef", + 00000136 => x"f4850513", + 00000137 => x"2ed000ef", 00000138 => x"fe402503", 00000139 => x"238000ef", 00000140 => x"ffff1537", - 00000141 => x"f3450513", - 00000142 => x"2cd000ef", + 00000141 => x"f5050513", + 00000142 => x"2d9000ef", 00000143 => x"30102573", 00000144 => x"224000ef", 00000145 => x"ffff1537", - 00000146 => x"f3c50513", - 00000147 => x"2b9000ef", + 00000146 => x"f5850513", + 00000147 => x"2c5000ef", 00000148 => x"fc002573", 00000149 => x"210000ef", 00000150 => x"ffff1537", - 00000151 => x"f4450513", - 00000152 => x"2a5000ef", + 00000151 => x"f6050513", + 00000152 => x"2b1000ef", 00000153 => x"fe802503", 00000154 => x"ffff14b7", 00000155 => x"00341413", 00000156 => x"1f4000ef", 00000157 => x"ffff1537", - 00000158 => x"f4c50513", - 00000159 => x"289000ef", + 00000158 => x"f6850513", + 00000159 => x"295000ef", 00000160 => x"ff802503", 00000161 => x"1e0000ef", - 00000162 => x"f5448513", - 00000163 => x"279000ef", + 00000162 => x"f7048513", + 00000163 => x"285000ef", 00000164 => x"ff002503", 00000165 => x"1d0000ef", 00000166 => x"ffff1537", - 00000167 => x"f6050513", - 00000168 => x"265000ef", + 00000167 => x"f7c50513", + 00000168 => x"271000ef", 00000169 => x"ffc02503", 00000170 => x"1bc000ef", - 00000171 => x"f5448513", - 00000172 => x"255000ef", + 00000171 => x"f7048513", + 00000172 => x"261000ef", 00000173 => x"ff402503", 00000174 => x"1ac000ef", 00000175 => x"ffff1537", - 00000176 => x"f6850513", - 00000177 => x"241000ef", - 00000178 => x"0b9000ef", + 00000176 => x"f8450513", + 00000177 => x"24d000ef", + 00000178 => x"0c5000ef", 00000179 => x"00a404b3", 00000180 => x"0084b433", 00000181 => x"00b40433", - 00000182 => x"1d1000ef", + 00000182 => x"1dd000ef", 00000183 => x"02050263", 00000184 => x"ffff1537", - 00000185 => x"f9450513", - 00000186 => x"21d000ef", - 00000187 => x"0d9000ef", + 00000185 => x"fb050513", + 00000186 => x"229000ef", + 00000187 => x"0e5000ef", 00000188 => x"02300793", 00000189 => x"02f51263", 00000190 => x"00000513", 00000191 => x"0180006f", - 00000192 => x"081000ef", + 00000192 => x"08d000ef", 00000193 => x"fc85eae3", 00000194 => x"00b41463", 00000195 => x"fc9566e3", 00000196 => x"00100513", - 00000197 => x"5dc000ef", + 00000197 => x"5e8000ef", 00000198 => x"0b4000ef", 00000199 => x"ffff1937", 00000200 => x"ffff19b7", @@ -215,13 +215,13 @@ 00000204 => x"07500b93", 00000205 => x"ffff14b7", 00000206 => x"ffff1c37", - 00000207 => x"fa090513", - 00000208 => x"1c5000ef", - 00000209 => x"155000ef", + 00000207 => x"fbc90513", + 00000208 => x"1d1000ef", + 00000209 => x"161000ef", 00000210 => x"00050413", - 00000211 => x"129000ef", - 00000212 => x"ea498513", - 00000213 => x"1b1000ef", + 00000211 => x"135000ef", + 00000212 => x"ec098513", + 00000213 => x"1bd000ef", 00000214 => x"fb4400e3", 00000215 => x"01541863", 00000216 => x"ffff02b7", @@ -234,7 +234,7 @@ 00000223 => x"03740063", 00000224 => x"07300793", 00000225 => x"00f41663", - 00000226 => x"67c000ef", + 00000226 => x"688000ef", 00000227 => x"fb1ff06f", 00000228 => x"06c00793", 00000229 => x"00f41863", @@ -246,20 +246,20 @@ 00000235 => x"02c000ef", 00000236 => x"f8dff06f", 00000237 => x"03f00793", - 00000238 => x"fa8c0513", + 00000238 => x"fc4c0513", 00000239 => x"00f40463", - 00000240 => x"fbc48513", - 00000241 => x"141000ef", + 00000240 => x"fd848513", + 00000241 => x"14d000ef", 00000242 => x"f75ff06f", 00000243 => x"ffff1537", - 00000244 => x"db850513", - 00000245 => x"1310006f", + 00000244 => x"dd450513", + 00000245 => x"13d0006f", 00000246 => x"800007b7", 00000247 => x"0007a783", 00000248 => x"00079863", 00000249 => x"ffff1537", - 00000250 => x"e1c50513", - 00000251 => x"1190006f", + 00000250 => x"e3850513", + 00000251 => x"1250006f", 00000252 => x"ff010113", 00000253 => x"00112623", 00000254 => x"30047073", @@ -266,9 +266,9 @@ 00000255 => x"00000013", 00000256 => x"00000013", 00000257 => x"ffff1537", - 00000258 => x"e3850513", - 00000259 => x"0f9000ef", - 00000260 => x"075000ef", + 00000258 => x"e5450513", + 00000259 => x"105000ef", + 00000260 => x"081000ef", 00000261 => x"fe051ee3", 00000262 => x"ff002783", 00000263 => x"00078067", @@ -277,17 +277,17 @@ 00000266 => x"00812423", 00000267 => x"00050413", 00000268 => x"ffff1537", - 00000269 => x"e4850513", + 00000269 => x"e6450513", 00000270 => x"00112623", - 00000271 => x"0c9000ef", + 00000271 => x"0d5000ef", 00000272 => x"03040513", 00000273 => x"0ff57513", - 00000274 => x"02d000ef", + 00000274 => x"039000ef", 00000275 => x"30047073", 00000276 => x"00000013", 00000277 => x"00000013", 00000278 => x"00100513", - 00000279 => x"155000ef", + 00000279 => x"171000ef", 00000280 => x"0000006f", 00000281 => x"fe010113", 00000282 => x"01212823", @@ -294,14 +294,14 @@ 00000283 => x"00050913", 00000284 => x"ffff1537", 00000285 => x"00912a23", - 00000286 => x"e5450513", + 00000286 => x"e7050513", 00000287 => x"ffff14b7", 00000288 => x"00812c23", 00000289 => x"01312623", 00000290 => x"00112e23", 00000291 => x"01c00413", - 00000292 => x"075000ef", - 00000293 => x"fc848493", + 00000292 => x"081000ef", + 00000293 => x"fe448493", 00000294 => x"ffc00993", 00000295 => x"008957b3", 00000296 => x"00f7f793", @@ -308,7 +308,7 @@ 00000297 => x"00f487b3", 00000298 => x"0007c503", 00000299 => x"ffc40413", - 00000300 => x"7c4000ef", + 00000300 => x"7d0000ef", 00000301 => x"ff3414e3", 00000302 => x"01c12083", 00000303 => x"01812403", @@ -340,14 +340,14 @@ 00000329 => x"00778793", 00000330 => x"06f41a63", 00000331 => x"00000513", - 00000332 => x"065000ef", - 00000333 => x"64c000ef", + 00000332 => x"081000ef", + 00000333 => x"658000ef", 00000334 => x"fe002783", 00000335 => x"0027d793", 00000336 => x"00a78533", 00000337 => x"00f537b3", 00000338 => x"00b785b3", - 00000339 => x"660000ef", + 00000339 => x"66c000ef", 00000340 => x"03c12403", 00000341 => x"04c12083", 00000342 => x"04812283", @@ -373,13 +373,13 @@ 00000362 => x"00100513", 00000363 => x"02079863", 00000364 => x"ffff1537", - 00000365 => x"e5850513", - 00000366 => x"74c000ef", + 00000365 => x"e7450513", + 00000366 => x"758000ef", 00000367 => x"00040513", 00000368 => x"ea5ff0ef", 00000369 => x"ffff1537", - 00000370 => x"e6c50513", - 00000371 => x"738000ef", + 00000370 => x"e8850513", + 00000371 => x"744000ef", 00000372 => x"34102573", 00000373 => x"e91ff0ef", 00000374 => x"00500513", @@ -388,14 +388,14 @@ 00000377 => x"00000513", 00000378 => x"00112623", 00000379 => x"00812423", - 00000380 => x"74c000ef", + 00000380 => x"768000ef", 00000381 => x"09e00513", - 00000382 => x"788000ef", + 00000382 => x"7a4000ef", 00000383 => x"00000513", - 00000384 => x"780000ef", + 00000384 => x"79c000ef", 00000385 => x"00050413", 00000386 => x"00000513", - 00000387 => x"750000ef", + 00000387 => x"76c000ef", 00000388 => x"00c12083", 00000389 => x"0ff47513", 00000390 => x"00812403", @@ -405,15 +405,15 @@ 00000394 => x"00112623", 00000395 => x"00812423", 00000396 => x"00000513", - 00000397 => x"708000ef", + 00000397 => x"724000ef", 00000398 => x"00500513", - 00000399 => x"744000ef", + 00000399 => x"760000ef", 00000400 => x"00000513", - 00000401 => x"73c000ef", + 00000401 => x"758000ef", 00000402 => x"00050413", 00000403 => x"00147413", 00000404 => x"00000513", - 00000405 => x"708000ef", + 00000405 => x"724000ef", 00000406 => x"fc041ce3", 00000407 => x"00c12083", 00000408 => x"00812403", @@ -422,13 +422,13 @@ 00000411 => x"ff010113", 00000412 => x"00000513", 00000413 => x"00112623", - 00000414 => x"6c4000ef", + 00000414 => x"6e0000ef", 00000415 => x"00600513", - 00000416 => x"700000ef", + 00000416 => x"71c000ef", 00000417 => x"00c12083", 00000418 => x"00000513", 00000419 => x"01010113", - 00000420 => x"6cc0006f", + 00000420 => x"6e80006f", 00000421 => x"ff010113", 00000422 => x"00812423", 00000423 => x"00050413", @@ -435,30 +435,30 @@ 00000424 => x"01055513", 00000425 => x"0ff57513", 00000426 => x"00112623", - 00000427 => x"6d4000ef", + 00000427 => x"6f0000ef", 00000428 => x"00845513", 00000429 => x"0ff57513", - 00000430 => x"6c8000ef", + 00000430 => x"6e4000ef", 00000431 => x"0ff47513", 00000432 => x"00812403", 00000433 => x"00c12083", 00000434 => x"01010113", - 00000435 => x"6b40006f", + 00000435 => x"6d00006f", 00000436 => x"ff010113", 00000437 => x"00812423", 00000438 => x"00050413", 00000439 => x"00000513", 00000440 => x"00112623", - 00000441 => x"658000ef", + 00000441 => x"674000ef", 00000442 => x"00300513", - 00000443 => x"694000ef", + 00000443 => x"6b0000ef", 00000444 => x"00040513", 00000445 => x"fa1ff0ef", 00000446 => x"00000513", - 00000447 => x"684000ef", + 00000447 => x"6a0000ef", 00000448 => x"00050413", 00000449 => x"00000513", - 00000450 => x"654000ef", + 00000450 => x"670000ef", 00000451 => x"00c12083", 00000452 => x"0ff47513", 00000453 => x"00812403", @@ -477,7 +477,7 @@ 00000466 => x"00000413", 00000467 => x"00400a13", 00000468 => x"02091e63", - 00000469 => x"544000ef", + 00000469 => x"550000ef", 00000470 => x"00a481a3", 00000471 => x"00140413", 00000472 => x"fff48493", @@ -519,509 +519,516 @@ 00000508 => x"04079663", 00000509 => x"02041863", 00000510 => x"ffff1537", - 00000511 => x"e7450513", - 00000512 => x"504000ef", + 00000511 => x"e9050513", + 00000512 => x"510000ef", 00000513 => x"008005b7", 00000514 => x"00040513", 00000515 => x"f15ff0ef", 00000516 => x"4788d7b7", 00000517 => x"afe78793", - 00000518 => x"02f50463", + 00000518 => x"02f50a63", 00000519 => x"00000513", 00000520 => x"01c0006f", 00000521 => x"ffff1537", - 00000522 => x"e9450513", - 00000523 => x"4d8000ef", - 00000524 => x"db1ff0ef", - 00000525 => x"fc0518e3", + 00000522 => x"eb050513", + 00000523 => x"4e4000ef", + 00000524 => x"4e4000ef", + 00000525 => x"00051663", 00000526 => x"00300513", 00000527 => x"be9ff0ef", - 00000528 => x"008009b7", - 00000529 => x"00498593", - 00000530 => x"00040513", - 00000531 => x"ed5ff0ef", - 00000532 => x"00050a93", - 00000533 => x"00898593", - 00000534 => x"00040513", - 00000535 => x"ec5ff0ef", - 00000536 => x"ff002c03", - 00000537 => x"00050b13", - 00000538 => x"ffcafb93", - 00000539 => x"00000913", - 00000540 => x"00000493", - 00000541 => x"00c98993", - 00000542 => x"013905b3", - 00000543 => x"052b9c63", - 00000544 => x"016484b3", - 00000545 => x"00200513", - 00000546 => x"fa049ae3", - 00000547 => x"ffff1537", - 00000548 => x"ea050513", - 00000549 => x"470000ef", - 00000550 => x"02c12083", - 00000551 => x"02812403", - 00000552 => x"800007b7", - 00000553 => x"0157a023", - 00000554 => x"000a2023", - 00000555 => x"02412483", - 00000556 => x"02012903", - 00000557 => x"01c12983", - 00000558 => x"01812a03", - 00000559 => x"01412a83", - 00000560 => x"01012b03", - 00000561 => x"00c12b83", - 00000562 => x"00812c03", - 00000563 => x"03010113", - 00000564 => x"00008067", - 00000565 => x"00040513", - 00000566 => x"e49ff0ef", - 00000567 => x"012c07b3", - 00000568 => x"00a484b3", - 00000569 => x"00a7a023", - 00000570 => x"00490913", - 00000571 => x"f8dff06f", - 00000572 => x"ff010113", - 00000573 => x"00112623", - 00000574 => x"ea1ff0ef", - 00000575 => x"ffff1537", - 00000576 => x"ea450513", - 00000577 => x"400000ef", - 00000578 => x"ad1ff0ef", - 00000579 => x"0000006f", - 00000580 => x"ff010113", - 00000581 => x"00112623", - 00000582 => x"00812423", - 00000583 => x"00912223", - 00000584 => x"00058413", - 00000585 => x"00050493", - 00000586 => x"d45ff0ef", - 00000587 => x"00000513", - 00000588 => x"40c000ef", - 00000589 => x"00200513", - 00000590 => x"448000ef", - 00000591 => x"00048513", - 00000592 => x"d55ff0ef", - 00000593 => x"00040513", - 00000594 => x"438000ef", - 00000595 => x"00000513", - 00000596 => x"40c000ef", - 00000597 => x"00812403", - 00000598 => x"00c12083", - 00000599 => x"00412483", - 00000600 => x"01010113", - 00000601 => x"cc1ff06f", - 00000602 => x"fe010113", - 00000603 => x"00812c23", - 00000604 => x"00912a23", - 00000605 => x"01212823", - 00000606 => x"00112e23", - 00000607 => x"00b12623", - 00000608 => x"00300413", - 00000609 => x"00350493", - 00000610 => x"fff00913", - 00000611 => x"00c10793", - 00000612 => x"008787b3", - 00000613 => x"0007c583", - 00000614 => x"40848533", - 00000615 => x"fff40413", - 00000616 => x"f71ff0ef", - 00000617 => x"ff2414e3", - 00000618 => x"01c12083", - 00000619 => x"01812403", - 00000620 => x"01412483", - 00000621 => x"01012903", - 00000622 => x"02010113", - 00000623 => x"00008067", - 00000624 => x"ff010113", - 00000625 => x"00112623", - 00000626 => x"00812423", - 00000627 => x"00050413", - 00000628 => x"c9dff0ef", - 00000629 => x"00000513", - 00000630 => x"364000ef", - 00000631 => x"0d800513", - 00000632 => x"3a0000ef", - 00000633 => x"00040513", - 00000634 => x"cadff0ef", - 00000635 => x"00000513", - 00000636 => x"36c000ef", - 00000637 => x"00812403", - 00000638 => x"00c12083", - 00000639 => x"01010113", - 00000640 => x"c25ff06f", - 00000641 => x"fe010113", - 00000642 => x"800007b7", - 00000643 => x"00812c23", - 00000644 => x"0007a403", - 00000645 => x"00112e23", - 00000646 => x"00912a23", - 00000647 => x"01212823", - 00000648 => x"01312623", - 00000649 => x"01412423", - 00000650 => x"01512223", - 00000651 => x"02041863", - 00000652 => x"ffff1537", - 00000653 => x"e1c50513", - 00000654 => x"01812403", - 00000655 => x"01c12083", - 00000656 => x"01412483", - 00000657 => x"01012903", - 00000658 => x"00c12983", - 00000659 => x"00812a03", - 00000660 => x"00412a83", - 00000661 => x"02010113", - 00000662 => x"2ac0006f", - 00000663 => x"ffff1537", - 00000664 => x"ea850513", - 00000665 => x"2a0000ef", - 00000666 => x"00040513", - 00000667 => x"9f9ff0ef", - 00000668 => x"ffff1537", - 00000669 => x"eb450513", - 00000670 => x"28c000ef", - 00000671 => x"00800537", - 00000672 => x"9e5ff0ef", - 00000673 => x"ffff1537", - 00000674 => x"ed050513", - 00000675 => x"278000ef", - 00000676 => x"208000ef", - 00000677 => x"00050493", - 00000678 => x"1dc000ef", - 00000679 => x"07900793", - 00000680 => x"0af49e63", - 00000681 => x"b3dff0ef", - 00000682 => x"00051663", - 00000683 => x"00300513", - 00000684 => x"975ff0ef", - 00000685 => x"ffff1537", - 00000686 => x"edc50513", - 00000687 => x"01045493", - 00000688 => x"244000ef", - 00000689 => x"00148493", - 00000690 => x"00800937", - 00000691 => x"fff00993", - 00000692 => x"00010a37", - 00000693 => x"fff48493", - 00000694 => x"07349063", - 00000695 => x"4788d5b7", - 00000696 => x"afe58593", - 00000697 => x"00800537", - 00000698 => x"e81ff0ef", - 00000699 => x"00800537", - 00000700 => x"00040593", - 00000701 => x"00450513", - 00000702 => x"e71ff0ef", - 00000703 => x"ff002a03", - 00000704 => x"008009b7", - 00000705 => x"ffc47413", - 00000706 => x"00000493", - 00000707 => x"00000913", - 00000708 => x"00c98a93", - 00000709 => x"01548533", - 00000710 => x"009a07b3", - 00000711 => x"02849663", - 00000712 => x"00898513", - 00000713 => x"412005b3", - 00000714 => x"e41ff0ef", - 00000715 => x"ffff1537", - 00000716 => x"ea050513", - 00000717 => x"f05ff06f", - 00000718 => x"00090513", - 00000719 => x"e85ff0ef", - 00000720 => x"01490933", - 00000721 => x"f91ff06f", - 00000722 => x"0007a583", - 00000723 => x"00448493", - 00000724 => x"00b90933", - 00000725 => x"e15ff0ef", - 00000726 => x"fbdff06f", - 00000727 => x"01c12083", - 00000728 => x"01812403", - 00000729 => x"01412483", - 00000730 => x"01012903", - 00000731 => x"00c12983", - 00000732 => x"00812a03", - 00000733 => x"00412a83", - 00000734 => x"02010113", - 00000735 => x"00008067", - 00000736 => x"ff010113", - 00000737 => x"f9402783", - 00000738 => x"f9002703", - 00000739 => x"f9402683", - 00000740 => x"fed79ae3", - 00000741 => x"00e12023", - 00000742 => x"00f12223", - 00000743 => x"00012503", - 00000744 => x"00412583", - 00000745 => x"01010113", - 00000746 => x"00008067", - 00000747 => x"f9800693", - 00000748 => x"fff00613", - 00000749 => x"00c6a023", - 00000750 => x"00a6a023", - 00000751 => x"00b6a223", - 00000752 => x"00008067", - 00000753 => x"fa402503", - 00000754 => x"0ff57513", + 00000528 => x"da1ff0ef", + 00000529 => x"fc0510e3", + 00000530 => x"ff1ff06f", + 00000531 => x"008009b7", + 00000532 => x"00498593", + 00000533 => x"00040513", + 00000534 => x"ec9ff0ef", + 00000535 => x"00050a93", + 00000536 => x"00898593", + 00000537 => x"00040513", + 00000538 => x"eb9ff0ef", + 00000539 => x"ff002c03", + 00000540 => x"00050b13", + 00000541 => x"ffcafb93", + 00000542 => x"00000913", + 00000543 => x"00000493", + 00000544 => x"00c98993", + 00000545 => x"013905b3", + 00000546 => x"052b9c63", + 00000547 => x"016484b3", + 00000548 => x"00200513", + 00000549 => x"fa0494e3", + 00000550 => x"ffff1537", + 00000551 => x"ebc50513", + 00000552 => x"470000ef", + 00000553 => x"02c12083", + 00000554 => x"02812403", + 00000555 => x"800007b7", + 00000556 => x"0157a023", + 00000557 => x"000a2023", + 00000558 => x"02412483", + 00000559 => x"02012903", + 00000560 => x"01c12983", + 00000561 => x"01812a03", + 00000562 => x"01412a83", + 00000563 => x"01012b03", + 00000564 => x"00c12b83", + 00000565 => x"00812c03", + 00000566 => x"03010113", + 00000567 => x"00008067", + 00000568 => x"00040513", + 00000569 => x"e3dff0ef", + 00000570 => x"012c07b3", + 00000571 => x"00a484b3", + 00000572 => x"00a7a023", + 00000573 => x"00490913", + 00000574 => x"f8dff06f", + 00000575 => x"ff010113", + 00000576 => x"00112623", + 00000577 => x"e95ff0ef", + 00000578 => x"ffff1537", + 00000579 => x"ec050513", + 00000580 => x"400000ef", + 00000581 => x"ac5ff0ef", + 00000582 => x"0000006f", + 00000583 => x"ff010113", + 00000584 => x"00112623", + 00000585 => x"00812423", + 00000586 => x"00912223", + 00000587 => x"00058413", + 00000588 => x"00050493", + 00000589 => x"d39ff0ef", + 00000590 => x"00000513", + 00000591 => x"41c000ef", + 00000592 => x"00200513", + 00000593 => x"458000ef", + 00000594 => x"00048513", + 00000595 => x"d49ff0ef", + 00000596 => x"00040513", + 00000597 => x"448000ef", + 00000598 => x"00000513", + 00000599 => x"41c000ef", + 00000600 => x"00812403", + 00000601 => x"00c12083", + 00000602 => x"00412483", + 00000603 => x"01010113", + 00000604 => x"cb5ff06f", + 00000605 => x"fe010113", + 00000606 => x"00812c23", + 00000607 => x"00912a23", + 00000608 => x"01212823", + 00000609 => x"00112e23", + 00000610 => x"00b12623", + 00000611 => x"00300413", + 00000612 => x"00350493", + 00000613 => x"fff00913", + 00000614 => x"00c10793", + 00000615 => x"008787b3", + 00000616 => x"0007c583", + 00000617 => x"40848533", + 00000618 => x"fff40413", + 00000619 => x"f71ff0ef", + 00000620 => x"ff2414e3", + 00000621 => x"01c12083", + 00000622 => x"01812403", + 00000623 => x"01412483", + 00000624 => x"01012903", + 00000625 => x"02010113", + 00000626 => x"00008067", + 00000627 => x"ff010113", + 00000628 => x"00112623", + 00000629 => x"00812423", + 00000630 => x"00050413", + 00000631 => x"c91ff0ef", + 00000632 => x"00000513", + 00000633 => x"374000ef", + 00000634 => x"0d800513", + 00000635 => x"3b0000ef", + 00000636 => x"00040513", + 00000637 => x"ca1ff0ef", + 00000638 => x"00000513", + 00000639 => x"37c000ef", + 00000640 => x"00812403", + 00000641 => x"00c12083", + 00000642 => x"01010113", + 00000643 => x"c19ff06f", + 00000644 => x"fe010113", + 00000645 => x"800007b7", + 00000646 => x"00812c23", + 00000647 => x"0007a403", + 00000648 => x"00112e23", + 00000649 => x"00912a23", + 00000650 => x"01212823", + 00000651 => x"01312623", + 00000652 => x"01412423", + 00000653 => x"01512223", + 00000654 => x"02041863", + 00000655 => x"ffff1537", + 00000656 => x"e3850513", + 00000657 => x"01812403", + 00000658 => x"01c12083", + 00000659 => x"01412483", + 00000660 => x"01012903", + 00000661 => x"00c12983", + 00000662 => x"00812a03", + 00000663 => x"00412a83", + 00000664 => x"02010113", + 00000665 => x"2ac0006f", + 00000666 => x"ffff1537", + 00000667 => x"ec450513", + 00000668 => x"2a0000ef", + 00000669 => x"00040513", + 00000670 => x"9edff0ef", + 00000671 => x"ffff1537", + 00000672 => x"ed050513", + 00000673 => x"28c000ef", + 00000674 => x"00800537", + 00000675 => x"9d9ff0ef", + 00000676 => x"ffff1537", + 00000677 => x"eec50513", + 00000678 => x"278000ef", + 00000679 => x"208000ef", + 00000680 => x"00050493", + 00000681 => x"1dc000ef", + 00000682 => x"07900793", + 00000683 => x"0af49e63", + 00000684 => x"b31ff0ef", + 00000685 => x"00051663", + 00000686 => x"00300513", + 00000687 => x"969ff0ef", + 00000688 => x"ffff1537", + 00000689 => x"ef850513", + 00000690 => x"01045493", + 00000691 => x"244000ef", + 00000692 => x"00148493", + 00000693 => x"00800937", + 00000694 => x"fff00993", + 00000695 => x"00010a37", + 00000696 => x"fff48493", + 00000697 => x"07349063", + 00000698 => x"4788d5b7", + 00000699 => x"afe58593", + 00000700 => x"00800537", + 00000701 => x"e81ff0ef", + 00000702 => x"00800537", + 00000703 => x"00040593", + 00000704 => x"00450513", + 00000705 => x"e71ff0ef", + 00000706 => x"ff002a03", + 00000707 => x"008009b7", + 00000708 => x"ffc47413", + 00000709 => x"00000493", + 00000710 => x"00000913", + 00000711 => x"00c98a93", + 00000712 => x"01548533", + 00000713 => x"009a07b3", + 00000714 => x"02849663", + 00000715 => x"00898513", + 00000716 => x"412005b3", + 00000717 => x"e41ff0ef", + 00000718 => x"ffff1537", + 00000719 => x"ebc50513", + 00000720 => x"f05ff06f", + 00000721 => x"00090513", + 00000722 => x"e85ff0ef", + 00000723 => x"01490933", + 00000724 => x"f91ff06f", + 00000725 => x"0007a583", + 00000726 => x"00448493", + 00000727 => x"00b90933", + 00000728 => x"e15ff0ef", + 00000729 => x"fbdff06f", + 00000730 => x"01c12083", + 00000731 => x"01812403", + 00000732 => x"01412483", + 00000733 => x"01012903", + 00000734 => x"00c12983", + 00000735 => x"00812a03", + 00000736 => x"00412a83", + 00000737 => x"02010113", + 00000738 => x"00008067", + 00000739 => x"ff010113", + 00000740 => x"f9402783", + 00000741 => x"f9002703", + 00000742 => x"f9402683", + 00000743 => x"fed79ae3", + 00000744 => x"00e12023", + 00000745 => x"00f12223", + 00000746 => x"00012503", + 00000747 => x"00412583", + 00000748 => x"01010113", + 00000749 => x"00008067", + 00000750 => x"f9800693", + 00000751 => x"fff00613", + 00000752 => x"00c6a023", + 00000753 => x"00a6a023", + 00000754 => x"00b6a223", 00000755 => x"00008067", - 00000756 => x"fa002023", - 00000757 => x"fe002703", - 00000758 => x"00151513", - 00000759 => x"00000793", - 00000760 => x"04a77463", - 00000761 => x"000016b7", - 00000762 => x"00000713", - 00000763 => x"ffe68693", - 00000764 => x"04f6e663", - 00000765 => x"00367613", - 00000766 => x"0035f593", - 00000767 => x"fff78793", - 00000768 => x"01461613", - 00000769 => x"00c7e7b3", - 00000770 => x"01659593", - 00000771 => x"01871713", - 00000772 => x"00b7e7b3", - 00000773 => x"00e7e7b3", - 00000774 => x"10000737", - 00000775 => x"00e7e7b3", - 00000776 => x"faf02023", - 00000777 => x"00008067", - 00000778 => x"00178793", - 00000779 => x"01079793", - 00000780 => x"40a70733", - 00000781 => x"0107d793", - 00000782 => x"fa9ff06f", - 00000783 => x"ffe70513", - 00000784 => x"0fd57513", - 00000785 => x"00051a63", - 00000786 => x"0037d793", - 00000787 => x"00170713", - 00000788 => x"0ff77713", - 00000789 => x"f9dff06f", - 00000790 => x"0017d793", - 00000791 => x"ff1ff06f", - 00000792 => x"f71ff06f", - 00000793 => x"fa002783", - 00000794 => x"fe07cee3", - 00000795 => x"faa02223", - 00000796 => x"00008067", - 00000797 => x"ff1ff06f", - 00000798 => x"fa002503", - 00000799 => x"01f55513", - 00000800 => x"00008067", - 00000801 => x"ff5ff06f", - 00000802 => x"fa402503", - 00000803 => x"fe055ee3", - 00000804 => x"0ff57513", - 00000805 => x"00008067", - 00000806 => x"ff1ff06f", - 00000807 => x"fa402503", - 00000808 => x"01f55513", - 00000809 => x"00008067", - 00000810 => x"ff5ff06f", - 00000811 => x"ff010113", - 00000812 => x"00812423", - 00000813 => x"01212023", - 00000814 => x"00112623", - 00000815 => x"00912223", - 00000816 => x"00050413", - 00000817 => x"00a00913", - 00000818 => x"00044483", - 00000819 => x"00140413", - 00000820 => x"00049e63", - 00000821 => x"00c12083", - 00000822 => x"00812403", - 00000823 => x"00412483", - 00000824 => x"00012903", - 00000825 => x"01010113", - 00000826 => x"00008067", - 00000827 => x"01249663", - 00000828 => x"00d00513", - 00000829 => x"f71ff0ef", - 00000830 => x"00048513", - 00000831 => x"f69ff0ef", - 00000832 => x"fc9ff06f", - 00000833 => x"fa9ff06f", - 00000834 => x"00757513", - 00000835 => x"00367613", - 00000836 => x"0015f593", - 00000837 => x"00a51513", - 00000838 => x"00d61613", - 00000839 => x"00c56533", - 00000840 => x"00959593", - 00000841 => x"fa800793", - 00000842 => x"00b56533", - 00000843 => x"0007a023", - 00000844 => x"10056513", - 00000845 => x"00a7a023", - 00000846 => x"00008067", - 00000847 => x"fa800713", - 00000848 => x"00072683", - 00000849 => x"00757793", - 00000850 => x"00100513", - 00000851 => x"00f51533", - 00000852 => x"00d56533", - 00000853 => x"00a72023", - 00000854 => x"00008067", - 00000855 => x"fa800713", - 00000856 => x"00072683", - 00000857 => x"00757513", - 00000858 => x"00100793", - 00000859 => x"00a797b3", - 00000860 => x"fff7c793", - 00000861 => x"00d7f7b3", - 00000862 => x"00f72023", - 00000863 => x"00008067", - 00000864 => x"faa02623", - 00000865 => x"fa802783", - 00000866 => x"fe07cee3", - 00000867 => x"fac02503", - 00000868 => x"00008067", - 00000869 => x"f8400713", - 00000870 => x"00072683", - 00000871 => x"00100793", - 00000872 => x"00a797b3", - 00000873 => x"00d7c7b3", - 00000874 => x"00f72023", + 00000756 => x"fa402503", + 00000757 => x"0ff57513", + 00000758 => x"00008067", + 00000759 => x"fa002023", + 00000760 => x"fe002703", + 00000761 => x"00151513", + 00000762 => x"00000793", + 00000763 => x"04a77463", + 00000764 => x"000016b7", + 00000765 => x"00000713", + 00000766 => x"ffe68693", + 00000767 => x"04f6e663", + 00000768 => x"00367613", + 00000769 => x"0035f593", + 00000770 => x"fff78793", + 00000771 => x"01461613", + 00000772 => x"00c7e7b3", + 00000773 => x"01659593", + 00000774 => x"01871713", + 00000775 => x"00b7e7b3", + 00000776 => x"00e7e7b3", + 00000777 => x"10000737", + 00000778 => x"00e7e7b3", + 00000779 => x"faf02023", + 00000780 => x"00008067", + 00000781 => x"00178793", + 00000782 => x"01079793", + 00000783 => x"40a70733", + 00000784 => x"0107d793", + 00000785 => x"fa9ff06f", + 00000786 => x"ffe70513", + 00000787 => x"0fd57513", + 00000788 => x"00051a63", + 00000789 => x"0037d793", + 00000790 => x"00170713", + 00000791 => x"0ff77713", + 00000792 => x"f9dff06f", + 00000793 => x"0017d793", + 00000794 => x"ff1ff06f", + 00000795 => x"f71ff06f", + 00000796 => x"fa002783", + 00000797 => x"fe07cee3", + 00000798 => x"faa02223", + 00000799 => x"00008067", + 00000800 => x"ff1ff06f", + 00000801 => x"fa002503", + 00000802 => x"01f55513", + 00000803 => x"00008067", + 00000804 => x"ff5ff06f", + 00000805 => x"fa402503", + 00000806 => x"fe055ee3", + 00000807 => x"0ff57513", + 00000808 => x"00008067", + 00000809 => x"ff1ff06f", + 00000810 => x"fa402503", + 00000811 => x"01f55513", + 00000812 => x"00008067", + 00000813 => x"ff5ff06f", + 00000814 => x"ff010113", + 00000815 => x"00812423", + 00000816 => x"01212023", + 00000817 => x"00112623", + 00000818 => x"00912223", + 00000819 => x"00050413", + 00000820 => x"00a00913", + 00000821 => x"00044483", + 00000822 => x"00140413", + 00000823 => x"00049e63", + 00000824 => x"00c12083", + 00000825 => x"00812403", + 00000826 => x"00412483", + 00000827 => x"00012903", + 00000828 => x"01010113", + 00000829 => x"00008067", + 00000830 => x"01249663", + 00000831 => x"00d00513", + 00000832 => x"f71ff0ef", + 00000833 => x"00048513", + 00000834 => x"f69ff0ef", + 00000835 => x"fc9ff06f", + 00000836 => x"fa9ff06f", + 00000837 => x"fe802503", + 00000838 => x"01355513", + 00000839 => x"00157513", + 00000840 => x"00008067", + 00000841 => x"00757513", + 00000842 => x"00367613", + 00000843 => x"0015f593", + 00000844 => x"00a51513", + 00000845 => x"00d61613", + 00000846 => x"00c56533", + 00000847 => x"00959593", + 00000848 => x"fa800793", + 00000849 => x"00b56533", + 00000850 => x"0007a023", + 00000851 => x"10056513", + 00000852 => x"00a7a023", + 00000853 => x"00008067", + 00000854 => x"fa800713", + 00000855 => x"00072683", + 00000856 => x"00757793", + 00000857 => x"00100513", + 00000858 => x"00f51533", + 00000859 => x"00d56533", + 00000860 => x"00a72023", + 00000861 => x"00008067", + 00000862 => x"fa800713", + 00000863 => x"00072683", + 00000864 => x"00757513", + 00000865 => x"00100793", + 00000866 => x"00a797b3", + 00000867 => x"fff7c793", + 00000868 => x"00d7f7b3", + 00000869 => x"00f72023", + 00000870 => x"00008067", + 00000871 => x"faa02623", + 00000872 => x"fa802783", + 00000873 => x"fe07cee3", + 00000874 => x"fac02503", 00000875 => x"00008067", - 00000876 => x"f8a02223", - 00000877 => x"00008067", - 00000878 => x"69617641", - 00000879 => x"6c62616c", - 00000880 => x"4d432065", - 00000881 => x"0a3a7344", - 00000882 => x"203a6820", - 00000883 => x"706c6548", - 00000884 => x"3a72200a", - 00000885 => x"73655220", - 00000886 => x"74726174", - 00000887 => x"3a75200a", - 00000888 => x"6c705520", - 00000889 => x"0a64616f", - 00000890 => x"203a7320", - 00000891 => x"726f7453", - 00000892 => x"6f742065", - 00000893 => x"616c6620", - 00000894 => x"200a6873", - 00000895 => x"4c203a6c", - 00000896 => x"2064616f", - 00000897 => x"6d6f7266", - 00000898 => x"616c6620", - 00000899 => x"200a6873", - 00000900 => x"45203a65", - 00000901 => x"75636578", - 00000902 => x"00006574", - 00000903 => x"65206f4e", - 00000904 => x"75636578", - 00000905 => x"6c626174", - 00000906 => x"76612065", - 00000907 => x"616c6961", - 00000908 => x"2e656c62", - 00000909 => x"00000000", - 00000910 => x"746f6f42", - 00000911 => x"2e676e69", - 00000912 => x"0a0a2e2e", - 00000913 => x"00000000", - 00000914 => x"52450a07", - 00000915 => x"5f524f52", + 00000876 => x"f8400713", + 00000877 => x"00072683", + 00000878 => x"00100793", + 00000879 => x"00a797b3", + 00000880 => x"00d7c7b3", + 00000881 => x"00f72023", + 00000882 => x"00008067", + 00000883 => x"f8a02223", + 00000884 => x"00008067", + 00000885 => x"69617641", + 00000886 => x"6c62616c", + 00000887 => x"4d432065", + 00000888 => x"0a3a7344", + 00000889 => x"203a6820", + 00000890 => x"706c6548", + 00000891 => x"3a72200a", + 00000892 => x"73655220", + 00000893 => x"74726174", + 00000894 => x"3a75200a", + 00000895 => x"6c705520", + 00000896 => x"0a64616f", + 00000897 => x"203a7320", + 00000898 => x"726f7453", + 00000899 => x"6f742065", + 00000900 => x"616c6620", + 00000901 => x"200a6873", + 00000902 => x"4c203a6c", + 00000903 => x"2064616f", + 00000904 => x"6d6f7266", + 00000905 => x"616c6620", + 00000906 => x"200a6873", + 00000907 => x"45203a65", + 00000908 => x"75636578", + 00000909 => x"00006574", + 00000910 => x"65206f4e", + 00000911 => x"75636578", + 00000912 => x"6c626174", + 00000913 => x"76612065", + 00000914 => x"616c6961", + 00000915 => x"2e656c62", 00000916 => x"00000000", - 00000917 => x"00007830", - 00000918 => x"58450a0a", - 00000919 => x"54504543", - 00000920 => x"204e4f49", - 00000921 => x"7561636d", - 00000922 => x"003d6573", - 00000923 => x"70204020", - 00000924 => x"00003d63", - 00000925 => x"69617741", - 00000926 => x"676e6974", - 00000927 => x"6f656e20", - 00000928 => x"32337672", - 00000929 => x"6578655f", - 00000930 => x"6e69622e", - 00000931 => x"202e2e2e", - 00000932 => x"00000000", - 00000933 => x"64616f4c", - 00000934 => x"2e676e69", - 00000935 => x"00202e2e", - 00000936 => x"00004b4f", - 00000937 => x"0000000a", - 00000938 => x"74697257", - 00000939 => x"78302065", - 00000940 => x"00000000", - 00000941 => x"74796220", - 00000942 => x"74207365", - 00000943 => x"5053206f", - 00000944 => x"6c662049", - 00000945 => x"20687361", - 00000946 => x"78302040", + 00000917 => x"746f6f42", + 00000918 => x"2e676e69", + 00000919 => x"0a0a2e2e", + 00000920 => x"00000000", + 00000921 => x"52450a07", + 00000922 => x"5f524f52", + 00000923 => x"00000000", + 00000924 => x"00007830", + 00000925 => x"58450a0a", + 00000926 => x"54504543", + 00000927 => x"204e4f49", + 00000928 => x"7561636d", + 00000929 => x"003d6573", + 00000930 => x"70204020", + 00000931 => x"00003d63", + 00000932 => x"69617741", + 00000933 => x"676e6974", + 00000934 => x"6f656e20", + 00000935 => x"32337672", + 00000936 => x"6578655f", + 00000937 => x"6e69622e", + 00000938 => x"202e2e2e", + 00000939 => x"00000000", + 00000940 => x"64616f4c", + 00000941 => x"2e676e69", + 00000942 => x"00202e2e", + 00000943 => x"00004b4f", + 00000944 => x"0000000a", + 00000945 => x"74697257", + 00000946 => x"78302065", 00000947 => x"00000000", - 00000948 => x"7928203f", - 00000949 => x"20296e2f", - 00000950 => x"00000000", - 00000951 => x"616c460a", - 00000952 => x"6e696873", - 00000953 => x"2e2e2e67", - 00000954 => x"00000020", - 00000955 => x"0a0a0a0a", - 00000956 => x"4e203c3c", - 00000957 => x"56524f45", - 00000958 => x"42203233", - 00000959 => x"6c746f6f", - 00000960 => x"6564616f", - 00000961 => x"3e3e2072", - 00000962 => x"4c420a0a", - 00000963 => x"203a5644", - 00000964 => x"20727041", - 00000965 => x"32203331", - 00000966 => x"0a313230", - 00000967 => x"3a565748", - 00000968 => x"00002020", - 00000969 => x"4b4c430a", - 00000970 => x"0020203a", - 00000971 => x"4553550a", - 00000972 => x"00203a52", - 00000973 => x"53494d0a", - 00000974 => x"00203a41", - 00000975 => x"58455a0a", - 00000976 => x"00203a54", - 00000977 => x"4f52500a", - 00000978 => x"00203a43", - 00000979 => x"454d490a", - 00000980 => x"00203a4d", - 00000981 => x"74796220", - 00000982 => x"40207365", - 00000983 => x"00000020", - 00000984 => x"454d440a", - 00000985 => x"00203a4d", - 00000986 => x"75410a0a", - 00000987 => x"6f626f74", - 00000988 => x"6920746f", - 00000989 => x"3828206e", - 00000990 => x"202e7329", - 00000991 => x"73657250", - 00000992 => x"656b2073", - 00000993 => x"6f742079", - 00000994 => x"6f626120", - 00000995 => x"0a2e7472", - 00000996 => x"00000000", - 00000997 => x"726f6241", - 00000998 => x"2e646574", - 00000999 => x"00000a0a", - 00001000 => x"444d430a", - 00001001 => x"00203e3a", - 00001002 => x"53207962", - 00001003 => x"68706574", - 00001004 => x"4e206e61", - 00001005 => x"69746c6f", - 00001006 => x"0000676e", - 00001007 => x"61766e49", - 00001008 => x"2064696c", - 00001009 => x"00444d43", - 00001010 => x"33323130", - 00001011 => x"37363534", - 00001012 => x"62613938", - 00001013 => x"66656463", + 00000948 => x"74796220", + 00000949 => x"74207365", + 00000950 => x"5053206f", + 00000951 => x"6c662049", + 00000952 => x"20687361", + 00000953 => x"78302040", + 00000954 => x"00000000", + 00000955 => x"7928203f", + 00000956 => x"20296e2f", + 00000957 => x"00000000", + 00000958 => x"616c460a", + 00000959 => x"6e696873", + 00000960 => x"2e2e2e67", + 00000961 => x"00000020", + 00000962 => x"0a0a0a0a", + 00000963 => x"4e203c3c", + 00000964 => x"56524f45", + 00000965 => x"42203233", + 00000966 => x"6c746f6f", + 00000967 => x"6564616f", + 00000968 => x"3e3e2072", + 00000969 => x"4c420a0a", + 00000970 => x"203a5644", + 00000971 => x"20727041", + 00000972 => x"32203132", + 00000973 => x"0a313230", + 00000974 => x"3a565748", + 00000975 => x"00002020", + 00000976 => x"4b4c430a", + 00000977 => x"0020203a", + 00000978 => x"4553550a", + 00000979 => x"00203a52", + 00000980 => x"53494d0a", + 00000981 => x"00203a41", + 00000982 => x"58455a0a", + 00000983 => x"00203a54", + 00000984 => x"4f52500a", + 00000985 => x"00203a43", + 00000986 => x"454d490a", + 00000987 => x"00203a4d", + 00000988 => x"74796220", + 00000989 => x"40207365", + 00000990 => x"00000020", + 00000991 => x"454d440a", + 00000992 => x"00203a4d", + 00000993 => x"75410a0a", + 00000994 => x"6f626f74", + 00000995 => x"6920746f", + 00000996 => x"3828206e", + 00000997 => x"202e7329", + 00000998 => x"73657250", + 00000999 => x"656b2073", + 00001000 => x"6f742079", + 00001001 => x"6f626120", + 00001002 => x"0a2e7472", + 00001003 => x"00000000", + 00001004 => x"726f6241", + 00001005 => x"2e646574", + 00001006 => x"00000a0a", + 00001007 => x"444d430a", + 00001008 => x"00203e3a", + 00001009 => x"53207962", + 00001010 => x"68706574", + 00001011 => x"4e206e61", + 00001012 => x"69746c6f", + 00001013 => x"0000676e", + 00001014 => x"61766e49", + 00001015 => x"2064696c", + 00001016 => x"00444d43", + 00001017 => x"33323130", + 00001018 => x"37363534", + 00001019 => x"62613938", + 00001020 => x"66656463", others => x"00000000" );
/rtl/core/neorv32_bus_keeper.vhd
0,0 → 1,144
-- #################################################################################################
-- # << NEORV32 - Bus Keeper (BUSKEEPER) >> #
-- # ********************************************************************************************* #
-- # This unit monitors the processor-internal bus. If the accesses INTERNAL (IMEM if enabled, #
-- # DMEM if enabled, BOOTROM + IO region) module does not respond within the defined number of #
-- # cycles (VHDL package: max_proc_int_response_time_c) it asserts the error signal to inform the #
-- # CPU / bus driver. This timeout does not track accesses via the processor-external bus #
-- # interface! #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License #
-- # #
-- # Copyright (c) 2021, Stephan Nolting. All rights reserved. #
-- # #
-- # Redistribution and use in source and binary forms, with or without modification, are #
-- # permitted provided that the following conditions are met: #
-- # #
-- # 1. Redistributions of source code must retain the above copyright notice, this list of #
-- # conditions and the following disclaimer. #
-- # #
-- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
-- # conditions and the following disclaimer in the documentation and/or other materials #
-- # provided with the distribution. #
-- # #
-- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
-- # endorse or promote products derived from this software without specific prior written #
-- # permission. #
-- # #
-- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
-- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
-- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
-- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
-- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
-- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
-- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
-- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
-- # OF THE POSSIBILITY OF SUCH DAMAGE. #
-- # ********************************************************************************************* #
-- # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
-- #################################################################################################
 
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
 
library neorv32;
use neorv32.neorv32_package.all;
 
entity neorv32_bus_keeper is
generic (
-- Internal instruction memory --
MEM_INT_IMEM_EN : boolean := true; -- implement processor-internal instruction memory
MEM_INT_IMEM_SIZE : natural := 8*1024; -- size of processor-internal instruction memory in bytes
-- Internal data memory --
MEM_INT_DMEM_EN : boolean := true; -- implement processor-internal data memory
MEM_INT_DMEM_SIZE : natural := 8*1024 -- size of processor-internal data memory in bytes
);
port (
-- host access --
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ack_i : in std_ulogic; -- transfer acknowledge from bus system
err_i : in std_ulogic; -- transfer error from bus system
err_o : out std_ulogic -- bus error
);
end neorv32_bus_keeper;
 
architecture neorv32_bus_keeper_rtl of neorv32_bus_keeper is
 
-- access check --
type access_check_t is record
int_imem : std_ulogic;
int_dmem : std_ulogic;
int_bootrom_io : std_ulogic;
valid : std_ulogic;
end record;
signal access_check : access_check_t;
 
-- controller --
type control_t is record
pending : std_ulogic;
timeout : std_ulogic_vector(index_size_f(max_proc_int_response_time_c)-1 downto 0);
bus_err : std_ulogic;
end record;
signal control : control_t;
 
begin
 
-- Sanity Check --------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
assert not (max_proc_int_response_time_c < 2) report "NEORV32 PROCESSOR CONFIG ERROR! Processor-internal bus timeout <max_proc_int_response_time_c> has to >= 2." severity error;
 
 
-- Access Control -------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
-- access to processor-internal IMEM or DMEM? --
access_check.int_imem <= '1' when (addr_i(31 downto index_size_f(MEM_INT_IMEM_SIZE)) = imem_base_c(31 downto index_size_f(MEM_INT_IMEM_SIZE))) and (MEM_INT_IMEM_EN = true) else '0';
access_check.int_dmem <= '1' when (addr_i(31 downto index_size_f(MEM_INT_DMEM_SIZE)) = dmem_base_c(31 downto index_size_f(MEM_INT_DMEM_SIZE))) and (MEM_INT_DMEM_EN = true) else '0';
-- access to processor-internal BOOTROM or IO devices? --
access_check.int_bootrom_io <= '1' when (addr_i(31 downto 16) = boot_rom_base_c(31 downto 16)) else '0'; -- hacky!
-- actual internal bus access? --
access_check.valid <= access_check.int_imem or access_check.int_dmem or access_check.int_bootrom_io;
 
 
-- Keeper ---------------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
keeper_control: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
control.pending <= '0';
control.bus_err <= '0';
control.timeout <= (others => def_rst_val_c);
elsif rising_edge(clk_i) then
 
-- pending access? --
control.bus_err <= '0';
if (control.pending = '0') then -- idle
if ((rden_i or wren_i) = '1') and (access_check.valid = '1') then
control.pending <= '1';
end if;
else -- pending
if (ack_i = '1') or (err_i = '1') then -- termination by bus system
control.pending <= '0';
elsif (or_all_f(control.timeout) = '0') then -- timeout! terminate bus transfer
control.pending <= '0';
control.bus_err <= '1';
end if;
end if;
 
-- timeout counter --
if (control.pending = '0') then
control.timeout <= std_ulogic_vector(to_unsigned(max_proc_int_response_time_c, index_size_f(max_proc_int_response_time_c)));
else
control.timeout <= std_ulogic_vector(unsigned(control.timeout) - 1); -- countdown timer
end if;
end if;
end process keeper_control;
 
err_o <= control.bus_err;
 
 
end neorv32_bus_keeper_rtl;
/rtl/core/neorv32_busswitch.vhd
58,8 → 58,7
ca_bus_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
ca_bus_we_i : in std_ulogic; -- write enable
ca_bus_re_i : in std_ulogic; -- read enable
ca_bus_cancel_i : in std_ulogic; -- cancel current bus transaction
ca_bus_excl_i : in std_ulogic; -- exclusive access
ca_bus_lock_i : in std_ulogic; -- exclusive access request
ca_bus_ack_o : out std_ulogic; -- bus transfer acknowledge
ca_bus_err_o : out std_ulogic; -- bus transfer error
-- controller interface b --
69,8 → 68,7
cb_bus_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
cb_bus_we_i : in std_ulogic; -- write enable
cb_bus_re_i : in std_ulogic; -- read enable
cb_bus_cancel_i : in std_ulogic; -- cancel current bus transaction
cb_bus_excl_i : in std_ulogic; -- exclusive access
cb_bus_lock_i : in std_ulogic; -- exclusive access request
cb_bus_ack_o : out std_ulogic; -- bus transfer acknowledge
cb_bus_err_o : out std_ulogic; -- bus transfer error
-- peripheral bus --
81,8 → 79,7
p_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
p_bus_we_o : out std_ulogic; -- write enable
p_bus_re_o : out std_ulogic; -- read enable
p_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
p_bus_excl_o : out std_ulogic; -- exclusive access
p_bus_lock_o : out std_ulogic; -- exclusive access request
p_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
p_bus_err_i : in std_ulogic -- bus transfer error
);
129,8 → 126,7
if (ca_rd_req_buf = '0') and (ca_wr_req_buf = '0') then -- idle
ca_rd_req_buf <= ca_bus_re_i;
ca_wr_req_buf <= ca_bus_we_i;
elsif (ca_bus_cancel_i = '1') or -- controller cancels access
(ca_bus_err = '1') or -- peripheral cancels access
elsif (ca_bus_err = '1') or -- error termination
(ca_bus_ack = '1') then -- normal termination
ca_rd_req_buf <= '0';
ca_wr_req_buf <= '0';
140,8 → 136,7
if (cb_rd_req_buf = '0') and (cb_wr_req_buf = '0') then
cb_rd_req_buf <= cb_bus_re_i;
cb_wr_req_buf <= cb_bus_we_i;
elsif (cb_bus_cancel_i = '1') or -- controller cancels access
(cb_bus_err = '1') or -- peripheral cancels access
elsif (cb_bus_err = '1') or -- error termination
(cb_bus_ack = '1') then -- normal termination
cb_rd_req_buf <= '0';
cb_wr_req_buf <= '0';
175,8 → 170,7
-- Peripheral Bus Arbiter -----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
arbiter_comb: process(arbiter, ca_req_current, cb_req_current, ca_req_buffered, cb_req_buffered,
ca_rd_req_buf, ca_wr_req_buf, cb_rd_req_buf, cb_wr_req_buf,
ca_bus_cancel_i, cb_bus_cancel_i, p_bus_ack_i, p_bus_err_i)
ca_rd_req_buf, ca_wr_req_buf, cb_rd_req_buf, cb_wr_req_buf, p_bus_ack_i, p_bus_err_i)
begin
-- arbiter defaults --
arbiter.state_nxt <= arbiter.state;
210,8 → 204,7
-- ------------------------------------------------------------
p_bus_src_o <= '0'; -- access from port A
arbiter.bus_sel <= '0';
if (ca_bus_cancel_i = '1') or -- controller cancels access
(p_bus_err_i = '1') or -- peripheral cancels access
if (p_bus_err_i = '1') or -- error termination
(p_bus_ack_i = '1') then -- normal termination
arbiter.state_nxt <= IDLE;
end if;
230,8 → 223,7
-- ------------------------------------------------------------
p_bus_src_o <= '1'; -- access from port B
arbiter.bus_sel <= '1';
if (cb_bus_cancel_i = '1') or -- controller cancels access
(p_bus_err_i = '1') or -- peripheral cancels access
if (p_bus_err_i = '1') or -- error termination
(p_bus_ack_i = '1') then -- normal termination
if (ca_req_buffered = '1') or (ca_req_current = '1') then -- any request from A?
arbiter.state_nxt <= RETIRE;
263,10 → 255,9
ca_bus_ben_i when (arbiter.bus_sel = '0') else cb_bus_ben_i;
p_bus_we <= ca_bus_we_i when (arbiter.bus_sel = '0') else cb_bus_we_i;
p_bus_re <= ca_bus_re_i when (arbiter.bus_sel = '0') else cb_bus_re_i;
p_bus_cancel_o <= ca_bus_cancel_i when (arbiter.bus_sel = '0') else cb_bus_cancel_i;
p_bus_we_o <= (p_bus_we or arbiter.we_trig);
p_bus_re_o <= (p_bus_re or arbiter.re_trig);
p_bus_excl_o <= ca_bus_excl_i or cb_bus_excl_i;
p_bus_lock_o <= ca_bus_lock_i or cb_bus_lock_i;
 
ca_bus_rdata_o <= p_bus_rdata_i;
cb_bus_rdata_o <= p_bus_rdata_i;
/rtl/core/neorv32_cpu.vhd
58,7 → 58,6
-- General --
HW_THREAD_ID : natural := 0; -- hardware thread id (32-bit)
CPU_BOOT_ADDR : std_ulogic_vector(31 downto 0):= x"00000000"; -- cpu boot address
BUS_TIMEOUT : natural := 63; -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
-- RISC-V CPU Extensions --
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_B : boolean := false; -- implement bit manipulation extensions?
93,7 → 92,7
i_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
i_bus_we_o : out std_ulogic; -- write enable
i_bus_re_o : out std_ulogic; -- read enable
i_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
i_bus_lock_o : out std_ulogic; -- exclusive access request
i_bus_ack_i : in std_ulogic := '0'; -- bus transfer acknowledge
i_bus_err_i : in std_ulogic := '0'; -- bus transfer error
i_bus_fence_o : out std_ulogic; -- executed FENCEI operation
105,13 → 104,11
d_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
d_bus_we_o : out std_ulogic; -- write enable
d_bus_re_o : out std_ulogic; -- read enable
d_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
d_bus_lock_o : out std_ulogic; -- exclusive access request
d_bus_ack_i : in std_ulogic := '0'; -- bus transfer acknowledge
d_bus_err_i : in std_ulogic := '0'; -- bus transfer error
d_bus_fence_o : out std_ulogic; -- executed FENCE operation
d_bus_priv_o : out std_ulogic_vector(1 downto 0); -- privilege level
d_bus_excl_o : out std_ulogic; -- exclusive access request
d_bus_excl_i : in std_ulogic; -- state of exclusiv access (set if success)
-- system time input from MTIME --
time_i : in std_ulogic_vector(63 downto 0) := (others => '0'); -- current system time
-- interrupts (risc-v compliant) --
143,7 → 140,7
signal ma_instr : std_ulogic; -- misaligned instruction address
signal ma_load : std_ulogic; -- misaligned load data address
signal ma_store : std_ulogic; -- misaligned store data address
signal bus_excl_ok : std_ulogic; -- atomic memory access successful
signal excl_state : std_ulogic; -- atomic/exclusive access lock status
signal be_instr : std_ulogic; -- bus error on instruction access
signal be_load : std_ulogic; -- bus error on load data access
signal be_store : std_ulogic; -- bus error on store data access
161,11 → 158,6
signal pmp_addr : pmp_addr_if_t;
signal pmp_ctrl : pmp_ctrl_if_t;
 
-- atomic memory access - success? --
signal atomic_sc_res : std_ulogic;
signal atomic_sc_res_ff : std_ulogic;
signal atomic_sc_val : std_ulogic;
 
begin
 
-- Sanity Checks --------------------------------------------------------------------------
185,9 → 177,6
-- U-extension requires Zicsr extension --
assert not ((CPU_EXTENSION_RISCV_Zicsr = false) and (CPU_EXTENSION_RISCV_U = true)) report "NEORV32 CPU CONFIG ERROR! User mode requires <CPU_EXTENSION_RISCV_Zicsr> extension to be enabled." severity error;
 
-- Bus timeout --
assert not (BUS_TIMEOUT < 2) report "NEORV32 CPU CONFIG ERROR! Invalid bus access timeout value <BUS_TIMEOUT>. Has to be >= 2." severity error;
 
-- Instruction prefetch buffer size --
assert not (is_power_of_two_f(ipb_entries_c) = false) report "NEORV32 CPU CONFIG ERROR! Number of entries in instruction prefetch buffer <ipb_entries_c> has to be a power of two." severity error;
-- A extension - only lr.w and sc.w are supported yet --
252,6 → 241,7
alu_wait_i => alu_wait, -- wait for ALU
bus_i_wait_i => bus_i_wait, -- wait for bus
bus_d_wait_i => bus_d_wait, -- wait for bus
excl_state_i => excl_state, -- atomic/exclusive access lock status
-- data input --
instr_i => instr, -- instruction
cmp_i => comparator, -- comparator status
341,8 → 331,16
);
 
 
-- Co-Processor 0: Integer Multiplication/Division ('M' Extension) ------------------------
-- Co-Processor 0: CSR (Read) Access ('Zicsr' Extension) ----------------------------------
-- -------------------------------------------------------------------------------------------
-- "pseudo" co-processor for CSR *read* access operations
-- required to get CSR read data into the data path
cp_result(0) <= csr_rdata when (CPU_EXTENSION_RISCV_Zicsr = true) else (others => '0');
cp_valid(0) <= cp_start(0); -- always assigned even if Zicsr extension is disabled to make sure CPU does not get stalled if there is an accidental access
 
 
-- Co-Processor 1: Integer Multiplication/Division ('M' Extension) ------------------------
-- -------------------------------------------------------------------------------------------
neorv32_cpu_cp_muldiv_inst_true:
if (CPU_EXTENSION_RISCV_M = true) generate
neorv32_cpu_cp_muldiv_inst: neorv32_cpu_cp_muldiv
354,50 → 352,23
clk_i => clk_i, -- global clock, rising edge
rstn_i => rstn_i, -- global reset, low-active, async
ctrl_i => ctrl, -- main control bus
start_i => cp_start(0), -- trigger operation
start_i => cp_start(1), -- trigger operation
-- data input --
rs1_i => rs1, -- rf source 1
rs2_i => rs2, -- rf source 2
-- result and status --
res_o => cp_result(0), -- operation result
valid_o => cp_valid(0) -- data output valid
res_o => cp_result(1), -- operation result
valid_o => cp_valid(1) -- data output valid
);
end generate;
 
neorv32_cpu_cp_muldiv_inst_false:
if (CPU_EXTENSION_RISCV_M = false) generate
cp_result(0) <= (others => '0');
cp_valid(0) <= cp_start(0); -- to make sure CPU does not get stalled if there is an accidental access
cp_result(1) <= (others => '0');
cp_valid(1) <= cp_start(1); -- to make sure CPU does not get stalled if there is an accidental access
end generate;
 
 
-- Co-Processor 1: Atomic Memory Access ('A' Extension) -----------------------------------
-- -------------------------------------------------------------------------------------------
-- "pseudo" co-processor for atomic operations
-- required to get the result of a store-conditional operation into the data path
atomic_op_cp: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
atomic_sc_val <= def_rst_val_c;
atomic_sc_res <= def_rst_val_c;
atomic_sc_res_ff <= def_rst_val_c;
elsif rising_edge(clk_i) then
atomic_sc_val <= cp_start(1);
atomic_sc_res <= bus_excl_ok;
if (atomic_sc_val = '1') then
atomic_sc_res_ff <= not atomic_sc_res;
else
atomic_sc_res_ff <= '0';
end if;
end if;
end process atomic_op_cp;
 
-- CP result --
cp_result(1)(data_width_c-1 downto 1) <= (others => '0');
cp_result(1)(0) <= atomic_sc_res_ff when (CPU_EXTENSION_RISCV_A = true) else '0';
cp_valid(1) <= atomic_sc_val when (CPU_EXTENSION_RISCV_A = true) else cp_start(1); -- assigned even if A extension is disabled so CPU does not get stalled on accidental access
 
 
-- Co-Processor 2: Bit Manipulation ('B' Extension) ---------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_cpu_cp_bitmanip_inst_true:
426,16 → 397,8
end generate;
 
 
-- Co-Processor 3: CSR (Read) Access ('Zicsr' Extension) ----------------------------------
-- Co-Processor 3: Single-Precision Floating-Point Unit ('Zfinx' Extension) ---------------
-- -------------------------------------------------------------------------------------------
-- "pseudo" co-processor for CSR *read* access operations
-- required to get CSR read data into the data path
cp_result(3) <= csr_rdata when (CPU_EXTENSION_RISCV_Zicsr = true) else (others => '0');
cp_valid(3) <= cp_start(3); -- always assigned even if Zicsr extension is disabled to make sure CPU does not get stalled if there is an accidental access
 
 
-- Co-Processor 4: Single-Precision Floating-Point Unit ('Zfinx' Extension) ---------------
-- -------------------------------------------------------------------------------------------
neorv32_cpu_cp_fpu_inst_true:
if (CPU_EXTENSION_RISCV_Zfinx = true) generate
neorv32_cpu_cp_fpu_inst: neorv32_cpu_cp_fpu
444,7 → 407,7
clk_i => clk_i, -- global clock, rising edge
rstn_i => rstn_i, -- global reset, low-active, async
ctrl_i => ctrl, -- main control bus
start_i => cp_start(4), -- trigger operation
start_i => cp_start(3), -- trigger operation
-- data input --
frm_i => fpu_rm, -- rounding mode
cmp_i => comparator, -- comparator status
451,22 → 414,25
rs1_i => rs1, -- rf source 1
rs2_i => rs2, -- rf source 2
-- result and status --
res_o => cp_result(4), -- operation result
res_o => cp_result(3), -- operation result
fflags_o => fpu_flags, -- exception flags
valid_o => cp_valid(4) -- data output valid
valid_o => cp_valid(3) -- data output valid
);
end generate;
 
neorv32_cpu_cp_fpu_inst_false:
if (CPU_EXTENSION_RISCV_Zfinx = false) generate
cp_result(4) <= (others => '0');
cp_result(3) <= (others => '0');
fpu_flags <= (others => '0');
cp_valid(4) <= cp_start(4); -- to make sure CPU does not get stalled if there is an accidental access
cp_valid(3) <= cp_start(3); -- to make sure CPU does not get stalled if there is an accidental access
end generate;
 
 
-- Co-Processor 5,6,7: Not Implemented Yet ------------------------------------------------
-- Co-Processor 4,5,6,7: Not Implemented --------------------------------------------------
-- -------------------------------------------------------------------------------------------
cp_result(4) <= (others => '0');
cp_valid(4) <= '0';
--
cp_result(5) <= (others => '0');
cp_valid(5) <= '0';
--
485,9 → 451,7
CPU_EXTENSION_RISCV_C => CPU_EXTENSION_RISCV_C, -- implement compressed extension?
-- Physical memory protection (PMP) --
PMP_NUM_REGIONS => PMP_NUM_REGIONS, -- number of regions (0..64)
PMP_MIN_GRANULARITY => PMP_MIN_GRANULARITY, -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
-- Bus Timeout --
BUS_TIMEOUT => BUS_TIMEOUT -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
PMP_MIN_GRANULARITY => PMP_MIN_GRANULARITY -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
)
port map (
-- global control --
508,7 → 472,7
mar_o => mar, -- current memory address register
d_wait_o => bus_d_wait, -- wait for access to complete
--
bus_excl_ok_o => bus_excl_ok, -- bus exclusive access successful
excl_state_o => excl_state, -- atomic/exclusive access status
ma_load_o => ma_load, -- misaligned load data address
ma_store_o => ma_store, -- misaligned store data address
be_load_o => be_load, -- bus error on load data access
523,7 → 487,7
i_bus_ben_o => i_bus_ben_o, -- byte enable
i_bus_we_o => i_bus_we_o, -- write enable
i_bus_re_o => i_bus_re_o, -- read enable
i_bus_cancel_o => i_bus_cancel_o, -- cancel current bus transaction
i_bus_lock_o => i_bus_lock_o, -- exclusive access request
i_bus_ack_i => i_bus_ack_i, -- bus transfer acknowledge
i_bus_err_i => i_bus_err_i, -- bus transfer error
i_bus_fence_o => i_bus_fence_o, -- fence operation
534,12 → 498,10
d_bus_ben_o => d_bus_ben_o, -- byte enable
d_bus_we_o => d_bus_we_o, -- write enable
d_bus_re_o => d_bus_re_o, -- read enable
d_bus_cancel_o => d_bus_cancel_o, -- cancel current bus transaction
d_bus_lock_o => d_bus_lock_o, -- exclusive access request
d_bus_ack_i => d_bus_ack_i, -- bus transfer acknowledge
d_bus_err_i => d_bus_err_i, -- bus transfer error
d_bus_fence_o => d_bus_fence_o, -- fence operation
d_bus_excl_o => d_bus_excl_o, -- exclusive access request
d_bus_excl_i => d_bus_excl_i -- state of exclusiv access (set if success)
d_bus_fence_o => d_bus_fence_o -- fence operation
);
 
-- current privilege level --
/rtl/core/neorv32_cpu_bus.vhd
43,13 → 43,11
 
entity neorv32_cpu_bus is
generic (
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_C : boolean := true; -- implement compressed extension?
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_C : boolean := true; -- implement compressed extension?
-- Physical memory protection (PMP) --
PMP_NUM_REGIONS : natural := 0; -- number of regions (0..64)
PMP_MIN_GRANULARITY : natural := 64*1024; -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
-- Bus Timeout --
BUS_TIMEOUT : natural := 63 -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
PMP_NUM_REGIONS : natural := 0; -- number of regions (0..64)
PMP_MIN_GRANULARITY : natural := 64*1024 -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
);
port (
-- global control --
70,7 → 68,7
mar_o : out std_ulogic_vector(data_width_c-1 downto 0); -- current memory address register
d_wait_o : out std_ulogic; -- wait for access to complete
--
bus_excl_ok_o : out std_ulogic; -- bus exclusive access successful
excl_state_o : out std_ulogic; -- atomic/exclusive access status
ma_load_o : out std_ulogic; -- misaligned load data address
ma_store_o : out std_ulogic; -- misaligned store data address
be_load_o : out std_ulogic; -- bus error on load data access
85,7 → 83,7
i_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
i_bus_we_o : out std_ulogic; -- write enable
i_bus_re_o : out std_ulogic; -- read enable
i_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
i_bus_lock_o : out std_ulogic; -- exclusive access request
i_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
i_bus_err_i : in std_ulogic; -- bus transfer error
i_bus_fence_o : out std_ulogic; -- fence operation
96,12 → 94,10
d_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
d_bus_we_o : out std_ulogic; -- write enable
d_bus_re_o : out std_ulogic; -- read enable
d_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
d_bus_lock_o : out std_ulogic; -- exclusive access request
d_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
d_bus_err_i : in std_ulogic; -- bus transfer error
d_bus_fence_o : out std_ulogic; -- fence operation
d_bus_excl_o : out std_ulogic; -- exclusive access request
d_bus_excl_i : in std_ulogic -- state of exclusiv access (set if success)
d_bus_fence_o : out std_ulogic -- fence operation
);
end neorv32_cpu_bus;
 
130,6 → 126,7
-- data access --
signal d_bus_wdata : std_ulogic_vector(data_width_c-1 downto 0); -- write data
signal d_bus_rdata : std_ulogic_vector(data_width_c-1 downto 0); -- read data
signal rdata_align : std_ulogic_vector(data_width_c-1 downto 0); -- read-data alignment
signal d_bus_ben : std_ulogic_vector(3 downto 0); -- write data byte enable
 
-- misaligned access? --
141,10 → 138,13
wr_req : std_ulogic; -- write access in progress
err_align : std_ulogic; -- alignment error
err_bus : std_ulogic; -- bus access error
timeout : std_ulogic_vector(index_size_f(BUS_TIMEOUT)-1 downto 0);
end record;
signal i_arbiter, d_arbiter : bus_arbiter_t;
 
-- atomic/exclusive access - reservation controller --
signal exclusive_lock : std_ulogic;
signal exclusive_lock_status : std_ulogic_vector(data_width_c-1 downto 0); -- read data
 
-- physical memory protection --
type pmp_addr_t is array (0 to PMP_NUM_REGIONS-1) of std_ulogic_vector(data_width_c-1 downto 0);
type pmp_t is record
258,7 → 258,7
 
-- Data Interface: Read Data --------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
mem_out_buf: process(rstn_i, clk_i)
mem_di_reg: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
mdi <= (others => def_rst_val_c);
267,7 → 267,7
mdi <= d_bus_rdata; -- memory data input register (MDI)
end if;
end if;
end process mem_out_buf;
end process mem_di_reg;
 
-- input data alignment and sign extension --
read_align: process(mdi, mar, ctrl_i)
284,17 → 284,20
-- actual data size --
case ctrl_i(ctrl_bus_size_msb_c downto ctrl_bus_size_lsb_c) is
when "00" => -- byte
rdata_o(31 downto 08) <= (others => ((not ctrl_i(ctrl_bus_unsigned_c)) and byte_in_v(7))); -- sign extension
rdata_o(07 downto 00) <= byte_in_v;
rdata_align(31 downto 08) <= (others => ((not ctrl_i(ctrl_bus_unsigned_c)) and byte_in_v(7))); -- sign extension
rdata_align(07 downto 00) <= byte_in_v;
when "01" => -- half-word
rdata_o(31 downto 16) <= (others => ((not ctrl_i(ctrl_bus_unsigned_c)) and hword_in_v(15))); -- sign extension
rdata_o(15 downto 00) <= hword_in_v; -- high half-word
rdata_align(31 downto 16) <= (others => ((not ctrl_i(ctrl_bus_unsigned_c)) and hword_in_v(15))); -- sign extension
rdata_align(15 downto 00) <= hword_in_v; -- high half-word
when others => -- word
rdata_o <= mdi; -- full word
rdata_align <= mdi; -- full word
end case;
end process read_align;
 
-- insert exclusive lock status for SC operations only --
rdata_o <= exclusive_lock_status when (CPU_EXTENSION_RISCV_A = true) and (ctrl_i(ctrl_bus_ch_lock_c) = '1') else rdata_align;
 
 
-- Data Access Arbiter --------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
data_access_arbiter: process(rstn_i, clk_i)
304,7 → 307,6
d_arbiter.rd_req <= '0';
d_arbiter.err_align <= '0';
d_arbiter.err_bus <= '0';
d_arbiter.timeout <= (others => '0');
elsif rising_edge(clk_i) then
-- data access request --
if (d_arbiter.wr_req = '0') and (d_arbiter.rd_req = '0') then -- idle
312,12 → 314,10
d_arbiter.rd_req <= ctrl_i(ctrl_bus_rd_c);
d_arbiter.err_align <= d_misaligned;
d_arbiter.err_bus <= '0';
d_arbiter.timeout <= std_ulogic_vector(to_unsigned(BUS_TIMEOUT, index_size_f(BUS_TIMEOUT)));
else -- in progress
d_arbiter.timeout <= std_ulogic_vector(unsigned(d_arbiter.timeout) - 1);
d_arbiter.err_align <= (d_arbiter.err_align or d_misaligned) and (not ctrl_i(ctrl_bus_derr_ack_c));
d_arbiter.err_bus <= (d_arbiter.err_bus or (not or_all_f(d_arbiter.timeout)) or d_bus_err_i or
(st_pmp_fault and d_arbiter.wr_req) or (ld_pmp_fault and d_arbiter.rd_req)) and (not ctrl_i(ctrl_bus_derr_ack_c));
d_arbiter.err_bus <= (d_arbiter.err_bus or d_bus_err_i or (st_pmp_fault and d_arbiter.wr_req) or (ld_pmp_fault and d_arbiter.rd_req)) and
(not ctrl_i(ctrl_bus_derr_ack_c));
if (d_bus_ack_i = '1') or (ctrl_i(ctrl_bus_derr_ack_c) = '1') then -- wait for normal termination / CPU abort
d_arbiter.wr_req <= '0';
d_arbiter.rd_req <= '0';
326,9 → 326,6
end if;
end process data_access_arbiter;
 
-- cancel bus access --
d_bus_cancel_o <= (d_arbiter.wr_req or d_arbiter.rd_req) and ctrl_i(ctrl_bus_derr_ack_c);
 
-- wait for bus transaction to finish --
d_wait_o <= (d_arbiter.wr_req or d_arbiter.rd_req) and (not d_bus_ack_i);
 
348,7 → 345,6
d_bus_re_o <= d_bus_re_buf when (PMP_NUM_REGIONS > pmp_num_regions_critical_c) else d_bus_re;
d_bus_fence_o <= ctrl_i(ctrl_bus_fence_c);
d_bus_rdata <= d_bus_rdata_i;
d_bus_excl_o <= ctrl_i(ctrl_bus_excl_c);
 
-- additional register stage for control signals if using PMP_NUM_REGIONS > pmp_num_regions_critical_c --
pmp_dbus_buffer: process(rstn_i, clk_i)
362,25 → 358,38
end if;
end process pmp_dbus_buffer;
 
-- Atomic memory access - status buffer --
atomic_access_status: process(rstn_i, clk_i)
 
-- Reservation Controller (LR/SC [A extension]) -------------------------------------------
-- -------------------------------------------------------------------------------------------
exclusive_access_controller: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
bus_excl_ok_o <= '0';
exclusive_lock <= '0';
elsif rising_edge(clk_i) then
if (CPU_EXTENSION_RISCV_A = true) then
if (d_bus_ack_i = '1') then
bus_excl_ok_o <= d_bus_excl_i; -- set if access was exclusive
elsif (d_arbiter.rd_req = '0') and (d_arbiter.wr_req = '0') then -- bus access done
bus_excl_ok_o <= '0';
if (ctrl_i(ctrl_trap_c) = '1') or (ctrl_i(ctrl_bus_de_lock_c) = '1') then -- remove lock if entering a trap or executing a non-load-reservate memory access
exclusive_lock <= '0';
elsif (ctrl_i(ctrl_bus_lock_c) = '1') then -- set new lock
exclusive_lock <= '1';
end if;
else
bus_excl_ok_o <= '0';
exclusive_lock <= '0';
end if;
end if;
end process atomic_access_status;
end process exclusive_access_controller;
 
-- lock status for SC operation --
exclusive_lock_status(data_width_c-1 downto 1) <= (others => '0');
exclusive_lock_status(0) <= not exclusive_lock;
 
-- output reservation status to control unit (to check if SC should write at all) --
excl_state_o <= exclusive_lock;
 
-- output to memory system --
i_bus_lock_o <= '0'; -- instruction fetches cannot be lockes
d_bus_lock_o <= exclusive_lock;
 
 
-- Instruction Fetch Arbiter --------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
ifetch_arbiter: process(rstn_i, clk_i)
389,7 → 398,6
i_arbiter.rd_req <= '0';
i_arbiter.err_align <= '0';
i_arbiter.err_bus <= '0';
i_arbiter.timeout <= (others => '0');
elsif rising_edge(clk_i) then
-- instruction fetch request --
if (i_arbiter.rd_req = '0') then -- idle
396,11 → 404,9
i_arbiter.rd_req <= ctrl_i(ctrl_bus_if_c);
i_arbiter.err_align <= i_misaligned;
i_arbiter.err_bus <= '0';
i_arbiter.timeout <= std_ulogic_vector(to_unsigned(BUS_TIMEOUT, index_size_f(BUS_TIMEOUT)));
else -- in progress
i_arbiter.timeout <= std_ulogic_vector(unsigned(i_arbiter.timeout) - 1);
i_arbiter.err_align <= (i_arbiter.err_align or i_misaligned) and (not ctrl_i(ctrl_bus_ierr_ack_c));
i_arbiter.err_bus <= (i_arbiter.err_bus or (not or_all_f(i_arbiter.timeout)) or i_bus_err_i or if_pmp_fault) and (not ctrl_i(ctrl_bus_ierr_ack_c));
else -- in progres
i_arbiter.err_align <= (i_arbiter.err_align or i_misaligned) and (not ctrl_i(ctrl_bus_ierr_ack_c));
i_arbiter.err_bus <= (i_arbiter.err_bus or i_bus_err_i or if_pmp_fault) and (not ctrl_i(ctrl_bus_ierr_ack_c));
if (i_bus_ack_i = '1') or (ctrl_i(ctrl_bus_ierr_ack_c) = '1') then -- wait for normal termination / CPU abort
i_arbiter.rd_req <= '0';
end if;
410,9 → 416,6
 
i_arbiter.wr_req <= '0'; -- instruction fetch is read-only
 
-- cancel bus access --
i_bus_cancel_o <= i_arbiter.rd_req and ctrl_i(ctrl_bus_ierr_ack_c);
 
-- wait for bus transaction to finish --
i_wait_o <= i_arbiter.rd_req and (not i_bus_ack_i);
 
/rtl/core/neorv32_cpu_control.vhd
78,6 → 78,7
alu_wait_i : in std_ulogic; -- wait for ALU
bus_i_wait_i : in std_ulogic; -- wait for bus
bus_d_wait_i : in std_ulogic; -- wait for bus
excl_state_i : in std_ulogic; -- atomic/exclusive access lock status
-- data input --
instr_i : in std_ulogic_vector(data_width_c-1 downto 0); -- instruction
cmp_i : in std_ulogic_vector(1 downto 0); -- comparator status
124,12 → 125,14
constant hpm_cnt_lo_width_c : natural := natural(cond_sel_int_f(boolean(HPM_CNT_WIDTH < 32), HPM_CNT_WIDTH, 32));
constant hpm_cnt_hi_width_c : natural := natural(cond_sel_int_f(boolean(HPM_CNT_WIDTH > 32), HPM_CNT_WIDTH-32, 0));
 
-- instruction fetch enginge --
type fetch_engine_state_t is (IFETCH_RESET, IFETCH_REQUEST, IFETCH_ISSUE);
-- instruction fetch engine --
type fetch_engine_state_t is (IFETCH_REQUEST, IFETCH_ISSUE);
type fetch_engine_t is record
state : fetch_engine_state_t;
state_nxt : fetch_engine_state_t;
state_prev : fetch_engine_state_t;
restart : std_ulogic;
restart_nxt : std_ulogic;
pc : std_ulogic_vector(data_width_c-1 downto 0);
pc_nxt : std_ulogic_vector(data_width_c-1 downto 0);
reset : std_ulogic;
137,7 → 140,7
end record;
signal fetch_engine : fetch_engine_t;
 
-- instrucion prefetch buffer (IPB, real FIFO) --
-- instruction prefetch buffer (IPB, real FIFO) --
type ipb_data_fifo_t is array (0 to ipb_entries_c-1) of std_ulogic_vector(2+31 downto 0);
type ipb_t is record
wdata : std_ulogic_vector(2+31 downto 0); -- write status (bus_error, align_error) + 32-bit instruction data
164,7 → 167,7
signal ci_instr32 : std_ulogic_vector(31 downto 0);
signal ci_illegal : std_ulogic;
 
-- instruction issue enginge --
-- instruction issue engine --
type issue_engine_state_t is (ISSUE_ACTIVE, ISSUE_REALIGN);
type issue_engine_t is record
state : issue_engine_state_t;
198,7 → 201,7
 
-- instruction execution engine --
type execute_engine_state_t is (SYS_WAIT, DISPATCH, TRAP_ENTER, TRAP_EXIT, TRAP_EXECUTE, EXECUTE, ALU_WAIT, BRANCH,
FENCE_OP,LOADSTORE_0, LOADSTORE_1, LOADSTORE_2, ATOMIC_SC_EVAL, SYS_ENV, CSR_ACCESS);
FENCE_OP,LOADSTORE_0, LOADSTORE_1, LOADSTORE_2, SYS_ENV, CSR_ACCESS);
type execute_engine_t is record
state : execute_engine_state_t;
state_nxt : execute_engine_state_t;
213,7 → 216,7
is_cp_op : std_ulogic; -- current instruction is a co-processor operation
is_cp_op_nxt : std_ulogic;
--
branch_taken : std_ulogic; -- branch condition fullfilled
branch_taken : std_ulogic; -- branch condition fulfilled
pc : std_ulogic_vector(data_width_c-1 downto 0); -- actual PC, corresponding to current executed instruction
pc_mux_sel : std_ulogic; -- source select for PC update
pc_we : std_ulogic; -- PC update enabled
363,17 → 366,19
fetch_engine_fsm_sync: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
fetch_engine.state <= IFETCH_RESET;
fetch_engine.state_prev <= IFETCH_RESET;
fetch_engine.state <= IFETCH_REQUEST;
fetch_engine.state_prev <= IFETCH_REQUEST;
fetch_engine.restart <= '1';
fetch_engine.pc <= (others => def_rst_val_c);
elsif rising_edge(clk_i) then
if (fetch_engine.reset = '1') then
fetch_engine.state <= IFETCH_RESET;
fetch_engine.state <= fetch_engine.state_nxt;
fetch_engine.state_prev <= fetch_engine.state;
fetch_engine.restart <= fetch_engine.restart_nxt;
if (fetch_engine.restart = '1') then
fetch_engine.pc <= execute_engine.pc(data_width_c-1 downto 1) & '0'; -- initialize with "real" application PC
else
fetch_engine.state <= fetch_engine.state_nxt;
fetch_engine.pc <= fetch_engine.pc_nxt;
end if;
fetch_engine.state_prev <= fetch_engine.state;
fetch_engine.pc <= fetch_engine.pc_nxt;
end if;
end process fetch_engine_fsm_sync;
 
390,41 → 395,41
fetch_engine.state_nxt <= fetch_engine.state;
fetch_engine.pc_nxt <= fetch_engine.pc;
fetch_engine.bus_err_ack <= '0';
fetch_engine.restart_nxt <= fetch_engine.restart or fetch_engine.reset;
 
-- instruction prefetch buffer interface --
ipb.we <= '0';
ipb.wdata <= be_instr_i & ma_instr_i & instr_i(31 downto 0); -- store exception info and instruction word
ipb.clear <= '0';
ipb.clear <= fetch_engine.restart;
 
-- state machine --
case fetch_engine.state is
 
when IFETCH_RESET => -- reset engine and prefetch buffer, get application PC
when IFETCH_REQUEST => -- request new 32-bit-aligned instruction word
-- ------------------------------------------------------------
fetch_engine.bus_err_ack <= '1'; -- acknowledge any instruction bus errors, the execute engine has to take care of them / terminate current transfer
fetch_engine.pc_nxt <= execute_engine.pc(data_width_c-1 downto 1) & '0'; -- initialize with "real" application PC
ipb.clear <= '1'; -- clear prefetch buffer
fetch_engine.state_nxt <= IFETCH_REQUEST;
 
when IFETCH_REQUEST => -- output current PC to bus system and request 32-bit (aligned!) instruction data
-- ------------------------------------------------------------
if (ipb.free = '1') then -- free entry in buffer?
if (ipb.free = '1') and (fetch_engine.restart = '0') then -- free entry in buffer AND no reset request?
bus_fast_ir <= '1'; -- fast instruction fetch request
fetch_engine.state_nxt <= IFETCH_ISSUE;
end if;
if (fetch_engine.restart = '1') then -- reset request?
fetch_engine.restart_nxt <= '0';
end if;
 
when IFETCH_ISSUE => -- store instruction data to prefetch buffer
-- ------------------------------------------------------------
fetch_engine.bus_err_ack <= be_instr_i or ma_instr_i; -- ACK bus/alignment errors
if (bus_i_wait_i = '0') or (be_instr_i = '1') or (ma_instr_i = '1') then -- wait for bus response
fetch_engine.pc_nxt <= std_ulogic_vector(unsigned(fetch_engine.pc) + 4);
ipb.we <= '1';
fetch_engine.pc_nxt <= std_ulogic_vector(unsigned(fetch_engine.pc) + 4);
ipb.we <= not fetch_engine.restart; -- write to IPB if not being reset
if (fetch_engine.restart = '1') then -- reset request?
fetch_engine.restart_nxt <= '0';
end if;
fetch_engine.state_nxt <= IFETCH_REQUEST;
end if;
 
when others => -- undefined
-- ------------------------------------------------------------
fetch_engine.state_nxt <= IFETCH_RESET;
fetch_engine.state_nxt <= IFETCH_REQUEST;
 
end case;
end process fetch_engine_fsm_comb;
687,7 → 692,7
execute_engine.state <= SYS_WAIT;
execute_engine.sleep <= '0';
execute_engine.branched <= '1'; -- reset is a branch from "somewhere"
-- no dedicated RESEt required --
-- no dedicated RESET required --
execute_engine.state_prev <= SYS_WAIT;
execute_engine.i_reg <= (others => def_rst_val_c);
execute_engine.is_ci <= def_rst_val_c;
774,6 → 779,7
ctrl_o(ctrl_ir_funct3_2_c downto ctrl_ir_funct3_0_c) <= execute_engine.i_reg(instr_funct3_msb_c downto instr_funct3_lsb_c);
-- cpu status --
ctrl_o(ctrl_sleep_c) <= execute_engine.sleep; -- cpu is in sleep mode
ctrl_o(ctrl_trap_c) <= trap_ctrl.env_start_ack; -- cpu is starting a trap handler
end process ctrl_output;
 
 
871,7 → 877,7
-- Execute Engine FSM Comb ----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
execute_engine_fsm_comb: process(execute_engine, decode_aux, fetch_engine, cmd_issue, trap_ctrl, csr, ctrl, csr_acc_valid,
alu_wait_i, bus_d_wait_i, ma_load_i, be_load_i, ma_store_i, be_store_i)
alu_wait_i, bus_d_wait_i, ma_load_i, be_load_i, ma_store_i, be_store_i, excl_state_i)
variable opcode_v : std_ulogic_vector(6 downto 0);
begin
-- arbiter defaults --
915,8 → 921,12
else -- branches
ctrl_nxt(ctrl_alu_unsigned_c) <= execute_engine.i_reg(instr_funct3_lsb_c+1); -- unsigned branches? (BLTU, BGEU)
end if;
-- bus interface --
ctrl_nxt(ctrl_bus_excl_c) <= ctrl(ctrl_bus_excl_c); -- keep exclusive bus access request alive if set
-- Atomic store-conditional instruction (evaluate lock status) --
if (CPU_EXTENSION_RISCV_A = true) then
ctrl_nxt(ctrl_bus_ch_lock_c) <= decode_aux.is_atomic_sc;
else
ctrl_nxt(ctrl_bus_ch_lock_c) <= '0';
end if;
 
 
-- state machine --
937,8 → 947,7
when DISPATCH => -- Get new command from instruction issue engine
-- ------------------------------------------------------------
-- housekeeping --
execute_engine.is_cp_op_nxt <= '0'; -- init
ctrl_nxt(ctrl_bus_excl_c) <= '0'; -- clear exclusive data bus access
execute_engine.is_cp_op_nxt <= '0'; -- no compressed instruction yet
-- PC update --
execute_engine.pc_mux_sel <= '0'; -- linear next PC
-- IR update --
1070,9 → 1079,9
 
when opcode_load_c | opcode_store_c | opcode_atomic_c => -- load/store / atomic memory access
-- ------------------------------------------------------------
ctrl_nxt(ctrl_alu_opa_mux_c) <= '0'; -- use RS1 as ALU.OPA
ctrl_nxt(ctrl_alu_opb_mux_c) <= '1'; -- use IMM as ALU.OPB
ctrl_nxt(ctrl_bus_mo_we_c) <= '1'; -- write to MAR and MDO (MDO only relevant for store)
ctrl_nxt(ctrl_alu_opa_mux_c)<= '0'; -- use RS1 as ALU.OPA
ctrl_nxt(ctrl_alu_opb_mux_c)<= '1'; -- use IMM as ALU.OPB
ctrl_nxt(ctrl_bus_mo_we_c) <= '1'; -- write to MAR and MDO (MDO only relevant for store)
--
if (CPU_EXTENSION_RISCV_A = false) or -- atomic extension disabled
(execute_engine.i_reg(instr_opcode_lsb_c+3 downto instr_opcode_lsb_c+2) = "00") then -- normal integerload/store
1221,11 → 1230,17
 
when LOADSTORE_0 => -- trigger memory request
-- ------------------------------------------------------------
ctrl_nxt(ctrl_bus_excl_c) <= decode_aux.is_atomic_lr; -- atomic.LR: exclusive memory access request
ctrl_nxt(ctrl_bus_lock_c) <= decode_aux.is_atomic_lr; -- atomic.LR: set lock
if (execute_engine.i_reg(instr_opcode_msb_c-1) = '0') or (decode_aux.is_atomic_lr = '1') then -- normal load or atomic load-reservate
ctrl_nxt(ctrl_bus_rd_c) <= '1'; -- read request
ctrl_nxt(ctrl_bus_rd_c) <= '1'; -- read request
else -- store
ctrl_nxt(ctrl_bus_wr_c) <= '1'; -- write request
if (CPU_EXTENSION_RISCV_A = true) and (decode_aux.is_atomic_sc = '1') then -- evaluate lock state
if (excl_state_i = '1') then -- lock is still ok - perform write access
ctrl_nxt(ctrl_bus_wr_c) <= '1'; -- write request
end if;
else
ctrl_nxt(ctrl_bus_wr_c) <= '1'; -- (normal) write request
end if;
end if;
execute_engine.state_nxt <= LOADSTORE_1;
 
1233,23 → 1248,25
when LOADSTORE_1 => -- memory latency
-- ------------------------------------------------------------
ctrl_nxt(ctrl_bus_mi_we_c) <= '1'; -- write input data to MDI (only relevant for LOAD)
if (CPU_EXTENSION_RISCV_A = true) and (decode_aux.is_atomic_sc = '1') then -- execute and evaluate atomic store-conditional
execute_engine.state_nxt <= ATOMIC_SC_EVAL;
else -- normal load/store
execute_engine.state_nxt <= LOADSTORE_2;
end if;
execute_engine.state_nxt <= LOADSTORE_2;
 
 
when LOADSTORE_2 => -- wait for bus transaction to finish
-- ------------------------------------------------------------
ctrl_nxt(ctrl_bus_mi_we_c) <= '1'; -- keep writing input data to MDI (only relevant for load operations)
ctrl_nxt(ctrl_bus_mi_we_c) <= '1'; -- keep writing input data to MDI (only relevant for load (and SC.W) operations)
ctrl_nxt(ctrl_rf_in_mux_c) <= '1'; -- RF input = memory input (only relevant for LOADs)
-- wait for memory response --
if ((ma_load_i or be_load_i or ma_store_i or be_store_i) = '1') then -- abort if exception
execute_engine.state_nxt <= DISPATCH;
elsif (bus_d_wait_i = '0') then -- wait for bus to finish transaction
-- data write-back
if (execute_engine.i_reg(instr_opcode_msb_c-1) = '0') or (decode_aux.is_atomic_lr = '1') then -- normal load OR atomic load
-- remove atomic lock if this is NOT the LR.W instruction used to SET the lock --
if (CPU_EXTENSION_RISCV_A = true) and (decode_aux.is_atomic_lr = '0') then -- execute and evaluate atomic store-conditional
ctrl_nxt(ctrl_bus_de_lock_c) <= '1';
end if;
-- data write-back --
if (execute_engine.i_reg(instr_opcode_msb_c-1) = '0') or -- normal load
(decode_aux.is_atomic_lr = '1') or -- atomic load-reservate
(decode_aux.is_atomic_sc = '1') then -- atomic store-conditional
ctrl_nxt(ctrl_rf_wb_en_c) <= '1';
end if;
execute_engine.state_nxt <= DISPATCH;
1256,27 → 1273,6
end if;
 
 
when ATOMIC_SC_EVAL => -- wait for bus transaction to finish and evaluate if SC was successful
-- ------------------------------------------------------------
if (CPU_EXTENSION_RISCV_A = true) then
-- atomic.SC: result comes from "atomic co-processor" --
ctrl_nxt(ctrl_cp_id_msb_c downto ctrl_cp_id_lsb_c) <= cp_sel_atomic_c;
execute_engine.is_cp_op_nxt <= '1'; -- this is a CP operation
ctrl_nxt(ctrl_rf_in_mux_c) <= '0'; -- RF input = ALU.res
ctrl_nxt(ctrl_rf_wb_en_c) <= '1'; -- allow reg file write back
-- wait for memory response --
if ((ma_load_i or be_load_i or ma_store_i or be_store_i) = '1') then -- abort if exception
ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_cmd_copro_c; -- trigger atomic-coprocessor operation for SC status evaluation
execute_engine.state_nxt <= ALU_WAIT;
elsif (bus_d_wait_i = '0') then -- wait for bus to finish transaction
ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_cmd_copro_c; -- trigger atomic-coprocessor operation for SC status evaluation
execute_engine.state_nxt <= ALU_WAIT;
end if;
else
execute_engine.state_nxt <= SYS_WAIT;
end if;
 
 
when others => -- undefined
-- ------------------------------------------------------------
execute_engine.state_nxt <= SYS_WAIT;
/rtl/core/neorv32_icache.vhd
60,7 → 60,6
host_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
host_we_i : in std_ulogic; -- write enable
host_re_i : in std_ulogic; -- read enable
host_cancel_i : in std_ulogic; -- cancel current bus transaction
host_ack_o : out std_ulogic; -- bus transfer acknowledge
host_err_o : out std_ulogic; -- bus transfer error
-- peripheral bus interface --
70,7 → 69,6
bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
bus_we_o : out std_ulogic; -- write enable
bus_re_o : out std_ulogic; -- read enable
bus_cancel_o : out std_ulogic; -- cancel current bus transaction
bus_ack_i : in std_ulogic; -- bus transfer acknowledge
bus_err_i : in std_ulogic -- bus transfer error
);
132,17 → 130,15
 
-- control engine --
type ctrl_engine_state_t is (S_IDLE, S_CACHE_CLEAR, S_CACHE_CHECK, S_CACHE_MISS, S_BUS_DOWNLOAD_REQ, S_BUS_DOWNLOAD_GET,
S_CACHE_RESYNC_0, S_CACHE_RESYNC_1, S_BUS_ERROR, S_ERROR, S_HOST_CANCEL);
S_CACHE_RESYNC_0, S_CACHE_RESYNC_1, S_BUS_ERROR);
type ctrl_t is record
state : ctrl_engine_state_t; -- current state
state_nxt : ctrl_engine_state_t; -- next state
addr_reg : std_ulogic_vector(31 downto 0); -- address register for block download
addr_reg_nxt : std_ulogic_vector(31 downto 0);
state : ctrl_engine_state_t; -- current state
state_nxt : ctrl_engine_state_t; -- next state
addr_reg : std_ulogic_vector(31 downto 0); -- address register for block download
addr_reg_nxt : std_ulogic_vector(31 downto 0);
--
re_buf : std_ulogic; -- read request buffer
re_buf_nxt : std_ulogic;
cancel_buf : std_ulogic; -- cancel request buffer
cancel_buf_nxt : std_ulogic;
re_buf : std_ulogic; -- read request buffer
re_buf_nxt : std_ulogic;
end record;
signal ctrl : ctrl_t;
 
165,13 → 161,11
ctrl_engine_fsm_sync_rst: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
ctrl.state <= S_CACHE_CLEAR;
ctrl.re_buf <= '0';
ctrl.cancel_buf <= '0';
ctrl.state <= S_CACHE_CLEAR;
ctrl.re_buf <= '0';
elsif rising_edge(clk_i) then
ctrl.state <= ctrl.state_nxt;
ctrl.re_buf <= ctrl.re_buf_nxt;
ctrl.cancel_buf <= ctrl.cancel_buf_nxt;
ctrl.state <= ctrl.state_nxt;
ctrl.re_buf <= ctrl.re_buf_nxt;
end if;
end process ctrl_engine_fsm_sync_rst;
 
186,13 → 180,12
 
-- Control Engine FSM Comb ----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
ctrl_engine_fsm_comb: process(ctrl, cache, clear_i, host_addr_i, host_re_i, host_cancel_i, bus_rdata_i, bus_ack_i, bus_err_i)
ctrl_engine_fsm_comb: process(ctrl, cache, clear_i, host_addr_i, host_re_i, bus_rdata_i, bus_ack_i, bus_err_i)
begin
-- control defaults --
ctrl.state_nxt <= ctrl.state;
ctrl.addr_reg_nxt <= ctrl.addr_reg;
ctrl.re_buf_nxt <= (ctrl.re_buf or host_re_i) and (not host_cancel_i);
ctrl.cancel_buf_nxt <= ctrl.cancel_buf or host_cancel_i;
ctrl.re_buf_nxt <= ctrl.re_buf or host_re_i;
 
-- cache defaults --
cache.clear <= '0';
216,7 → 209,6
bus_ben_o <= (others => '0'); -- cache is read-only
bus_we_o <= '0'; -- cache is read-only
bus_re_o <= '0';
bus_cancel_o <= '0';
 
-- fsm --
case ctrl.state is
226,9 → 218,8
if (clear_i = '1') then -- cache control operation?
ctrl.state_nxt <= S_CACHE_CLEAR;
elsif (host_re_i = '1') or (ctrl.re_buf = '1') then -- cache access
ctrl.re_buf_nxt <= '0';
ctrl.cancel_buf_nxt <= '0';
ctrl.state_nxt <= S_CACHE_CHECK;
ctrl.re_buf_nxt <= '0';
ctrl.state_nxt <= S_CACHE_CHECK;
end if;
 
when S_CACHE_CLEAR => -- invalidate all cache entries
239,7 → 230,7
when S_CACHE_CHECK => -- finalize host access if cache hit
-- ------------------------------------------------------------
if (cache.hit = '1') then -- cache HIT
host_ack_o <= not ctrl.cancel_buf; -- ACK if request has not been canceled
host_ack_o <= '1';
ctrl.state_nxt <= S_IDLE;
else -- cache MISS
ctrl.state_nxt <= S_CACHE_MISS;
252,14 → 243,11
ctrl.addr_reg_nxt((2+cache_offset_size_c)-1 downto 2) <= (others => '0'); -- block-aligned
ctrl.addr_reg_nxt(1 downto 0) <= "00"; -- word-aligned
--
if (host_cancel_i = '1') or (ctrl.cancel_buf = '1') then -- 'early' CPU cancel (abort before bus transaction has even started)
ctrl.state_nxt <= S_IDLE;
else
ctrl.state_nxt <= S_BUS_DOWNLOAD_REQ;
end if;
ctrl.state_nxt <= S_BUS_DOWNLOAD_REQ;
 
when S_BUS_DOWNLOAD_REQ => -- download new cache block: request new word
-- ------------------------------------------------------------
cache.ctrl_en <= '1'; -- we are in cache control mode
bus_re_o <= '1'; -- request new read transfer
ctrl.state_nxt <= S_BUS_DOWNLOAD_GET;
 
269,8 → 257,6
--
if (bus_err_i = '1') then -- bus error
ctrl.state_nxt <= S_BUS_ERROR;
elsif (ctrl.cancel_buf = '1') then -- 'late' CPU cancel (timeout?)
ctrl.state_nxt <= S_HOST_CANCEL;
elsif (bus_ack_i = '1') then -- ACK = write to cache and get next word
cache.ctrl_we <= '1'; -- write to cache
if (and_all_f(ctrl.addr_reg((2+cache_offset_size_c)-1 downto 2)) = '1') then -- block complete?
289,28 → 275,14
 
when S_CACHE_RESYNC_1 => -- re-sync host/cache access: finalize CPU request
-- ------------------------------------------------------------
host_ack_o <= not ctrl.cancel_buf; -- ACK if request has not been canceled
host_ack_o <= '1';
ctrl.state_nxt <= S_IDLE;
 
when S_BUS_ERROR => -- bus error during download
-- ------------------------------------------------------------
host_err_o <= '1';
ctrl.state_nxt <= S_ERROR;
ctrl.state_nxt <= S_IDLE;
 
when S_ERROR => -- wait for CPU to cancel faulting transfer
-- ------------------------------------------------------------
if (host_cancel_i = '1') then
bus_cancel_o <= '1';
ctrl.state_nxt <= S_IDLE;
end if;
 
when S_HOST_CANCEL => -- host cancels transfer
-- ------------------------------------------------------------
cache.ctrl_en <= '1'; -- we are in cache control mode
cache.ctrl_invalid_we <= '1'; -- invalidate current cache block
bus_cancel_o <= '1';
ctrl.state_nxt <= S_IDLE;
 
when others => -- undefined
-- ------------------------------------------------------------
ctrl.state_nxt <= S_IDLE;
503,7 → 475,7
history.re_ff <= host_re_i;
if (invalidate_i = '1') then -- invalidate whole cache
history.last_used_set <= (others => '1');
elsif (history.re_ff = '1') and (or_all_f(hit) = '1') then -- store last accessed set that caused a hit
elsif (history.re_ff = '1') and (or_all_f(hit) = '1') and (ctrl_en_i = '0') then -- store last accessed set that caused a hit
history.last_used_set(to_integer(unsigned(cache_index))) <= not hit(0);
end if;
history.to_be_replaced <= history.last_used_set(to_integer(unsigned(cache_index)));
/rtl/core/neorv32_mtime.vhd
2,7 → 2,7
-- # << NEORV32 - Machine System Timer (MTIME) >> #
-- # ********************************************************************************************* #
-- # Compatible to RISC-V spec's 64-bit MACHINE system timer including "mtime[h]" & "mtimecmp[h]". #
-- # Note: The 64-bit counter and compare system is broken and de-coupled into two 32-bit systems. #
-- # Note: The 64-bit counter and compare systems are de-coupled into two 32-bit systems. #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License #
-- # #
71,6 → 71,11
signal addr : std_ulogic_vector(31 downto 0); -- access address
signal wren : std_ulogic; -- module access enable
 
-- time write access buffer --
signal wdata_buf : std_ulogic_vector(31 downto 0);
signal mtime_lo_we : std_ulogic;
signal mtime_hi_we : std_ulogic;
 
-- accessible regs --
signal mtimecmp_lo : std_ulogic_vector(31 downto 0);
signal mtimecmp_hi : std_ulogic_vector(31 downto 0);
77,6 → 82,7
signal mtime_lo : std_ulogic_vector(32 downto 0);
signal mtime_lo_msb_ff : std_ulogic;
signal mtime_hi : std_ulogic_vector(31 downto 0);
signal inc_hi : std_ulogic_vector(31 downto 0);
 
-- irq control --
signal cmp_lo : std_ulogic;
98,20 → 104,25
wr_access: process(clk_i)
begin
if rising_edge(clk_i) then
-- mtimecmp low --
if (wren = '1') and (addr = mtime_cmp_lo_addr_c) then
mtimecmp_lo <= data_i;
-- mtimecmp --
if (wren = '1') then
if (addr = mtime_cmp_lo_addr_c) then
mtimecmp_lo <= data_i;
end if;
if (addr = mtime_cmp_hi_addr_c) then
mtimecmp_hi <= data_i;
end if;
end if;
 
-- mtimecmp high --
if (wren = '1') and (addr = mtime_cmp_hi_addr_c) then
mtimecmp_hi <= data_i;
end if;
-- mtime access buffer --
wdata_buf <= data_i;
mtime_lo_we <= wren and bool_to_ulogic_f(boolean(addr = mtime_time_lo_addr_c));
mtime_hi_we <= wren and bool_to_ulogic_f(boolean(addr = mtime_time_hi_addr_c));
 
-- mtime low --
if (wren = '1') and (addr = mtime_time_lo_addr_c) then
if (mtime_lo_we = '1') then -- write access
mtime_lo_msb_ff <= '0';
mtime_lo <= '0' & data_i;
mtime_lo <= '0' & wdata_buf;
else -- auto increment
mtime_lo_msb_ff <= mtime_lo(mtime_lo'left);
mtime_lo <= std_ulogic_vector(unsigned(mtime_lo) + 1);
118,15 → 129,19
end if;
 
-- mtime high --
if (wren = '1') and (addr = mtime_time_hi_addr_c) then
mtime_hi <= data_i;
elsif ((mtime_lo_msb_ff xor mtime_lo(mtime_lo'left)) = '1') then -- auto increment: mtime_lo carry?
mtime_hi <= std_ulogic_vector(unsigned(mtime_hi) + 1);
if (mtime_hi_we = '1') then -- write access
mtime_hi <= wdata_buf;
else -- auto increment (if mtime.low overflows)
mtime_hi <= std_ulogic_vector(unsigned(mtime_hi) + unsigned(inc_hi));
end if;
end if;
end process wr_access;
 
-- mtime.time_HI increment (0 or 1) --
inc_hi(0) <= mtime_lo_msb_ff xor mtime_lo(mtime_lo'left);
inc_hi(31 downto 1) <= (others => '0');
 
 
-- Read Access ----------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
rd_access: process(clk_i)
/rtl/core/neorv32_package.vhd
45,12 → 45,11
constant dspace_base_c : std_ulogic_vector(31 downto 0) := x"80000000"; -- default data memory address space base address
 
-- (external) bus interface --
constant bus_timeout_c : natural := 127; -- cycles after which an *unacknowledged* bus access will timeout and trigger a bus fault exception (min 2)
constant wb_pipe_mode_c : boolean := false; -- *external* bus protocol: false=classic/standard wishbone mode (default), true=pipelined wishbone mode
constant xbus_big_endian_c : boolean := true; -- external memory access byte order: true=big endian (default); false=little endian
 
-- CPU core --
constant ipb_entries_c : natural := 2; -- entries in CPU instruction prefetch buffer, has to be a power of 2, default=2
constant ipb_entries_c : natural := 4; -- entries in CPU instruction prefetch buffer, has to be a power of 2, default=2
constant cp_timeout_en_c : boolean := false; -- auto-terminate pending co-processor operations after 256 cycles (for debugging only), default = false
constant dedicated_reset_c : boolean := false; -- use dedicated hardware reset value for UNCRITICAL registers (FALSE=reset value is irrelevant (might simplify HW), default; TRUE=defined LOW reset value)
 
59,6 → 58,9
-- increasing instruction fetch & data access latency by +1 cycle but also reducing critical path length
constant pmp_num_regions_critical_c : natural := 8; -- default=8
 
-- "response time window" for processor-internal memories and IO devices
constant max_proc_int_response_time_c : natural := 15; -- cycles after which an *unacknowledged* internal bus access will timeout and trigger a bus fault exception (min 2)
 
-- Helper Functions -----------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
function index_size_f(input : natural) return natural;
81,7 → 83,7
-- Architecture Constants (do not modify!) ------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant data_width_c : natural := 32; -- native data path width - do not change!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01050400"; -- no touchy!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01050408"; -- no touchy!
constant archid_c : natural := 19; -- official NEORV32 architecture ID - hands off!
constant rf_r0_is_reg_c : boolean := true; -- x0 is a *physical register* that has to be initialized to zero by the CPU
constant def_rst_val_c : std_ulogic := cond_sel_stdulogic_f(dedicated_reset_c, '0', '-');
263,14 → 265,13
constant ctrl_bus_derr_ack_c : natural := 38; -- acknowledge data access bus exceptions
constant ctrl_bus_fence_c : natural := 39; -- executed fence operation
constant ctrl_bus_fencei_c : natural := 40; -- executed fencei operation
constant ctrl_bus_excl_c : natural := 41; -- exclusive bus access
constant ctrl_bus_lock_c : natural := 41; -- make atomic/exclusive access lock
constant ctrl_bus_de_lock_c : natural := 42; -- remove atomic/exclusive access
constant ctrl_bus_ch_lock_c : natural := 43; -- evaluate atomic/exclusive lock (SC operation)
-- co-processors --
constant ctrl_cp_id_lsb_c : natural := 42; -- cp select ID lsb
constant ctrl_cp_id_hsb_c : natural := 43; -- cp select ID
constant ctrl_cp_id_msb_c : natural := 44; -- cp select ID msb
-- current privilege level --
constant ctrl_priv_lvl_lsb_c : natural := 45; -- privilege level lsb
constant ctrl_priv_lvl_msb_c : natural := 46; -- privilege level msb
constant ctrl_cp_id_lsb_c : natural := 44; -- cp select ID lsb
constant ctrl_cp_id_hsb_c : natural := 45; -- cp select ID
constant ctrl_cp_id_msb_c : natural := 46; -- cp select ID msb
-- instruction's control blocks (used by cpu co-processors) --
constant ctrl_ir_funct3_0_c : natural := 47; -- funct3 bit 0
constant ctrl_ir_funct3_1_c : natural := 48; -- funct3 bit 1
295,9 → 296,12
constant ctrl_ir_opcode7_5_c : natural := 67; -- opcode7 bit 5
constant ctrl_ir_opcode7_6_c : natural := 68; -- opcode7 bit 6
-- CPU status --
constant ctrl_sleep_c : natural := 69; -- set when CPU is in sleep mode
constant ctrl_priv_lvl_lsb_c : natural := 69; -- privilege level lsb
constant ctrl_priv_lvl_msb_c : natural := 70; -- privilege level msb
constant ctrl_sleep_c : natural := 71; -- set when CPU is in sleep mode
constant ctrl_trap_c : natural := 72; -- set when CPU is entering trap execution
-- control bus size --
constant ctrl_width_c : natural := 70; -- control bus size
constant ctrl_width_c : natural := 73; -- control bus size
 
-- Comparator Bus -------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
723,12 → 727,12
 
-- Co-Processor IDs -----------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant cp_sel_muldiv_c : std_ulogic_vector(2 downto 0) := "000"; -- multiplication/division operations ('M' extension)
constant cp_sel_atomic_c : std_ulogic_vector(2 downto 0) := "001"; -- atomic operations; success/failure evaluation ('A' extension)
constant cp_sel_csr_rd_c : std_ulogic_vector(2 downto 0) := "000"; -- CSR read access ('Zicsr' extension)
constant cp_sel_muldiv_c : std_ulogic_vector(2 downto 0) := "001"; -- multiplication/division operations ('M' extension)
constant cp_sel_bitmanip_c : std_ulogic_vector(2 downto 0) := "010"; -- bit manipulation ('B' extension)
constant cp_sel_csr_rd_c : std_ulogic_vector(2 downto 0) := "011"; -- CSR read access ('Zicsr' extension)
constant cp_sel_fpu_c : std_ulogic_vector(2 downto 0) := "100"; -- loating-point unit ('Zfinx' extension)
--constant cp_sel_crypto_c : std_ulogic_vector(2 downto 0) := "101"; -- crypto operations ('K' extension)
constant cp_sel_fpu_c : std_ulogic_vector(2 downto 0) := "011"; -- floating-point unit ('Zfinx' extension)
--constant cp_sel_reserved_c : std_ulogic_vector(2 downto 0) := "100"; -- reserved
--constant cp_sel_reserved_c : std_ulogic_vector(2 downto 0) := "101"; -- reserved
--constant cp_sel_reserved_c : std_ulogic_vector(2 downto 0) := "110"; -- reserved
--constant cp_sel_reserved_c : std_ulogic_vector(2 downto 0) := "111"; -- reserved
 
873,7 → 877,7
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
CPU_EXTENSION_RISCV_U : boolean := false; -- implement user mode extension?
CPU_EXTENSION_RISCV_Zfinx : boolean := false; -- implement 32-bit floating-point extension (using INT reg!)
CPU_EXTENSION_RISCV_Zfinx : boolean := false; -- implement 32-bit floating-point extension (using INT regs!)
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
-- Extension Options --
901,6 → 905,7
ICACHE_ASSOCIATIVITY : natural := 1; -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN : boolean := false; -- implement external memory bus interface?
MEM_EXT_TIMEOUT : natural := 255; -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN : boolean := true; -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN : boolean := true; -- implement machine system timer (MTIME)?
923,7 → 928,7
clk_i : in std_ulogic := '0'; -- global clock, rising edge
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async
-- Wishbone bus interface (available if MEM_EXT_EN = true) --
wb_tag_o : out std_ulogic_vector(03 downto 0); -- request tag
wb_tag_o : out std_ulogic_vector(02 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0) := (others => '0'); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
931,7 → 936,7
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_tag_i : in std_ulogic; -- response tag
wb_lock_o : out std_ulogic; -- exclusive access request
wb_ack_i : in std_ulogic := '0'; -- transfer acknowledge
wb_err_i : in std_ulogic := '0'; -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
984,7 → 989,6
-- General --
HW_THREAD_ID : natural := 0; -- hardware thread id (32-bit)
CPU_BOOT_ADDR : std_ulogic_vector(31 downto 0) := x"00000000"; -- cpu boot address
BUS_TIMEOUT : natural := 63; -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
-- RISC-V CPU Extensions --
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_B : boolean := false; -- implement bit manipulation extensions?
1019,7 → 1023,7
i_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
i_bus_we_o : out std_ulogic; -- write enable
i_bus_re_o : out std_ulogic; -- read enable
i_bus_cancel_o : out std_ulogic := '0'; -- cancel current bus transaction
i_bus_lock_o : out std_ulogic; -- exclusive access request
i_bus_ack_i : in std_ulogic := '0'; -- bus transfer acknowledge
i_bus_err_i : in std_ulogic := '0'; -- bus transfer error
i_bus_fence_o : out std_ulogic; -- executed FENCEI operation
1031,13 → 1035,11
d_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
d_bus_we_o : out std_ulogic; -- write enable
d_bus_re_o : out std_ulogic; -- read enable
d_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
d_bus_lock_o : out std_ulogic; -- exclusive access request
d_bus_ack_i : in std_ulogic := '0'; -- bus transfer acknowledge
d_bus_err_i : in std_ulogic := '0'; -- bus transfer error
d_bus_fence_o : out std_ulogic; -- executed FENCE operation
d_bus_priv_o : out std_ulogic_vector(1 downto 0); -- privilege level
d_bus_excl_o : out std_ulogic; -- exclusive access
d_bus_excl_i : in std_ulogic; -- state of exclusiv access (set if success)
-- system time input from MTIME --
time_i : in std_ulogic_vector(63 downto 0) := (others => '0'); -- current system time
-- interrupts (risc-v compliant) --
1085,6 → 1087,7
alu_wait_i : in std_ulogic; -- wait for ALU
bus_i_wait_i : in std_ulogic; -- wait for bus
bus_d_wait_i : in std_ulogic; -- wait for bus
excl_state_i : in std_ulogic; -- atomic/exclusive access lock status
-- data input --
instr_i : in std_ulogic_vector(data_width_c-1 downto 0); -- instruction
cmp_i : in std_ulogic_vector(1 downto 0); -- comparator status
1236,13 → 1239,11
-- -------------------------------------------------------------------------------------------
component neorv32_cpu_bus
generic (
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_C : boolean := true; -- implement compressed extension?
CPU_EXTENSION_RISCV_A : boolean := false; -- implement atomic extension?
CPU_EXTENSION_RISCV_C : boolean := true; -- implement compressed extension?
-- Physical memory protection (PMP) --
PMP_NUM_REGIONS : natural := 0; -- number of regions (0..64)
PMP_MIN_GRANULARITY : natural := 64*1024; -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
-- Bus Timeout --
BUS_TIMEOUT : natural := 63 -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
PMP_NUM_REGIONS : natural := 0; -- number of regions (0..64)
PMP_MIN_GRANULARITY : natural := 64*1024 -- minimal region granularity in bytes, has to be a power of 2, min 8 bytes
);
port (
-- global control --
1263,7 → 1264,7
mar_o : out std_ulogic_vector(data_width_c-1 downto 0); -- current memory address register
d_wait_o : out std_ulogic; -- wait for access to complete
--
bus_excl_ok_o : out std_ulogic; -- bus exclusive access successful
excl_state_o : out std_ulogic; -- atomic/exclusive access status
ma_load_o : out std_ulogic; -- misaligned load data address
ma_store_o : out std_ulogic; -- misaligned store data address
be_load_o : out std_ulogic; -- bus error on load data access
1278,7 → 1279,7
i_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
i_bus_we_o : out std_ulogic; -- write enable
i_bus_re_o : out std_ulogic; -- read enable
i_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
i_bus_lock_o : out std_ulogic; -- exclusive access request
i_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
i_bus_err_i : in std_ulogic; -- bus transfer error
i_bus_fence_o : out std_ulogic; -- fence operation
1289,15 → 1290,37
d_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
d_bus_we_o : out std_ulogic; -- write enable
d_bus_re_o : out std_ulogic; -- read enable
d_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
d_bus_lock_o : out std_ulogic; -- exclusive access request
d_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
d_bus_err_i : in std_ulogic; -- bus transfer error
d_bus_fence_o : out std_ulogic; -- fence operation
d_bus_excl_o : out std_ulogic; -- exclusive access request
d_bus_excl_i : in std_ulogic -- state of exclusiv access (set if success)
d_bus_fence_o : out std_ulogic -- fence operation
);
end component;
 
-- Component: Bus Keeper ------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
component neorv32_bus_keeper is
generic (
-- Internal instruction memory --
MEM_INT_IMEM_EN : boolean := true; -- implement processor-internal instruction memory
MEM_INT_IMEM_SIZE : natural := 8*1024; -- size of processor-internal instruction memory in bytes
-- Internal data memory --
MEM_INT_DMEM_EN : boolean := true; -- implement processor-internal data memory
MEM_INT_DMEM_SIZE : natural := 8*1024 -- size of processor-internal data memory in bytes
);
port (
-- host access --
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ack_i : in std_ulogic; -- transfer acknowledge from bus system
err_i : in std_ulogic; -- transfer error from bus system
err_o : out std_ulogic -- bus error
);
end component;
 
-- Component: CPU Instruction Cache -------------------------------------------------------
-- -------------------------------------------------------------------------------------------
component neorv32_icache
1318,7 → 1341,6
host_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
host_we_i : in std_ulogic; -- write enable
host_re_i : in std_ulogic; -- read enable
host_cancel_i : in std_ulogic; -- cancel current bus transaction
host_ack_o : out std_ulogic; -- bus transfer acknowledge
host_err_o : out std_ulogic; -- bus transfer error
-- peripheral bus interface --
1328,7 → 1350,6
bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
bus_we_o : out std_ulogic; -- write enable
bus_re_o : out std_ulogic; -- read enable
bus_cancel_o : out std_ulogic; -- cancel current bus transaction
bus_ack_i : in std_ulogic; -- bus transfer acknowledge
bus_err_i : in std_ulogic -- bus transfer error
);
1352,8 → 1373,7
ca_bus_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
ca_bus_we_i : in std_ulogic; -- write enable
ca_bus_re_i : in std_ulogic; -- read enable
ca_bus_cancel_i : in std_ulogic; -- cancel current bus transaction
ca_bus_excl_i : in std_ulogic; -- exclusive access
ca_bus_lock_i : in std_ulogic; -- exclusive access request
ca_bus_ack_o : out std_ulogic; -- bus transfer acknowledge
ca_bus_err_o : out std_ulogic; -- bus transfer error
-- controller interface b --
1363,8 → 1383,7
cb_bus_ben_i : in std_ulogic_vector(03 downto 0); -- byte enable
cb_bus_we_i : in std_ulogic; -- write enable
cb_bus_re_i : in std_ulogic; -- read enable
cb_bus_cancel_i : in std_ulogic; -- cancel current bus transaction
cb_bus_excl_i : in std_ulogic; -- exclusive access
cb_bus_lock_i : in std_ulogic; -- exclusive access request
cb_bus_ack_o : out std_ulogic; -- bus transfer acknowledge
cb_bus_err_o : out std_ulogic; -- bus transfer error
-- peripheral bus --
1375,8 → 1394,7
p_bus_ben_o : out std_ulogic_vector(03 downto 0); -- byte enable
p_bus_we_o : out std_ulogic; -- write enable
p_bus_re_o : out std_ulogic; -- read enable
p_bus_cancel_o : out std_ulogic; -- cancel current bus transaction
p_bus_excl_o : out std_ulogic; -- exclusive access
p_bus_lock_o : out std_ulogic; -- exclusive access request
p_bus_ack_i : in std_ulogic; -- bus transfer acknowledge
p_bus_err_i : in std_ulogic -- bus transfer error
);
1635,38 → 1653,38
MEM_INT_IMEM_SIZE : natural := 8*1024; -- size of processor-internal instruction memory in bytes
-- Internal data memory --
MEM_INT_DMEM_EN : boolean := true; -- implement processor-internal data memory
MEM_INT_DMEM_SIZE : natural := 4*1024 -- size of processor-internal data memory in bytes
MEM_INT_DMEM_SIZE : natural := 4*1024; -- size of processor-internal data memory in bytes
-- Bus Timeout --
BUS_TIMEOUT : natural := 63 -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
);
port (
-- global control --
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
-- host access --
src_i : in std_ulogic; -- access type (0: data, 1:instruction)
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable
data_i : in std_ulogic_vector(31 downto 0); -- data in
data_o : out std_ulogic_vector(31 downto 0); -- data out
cancel_i : in std_ulogic; -- cancel current bus transaction
excl_i : in std_ulogic; -- exclusive access request
excl_o : out std_ulogic; -- state of exclusiv access (set if success)
ack_o : out std_ulogic; -- transfer acknowledge
err_o : out std_ulogic; -- transfer error
priv_i : in std_ulogic_vector(01 downto 0); -- current CPU privilege level
src_i : in std_ulogic; -- access type (0: data, 1:instruction)
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable
data_i : in std_ulogic_vector(31 downto 0); -- data in
data_o : out std_ulogic_vector(31 downto 0); -- data out
lock_i : in std_ulogic; -- exclusive access request
ack_o : out std_ulogic; -- transfer acknowledge
err_o : out std_ulogic; -- transfer error
priv_i : in std_ulogic_vector(01 downto 0); -- current CPU privilege level
-- wishbone interface --
wb_tag_o : out std_ulogic_vector(03 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
wb_we_o : out std_ulogic; -- read/write
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_tag_i : in std_ulogic; -- response tag
wb_ack_i : in std_ulogic; -- transfer acknowledge
wb_err_i : in std_ulogic -- transfer error
wb_tag_o : out std_ulogic_vector(02 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
wb_we_o : out std_ulogic; -- read/write
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_lock_o : out std_ulogic; -- exclusive access request
wb_ack_i : in std_ulogic; -- transfer acknowledge
wb_err_i : in std_ulogic -- transfer error
);
end component;
 
/rtl/core/neorv32_sysinfo.vhd
131,7 → 131,9
sysinfo_mem(2)(05) <= bool_to_ulogic_f(xbus_big_endian_c); -- is external memory bus interface using BIG-endian byte-order?
sysinfo_mem(2)(06) <= bool_to_ulogic_f(ICACHE_EN); -- processor-internal instruction cache implemented?
--
sysinfo_mem(2)(15 downto 07) <= (others => '0'); -- reserved
sysinfo_mem(2)(14 downto 07) <= (others => '0'); -- reserved
-- Misc --
sysinfo_mem(2)(15) <= bool_to_ulogic_f(dedicated_reset_c); -- dedicated hardware reset of all core registers?
-- IO --
sysinfo_mem(2)(16) <= bool_to_ulogic_f(IO_GPIO_EN); -- general purpose input/output port unit (GPIO) implemented?
sysinfo_mem(2)(17) <= bool_to_ulogic_f(IO_MTIME_EN); -- machine system timer (MTIME) implemented?
/rtl/core/neorv32_top.vhd
60,7 → 60,7
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
CPU_EXTENSION_RISCV_U : boolean := false; -- implement user mode extension?
CPU_EXTENSION_RISCV_Zfinx : boolean := false; -- implement 32-bit floating-point extension (using INT reg!)
CPU_EXTENSION_RISCV_Zfinx : boolean := false; -- implement 32-bit floating-point extension (using INT regs!)
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
 
95,6 → 95,7
 
-- External memory interface --
MEM_EXT_EN : boolean := false; -- implement external memory bus interface?
MEM_EXT_TIMEOUT : natural := 255; -- cycles after a pending bus access auto-terminates (0 = disabled)
 
-- Processor peripherals --
IO_GPIO_EN : boolean := true; -- implement general purpose input/output port unit (GPIO)?
119,7 → 120,7
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async
 
-- Wishbone bus interface (available if MEM_EXT_EN = true) --
wb_tag_o : out std_ulogic_vector(03 downto 0); -- request tag
wb_tag_o : out std_ulogic_vector(02 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0) := (others => '0'); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
127,7 → 128,7
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_tag_i : in std_ulogic := '0'; -- response tag
wb_lock_o : out std_ulogic; -- exclusive access request
wb_ack_i : in std_ulogic := '0'; -- transfer acknowledge
wb_err_i : in std_ulogic := '0'; -- transfer error
 
190,10 → 191,6
-- CPU boot address --
constant cpu_boot_addr_c : std_ulogic_vector(31 downto 0) := cond_sel_stdulogicvector_f(BOOTLOADER_EN, boot_rom_base_c, ispace_base_c);
 
-- Bus timeout --
constant bus_timeout_temp_c : natural := 2**index_size_f(bus_timeout_c); -- round to next power-of-two
constant bus_timeout_proc_c : natural := cond_sel_natural_f(ICACHE_EN, ((ICACHE_BLOCK_SIZE/4)*bus_timeout_temp_c)-1, bus_timeout_c);
 
-- alignment check for internal memories --
constant imem_align_check_c : std_ulogic_vector(index_size_f(MEM_INT_IMEM_SIZE)-1 downto 0) := (others => '0');
constant dmem_align_check_c : std_ulogic_vector(index_size_f(MEM_INT_DMEM_SIZE)-1 downto 0) := (others => '0');
231,16 → 228,14
ben : std_ulogic_vector(03 downto 0); -- byte enable
we : std_ulogic; -- write enable
re : std_ulogic; -- read enable
cancel : std_ulogic; -- cancel current transfer
ack : std_ulogic; -- bus transfer acknowledge
err : std_ulogic; -- bus transfer error
fence : std_ulogic; -- fence(i) instruction executed
priv : std_ulogic_vector(1 downto 0); -- current privilege level
src : std_ulogic; -- access source (1=instruction fetch, 0=data access)
excl : std_ulogic; -- exclusive access
lock : std_ulogic; -- exclusive access request
end record;
signal cpu_i, i_cache, cpu_d, p_bus : bus_interface_t;
signal cpu_d_exclr : std_ulogic; -- CPU D-bus, exclusive access response
 
-- io space access --
signal io_acc : std_ulogic;
257,7 → 252,6
signal wishbone_rdata : std_ulogic_vector(data_width_c-1 downto 0);
signal wishbone_ack : std_ulogic;
signal wishbone_err : std_ulogic;
signal wishbone_exclr : std_ulogic;
signal gpio_rdata : std_ulogic_vector(data_width_c-1 downto 0);
signal gpio_ack : std_ulogic;
signal mtime_rdata : std_ulogic_vector(data_width_c-1 downto 0);
284,6 → 278,7
signal neoled_ack : std_ulogic;
signal sysinfo_rdata : std_ulogic_vector(data_width_c-1 downto 0);
signal sysinfo_ack : std_ulogic;
signal bus_keeper_err : std_ulogic;
 
-- IRQs --
signal mtime_irq : std_ulogic;
332,10 → 327,7
assert not (dspace_base_c /= x"80000000") report "NEORV32 PROCESSOR CONFIG WARNING! Non-default base address for data address space. Make sure this is sync with the software framework." severity warning;
-- memory system - the i-cache is intended to accelerate instruction fetch via the external memory interface only --
assert not ((ICACHE_EN = true) and (MEM_EXT_EN = false)) report "NEORV32 PROCESSOR CONFIG NOTE. Implementing i-cache without having the external memory interface implemented. The i-cache is intended to accelerate instruction fetch via the external memory interface." severity note;
-- memory system - cached instruction fetch latency check --
assert not (ICACHE_EN = true) report "NEORV32 PROCESSOR CONFIG WARNING! Implementing i-cache. Increasing bus access timeout from " & integer'image(bus_timeout_c) & " cycles to " & integer'image(bus_timeout_proc_c) & " cycles." severity warning;
 
 
-- Reset Generator ------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
reset_generator_sync: process(clk_i)
411,7 → 403,6
-- General --
HW_THREAD_ID => HW_THREAD_ID, -- hardware thread id
CPU_BOOT_ADDR => cpu_boot_addr_c, -- cpu boot address
BUS_TIMEOUT => bus_timeout_proc_c, -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
-- RISC-V CPU Extensions --
CPU_EXTENSION_RISCV_A => CPU_EXTENSION_RISCV_A, -- implement atomic extension?
CPU_EXTENSION_RISCV_B => CPU_EXTENSION_RISCV_B, -- implement bit manipulation extensions?
445,7 → 436,7
i_bus_ben_o => cpu_i.ben, -- byte enable
i_bus_we_o => cpu_i.we, -- write enable
i_bus_re_o => cpu_i.re, -- read enable
i_bus_cancel_o => cpu_i.cancel, -- cancel current bus transaction
i_bus_lock_o => cpu_i.lock, -- exclusive access request
i_bus_ack_i => cpu_i.ack, -- bus transfer acknowledge
i_bus_err_i => cpu_i.err, -- bus transfer error
i_bus_fence_o => cpu_i.fence, -- executed FENCEI operation
457,13 → 448,11
d_bus_ben_o => cpu_d.ben, -- byte enable
d_bus_we_o => cpu_d.we, -- write enable
d_bus_re_o => cpu_d.re, -- read enable
d_bus_cancel_o => cpu_d.cancel, -- cancel current bus transaction
d_bus_lock_o => cpu_d.lock, -- exclusive access request
d_bus_ack_i => cpu_d.ack, -- bus transfer acknowledge
d_bus_err_i => cpu_d.err, -- bus transfer error
d_bus_fence_o => cpu_d.fence, -- executed FENCE operation
d_bus_priv_o => cpu_d.priv, -- privilege level
d_bus_excl_o => cpu_d.excl, -- exclusive access
d_bus_excl_i => cpu_d_exclr, -- state of exclusiv access (set if success)
-- system time input from MTIME --
time_i => mtime_time, -- current system time
-- interrupts (risc-v compliant) --
476,9 → 465,8
);
 
-- misc --
cpu_i.excl <= '0'; -- i-fetch cannot do exclusive accesses
cpu_i.src <= '1'; -- initialized but unused
cpu_d.src <= '0'; -- initialized but unused
cpu_i.src <= '1'; -- initialized but unused
cpu_d.src <= '0'; -- initialized but unused
 
-- advanced memory control --
fence_o <= cpu_d.fence; -- indicates an executed FENCE operation
530,7 → 518,6
host_ben_i => cpu_i.ben, -- byte enable
host_we_i => cpu_i.we, -- write enable
host_re_i => cpu_i.re, -- read enable
host_cancel_i => cpu_i.cancel, -- cancel current bus transaction
host_ack_o => cpu_i.ack, -- bus transfer acknowledge
host_err_o => cpu_i.err, -- bus transfer error
-- peripheral bus interface --
540,29 → 527,27
bus_ben_o => i_cache.ben, -- byte enable
bus_we_o => i_cache.we, -- write enable
bus_re_o => i_cache.re, -- read enable
bus_cancel_o => i_cache.cancel, -- cancel current bus transaction
bus_ack_i => i_cache.ack, -- bus transfer acknowledge
bus_err_i => i_cache.err -- bus transfer error
);
end generate;
 
-- TODO: do not use LOCKED instruction fetch --
i_cache.lock <= '0';
 
neorv32_icache_inst_false:
if (ICACHE_EN = false) generate
i_cache.addr <= cpu_i.addr;
cpu_i.rdata <= i_cache.rdata;
i_cache.wdata <= cpu_i.wdata;
i_cache.ben <= cpu_i.ben;
i_cache.we <= cpu_i.we;
i_cache.re <= cpu_i.re;
i_cache.cancel <= cpu_i.cancel;
cpu_i.ack <= i_cache.ack;
cpu_i.err <= i_cache.err;
i_cache.addr <= cpu_i.addr;
cpu_i.rdata <= i_cache.rdata;
i_cache.wdata <= cpu_i.wdata;
i_cache.ben <= cpu_i.ben;
i_cache.we <= cpu_i.we;
i_cache.re <= cpu_i.re;
cpu_i.ack <= i_cache.ack;
cpu_i.err <= i_cache.err;
end generate;
 
-- no exclusive accesses for i-fetch --
i_cache.excl <= '0';
 
 
-- CPU Bus Switch -------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_busswitch_inst: neorv32_busswitch
581,8 → 566,7
ca_bus_ben_i => cpu_d.ben, -- byte enable
ca_bus_we_i => cpu_d.we, -- write enable
ca_bus_re_i => cpu_d.re, -- read enable
ca_bus_cancel_i => cpu_d.cancel, -- cancel current bus transaction
ca_bus_excl_i => cpu_d.excl, -- exclusive access
ca_bus_lock_i => cpu_d.lock, -- exclusive access request
ca_bus_ack_o => cpu_d.ack, -- bus transfer acknowledge
ca_bus_err_o => cpu_d.err, -- bus transfer error
-- controller interface b --
592,8 → 576,7
cb_bus_ben_i => i_cache.ben, -- byte enable
cb_bus_we_i => i_cache.we, -- write enable
cb_bus_re_i => i_cache.re, -- read enable
cb_bus_cancel_i => i_cache.cancel, -- cancel current bus transaction
cb_bus_excl_i => i_cache.excl, -- exclusive access
cb_bus_lock_i => i_cache.lock, -- exclusive access request
cb_bus_ack_o => i_cache.ack, -- bus transfer acknowledge
cb_bus_err_o => i_cache.err, -- bus transfer error
-- peripheral bus --
604,8 → 587,7
p_bus_ben_o => p_bus.ben, -- byte enable
p_bus_we_o => p_bus.we, -- write enable
p_bus_re_o => p_bus.re, -- read enable
p_bus_cancel_o => p_bus.cancel, -- cancel current bus transaction
p_bus_excl_o => p_bus.excl, -- exclusive access
p_bus_lock_o => p_bus.lock, -- exclusive access request
p_bus_ack_i => p_bus.ack, -- bus transfer acknowledge
p_bus_err_i => p_bus.err -- bus transfer error
);
622,13 → 604,33
spi_ack or twi_ack or pwm_ack or wdt_ack or trng_ack or cfs_ack or nco_ack or neoled_ack or sysinfo_ack);
 
-- processor bus: CPU transfer data bus error input --
p_bus.err <= wishbone_err;
p_bus.err <= bus_keeper_err or wishbone_err;
 
-- exclusive access status --
-- since all internal modules/memories are only accessible to this CPU internal atomic access cannot fail
cpu_d_exclr <= wishbone_exclr; -- only external atomic memory accesses can fail
 
-- Processor-Internal Bus Keeper (BUSKEEPER) ----------------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_bus_keeper_inst: neorv32_bus_keeper
generic map (
-- Internal instruction memory --
MEM_INT_IMEM_EN => MEM_INT_IMEM_EN, -- implement processor-internal instruction memory
MEM_INT_IMEM_SIZE => MEM_INT_IMEM_SIZE, -- size of processor-internal instruction memory in bytes
-- Internal data memory --
MEM_INT_DMEM_EN => MEM_INT_DMEM_EN, -- implement processor-internal data memory
MEM_INT_DMEM_SIZE => MEM_INT_DMEM_SIZE -- size of processor-internal data memory in bytes
)
port map (
-- host access --
clk_i => clk_i, -- global clock line
rstn_i => sys_rstn, -- global reset line, low-active
addr_i => p_bus.addr, -- address
rden_i => p_bus.re, -- read enable
wren_i => p_bus.we, -- write enable
ack_i => p_bus.ack, -- transfer acknowledge from bus system
err_i => p_bus.err, -- transfer error from bus system
err_o => bus_keeper_err -- bus error
);
 
 
-- Processor-Internal Instruction Memory (IMEM) -------------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_int_imem_inst_true:
724,7 → 726,9
MEM_INT_IMEM_SIZE => MEM_INT_IMEM_SIZE, -- size of processor-internal instruction memory in bytes
-- Internal data memory --
MEM_INT_DMEM_EN => MEM_INT_DMEM_EN, -- implement processor-internal data memory
MEM_INT_DMEM_SIZE => MEM_INT_DMEM_SIZE -- size of processor-internal data memory in bytes
MEM_INT_DMEM_SIZE => MEM_INT_DMEM_SIZE, -- size of processor-internal data memory in bytes
-- Bus Timeout --
BUS_TIMEOUT => MEM_EXT_TIMEOUT -- cycles after an UNACKNOWLEDGED bus access triggers a bus fault exception
)
port map (
-- global control --
738,9 → 742,7
ben_i => p_bus.ben, -- byte write enable
data_i => p_bus.wdata, -- data in
data_o => wishbone_rdata, -- data out
cancel_i => p_bus.cancel, -- cancel current transaction
excl_i => p_bus.excl, -- exclusive access request
excl_o => wishbone_exclr, -- state of exclusiv access (set if success)
lock_i => p_bus.lock, -- exclusive access request
ack_o => wishbone_ack, -- transfer acknowledge
err_o => wishbone_err, -- transfer error
priv_i => p_bus.priv, -- current CPU privilege level
753,7 → 755,7
wb_sel_o => wb_sel_o, -- byte enable
wb_stb_o => wb_stb_o, -- strobe
wb_cyc_o => wb_cyc_o, -- valid cycle
wb_tag_i => wb_tag_i, -- response tag
wb_lock_o => wb_lock_o, -- exclusive access request
wb_ack_i => wb_ack_i, -- transfer acknowledge
wb_err_i => wb_err_i -- transfer error
);
764,7 → 766,6
wishbone_rdata <= (others => '0');
wishbone_ack <= '0';
wishbone_err <= '0';
wishbone_exclr <= '0';
--
wb_adr_o <= (others => '0');
wb_dat_o <= (others => '0');
/rtl/core/neorv32_wishbone.vhd
2,7 → 2,7
-- # << NEORV32 - External Bus Interface (WISHBONE) >> #
-- # ********************************************************************************************* #
-- # The interface provides registers for all outgoing and for all incoming signals. If the host #
-- # cancels an activetransfer, the Wishbone arbiter still waits some time for the bus system to #
-- # cancels an active transfer, the Wishbone arbiter still waits some time for the bus system to #
-- # ACK/ERR the transfer before the arbiter forces termination. #
-- # #
-- # Even when all processor-internal memories and IO devices are disabled, the EXTERNAL address #
9,7 → 9,7
-- # space ENDS at address 0xffff0000 (begin of internal BOOTROM address space). #
-- # #
-- # All bus accesses from the CPU, which do not target the internal IO region / the internal #
-- # bootlloader / the internal instruction or data memories (if implemented), are delegated via #
-- # bootloader / the internal instruction or data memories (if implemented), are delegated via #
-- # this Wishbone gateway to the external bus interface. Accessed peripherals can have a response #
-- # latency of up to BUS_TIMEOUT - 2 cycles. #
-- # #
68,41 → 68,39
);
port (
-- global control --
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
clk_i : in std_ulogic; -- global clock line
rstn_i : in std_ulogic; -- global reset line, low-active
-- host access --
src_i : in std_ulogic; -- access type (0: data, 1:instruction)
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable
data_i : in std_ulogic_vector(31 downto 0); -- data in
data_o : out std_ulogic_vector(31 downto 0); -- data out
cancel_i : in std_ulogic; -- cancel current bus transaction
excl_i : in std_ulogic; -- exclusive access request
excl_o : out std_ulogic; -- state of exclusiv access (set if failed)
ack_o : out std_ulogic; -- transfer acknowledge
err_o : out std_ulogic; -- transfer error
priv_i : in std_ulogic_vector(01 downto 0); -- current CPU privilege level
src_i : in std_ulogic; -- access type (0: data, 1:instruction)
addr_i : in std_ulogic_vector(31 downto 0); -- address
rden_i : in std_ulogic; -- read enable
wren_i : in std_ulogic; -- write enable
ben_i : in std_ulogic_vector(03 downto 0); -- byte write enable
data_i : in std_ulogic_vector(31 downto 0); -- data in
data_o : out std_ulogic_vector(31 downto 0); -- data out
lock_i : in std_ulogic; -- exclusive access request
ack_o : out std_ulogic; -- transfer acknowledge
err_o : out std_ulogic; -- transfer error
priv_i : in std_ulogic_vector(01 downto 0); -- current CPU privilege level
-- wishbone interface --
wb_tag_o : out std_ulogic_vector(03 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
wb_we_o : out std_ulogic; -- read/write
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_tag_i : in std_ulogic; -- response tag
wb_ack_i : in std_ulogic; -- transfer acknowledge
wb_err_i : in std_ulogic -- transfer error
wb_tag_o : out std_ulogic_vector(02 downto 0); -- request tag
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
wb_dat_i : in std_ulogic_vector(31 downto 0); -- read data
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
wb_we_o : out std_ulogic; -- read/write
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_ulogic; -- strobe
wb_cyc_o : out std_ulogic; -- valid cycle
wb_lock_o : out std_ulogic; -- exclusive access request
wb_ack_i : in std_ulogic; -- transfer acknowledge
wb_err_i : in std_ulogic -- transfer error
);
end neorv32_wishbone;
 
architecture neorv32_wishbone_rtl of neorv32_wishbone is
 
-- constants --
constant xbus_timeout_c : natural := BUS_TIMEOUT/4;
-- timeout enable --
constant timeout_en_c : boolean := boolean(BUS_TIMEOUT /= 0); -- timeout enabled if BUS_TIMEOUT > 0
 
-- access control --
signal int_imem_acc : std_ulogic;
111,7 → 109,7
signal xbus_access : std_ulogic;
 
-- bus arbiter
type ctrl_state_t is (IDLE, BUSY, CANCELED, RESYNC);
type ctrl_state_t is (IDLE, BUSY, RESYNC);
type ctrl_t is record
state : ctrl_state_t;
we : std_ulogic;
123,10 → 121,9
sel : std_ulogic_vector(3 downto 0);
ack : std_ulogic;
err : std_ulogic;
timeout : std_ulogic_vector(index_size_f(xbus_timeout_c)-1 downto 0);
timeout : std_ulogic_vector(index_size_f(BUS_TIMEOUT)-1 downto 0);
src : std_ulogic;
excl : std_ulogic;
exclr : std_ulogic; -- response
lock : std_ulogic;
priv : std_ulogic_vector(1 downto 0);
end record;
signal ctrl : ctrl_t;
137,9 → 134,10
 
-- Sanity Checks --------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
-- max bus timeout latency lower than recommended --
assert not (BUS_TIMEOUT <= 32) report "NEORV32 PROCESSOR CONFIG WARNING: Bus timeout should be >32 when using external bus interface." severity warning;
-- external memory iterface protocol + max timeout latency notifier (warning) --
-- bus timeout --
assert not (BUS_TIMEOUT /= 0) report "NEORV32 PROCESSOR CONFIG WARNING: Using auto-timeout for external bus interface (" & integer'image(BUS_TIMEOUT) & " cycles)." severity warning;
assert not (BUS_TIMEOUT /= 0) report "NEORV32 PROCESSOR CONFIG WARNING: Using no auto-timeout for external bus interface (might cause permanent CPU stall)." severity warning;
-- external memory interface protocol + max timeout latency notifier (warning) --
assert not (wb_pipe_mode_c = false) report "NEORV32 PROCESSOR CONFIG NOTE: Implementing external memory interface using STANDARD Wishbone protocol." severity note;
assert not (wb_pipe_mode_c = true) report "NEORV32 PROCESSOR CONFIG NOTE! Implementing external memory interface using PIEPLINED Wishbone protocol." severity note;
-- endianness --
163,27 → 161,25
begin
if (rstn_i = '0') then
ctrl.state <= IDLE;
ctrl.we <= '0';
ctrl.we <= def_rst_val_c;
ctrl.rd_req <= '0';
ctrl.wr_req <= '0';
ctrl.adr <= (others => '0');
ctrl.wdat <= (others => '0');
ctrl.rdat <= (others => '0');
ctrl.sel <= (others => '0');
ctrl.timeout <= (others => '0');
ctrl.ack <= '0';
ctrl.err <= '0';
ctrl.src <= '0';
ctrl.excl <= '0';
ctrl.exclr <= '0';
ctrl.priv <= "00";
ctrl.adr <= (others => def_rst_val_c);
ctrl.wdat <= (others => def_rst_val_c);
ctrl.rdat <= (others => def_rst_val_c);
ctrl.sel <= (others => def_rst_val_c);
ctrl.timeout <= (others => def_rst_val_c);
ctrl.ack <= def_rst_val_c;
ctrl.err <= def_rst_val_c;
ctrl.src <= def_rst_val_c;
ctrl.lock <= def_rst_val_c;
ctrl.priv <= (others => def_rst_val_c);
elsif rising_edge(clk_i) then
-- defaults --
ctrl.rdat <= (others => '0');
ctrl.ack <= '0';
ctrl.err <= '0';
ctrl.exclr <= '0';
ctrl.timeout <= std_ulogic_vector(to_unsigned(xbus_timeout_c, index_size_f(xbus_timeout_c)));
ctrl.timeout <= std_ulogic_vector(to_unsigned(BUS_TIMEOUT, index_size_f(BUS_TIMEOUT)));
 
-- state machine --
case ctrl.state is
203,36 → 199,29
ctrl.sel <= bit_rev_f(ben_i);
end if;
ctrl.src <= src_i;
ctrl.excl <= excl_i;
ctrl.lock <= lock_i;
ctrl.priv <= priv_i;
-- valid new or buffered read/write request --
if ((xbus_access and (wren_i or ctrl.wr_req or rden_i or ctrl.rd_req) and (not cancel_i)) = '1') then
if ((xbus_access and (wren_i or ctrl.wr_req or rden_i or ctrl.rd_req)) = '1') then
ctrl.state <= BUSY;
end if;
 
when BUSY => -- transfer in progress
-- ------------------------------------------------------------
ctrl.rdat <= wb_dat_i;
ctrl.exclr <= wb_tag_i; -- set if exclusive access success
if (cancel_i = '1') then -- transfer canceled by host
ctrl.state <= CANCELED;
elsif (wb_err_i = '1') then -- abnormal bus termination
ctrl.rdat <= wb_dat_i;
if (wb_err_i = '1') then -- abnormal bus termination
ctrl.err <= '1';
ctrl.state <= CANCELED;
ctrl.state <= IDLE;
elsif (wb_ack_i = '1') then -- normal bus termination
ctrl.ack <= '1';
ctrl.state <= IDLE;
elsif (timeout_en_c = true) and (or_all_f(ctrl.timeout) = '0') then -- valid timeout
ctrl.err <= '1';
ctrl.state <= IDLE;
end if;
 
when CANCELED => -- wait for cycle to be completed either by peripheral or by timeout (ignore result of transfer)
-- ------------------------------------------------------------
ctrl.wr_req <= ctrl.wr_req or wren_i; -- buffer new request
ctrl.rd_req <= ctrl.rd_req or rden_i; -- buffer new request
-- wait for bus.peripheral to ACK transfer (as "aborted" but still somehow "completed")
-- or wait for a timeout and force termination
ctrl.timeout <= std_ulogic_vector(unsigned(ctrl.timeout) - 1); -- timeout counter
if (wb_ack_i = '1') or (or_all_f(ctrl.timeout) = '0') then
ctrl.state <= RESYNC;
-- timeout counter --
if (timeout_en_c = true) then
ctrl.timeout <= std_ulogic_vector(unsigned(ctrl.timeout) - 1); -- timeout counter
end if;
 
when RESYNC => -- make sure transfer is done!
255,14 → 244,14
data_o <= ctrl.rdat when (xbus_big_endian_c = true) else bswap32_f(ctrl.rdat); -- endianness conversion
ack_o <= ctrl.ack;
err_o <= ctrl.err;
excl_o <= ctrl.exclr;
 
-- wishbone interface --
wb_tag_o(0) <= '1' when (ctrl.priv = priv_mode_m_c) else '0'; -- privileged access when in machine mode
wb_tag_o(1) <= '0'; -- 0 = secure, 1 = non-secure
wb_tag_o(2) <= ctrl.src; -- 0 = data access, 1 = instruction access
wb_tag_o(3) <= ctrl.excl; -- 1 = exclusive access request
 
wb_lock_o <= ctrl.lock; -- 1 = exclusive access request
 
wb_adr_o <= ctrl.adr;
wb_dat_o <= ctrl.wdat;
wb_we_o <= ctrl.we;
/rtl/top_templates/neorv32_test_setup.vhd
108,6 → 108,7
ICACHE_ASSOCIATIVITY => 1, -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN => false, -- implement external memory bus interface?
MEM_EXT_TIMEOUT => 0, -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN => true, -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN => true, -- implement machine system timer (MTIME)?
138,7 → 139,7
wb_sel_o => open, -- byte enable
wb_stb_o => open, -- strobe
wb_cyc_o => open, -- valid cycle
wb_tag_i => '0', -- response tag
wb_lock_o => open, -- exclusive access request
wb_ack_i => '0', -- transfer acknowledge
wb_err_i => '0', -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
/rtl/top_templates/neorv32_top_axi4lite.vhd
215,16 → 215,17
 
-- internal wishbone bus --
type wb_bus_t is record
adr : std_ulogic_vector(31 downto 0); -- address
di : std_ulogic_vector(31 downto 0); -- processor input data
do : std_ulogic_vector(31 downto 0); -- processor output data
we : std_ulogic; -- write enable
sel : std_ulogic_vector(03 downto 0); -- byte enable
stb : std_ulogic; -- strobe
cyc : std_ulogic; -- valid cycle
ack : std_ulogic; -- transfer acknowledge
err : std_ulogic; -- transfer error
tag : std_ulogic_vector(03 downto 0); -- tag
adr : std_ulogic_vector(31 downto 0); -- address
di : std_ulogic_vector(31 downto 0); -- processor input data
do : std_ulogic_vector(31 downto 0); -- processor output data
we : std_ulogic; -- write enable
sel : std_ulogic_vector(03 downto 0); -- byte enable
stb : std_ulogic; -- strobe
cyc : std_ulogic; -- valid cycle
ack : std_ulogic; -- transfer acknowledge
err : std_ulogic; -- transfer error
tag : std_ulogic_vector(02 downto 0); -- tag
lock : std_ulogic; -- exclusive access request
end record;
signal wb_core : wb_bus_t;
 
244,7 → 245,7
-- Sanity Checks --------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
assert not (wb_pipe_mode_c = true) report "NEORV32 PROCESSOR CONFIG ERROR: AXI4-Lite bridge requires STANDARD/CLASSIC Wishbone mode (package.wb_pipe_mode_c = false)." severity error;
assert not (CPU_EXTENSION_RISCV_A = true) report "NEORV32 PROCESSOR CONFIG WARNING: AXI4-Lite provides NO support for atomic memory operations." severity warning;
assert not (CPU_EXTENSION_RISCV_A = true) report "NEORV32 PROCESSOR CONFIG WARNING: AXI4-Lite provides NO support for atomic memory operations. LR/SC access via AXI will raise a bus exception." severity warning;
 
 
-- The Core Of The Problem ----------------------------------------------------------------
291,6 → 292,7
ICACHE_ASSOCIATIVITY => ICACHE_ASSOCIATIVITY, -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN => true, -- implement external memory bus interface?
MEM_EXT_TIMEOUT => 0, -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN => IO_GPIO_EN, -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN => IO_MTIME_EN, -- implement machine system timer (MTIME)?
321,7 → 323,7
wb_sel_o => wb_core.sel, -- byte enable
wb_stb_o => wb_core.stb, -- strobe
wb_cyc_o => wb_core.cyc, -- valid cycle
wb_tag_i => '0', -- response tag
wb_lock_o => wb_core.lock, -- exclusive access request
wb_ack_i => wb_core.ack, -- transfer acknowledge
wb_err_i => wb_core.err, -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
474,7 → 476,7
 
-- Wishbone transfer termination --
wb_core.ack <= ack_read or ack_write;
wb_core.err <= (ack_read and err_read) or (ack_write and err_write);
wb_core.err <= (ack_read and err_read) or (ack_write and err_write) or wb_core.lock;
 
 
end neorv32_top_axi4lite_rtl;
/rtl/top_templates/neorv32_top_stdlogic.vhd
81,6 → 81,7
ICACHE_ASSOCIATIVITY : natural := 1; -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN : boolean := false; -- implement external memory bus interface?
MEM_EXT_TIMEOUT : natural := 255; -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN : boolean := true; -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN : boolean := true; -- implement machine system timer (MTIME)?
103,7 → 104,7
clk_i : in std_logic := '0'; -- global clock, rising edge
rstn_i : in std_logic := '0'; -- global reset, low-active, async
-- Wishbone bus interface (available if MEM_EXT_EN = true) --
wb_tag_o : out std_logic_vector(03 downto 0); -- tag
wb_tag_o : out std_logic_vector(02 downto 0); -- tag
wb_adr_o : out std_logic_vector(31 downto 0); -- address
wb_dat_i : in std_logic_vector(31 downto 0) := (others => '0'); -- read data
wb_dat_o : out std_logic_vector(31 downto 0); -- write data
111,7 → 112,7
wb_sel_o : out std_logic_vector(03 downto 0); -- byte enable
wb_stb_o : out std_logic; -- strobe
wb_cyc_o : out std_logic; -- valid cycle
wb_tag_i : in std_logic; -- response tag
wb_lock_o : out std_logic; -- exclusive access request
wb_ack_i : in std_logic := '0'; -- transfer acknowledge
wb_err_i : in std_logic := '0'; -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
166,7 → 167,7
signal clk_i_int : std_ulogic;
signal rstn_i_int : std_ulogic;
--
signal wb_tag_o_int : std_ulogic_vector(03 downto 0);
signal wb_tag_o_int : std_ulogic_vector(02 downto 0);
signal wb_adr_o_int : std_ulogic_vector(31 downto 0);
signal wb_dat_i_int : std_ulogic_vector(31 downto 0);
signal wb_dat_o_int : std_ulogic_vector(31 downto 0);
174,7 → 175,7
signal wb_sel_o_int : std_ulogic_vector(03 downto 0);
signal wb_stb_o_int : std_ulogic;
signal wb_cyc_o_int : std_ulogic;
signal wb_tag_i_int : std_ulogic;
signal wb_lock_o_int : std_ulogic;
signal wb_ack_i_int : std_ulogic;
signal wb_err_i_int : std_ulogic;
--
261,6 → 262,7
ICACHE_ASSOCIATIVITY => ICACHE_ASSOCIATIVITY, -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN => MEM_EXT_EN, -- implement external memory bus interface?
MEM_EXT_TIMEOUT => MEM_EXT_TIMEOUT, -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN => IO_GPIO_EN, -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN => IO_MTIME_EN, -- implement machine system timer (MTIME)?
291,7 → 293,7
wb_sel_o => wb_sel_o_int, -- byte enable
wb_stb_o => wb_stb_o_int, -- strobe
wb_cyc_o => wb_cyc_o_int, -- valid cycle
wb_tag_i => wb_tag_i_int, -- response tag
wb_lock_o => wb_lock_o_int, -- exclusive access request
wb_ack_i => wb_ack_i_int, -- transfer acknowledge
wb_err_i => wb_err_i_int, -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
348,6 → 350,7
wb_sel_o <= std_logic_vector(wb_sel_o_int);
wb_stb_o <= std_logic(wb_stb_o_int);
wb_cyc_o <= std_logic(wb_cyc_o_int);
wb_lock_o <= std_logic(wb_lock_o_int);
wb_ack_i_int <= std_ulogic(wb_ack_i);
wb_err_i_int <= std_ulogic(wb_err_i);
 
/rtl/README.md
1,16 → 1,11
## VHDL Source File Folders
## VHDL Source Folders
 
### [`core`](https://github.com/stnolting/neorv32/tree/master/rtl/core)
 
This folder contains the the core VHDL files for the NEORV32 CPU and the NEORV32 Processor. When creating a new synthesis/simulation project make
sure that all `*.vhd` files from this folder are added to a **new** design library called `neorv32`.
This folder contains the core VHDL files for the NEORV32 CPU and the NEORV32 Processor. When creating a new synthesis/simulation project make
sure that all `*.vhd` files from this folder are added to a *new design library* called `neorv32`.
 
### [`fpga_specifc`](https://github.com/stnolting/neorv32/tree/master/rtl/fpga_specific)
 
This folder provides FPGA- or technology-specific *alternatives* for certain CPU and/or processor modules (for example optimized memory modules using
FPGA-specific primitves).
 
### [`top_templates`](https://github.com/stnolting/neorv32/tree/master/rtl/top_templates)
 
Alternative top entities for the CPU and/or the processor. Actually, these *alternative* top entities are wrappers, which instantiate the *real* top entity of
Alternative top entities for the NEORV32 Processor. Actually, these *alternative* top entities are wrappers, which instantiate the *real* top entity of
processor/CPU and provide a different interface.
/sim/ghdl/ghdl_sim.sh
4,7 → 4,7
set -e
 
# Default simulation configuration
SIM_CONFIG=--stop-time=7ms
SIM_CONFIG=--stop-time=8ms
 
# Project home folder
homedir="$( cd "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
41,6 → 41,7
#
ghdl -a --work=neorv32 $srcdir_core/neorv32_boot_rom.vhd
ghdl -a --work=neorv32 $srcdir_core/neorv32_busswitch.vhd
ghdl -a --work=neorv32 $srcdir_core/neorv32_bus_keeper.vhd
ghdl -a --work=neorv32 $srcdir_core/neorv32_icache.vhd
ghdl -a --work=neorv32 $srcdir_core/neorv32_cfs.vhd
ghdl -a --work=neorv32 $srcdir_core/neorv32_cpu.vhd
/sim/neorv32_tb.vhd
134,8 → 134,8
cyc : std_ulogic; -- valid cycle
ack : std_ulogic; -- transfer acknowledge
err : std_ulogic; -- transfer error
tag : std_ulogic_vector(03 downto 0); -- request tag
tag_r : std_ulogic; -- response tag
tag : std_ulogic_vector(02 downto 0); -- request tag
lock : std_ulogic; -- exclusive access request
end record;
signal wb_cpu, wb_mem_a, wb_mem_b, wb_mem_c, wb_irq : wishbone_t;
 
228,6 → 228,7
ICACHE_ASSOCIATIVITY => 2, -- i-cache: associativity / number of sets (1=direct_mapped), has to be a power of 2
-- External memory interface --
MEM_EXT_EN => true, -- implement external memory bus interface?
MEM_EXT_TIMEOUT => 255, -- cycles after a pending bus access auto-terminates (0 = disabled)
-- Processor peripherals --
IO_GPIO_EN => true, -- implement general purpose input/output port unit (GPIO)?
IO_MTIME_EN => true, -- implement machine system timer (MTIME)?
258,7 → 259,7
wb_sel_o => wb_cpu.sel, -- byte enable
wb_stb_o => wb_cpu.stb, -- strobe
wb_cyc_o => wb_cpu.cyc, -- valid cycle
wb_tag_i => wb_cpu.tag_r, -- response tag
wb_lock_o => wb_cpu.lock, -- exclusive access request
wb_ack_i => wb_cpu.ack, -- transfer acknowledge
wb_err_i => wb_cpu.err, -- transfer error
-- Advanced memory control signals (available if MEM_EXT_EN = true) --
447,7 → 448,6
wb_cpu.rdata <= wb_mem_a.rdata or wb_mem_b.rdata or wb_mem_c.rdata or wb_irq.rdata;
wb_cpu.ack <= wb_mem_a.ack or wb_mem_b.ack or wb_mem_c.ack or wb_irq.ack;
wb_cpu.err <= wb_mem_a.err or wb_mem_b.err or wb_mem_c.err or wb_irq.err;
wb_cpu.tag_r <= wb_mem_a.tag_r or wb_mem_b.tag_r or wb_mem_c.tag_r or wb_irq.tag_r;
 
-- peripheral select via STROBE signal --
wb_mem_a.stb <= wb_cpu.stb when (wb_cpu.addr >= ext_mem_a_base_addr_c) and (wb_cpu.addr < std_ulogic_vector(unsigned(ext_mem_a_base_addr_c) + ext_mem_a_size_c)) else '0';
484,8 → 484,7
end if;
 
-- bus output register --
wb_mem_a.err <= '0';
wb_mem_a.tag_r <= '0';
wb_mem_a.err <= '0';
if (ext_mem_a.ack(ext_mem_a_latency_c-1) = '1') and (wb_mem_b.cyc = '1') and (wb_mem_a.ack = '0') then
wb_mem_a.rdata <= ext_mem_a.rdata(ext_mem_a_latency_c-1);
wb_mem_a.ack <= '1';
525,8 → 524,7
end if;
 
-- bus output register --
wb_mem_b.err <= '0';
wb_mem_b.tag_r <= '0';
wb_mem_b.err <= '0';
if (ext_mem_b.ack(ext_mem_b_latency_c-1) = '1') and (wb_mem_b.cyc = '1') and (wb_mem_b.ack = '0') then
wb_mem_b.rdata <= ext_mem_b.rdata(ext_mem_b_latency_c-1);
wb_mem_b.ack <= '1';
567,28 → 565,22
 
-- EXCLUSIVE bus access -----------------------------------------------------
-- -----------------------------------------------------------------------------
-- make a reservation --
if ((wb_mem_c.cyc and wb_mem_c.stb) = '1') and -- valid access
(wb_mem_c.tag(3) = '1') and -- make a reservation if there is a request (LR.W instruction)
(wb_mem_c.addr(2) = '0') then -- only possible for even word-addresses - odd word-addresses will fail
ext_mem_c_atomic_reservation <= '1';
-- clear reservation --
elsif (wb_mem_c.ack = '1') and -- end of access
(wb_mem_c.tag(3) = '0') then -- end of exclusive access
ext_mem_c_atomic_reservation <= '0';
-- Since there is only one CPU in this design, the exclusive access reservation in THIS memory CANNOT fail.
-- However, this memory module is used to simulated failing LR/SC accesses.
if ((wb_mem_c.cyc and wb_mem_c.stb) = '1') then -- valid access
ext_mem_c_atomic_reservation <= wb_mem_c.lock; -- make reservation
end if;
-- -----------------------------------------------------------------------------
 
-- bus output register --
wb_mem_c.err <= '0';
if (ext_mem_c.ack(ext_mem_c_latency_c-1) = '1') and (wb_mem_c.cyc = '1') and (wb_mem_c.ack = '0') then
wb_mem_c.rdata <= ext_mem_c.rdata(ext_mem_c_latency_c-1);
wb_mem_c.ack <= '1';
wb_mem_c.tag_r <= ext_mem_c_atomic_reservation;
wb_mem_c.err <= ext_mem_c_atomic_reservation; -- issue a bus error if there is an exclusive access request
else
wb_mem_c.rdata <= (others => '0');
wb_mem_c.ack <= '0';
wb_mem_c.tag_r <= '0';
wb_mem_c.err <= '0';
end if;
end if;
end process ext_mem_c_access;
600,10 → 592,9
begin
if rising_edge(clk_gen) then
-- bus interface --
wb_irq.rdata <= (others => '0');
wb_irq.ack <= wb_irq.cyc and wb_irq.stb and wb_irq.we and and_all_f(wb_irq.sel);
wb_irq.err <= '0';
wb_irq.tag_r <= '0';
wb_irq.rdata <= (others => '0');
wb_irq.ack <= wb_irq.cyc and wb_irq.stb and wb_irq.we and and_all_f(wb_irq.sel);
wb_irq.err <= '0';
-- trigger IRQ using CSR.MIE bit layout --
msi_ring <= '0';
mei_ring <= '0';
/sw/bootloader/bootloader.c
469,6 → 469,11
else {
neorv32_uart_print("Loading... ");
 
// check if SPI is available at all
if (neorv32_spi_available() == 0) {
system_error(ERROR_FLASH);
}
 
// check if flash ready (or available at all)
if (spi_flash_read_1st_id() == 0x00) { // manufacturer ID
system_error(ERROR_FLASH);
/sw/example/cpu_test/main.c
55,10 → 55,6
#define ADDR_UNREACHABLE (IO_BASE_ADDRESS-4)
//** external memory base address */
#define EXT_MEM_BASE (0xF0000000)
//** exclusive access to this address will always succeed */
#define ATOMIC_SUCCESS_ADDR (EXT_MEM_BASE + 0)
//** exclusive access to this address will always fail */
#define ATOMIC_FAILURE_ADDR (EXT_MEM_BASE + 4)
/**@}*/
 
 
78,7 → 74,10
/// Global numbe rof available HPMs
uint32_t num_hpm_cnts_global = 0;
 
/// Variable to test atomic accessess
uint32_t atomic_access_addr;
 
 
/**********************************************************************//**
* High-level CPU/processor test program.
*
837,12 → 836,20
if (neorv32_mtime_available()) {
cnt_test++;
 
// force MTIME IRQ
neorv32_mtime_set_timecmp(0);
// configure MTIME IRQ (and check overflow form low owrd to high word)
neorv32_mtime_set_timecmp(-1);
neorv32_mtime_set_time(0);
 
// wait some time for the IRQ to arrive the CPU
neorv32_cpu_csr_write(CSR_MIP, 0); // clear all pending IRQs
 
neorv32_mtime_set_timecmp(0x0000000100000000ULL);
neorv32_mtime_set_time( 0x00000000FFFFFFFEULL);
 
// wait some time for the IRQ to trigger and arrive the CPU
asm volatile("nop");
asm volatile("nop");
asm volatile("nop");
asm volatile("nop");
 
if (neorv32_cpu_csr_read(CSR_MCAUSE) == TRAP_CODE_MTI) {
test_ok();
1458,7 → 1465,7
 
 
// ------ EXECUTE: should fail ------
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] execute: ", cnt_test);
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] execute: ", cnt_test);
cnt_test++;
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
 
1483,7 → 1490,7
 
 
// ------ LOAD: should work ------
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] read: ", cnt_test);
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] read: ", cnt_test);
cnt_test++;
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
 
1508,7 → 1515,7
 
 
// ------ STORE: should fail ------
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] write: ", cnt_test);
neorv32_uart_printf("[%i] PMP: U-mode [!X,!W,R] write: ", cnt_test);
cnt_test++;
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
 
1563,34 → 1570,33
// Test atomic LR/SC operation - should succeed
// ----------------------------------------------------------
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
neorv32_uart_printf("[%i] Atomic access (LR+SC) test (succeeding access): ", cnt_test);
neorv32_uart_printf("[%i] Atomic access (LR+SC succeeding access): ", cnt_test);
 
#ifdef __riscv_atomic
if (is_simulation) { // check if this is a simulation
// skip if A-mode is not implemented
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
 
// skip if A-mode is not implemented
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
cnt_test++;
 
cnt_test++;
neorv32_cpu_store_unsigned_word((uint32_t)&atomic_access_addr, 0x11223344);
 
neorv32_cpu_store_unsigned_word(ATOMIC_SUCCESS_ADDR, 0x11223344);
tmp_a = neorv32_cpu_load_reservate_word((uint32_t)&atomic_access_addr); // make reservation
asm volatile ("nop");
tmp_b = neorv32_cpu_store_conditional((uint32_t)&atomic_access_addr, 0x22446688);
 
// atomic compare-and-swap
if ((neorv32_cpu_atomic_cas((uint32_t)ATOMIC_SUCCESS_ADDR, 0x11223344, 0xAABBCCDD) == 0) && // status: success
(neorv32_cpu_load_unsigned_word(ATOMIC_SUCCESS_ADDR) == 0xAABBCCDD) && // data written correctly
(neorv32_cpu_csr_read(CSR_MCAUSE) == 0)) { // no exception triggered
test_ok();
}
else {
test_fail();
}
// atomic access
if ((tmp_b == 0) && // status: success
(tmp_a == 0x11223344) && // correct data read
(neorv32_cpu_load_unsigned_word((uint32_t)&atomic_access_addr) == 0x22446688) && // correct data write
(neorv32_cpu_csr_read(CSR_MCAUSE) == 0)) { // no exception triggered
test_ok();
}
else {
neorv32_uart_printf("skipped (not implemented)\n");
test_fail();
}
}
else {
neorv32_uart_printf("skipped (on real HW)\n");
neorv32_uart_printf("skipped (not implemented)\n");
}
#else
neorv32_uart_printf("skipped (not implemented)\n");
1601,32 → 1607,67
// Test atomic LR/SC operation - should fail
// ----------------------------------------------------------
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
neorv32_uart_printf("[%i] Atomic access (LR+SC) test (failing access): ", cnt_test);
neorv32_uart_printf("[%i] Atomic access (LR+SC failing access 1): ", cnt_test);
 
#ifdef __riscv_atomic
if (is_simulation) { // check if this is a simulation
// skip if A-mode is not implemented
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
 
// skip if A-mode is not implemented
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
cnt_test++;
 
cnt_test++;
neorv32_cpu_store_unsigned_word((uint32_t)&atomic_access_addr, 0xAABBCCDD);
 
neorv32_cpu_store_unsigned_word(ATOMIC_FAILURE_ADDR, 0x55667788);
// atomic access
tmp_a = neorv32_cpu_load_reservate_word((uint32_t)&atomic_access_addr); // make reservation
neorv32_cpu_store_unsigned_word((uint32_t)&atomic_access_addr, 0xDEADDEAD); // destroy reservation
tmp_b = neorv32_cpu_store_conditional((uint32_t)&atomic_access_addr, 0x22446688);
 
// atomic compare-and-swap
if ((neorv32_cpu_atomic_cas((uint32_t)ATOMIC_FAILURE_ADDR, 0x55667788, 0xEEFFDDBB) != 0) && // staus: failed
(neorv32_cpu_csr_read(CSR_MCAUSE) == 0)) { // no exception triggered
test_ok();
}
else {
test_fail();
}
if ((tmp_b != 0) && // status: fail
(tmp_a == 0xAABBCCDD) && // correct data read
(neorv32_cpu_load_unsigned_word((uint32_t)&atomic_access_addr) == 0xDEADDEAD)) { // correct data write
test_ok();
}
else {
neorv32_uart_printf("skipped (not implemented)\n");
test_fail();
}
}
else {
neorv32_uart_printf("skipped (not implemented)\n");
}
#else
neorv32_uart_printf("skipped (not implemented)\n");
#endif
 
 
// ----------------------------------------------------------
// Test atomic LR/SC operation - should fail
// ----------------------------------------------------------
neorv32_cpu_csr_write(CSR_MCAUSE, 0);
neorv32_uart_printf("[%i] Atomic access (LR+SC failing access 2): ", cnt_test);
 
#ifdef __riscv_atomic
// skip if A-mode is not implemented
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
 
cnt_test++;
 
neorv32_cpu_store_unsigned_word((uint32_t)&atomic_access_addr, 0x12341234);
 
// atomic access
tmp_a = neorv32_cpu_load_reservate_word((uint32_t)&atomic_access_addr); // make reservation
asm volatile ("ecall"); // destroy reservation via trap (simulate a context switch)
tmp_b = neorv32_cpu_store_conditional((uint32_t)&atomic_access_addr, 0xDEADBEEF);
 
if ((tmp_b != 0) && // status: fail
(tmp_a == 0x12341234) && // correct data read
(neorv32_cpu_load_unsigned_word((uint32_t)&atomic_access_addr) == 0x12341234)) { // correct data write
test_ok();
}
else {
test_fail();
}
}
else {
neorv32_uart_printf("skipped (on real HW)\n");
}
#else
/sw/example/hex_viewer/main.c
112,7 → 112,7
" help - show this text\n"
" read - read single word from address\n"
" write - write single word to address\n"
" atomic - perform atomic compare-and-swap operation\n"
" atomic - perform atomic LR/SC access\n"
" dump - dumpe several words from base address\n");
}
 
208,7 → 208,7
void atomic_cas(void) {
 
char terminal_buffer[16];
uint32_t mem_address, cas_expected, cas_desired;
uint32_t mem_address, rdata, wdata, status;
 
if ((neorv32_cpu_csr_read(CSR_MISA) & (1<<CSR_MISA_A_EXT)) != 0) {
 
217,22 → 217,22
neorv32_uart_scan(terminal_buffer, 8+1, 1); // 8 hex chars for address plus '\0'
mem_address = (uint32_t)hexstr_to_uint(terminal_buffer, strlen(terminal_buffer));
 
// enter expected value
neorv32_uart_printf("\nEnter expected value @0x%x (8 hex chars): 0x", mem_address);
neorv32_uart_scan(terminal_buffer, 8+1, 1); // 8 hex chars for address plus '\0'
cas_expected = (uint32_t)hexstr_to_uint(terminal_buffer, strlen(terminal_buffer));
 
// enter desired value
neorv32_uart_printf("\nEnter desired (new) value @0x%x (8 hex chars): 0x", mem_address);
neorv32_uart_printf("\nEnter new value @0x%x (8 hex chars): 0x", mem_address);
neorv32_uart_scan(terminal_buffer, 8+1, 1); // 8 hex chars for address plus '\0'
cas_desired = (uint32_t)hexstr_to_uint(terminal_buffer, strlen(terminal_buffer));
wdata = (uint32_t)hexstr_to_uint(terminal_buffer, strlen(terminal_buffer));
 
// try to execute atomic compare-and-swap
if (neorv32_cpu_atomic_cas(mem_address, cas_expected, cas_desired) == 0) {
neorv32_uart_printf("\nAtomic-CAS: Successful!\n");
rdata = neorv32_cpu_load_reservate_word(mem_address); // make reservation
status = neorv32_cpu_store_conditional(mem_address, wdata);
 
// status
neorv32_uart_printf("\nOld data: 0x%x\n", rdata);
if (status == 0) {
neorv32_uart_printf("Atomic access successful!\n");
neorv32_uart_printf("New data: 0x%x\n", neorv32_cpu_load_unsigned_word(mem_address));
}
else {
neorv32_uart_printf("\nAtomic-CAS: Failed!\n");
neorv32_uart_printf("Atomic access failed!\n");
}
}
else {
/sw/lib/include/neorv32.h
1080,6 → 1080,8
SYSINFO_FEATURES_MEM_EXT_ENDIAN = 5, /**< SYSINFO_FEATURES (5) (r/-): External bus interface uses BIG-endian byte-order when 1 (via package.xbus_big_endian_c constant) */
SYSINFO_FEATURES_ICACHE = 6, /**< SYSINFO_FEATURES (6) (r/-): Processor-internal instruction cache implemented when 1 (via ICACHE_EN generic) */
 
SYSINFO_FEATURES_HW_RESET = 15, /**< SYSINFO_FEATURES (15) (r/-): Dedicated hardware reset of core registers implemented when 1 (via package's dedicated_reset_c constant) */
 
SYSINFO_FEATURES_IO_GPIO = 16, /**< SYSINFO_FEATURES (16) (r/-): General purpose input/output port unit implemented when 1 (via IO_GPIO_EN generic) */
SYSINFO_FEATURES_IO_MTIME = 17, /**< SYSINFO_FEATURES (17) (r/-): Machine system timer implemented when 1 (via IO_MTIME_EN generic) */
SYSINFO_FEATURES_IO_UART0 = 18, /**< SYSINFO_FEATURES (18) (r/-): Primary universal asynchronous receiver/transmitter 0 implemented when 1 (via IO_UART0_EN generic) */
/sw/lib/include/neorv32_cpu.h
52,7 → 52,6
uint64_t neorv32_cpu_get_systime(void);
void neorv32_cpu_delay_ms(int16_t time_ms);
void __attribute__((naked)) neorv32_cpu_goto_user_mode(void);
int neorv32_cpu_atomic_cas(uint32_t addr, uint32_t expected, uint32_t desired);
uint32_t neorv32_cpu_pmp_get_num_regions(void);
uint32_t neorv32_cpu_pmp_get_granularity(void);
int neorv32_cpu_pmp_configure_region(uint32_t index, uint32_t base, uint32_t size, uint8_t config);
68,7 → 67,32
*
* @param[in] addr Address (32-bit).
* @param[in] wdata Data word (32-bit) to store.
* @return Operation status (32-bit).
**************************************************************************/
inline uint32_t __attribute__ ((always_inline)) neorv32_cpu_store_conditional(uint32_t addr, uint32_t wdata) {
 
#if defined __riscv_atomic || defined __riscv_a
register uint32_t reg_addr = addr;
register uint32_t reg_data = wdata;
register uint32_t reg_status;
 
asm volatile ("sc.w %[status], %[da], (%[ad])" : [status] "=r" (reg_status) : [da] "r" (reg_data), [ad] "r" (reg_addr));
 
return reg_status;
#else
return 1; // always failing
#endif
}
 
 
/**********************************************************************//**
* Conditional store unsigned word to address space if atomic access reservation is valid.
*
* @note An unaligned access address will raise an alignment exception.
*
* @param[in] addr Address (32-bit).
* @param[in] wdata Data word (32-bit) to store.
**************************************************************************/
inline void __attribute__ ((always_inline)) neorv32_cpu_store_unsigned_word(uint32_t addr, uint32_t wdata) {
 
register uint32_t reg_addr = addr;
111,6 → 135,30
 
 
/**********************************************************************//**
* Load unsigned word from address space and make reservation for atomic access.
*
* @note An unaligned access address will raise an alignment exception.
*
* @param[in] addr Address (32-bit).
* @return Read data word (32-bit).
**************************************************************************/
inline uint32_t __attribute__ ((always_inline)) neorv32_cpu_load_reservate_word(uint32_t addr) {
 
register uint32_t reg_addr = addr;
register uint32_t reg_data;
 
#if defined __riscv_atomic || defined __riscv_a
asm volatile ("lr.w %[da], 0(%[ad])" : [da] "=r" (reg_data) : [ad] "r" (reg_addr));
#else
asm volatile ("lw %[da], 0(%[ad])" : [da] "=r" (reg_data) : [ad] "r" (reg_addr));
#endif
 
return (uint32_t)reg_data;
}
 
 
 
/**********************************************************************//**
* Load unsigned word from address space.
*
* @note An unaligned access address will raise an alignment exception.
/sw/lib/source/neorv32_cpu.c
322,45 → 322,6
 
 
/**********************************************************************//**
* Atomic compare-and-swap operation (for implemeneting semaphores and mutexes).
*
* @warning This function requires the A (atomic) CPU extension.
*
* @param[in] addr Address of memory location.
* @param[in] expected Expected value (for comparison).
* @param[in] desired Desired value (new value).
* @return Returns 0 on success, 1 on failure.
**************************************************************************/
int __attribute__ ((noinline)) neorv32_cpu_atomic_cas(uint32_t addr, uint32_t expected, uint32_t desired) {
#ifdef __riscv_atomic
 
register uint32_t addr_reg = addr;
register uint32_t des_reg = desired;
register uint32_t tmp_reg;
 
// load original value + reservation (lock)
asm volatile ("lr.w %[result], (%[input])" : [result] "=r" (tmp_reg) : [input] "r" (addr_reg));
 
if (tmp_reg != expected) {
asm volatile ("lw x0, 0(%[input])" : : [input] "r" (addr_reg)); // clear reservation lock
return 1;
}
 
// store-conditional
asm volatile ("sc.w %[result], %[input_i], (%[input_j])" : [result] "=r" (tmp_reg) : [input_i] "r" (des_reg), [input_j] "r" (addr_reg));
 
if (tmp_reg) {
return 1;
}
 
return 0;
#else
return 1; // A extension not implemented - function always fails
#endif
}
 
 
/**********************************************************************//**
* Physical memory protection (PMP): Get number of available regions.
*
* @warning This function overrides all available PMPCFG* CSRs.
/sw/lib/source/neorv32_rte.c
273,8 → 273,9
 
// Processor - general stuff
neorv32_uart_printf("\n=== << General >> ===\n");
neorv32_uart_printf("Clock: %u Hz\n", SYSINFO_CLK);
neorv32_uart_printf("User ID: 0x%x\n", SYSINFO_USER_CODE);
neorv32_uart_printf("Clock: %u Hz\n", SYSINFO_CLK);
neorv32_uart_printf("User ID: 0x%x\n", SYSINFO_USER_CODE);
neorv32_uart_printf("Dedicated HW reset: "); __neorv32_rte_print_true_false(SYSINFO_FEATURES & (1 << SYSINFO_FEATURES_HW_RESET));
 
 
// CPU configuration
/CHANGELOG.md
24,6 → 24,13
 
| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 29.04.2021 | 1.5.4.8 | minor edits in CPU instruction fetch engine; reduced **processor-internal bus timeout** (`max_proc_int_response_time_c`) to 15 cycles; added flag to SYSINGO module (`SYSINFO_FEATURES_HW_RESET`) to check if a dedicated hardware reset of all core register is implemented (via package's `dedicated_reset_c` constant) |
| 28.04.2021 | 1.5.4.7 | :bug: fixed bug in instruction cache (iCACHE) when using two sets - `ICACHE_ASSOCIATIVITY` = 2: cache was corrupting the non-active set |
| 26.04.2021 | 1.5.4.6 | optimized CPU's instruction fetch unit: less overhead for branches, reduced unit's hardware complexity |
| 25.04.2021 | 1.5.4.5 | :sparkles: :warning: removed `cancel` signals from processor-internal bus system; removed CPU's internal bus access timeout counter; added new top generic: `MEM_EXT_TIMEOUT` - type `natural`, default = 255; used to configure optional auto-timeout of Wishbone interface (if an **external** device is not responding within `MEM_EXT_TIMEOUT` clock cycles); set to zero to disable auto-timeout (required to comply with AXI4-Lite specs. when using the top's AXI wrapper) |
| 25.04.2021 | 1.5.4.3 | :sparkles: converted NEORV32.pdf data sheet to [`asciidoc` using asciidoctor](https://asciidoctor.org/); added data sheet sources to [`docs/src_adoc`](https://github.com/stnolting/neorv32/blob/master/docs/src_adoc) |
| 21.04.2021 | 1.5.4.3 | :warning: :bug: reworked *atomic memory access* system due to conceptual design errors: new system will make atomic LR/SC combinations fail when there is a trap (like a context switch) between the two instructions; new system prohibits SC from writing to memory if exclusive access fails; removed top's `wb_tag_i` signal, pruned one bit of top's `wb_tag_o` signal (atomic access), added top's `wb_lock_o` signal; updated sections in NEORV32.pdf regarding atomic memory accesses |
| 19.04.2021 | 1.5.4.1 | added register stage to `MTIME.time` write access to improve timing closure |
| 17.04.2021 | [**:rocket:1.5.4.0**](https://github.com/stnolting/neorv32/releases/tag/v1.5.4.0) | **New release** |
| 16.04.2021 | 1.5.3.13 | :warning: added new top configuration generic `TINY_SHIFT_EN` (type = `boolean`, default = `false`) to configure a tiny single-bit (iterative) shifter for CPU ALU shift operations (for highly area-constrained setups) |
| 16.04.2021 | 1.5.3.12 | :sparkles: reworked reset system of the complete CPU: by default most registers (= "uncritical registers") **do not** provide an initialization via hardware reset; a **defined reset value** can be enabled by setting a constant from the main VHDL package (`rtl/core/neorv32_package.vhd`): `constant dedicated_reset_c : boolean := false;` (set `true` to enable CPU-wide dedicated register reset); see new section "2.11. CPU Hardware Reset" of NEORV32.pdf for more information |
/README.md
6,6 → 6,7
[![riscv-arch-test](https://github.com/stnolting/neorv32/actions/workflows/riscv-arch-test.yml/badge.svg)](https://github.com/stnolting/neorv32/actions/workflows/riscv-arch-test.yml)
[![license](https://img.shields.io/github/license/stnolting/neorv32)](https://github.com/stnolting/neorv32/blob/master/LICENSE)
[![release](https://img.shields.io/github/v/release/stnolting/neorv32)](https://github.com/stnolting/neorv32/releases)
[![datasheet](https://img.shields.io/badge/data%20sheet-NEORV32.pdf-ffbd00)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf)
 
* [Overview](#Overview)
* [Status](#Status)
28,6 → 29,8
designs or as *ready-to-go* stand-alone custom microcontroller.
 
:books: For detailed information take a look at the [NEORV32 data sheet (pdf)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
The `asciidoc` sources can be found in [`docs/src_adoc`](https://github.com/stnolting/neorv32/blob/master/docs/src_adoc). The latest automatic build
can be downloaded as artifacts from the [_Build Data Sheet_ GitHub workflow](https://github.com/stnolting/neorv32/actions/workflows/build_datasheet.yml).
The doxygen-based documentation of the *software framework* is available online at [GitHub-pages](https://stnolting.github.io/neorv32/files.html).
 
:label: The project’s change log is available as [CHANGELOG.md](https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md) in the root directory of this repository.
99,8 → 102,10
 
### Status
 
The processor is [synthesizable](#FPGA-Implementation-Results) (tested on *real hardware* using Intel Quartus Prime, Xilinx Vivado and Lattice Radiant/Synplify Pro) and can successfully execute
all the [provided example programs](https://github.com/stnolting/neorv32/tree/master/sw/example) including the [CoreMark benchmark](#CoreMark-Benchmark).
The processor is [synthesizable](#FPGA-Implementation-Results) (tested on *real hardware* using Intel Quartus Prime, Xilinx Vivado and Lattice Radiant) and can successfully execute
all the [provided example programs](https://github.com/stnolting/neorv32/tree/master/sw/example) including the [CoreMark benchmark](#CoreMark-Benchmark) and the custom
NEORV32 processor check ([`sw/example/cpu_test`](https://github.com/stnolting/neorv32/tree/master/sw/example/cpu_test), see the status report in the according
[GitHub workflow](https://github.com/stnolting/neorv32/actions/workflows/processor-check.yml)).
 
**RISC-V Architecture Tests**: The processor passes the official `rv32_m/C`, `rv32_m/I`, `rv32_m/M`, `rv32_m/privilege` and `rv32_m/Zifencei`
[riscv-arch-test](https://github.com/riscv/riscv-arch-test) tests. More information regarding the NEORV32 port of the riscv-arch-test test framework can be found in
109,7 → 114,8
| Project component | CI status |
|:----------------- |:----------|
| [NEORV32 processor](https://github.com/stnolting/neorv32) | [![Processor Check](https://github.com/stnolting/neorv32/workflows/Processor%20Check/badge.svg)](https://github.com/stnolting/neorv32/actions?query=workflow%3A%22Processor+Check%22) |
| [SW Framework Documentation (online at GH-pages)](https://stnolting.github.io/neorv32/files.html) | [![Doc@GitHub-pages](https://github.com/stnolting/neorv32/workflows/Deploy%20SW%20Framework%20Documentation%20to%20GitHub-Pages/badge.svg)](https://stnolting.github.io/neorv32/files.html) |
| [SW Framework Documentation (online at GH-pages)](https://stnolting.github.io/neorv32/files.html) | [![Doc@GitHub-pages](https://github.com/stnolting/neorv32/workflows/Deploy%20SW%20Framework%20Documentation%20to%20GitHub-Pages/badge.svg)](https://stnolting.github.io/neorv32/files.html) |
| Build data sheet from `asciidoc` sources | [![Build Data Sheet](https://github.com/stnolting/neorv32/actions/workflows/build_datasheet.yml/badge.svg)](https://github.com/stnolting/neorv32/actions/workflows/build_datasheet.yml) |
| [Pre-built toolchains](https://github.com/stnolting/riscv-gcc-prebuilt) | [![Test Toolchains](https://github.com/stnolting/riscv-gcc-prebuilt/workflows/Test%20Toolchains/badge.svg)](https://github.com/stnolting/riscv-gcc-prebuilt/actions?query=workflow%3A%22Test+Toolchains%22) |
| [RISC-V architecture test](https://github.com/stnolting/neorv32/blob/master/riscv-arch-test/README.md) | [![riscv-arch-test](https://github.com/stnolting/neorv32/actions/workflows/riscv-arch-test.yml/badge.svg)](https://github.com/stnolting/neorv32/actions/workflows/riscv-arch-test.yml) |
 
401,7 → 407,6
**_Notes_**
* The "default" implementation strategy of the according toolchain is used.
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DMEM (each 64kb).
The FPGA-specific memory components can be found in [`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up).
* The clock frequencies marked with a "c" are constrained clocks. The remaining ones are _f_max_ results from the place and route timing reports.
* The Upduino and the Arty board have on-board SPI flash memories for storing the FPGA configuration. These device can also be used by the default NEORV32
bootloader to store and automatically boot an application program after reset (both tested successfully).
529,6 → 534,22
[:page_facing_up: NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf)
 
 
### 0. Build the Documentation
 
This step is optional since there are pre-built versions of the [processor data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf)
and the [software documentation](https://stnolting.github.io/neorv32/files.html). If you want to build the documentation by yourself:
 
**NEORV32 Data Sheet**
 
To build the data sheet open a console and navigate to the project's `docs` folder. Run `$ sh make_datasheet.sh` (make sure `asciidoctor-pdf` is installed).
This will take all the `asciidoc` sources from [`docs/src_adoc`](https://github.com/stnolting/neorv32/blob/master/docs/src_adoc) to generate `docs/NEORV32.pdf`.
 
**Software Framework Documentation**
 
Make sure `doxygen` is installed. Open a console and navigate to the project's `docs` folder and run `$ doxygen Doxyfile`. This will create (if not already there)
a new folder `docs/doxygen_build/html` where doxygen will generate the HTML-based documentation pages. Open `docs/doxygen_build/html/files.html` to get started.
 
 
### 1. Get the Toolchain
 
At first you need a **RISC-V GCC toolchain**. You can either [download the sources](https://github.com/riscv/riscv-gnu-toolchain)

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.