Line 19... |
Line 19... |
|
|
|
|
## Introduction
|
## Introduction
|
|
|
The **NEORV32 processor** is a customizable full-scale mikrocontroller-like processor system based on the RISC-V-compliant
|
The **NEORV32 processor** is a customizable full-scale mikrocontroller-like processor system based on the RISC-V-compliant
|
`rv32i` NEORV32 CPU with optional `M`, `E`, `C` and `Zicsr` and `Zifencei` extensions. The CPU was built from scratch and
|
`rv32i` NEORV32 CPU with optional `M`, `E`, `C` and `U`, `Zicsr` and `Zifencei` extensions and optional physical memory protection (PMP).
|
is compliant to the *Unprivileged ISA Specification Version 2.2* and a subset of the *Privileged Architecture
|
The CPU was built from scratch and is compliant to the *Unprivileged ISA Specification Version 2.2* and a subset of the *Privileged Architecture
|
Specification Version 1.12-draft*.
|
Specification Version 1.12-draft*.
|
|
|
The **processor** is intended as auxiliary processor within a larger SoC designs or as stand-alone
|
The **processor** is intended as auxiliary processor within a larger SoC designs or as stand-alone
|
custom microcontroller. Its top entity can be directly synthesized for any FPGA without modifications and
|
custom microcontroller. Its top entity can be directly synthesized for any FPGA without modifications and
|
provides a full-scale RISC-V based microcontroller with common peripherals like GPIO, serial interfaces for
|
provides a full-scale RISC-V based microcontroller with common peripherals like GPIO, serial interfaces for
|
Line 43... |
Line 43... |
[compile the GCC toolchains](https://github.com/riscv/riscv-gnu-toolchain) by yourself, you can also
|
[compile the GCC toolchains](https://github.com/riscv/riscv-gnu-toolchain) by yourself, you can also
|
download [pre-compiled toolchains](https://github.com/stnolting/riscv_gcc_prebuilt) for Linux.
|
download [pre-compiled toolchains](https://github.com/stnolting/riscv_gcc_prebuilt) for Linux.
|
|
|
For more information take a look a the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
For more information take a look a the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
|
|
|
### Key Features
|
|
|
|
- RISC-V-compliant `rv32i` CPU with optional `C`, `E`, `M`, `U`, `Zicsr`, `rv32Zifencei` and PMP (physical memory protection) extensions
|
|
- GCC-based toolchain ([pre-compiled rv32i and rv32 etoolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
|
- Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
|
- [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
|
- Detailed [datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
|
- Completely described in behavioral, platform-independent VHDL – no primitives, macros, etc.
|
|
- Fully synchronous design, no latches, no gated clocks
|
|
- Small hardware footprint and high operating frequency
|
|
- Highly configurable CPU and processor setup
|
|
|
### Design Principles
|
### Design Principles
|
|
|
* From zero to main(): Completely open source and documented.
|
* From zero to main(): Completely open source and documented.
|
* Plain VHDL without technology-specific parts like attributes, macros or primitives.
|
* Plain VHDL without technology-specific parts like attributes, macros or primitives.
|
Line 73... |
Line 84... |
|
|
* No exception is triggered for the `E` CPU extension when using registers above `x15` (*needs fixing*)
|
* No exception is triggered for the `E` CPU extension when using registers above `x15` (*needs fixing*)
|
* `misa` CSR is read-only - no dynamic enabling/disabling of implemented CPU extensions during runtime
|
* `misa` CSR is read-only - no dynamic enabling/disabling of implemented CPU extensions during runtime
|
* `mcause` CSR is read-only
|
* `mcause` CSR is read-only
|
* The `[m]cycleh` and `[m]instreth` CSR counters are only 20-bit wide (in contrast to original 32-bit)
|
* The `[m]cycleh` and `[m]instreth` CSR counters are only 20-bit wide (in contrast to original 32-bit)
|
|
* The physical memory protection (**PMP**) only supports `NAPOT` mode and only up to 8 regions
|
|
|
|
|
### Custom CPU Extensions
|
### Custom CPU Extensions
|
|
|
|
The custom extensions are always enabled and are indicated via the `X` bit in the `misa` CSR.
|
|
|
* Four *fast interrupt* request channels with according control/status bits in `mie` and `mip` and custom exception codes in `mcause`
|
* Four *fast interrupt* request channels with according control/status bits in `mie` and `mip` and custom exception codes in `mcause`
|
|
|
|
|
### To-Do / Wish List
|
### To-Do / Wish List
|
|
|
- Add instructions how to use the NEORV32 CPU without the processor surroundings
|
|
- Add AXI / AXI-Lite bridges
|
- Add AXI / AXI-Lite bridges
|
- Option to use DSP-based multiplier in `M` extension (would be so much faster)
|
- Option to use DSP-based multiplier in `M` extension (would be so much faster)
|
- Synthesis results for more platforms
|
- Synthesis results for more platforms
|
- Implement user mode (`U` extension)
|
|
- Port Dhrystone benchmark
|
- Port Dhrystone benchmark
|
- Implement atomic operations (`A` extension) and floating-point operations (`F` extension)
|
- Implement atomic operations (`A` extension) and floating-point operations (`F` extension)
|
- Maybe port an RTOS (like [Zephyr](https://github.com/zephyrproject-rtos/zephyr), [freeRTOS](https://www.freertos.org) or [RIOT](https://www.riot-os.org))
|
- Maybe port an RTOS (like [Zephyr](https://github.com/zephyrproject-rtos/zephyr), [freeRTOS](https://www.freertos.org) or [RIOT](https://www.riot-os.org))
|
- Make a 64-bit branch someday
|
- Make a 64-bit branch someday
|
|
|
Line 100... |
Line 112... |
|
|
### Processor Features
|
### Processor Features
|
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
|
|
- RISC-V-compliant `rv32i` CPU with optional `C`, `E`, `M`, `Zicsr` and `rv32Zifencei` extensions
|
Highly customizable processor configuration:
|
- GCC-based toolchain ([pre-compiled rv32i and rv32 etoolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
- Optional processor-internal data and instruction memories (DMEM/IMEM)
|
- Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
- Optional internal bootloader with UART console and automatic SPI flash boot option
|
- [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
- Optional machine system timer (MTIME), RISC-V-compliant
|
- Detailed [datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
- Optional universal asynchronous receiver and transmitter (UART)
|
- Completely described in behavioral, platform-independent VHDL – no primitives, macros, etc.
|
- Optional 8/16/24/32-bit serial peripheral interface controller (SPI) with 8 dedicated chip select lines
|
- Fully synchronous design, no latches, no gated clocks
|
- Optional two wire serial interface controller (TWI), compatible to the I²C standard
|
- Small hardware footprint and high operating frequency
|
- Optional general purpose parallel IO port (GPIO), 16xOut & 16xIn, with pin-change interrupt
|
- Highly customizable processor configuration
|
- Optional 32-bit external bus interface, Wishbone b4 compliant (WISHBONE)
|
- _Optional_ processor-internal data and instruction memories (DMEM/IMEM)
|
- Optional watchdog timer (WDT)
|
- _Optional_ internal bootloader with UART console and automatic SPI flash boot option
|
- Optional PWM controller with 4 channels and 8-bit duty cycle resolution (PWM)
|
- _Optional_ machine system timer (MTIME), RISC-V-compliant
|
- Optional GARO-based true random number generator (TRNG)
|
- _Optional_ universal asynchronous receiver and transmitter (UART)
|
- Optional dummy device (DEVNULL) (can be used for *fast* simulation console output)
|
- _Optional_ 8/16/24/32-bit serial peripheral interface controller (SPI) with 8 dedicated chip select lines
|
|
- _Optional_ two wire serial interface controller (TWI), compatible to the I²C standard
|
|
- _Optional_ general purpose parallel IO port (GPIO), 16xOut & 16xIn, with pin-change interrupt
|
|
- _Optional_ 32-bit external bus interface, Wishbone b4 compliant (WISHBONE)
|
|
- _Optional_ watchdog timer (WDT)
|
|
- _Optional_ PWM controller with 4 channels and 8-bit duty cycle resolution (PWM)
|
|
- _Optional_ GARO-based true random number generator (TRNG)
|
|
- _Optional_ dummy device (DEVNULL) (can be used for *fast* simulation console output)
|
|
- System configuration information memory to check hardware configuration by software (SYSINFO)
|
- System configuration information memory to check hardware configuration by software (SYSINFO)
|
|
|
### CPU Features
|
### CPU Features
|
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_cpu.png)
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_cpu.png)
|
Line 138... |
Line 142... |
|
|
|
|
**General**:
|
**General**:
|
* Modified Harvard architecture (separate CPU interfaces for data and instructions; single processor-bus via bus switch)
|
* Modified Harvard architecture (separate CPU interfaces for data and instructions; single processor-bus via bus switch)
|
* Two stages in-order pipeline (FETCH, EXECUTE); each stage uses a multi-cycle processing scheme
|
* Two stages in-order pipeline (FETCH, EXECUTE); each stage uses a multi-cycle processing scheme
|
* No hardware support of unaligned accesses - they will trigger and exception
|
* No hardware support of unaligned accesses - they will trigger an exception
|
|
* Privilege levels: `machine` mode, `user` mode (if enabled via `U` extension)
|
|
|
|
|
**RV32I base instruction set** (`I` extension):
|
**RV32I base instruction set** (`I` extension):
|
* ALU instructions: `LUI` `AUIPC` `ADDI` `SLTI` `SLTIU` `XORI` `ORI` `ANDI` `SLLI` `SRLI` `SRAI` `ADD` `SUB` `SLL` `SLT` `SLTU` `XOR` `SRL` `SRA` `OR` `AND`
|
* ALU instructions: `LUI` `AUIPC` `ADDI` `SLTI` `SLTIU` `XORI` `ORI` `ANDI` `SLLI` `SRLI` `SRAI` `ADD` `SUB` `SLL` `SLT` `SLTU` `XOR` `SRL` `SRA` `OR` `AND`
|
* Jump and branch instructions: `JAL` `JALR` `BEQ` `BNE` `BLT` `BGE` `BLTU` `BGEU`
|
* Jump and branch instructions: `JAL` `JALR` `BEQ` `BNE` `BLT` `BGE` `BLTU` `BGEU`
|
Line 176... |
Line 181... |
* Load address misaligned
|
* Load address misaligned
|
* Load access fault
|
* Load access fault
|
* Store address misaligned
|
* Store address misaligned
|
* Store access fault
|
* Store access fault
|
* Environment call from M-mode (via `ecall` instruction)
|
* Environment call from M-mode (via `ecall` instruction)
|
* Machine timer interrupt `mti` (via MTIME unit)
|
* Machine timer interrupt `mti` (via processor's MTIME unit)
|
* Machine external interrupt `mei` (via CLIC unit)
|
* Machine software interrupt `msi` (via external signal)
|
|
* Machine external interrupt `mei` (via external signal)
|
|
* Four fast interrupt requests (custom extension)
|
|
|
|
**Privileged architecture / User mode** (`U` extension, requires `Zicsr` extension):
|
|
* Privilege levels: `M-mode` (Machine mode) + `U-Mode` (User mode)
|
|
|
**Privileged architecture / FENCE.I** (`Zifencei` extension):
|
**Privileged architecture / FENCE.I** (`Zifencei` extension):
|
* System instructions: `FENCE.I`
|
* System instructions: `FENCEI`
|
|
|
|
**Physical memory protection** (`PMP`, requires `Zicsr` extension):
|
|
* Additional machine CSRs: `pmpcfgx` `pmpaddrx`
|
|
|
|
|
## FPGA Implementation Results
|
## FPGA Implementation Results
|
|
|
This chapter shows exemplary implementation results of the NEORV32 processor for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
This chapter shows exemplary implementation results of the NEORV32 processor for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
of the processor's generics is assumed. No constraints were used at all.
|
of the CPU's generics is assumed (no PMP). No constraints were used at all.
|
|
|
### CPU
|
### CPU
|
|
|
Results generated for hardware version: `1.3.0.0`
|
Results generated for hardware version: `1.3.0.0`
|
|
|
Line 227... |
Line 240... |
|
|
|
|
### Exemplary FPGA Setups
|
### Exemplary FPGA Setups
|
|
|
Exemplary implementation results for different FPGA platforms. The processor setup uses *all provided peripherals*,
|
Exemplary implementation results for different FPGA platforms. The processor setup uses *all provided peripherals*,
|
no external memory interface and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
no external memory interface, no PMP and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
|
|
Results generated for hardware version: `1.3.0.0`
|
Results generated for hardware version: `1.3.0.0`
|
|
|
Line 323... |
Line 336... |
USER_CODE : std_ulogic_vector(31 downto 0) := x"00000000"; -- custom user code
|
USER_CODE : std_ulogic_vector(31 downto 0) := x"00000000"; -- custom user code
|
-- RISC-V CPU Extensions --
|
-- RISC-V CPU Extensions --
|
CPU_EXTENSION_RISCV_C : boolean := false; -- implement compressed extension?
|
CPU_EXTENSION_RISCV_C : boolean := false; -- implement compressed extension?
|
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
|
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
|
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
|
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
|
|
CPU_EXTENSION_RISCV_U : boolean := false; -- implement user mode extension?
|
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
|
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
|
CPU_EXTENSION_RISCV_Zifencei : boolean := true; -- implement instruction stream sync.?
|
CPU_EXTENSION_RISCV_Zifencei : boolean := true; -- implement instruction stream sync.?
|
|
-- Physical Memory Protection (PMP) --
|
|
PMP_USE : boolean := false; -- implement PMP?
|
|
PMP_NUM_REGIONS : natural := 4; -- number of regions (max 16)
|
|
PMP_GRANULARITY : natural := 15; -- region granularity (1=8B, 2=16B, 3=32B, ...) default is 64k
|
-- Memory configuration: Instruction memory --
|
-- Memory configuration: Instruction memory --
|
MEM_ISPACE_BASE : std_ulogic_vector(31 downto 0) := x"00000000"; -- base address of instruction memory space
|
MEM_ISPACE_BASE : std_ulogic_vector(31 downto 0) := x"00000000"; -- base address of instruction memory space
|
MEM_ISPACE_SIZE : natural := 16*1024; -- total size of instruction memory space in byte
|
MEM_ISPACE_SIZE : natural := 16*1024; -- total size of instruction memory space in byte
|
MEM_INT_IMEM_USE : boolean := true; -- implement processor-internal instruction memory
|
MEM_INT_IMEM_USE : boolean := true; -- implement processor-internal instruction memory
|
MEM_INT_IMEM_SIZE : natural := 16*1024; -- size of processor-internal instruction memory in bytes
|
MEM_INT_IMEM_SIZE : natural := 16*1024; -- size of processor-internal instruction memory in bytes
|
Line 404... |
Line 422... |
CPU_BOOT_ADDR : std_ulogic_vector(31 downto 0):= (others => '0'); -- cpu boot address
|
CPU_BOOT_ADDR : std_ulogic_vector(31 downto 0):= (others => '0'); -- cpu boot address
|
-- RISC-V CPU Extensions --
|
-- RISC-V CPU Extensions --
|
CPU_EXTENSION_RISCV_C : boolean := false; -- implement compressed extension?
|
CPU_EXTENSION_RISCV_C : boolean := false; -- implement compressed extension?
|
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
|
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
|
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
|
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
|
|
CPU_EXTENSION_RISCV_U : boolean := false; -- implement user mode extension?
|
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
|
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
|
CPU_EXTENSION_RISCV_Zifencei : boolean := true; -- implement instruction stream sync.?
|
CPU_EXTENSION_RISCV_Zifencei : boolean := true; -- implement instruction stream sync.?
|
|
-- Physical Memory Protection (PMP) --
|
|
PMP_USE : boolean := false; -- implement PMP?
|
|
PMP_NUM_REGIONS : natural := 4; -- number of regions (max 16)
|
|
PMP_GRANULARITY : natural := 15; -- region granularity (1=8B, 2=16B, 3=32B, ...) default is 64k
|
-- Bus Interface --
|
-- Bus Interface --
|
BUS_TIMEOUT : natural := 15 -- cycles after which a valid bus access will timeout
|
BUS_TIMEOUT : natural := 15 -- cycles after which a valid bus access will timeout
|
);
|
);
|
port (
|
port (
|
-- global control --
|
-- global control --
|
Line 467... |
Line 490... |
`mul`/`div` instructions! Hence, this code cannot be executed (without emulation) on an architecture without these extensions!
|
`mul`/`div` instructions! Hence, this code cannot be executed (without emulation) on an architecture without these extensions!
|
|
|
To build the toolchain by yourself, follow the official [build instructions](https://github.com/riscv/riscv-gnu-toolchain.
|
To build the toolchain by yourself, follow the official [build instructions](https://github.com/riscv/riscv-gnu-toolchain.
|
Make sure to use the `ilp32` or `ilp32e` ABI.
|
Make sure to use the `ilp32` or `ilp32e` ABI.
|
|
|
Alternatively, you can download a prebuilt toolchain. I have uploaded the toolchain(s) I am using to GitHub. This toolchain
|
**Alternatively**, you can download a prebuilt toolchain. I have uploaded the toolchains I am using to GitHub. These toolchains
|
has been compiled on a 64-bit x86 Ubuntu (Ubuntu on Windows, actually). Download the toolchain of choice:
|
were compiled on a 64-bit x86 Ubuntu 20.04 LTS (Ubuntu on Windows, actually). Download the toolchain of choice:
|
|
|
[https://github.com/stnolting/riscv_gcc_prebuilt](https://github.com/stnolting/riscv_gcc_prebuilt)
|
[https://github.com/stnolting/riscv_gcc_prebuilt](https://github.com/stnolting/riscv_gcc_prebuilt)
|
|
|
|
|
### Dowload the NEORV32 and Create a Hardware Project
|
### Dowload the NEORV32 and Create a Hardware Project
|