Line 61... |
Line 61... |
with the NEORV32. If you do not want to [compile the GCC toolchains](https://github.com/riscv/riscv-gnu-toolchain) by yourself, you can also
|
with the NEORV32. If you do not want to [compile the GCC toolchains](https://github.com/riscv/riscv-gnu-toolchain) by yourself, you can also
|
download [pre-compiled toolchains](https://github.com/stnolting/riscv_gcc_prebuilt) for Linux.
|
download [pre-compiled toolchains](https://github.com/stnolting/riscv_gcc_prebuilt) for Linux.
|
|
|
For more information take a look at the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
For more information take a look at the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
|
|
This project is hosted on [GitHub](https://github.com/stnolting/neorv32) and [opencores.org](https://opencores.org/projects/neorv32).
|
|
A not-so-complete project log can be found on [hackaday.io](https://hackaday.io/project/174167-the-neorv32-risc-v-processor).
|
|
|
|
|
|
### Key Features
|
### Key Features
|
|
|
- RISC-V-compliant `rv32i` CPU with optional `C`, `E`, `M`, `U`, `Zicsr`, `Zifencei` and `PMP` (physical memory protection) extensions
|
- RISC-V-compliant `rv32i` CPU with optional `C`, `E`, `M`, `U`, `Zicsr`, `Zifencei` and `PMP` (physical memory protection) extensions
|
- GCC-based toolchain ([pre-compiled rv32i and rv32e toolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
- GCC-based toolchain ([pre-compiled rv32i and rv32e toolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
- Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
- Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
- [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
- [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
- Detailed [datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
- [**Detailed data sheet**](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
- Completely described in behavioral, platform-independent VHDL – no primitives, macros, etc.
|
- Completely described in behavioral, platform-independent VHDL – no primitives, macros, etc.
|
- Fully synchronous design, no latches, no gated clocks
|
- Fully synchronous design, no latches, no gated clocks
|
- Small hardware footprint and high operating frequency
|
- Small hardware footprint and high operating frequency
|
- Highly configurable CPU and processor setup
|
- Highly configurable CPU and processor setup
|
- [FreeRTOS port](https://github.com/stnolting/neorv32/blob/master/sw/example/demo_freeRTOS) available
|
- [FreeRTOS port](https://github.com/stnolting/neorv32/blob/master/sw/example/demo_freeRTOS) available
|
Line 91... |
Line 88... |
* The processor has to fit in a Lattice iCE40 UltraPlus 5k FPGA running at 20+ MHz.
|
* The processor has to fit in a Lattice iCE40 UltraPlus 5k FPGA running at 20+ MHz.
|
|
|
|
|
### Status
|
### Status
|
|
|
The processor is [synthesizable](#NEORV32-Processor-Exemplary-FPGA-Setups) (tested with *real hardware* using Intel Quartus Prime, Xilinx Vivado and Lattice Radiant/Synplify Pro) and can successfully execute
|
The processor is [synthesizable](#FPGA-Implementation-Results) (tested on *real hardware* using Intel Quartus Prime, Xilinx Vivado and Lattice Radiant/Synplify Pro) and can successfully execute
|
all the [provided example programs](https://github.com/stnolting/neorv32/tree/master/sw/example) including the [CoreMark benchmark](#CoreMark-Benchmark).
|
all the [provided example programs](https://github.com/stnolting/neorv32/tree/master/sw/example) including the [CoreMark benchmark](#CoreMark-Benchmark).
|
|
|
The processor passes the official `rv32i`, `rv32im`, `rv32imc`, `rv32Zicsr` and `rv32Zifencei` [RISC-V compliance tests](https://github.com/riscv/riscv-compliance).
|
The processor passes the official `rv32i`, `rv32im`, `rv32imc`, `rv32Zicsr` and `rv32Zifencei` [RISC-V compliance tests](https://github.com/riscv/riscv-compliance).
|
|
|
|
The project’s change log is available in the [CHANGELOG.md](https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md) file in the root directory of this repository.
|
|
|
| Project component | CI status | Note |
|
| Project component | CI status | Note |
|
|:--------------------------------------------------------------------------------|:----------|:---------|
|
|:--------------------------------------------------------------------------------|:----------|:---------|
|
| [NEORV32 processor](https://github.com/stnolting/neorv32) | [![Build Status](https://travis-ci.com/stnolting/neorv32.svg?branch=master)](https://travis-ci.com/stnolting/neorv32) | [![sw doc](https://img.shields.io/badge/SW%20documentation-gh--pages-blue)](https://stnolting.github.io/neorv32/files.html) |
|
| [NEORV32 processor](https://github.com/stnolting/neorv32) | [![Build Status](https://travis-ci.com/stnolting/neorv32.svg?branch=master)](https://travis-ci.com/stnolting/neorv32) | [![sw doc](https://img.shields.io/badge/SW%20documentation-gh--pages-blue)](https://stnolting.github.io/neorv32/files.html) |
|
| [Pre-built toolchain](https://github.com/stnolting/riscv_gcc_prebuilt) | [![Build Status](https://travis-ci.com/stnolting/riscv_gcc_prebuilt.svg?branch=master)](https://travis-ci.com/stnolting/riscv_gcc_prebuilt) | |
|
| [Pre-built toolchain](https://github.com/stnolting/riscv_gcc_prebuilt) | [![Build Status](https://travis-ci.com/stnolting/riscv_gcc_prebuilt.svg?branch=master)](https://travis-ci.com/stnolting/riscv_gcc_prebuilt) | |
|
| [RISC-V compliance test](https://github.com/stnolting/neorv32_riscv_compliance) | [![Build Status](https://travis-ci.com/stnolting/neorv32_riscv_compliance.svg?branch=master)](https://travis-ci.com/stnolting/neorv32_riscv_compliance) | |
|
| [RISC-V compliance test](https://github.com/stnolting/neorv32_riscv_compliance) | [![Build Status](https://travis-ci.com/stnolting/neorv32_riscv_compliance.svg?branch=master)](https://travis-ci.com/stnolting/neorv32_riscv_compliance) | |
|
Line 117... |
Line 116... |
- ...
|
- ...
|
|
|
|
|
## Features
|
## Features
|
|
|
|
The full-blown data sheet of the NEORV32 Processor/CPU is available as pdf file:
|
|
[![NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
|
|
### Processor Features
|
### Processor Features
|
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
|
|
The NEORV32 Processor provides a full-scale microcontroller-like SoC based on the NEORV32 CPU. The setup
|
The NEORV32 Processor provides a full-scale microcontroller-like SoC based on the NEORV32 CPU. The setup
|
Line 131... |
Line 133... |
- Optional machine system timer (**MTIME**), RISC-V-compliant
|
- Optional machine system timer (**MTIME**), RISC-V-compliant
|
- Optional universal asynchronous receiver and transmitter (**UART**) with simulation output option via text.io
|
- Optional universal asynchronous receiver and transmitter (**UART**) with simulation output option via text.io
|
- Optional 8/16/24/32-bit serial peripheral interface controller (**SPI**) with 8 dedicated chip select lines
|
- Optional 8/16/24/32-bit serial peripheral interface controller (**SPI**) with 8 dedicated chip select lines
|
- Optional two wire serial interface controller (**TWI**), compatible to the I²C standard
|
- Optional two wire serial interface controller (**TWI**), compatible to the I²C standard
|
- Optional general purpose parallel IO port (**GPIO**), 32xOut & 32xIn, with pin-change interrupt
|
- Optional general purpose parallel IO port (**GPIO**), 32xOut & 32xIn, with pin-change interrupt
|
- Optional 32-bit external bus interface, Wishbone b4 compliant (**WISHBONE**)
|
- Optional 32-bit external bus interface, Wishbone b4 compliant (**WISHBONE**), *standard* or *pipelined* handshake/transactions mode
|
- Optional watchdog timer (**WDT**)
|
- Optional watchdog timer (**WDT**)
|
- Optional PWM controller with 4 channels and 8-bit duty cycle resolution (**PWM**)
|
- Optional PWM controller with 4 channels and 8-bit duty cycle resolution (**PWM**)
|
- Optional GARO-based true random number generator (**TRNG**)
|
- Optional GARO-based true random number generator (**TRNG**)
|
- Optional custom functions unit (**CFU**) for tightly-coupled custom co-processors
|
- Optional custom functions unit (**CFU**) for tightly-coupled custom co-processors
|
- System configuration information memory to check hardware configuration by software (**SYSINFO**)
|
- System configuration information memory to check hardware configuration by software (**SYSINFO**)
|
Line 147... |
Line 149... |
The CPU is [compliant](https://github.com/stnolting/neorv32_riscv_compliance) to the
|
The CPU is [compliant](https://github.com/stnolting/neorv32_riscv_compliance) to the
|
[official RISC-V specifications (2.2)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf) including a subset of the
|
[official RISC-V specifications (2.2)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf) including a subset of the
|
[RISC-V privileged architecture specifications (1.12-draft)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf).
|
[RISC-V privileged architecture specifications (1.12-draft)](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf).
|
|
|
More information regarding the CPU including a detailed list of the instruction set and the available CSRs can be found in
|
More information regarding the CPU including a detailed list of the instruction set and the available CSRs can be found in
|
the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
the [NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
|
|
|
|
**General**:
|
**General**:
|
* Modified Harvard architecture (separate CPU interfaces for data and instructions; NEORV32 processor: Single processor-internal bus via I/D mux)
|
* Modified Harvard architecture (separate CPU interfaces for data and instructions; NEORV32 processor: Single processor-internal bus via I/D mux)
|
* Two stages in-order pipeline (FETCH, EXECUTE); each stage uses a multi-cycle processing scheme
|
* Two stages in-order pipeline (FETCH, EXECUTE); each stage uses a multi-cycle processing scheme
|
Line 235... |
Line 237... |
This chapter shows exemplary implementation results of the NEORV32 CPU for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
This chapter shows exemplary implementation results of the NEORV32 CPU for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
|
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
|
|
|
Results generated for hardware version: `1.4.3.3`
|
Results generated for hardware version: `1.4.4.8`
|
|
|
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:-------:|
|
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:--------:|
|
| `rv32i` | 1033 | 567 | 2048 | 0 | 120 MHz |
|
| `rv32i` | 983 | 438 | 2048 | 0 | ~120 MHz |
|
| `rv32i` + `u` + `Zicsr` + `Zifencei` | 1778 | 806 | 2048 | 0 | 103 MHz |
|
| `rv32i` + `u` + `Zicsr` + `Zifencei` | 1877 | 802 | 2048 | 0 | ~112 MHz |
|
| `rv32im` + `u` + `Zicsr` + `Zifencei` | 2389 | 1052 | 2048 | 0 | 102 MHz |
|
| `rv32im` + `u` + `Zicsr` + `Zifencei` | 2374 | 1048 | 2048 | 0 | ~110 MHz |
|
| `rv32imc` + `u` + `Zicsr` + `Zifencei` | 2644 | 1053 | 2048 | 0 | 106 MHz |
|
| `rv32imc` + `u` + `Zicsr` + `Zifencei` | 2650 | 1064 | 2048 | 0 | ~110 MHz |
|
| `rv32emc` + `u` + `Zicsr` + `Zifencei` | 2646 | 1050 | 1024 | 0 | 103 MHz |
|
| `rv32emc` + `u` + `Zicsr` + `Zifencei` | 2680 | 1061 | 1024 | 0 | ~110 MHz |
|
|
|
|
|
### NEORV32 Processor-Internal Peripherals and Memories
|
### NEORV32 Processor-Internal Peripherals and Memories
|
|
|
Results generated for hardware version: `1.4.3.3`
|
Results generated for hardware version: `1.4.4.8`
|
|
|
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
|:----------|:-----------------------------------------------------|:---:|:---:|:-----------:|:----:|
|
|:----------|:-----------------------------------------------------|----:|----:|------------:|-----:|
|
| BOOT ROM | Bootloader ROM (default 4kB) | 3 | 1 | 32 768 | 0 |
|
| BOOT ROM | Bootloader ROM (default 4kB) | 4 | 1 | 32 768 | 0 |
|
| BUSSWITCH | Mux for CPU I & D interfaces | 59 | 8 | 0 | 0 |
|
| BUSSWITCH | Mux for CPU I & D interfaces | 62 | 8 | 0 | 0 |
|
| CFU | Custom functions unit | - | - | - | - |
|
| CFU | Custom functions unit | - | - | - | - |
|
| DMEM | Processor-internal data memory (default 8kB) | 13 | 2 | 65 536 | 0 |
|
| DMEM | Processor-internal data memory (default 8kB) | 13 | 2 | 65 536 | 0 |
|
| GPIO | General purpose input/output ports | 69 | 65 | 0 | 0 |
|
| GPIO | General purpose input/output ports | 66 | 65 | 0 | 0 |
|
| IMEM | Processor-internal instruction memory (default 16kb) | 9 | 2 | 131 072 | 0 |
|
| IMEM | Processor-internal instruction memory (default 16kb) | 7 | 2 | 131 072 | 0 |
|
| MTIME | Machine system timer | 281 | 166 | 0 | 0 |
|
| MTIME | Machine system timer | 268 | 166 | 0 | 0 |
|
| PWM | Pulse-width modulation controller | 72 | 69 | 0 | 0 |
|
| PWM | Pulse-width modulation controller | 72 | 69 | 0 | 0 |
|
| SPI | Serial peripheral interface | 189 | 125 | 0 | 0 |
|
| SPI | Serial peripheral interface | 184 | 125 | 0 | 0 |
|
| SYSINFO | System configuration information memory | 10 | 9 | 0 | 0 |
|
| SYSINFO | System configuration information memory | 11 | 9 | 0 | 0 |
|
| TRNG | True random number generator | 175 | 132 | 0 | 0 |
|
| TRNG | True random number generator | 132 | 105 | 0 | 0 |
|
| TWI | Two-wire interface | 72 | 44 | 0 | 0 |
|
| TWI | Two-wire interface | 74 | 44 | 0 | 0 |
|
| UART | Universal asynchronous receiver/transmitter | 175 | 132 | 0 | 0 |
|
| UART | Universal asynchronous receiver/transmitter | 175 | 132 | 0 | 0 |
|
| WDT | Watchdog timer | 60 | 45 | 0 | 0 |
|
| WDT | Watchdog timer | 58 | 45 | 0 | 0 |
|
|
| WISHBONE | External memory interface (`MEM_EXT_REG_STAGES` = 2) | 106 | 104 | 0 | 0 |
|
|
|
|
|
### NEORV32 Processor - Exemplary FPGA Setups
|
### NEORV32 Processor - Exemplary FPGA Setups
|
|
|
Exemplary processor implementation results for different FPGA platforms. The processor setup uses *all provided peripherals* (but not the _CFU_),
|
Exemplary processor implementation results for different FPGA platforms. The processor setup uses *the default peripheral configuration* (like no _CFU_ and no _TRNG_),
|
no external memory interface and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
no external memory interface and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
|
|
Results generated for hardware version: `1.4.3.3`
|
Results generated for hardware version: `1.4.4.8`
|
|
|
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
|
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
|
| Intel | Cyclone IV `EP4CE22F17C6N` | Terasic DE0-Nano | Quartus Prime Lite 19.1 | balanced | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 4120 (18%) | 1944 (9%) | 0 (0%) | 231424 (38%) | - | - | 103 MHz |
|
| Intel | Cyclone IV `EP4CE22F17C6N` | Terasic DE0-Nano | Quartus Prime Lite 19.1 | balanced | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 4008 (18%) | 1849 (9%) | 0 (0%) | 231424 (38%) | - | - | 105 MHz |
|
| Lattice | iCE40 UltraPlus `iCE40UP5K-SG48I` | Upduino v2.0 | Radiant 2.1 (Synplify Pro) | default | `rv32ic` + `u` + `Zicsr` + `Zifencei` | 4288 (81%) | 1693 (32%) | 0 (0%) | - | 12 (40%) | 4 (100%) | *c* 22.5 MHz |
|
| Lattice | iCE40 UltraPlus `iCE40UP5K-SG48I` | Upduino v2.0 | Radiant 2.1 (Synplify Pro) | default | `rv32ic` + `u` + `Zicsr` + `Zifencei` | 4296 (81%) | 1611 (30%) | 0 (0%) | - | 12 (40%) | 4 (100%) | *c* 22.5 MHz |
|
| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | default | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 2385 (11%) | 2008 (5%) | 0 (0%) | - | 8 (16%) | - | *c* 100 MHz |
|
| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | default | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 2390 (11%) | 1888 (5%) | 0 (0%) | - | 8 (16%) | - | *c* 100 MHz |
|
|
|
**_Notes_**
|
**_Notes_**
|
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DMEM (each 64kb).
|
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DMEM (each 64kb).
|
The FPGA-specific memory components can be found in [`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up).
|
The FPGA-specific memory components can be found in [`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up).
|
* The clock frequencies marked with a "c" are constrained clocks. The remaining ones are _f_max_ results from the place and route timing reports.
|
* The clock frequencies marked with a "c" are constrained clocks. The remaining ones are _f_max_ results from the place and route timing reports.
|
Line 301... |
Line 304... |
|
|
The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the NEORV32 and is available in the
|
The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the NEORV32 and is available in the
|
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
|
|
Results generated for hardware version: `1.3.7.3`
|
Results generated for hardware version: `1.4.4.8`
|
|
|
~~~
|
~~~
|
**Configuration**
|
**Configuration**
|
Hardware: 32kB IMEM, 16kB DMEM, 100MHz clock
|
Hardware: 32kB IMEM, 16kB DMEM, 100MHz clock
|
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
|
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
|
Compiler: RISCV32-GCC 10.1.0 (rv32i)
|
Compiler: RISCV32-GCC 10.1.0 (rv32i)
|
Peripherals: UART for printing the results
|
Peripherals: UART for printing the results
|
~~~
|
~~~
|
|
|
| CPU | Executable Size | Optimization | CoreMark Score | CoreMarks/MHz |
|
| CPU | Executable Size | Optimization | CoreMark Score | CoreMarks/MHz |
|
|:-----------------------|:---------------:|:------------:|:--------------:|:-------------:|
|
|:--------------------------|:---------------:|:------------:|:--------------:|:-------------:|
|
| `rv32i` | 26 748 bytes | `-O3` | 28.98 | 0.2898 |
|
| `rv32i` | 26 940 bytes | `-O3` | 33.89 | 0.3389 |
|
| `rv32im` | 25 580 bytes | `-O3` | 60.60 | 0.6060 |
|
| `rv32im` | 25 772 bytes | `-O3` | 64.51 | 0.6451 |
|
| `rv32imc` | 19 636 bytes | `-O3` | 62.50 | 0.6250 |
|
| `rv32im` + `FAST_MUL_EN` | 25 772 bytes | `-O3` | 80.00 | 0.8000 |
|
| `rv32imc` + _FAST_MUL_ | 19 636 bytes | `-O3` | 76.92 | 0.7692 |
|
| `rv32imc` | 19 812 bytes | `-O3` | 62.50 | 0.6250 |
|
|
| `rv32imc` + `FAST_MUL_EN` | 19 812 bytes | `-O3` | 76.92 | 0.7692 |
|
|
|
The _FAST_MUL_ configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic).
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic).
|
|
|
|
When the `C` extension is enabled, branches to an unaligned uncompressed instruction require additional instruction fetch cycles.
|
|
|
### Instruction Cycles
|
### Instruction Cycles
|
|
|
The NEORV32 CPU is based on a two-stages pipelined architecutre. Each stage uses a multi-cycle processing scheme. Hence,
|
The NEORV32 CPU is based on a two-stages pipelined architecutre. Each stage uses a multi-cycle processing scheme. Hence,
|
each instruction requires several clock cycles to execute (2 cycles for ALU operations, ..., 40 cycles for divisions).
|
each instruction requires several clock cycles to execute (2 cycles for ALU operations, ..., 40 cycles for divisions).
|
Line 336... |
Line 341... |
The following table shows the performance results for successfully running 2000 CoreMark
|
The following table shows the performance results for successfully running 2000 CoreMark
|
iterations, which reflects a pretty good "real-life" work load. The average CPI is computed by
|
iterations, which reflects a pretty good "real-life" work load. The average CPI is computed by
|
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
|
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
|
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
|
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
|
|
|
Results generated for hardware version: `1.3.7.3`
|
Results generated for hardware version: `1.4.4.8`
|
|
|
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|
|:-----------------------|----------------------:|----------------------:|:-----------:|
|
|:------------------------|----------------------:|----------------------:|:-----------:|
|
| `rv32i` | 6 955 817 507 | 1 468 927 290 | 4.73 |
|
| `rv32i` | 5 945 938 586 | 1 469 587 406 | 4.05 |
|
| `rv32im` | 3 376 961 507 | 601 565 750 | 5.61 |
|
| `rv32im` | 3 110 282 586 | 602 225 760 | 5.16 |
|
| `rv32imc` | 3 274 832 513 | 601 565 964 | 5.44 |
|
| `rv32im` `FAST_MUL_EN` | 2 527 730 586 | 602 225 728 | 4.19 |
|
| `rv32imc` + _FAST_MUL_ | 2 689 845 200 | 601 565 890 | 4.47 |
|
| `rv32imc` | 3 217 064 278 | 602 225 530 | 5.34 |
|
|
| `rv32imc` `FAST_MUL_EN` | 2 634 512 278 | 602 225 574 | 4.37 |
|
|
|
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic).
|
|
|
The _FAST_MUL_ configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic).
|
When the `C` extension is enabled, branches to an unaligned uncompressed instruction require additional instruction fetch cycles.
|
|
|
|
|
|
|
## Top Entities
|
## Top Entities
|
|
|