Line 1... |
Line 1... |
# [The NEORV32 Processor](https://github.com/stnolting/neorv32) (RISC-V)
|
[![NEORV32](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_logo_white_bg.png)](https://github.com/stnolting/neorv32)
|
|
|
|
# The NEORV32 RISC-V Processor
|
|
|
[![Build Status](https://travis-ci.com/stnolting/neorv32.svg?branch=master)](https://travis-ci.com/stnolting/neorv32)
|
[![Build Status](https://travis-ci.com/stnolting/neorv32.svg?branch=master)](https://travis-ci.com/stnolting/neorv32)
|
[![license](https://img.shields.io/github/license/stnolting/neorv32)](https://github.com/stnolting/neorv32/blob/master/LICENSE)
|
[![license](https://img.shields.io/github/license/stnolting/neorv32)](https://github.com/stnolting/neorv32/blob/master/LICENSE)
|
[![release](https://img.shields.io/github/v/release/stnolting/neorv32)](https://github.com/stnolting/neorv32/releases)
|
[![release](https://img.shields.io/github/v/release/stnolting/neorv32)](https://github.com/stnolting/neorv32/releases)
|
|
|
Line 18... |
Line 20... |
|
|
## Overview
|
## Overview
|
|
|
The NEORV32 Processor is a customizable microcontroller-like system on chip (SoC) that is based
|
The NEORV32 Processor is a customizable microcontroller-like system on chip (SoC) that is based
|
on the RISC-V-compliant NEORV32 CPU. The processor is intended as *ready-to-go* auxiliary processor within a larger SoC
|
on the RISC-V-compliant NEORV32 CPU. The processor is intended as *ready-to-go* auxiliary processor within a larger SoC
|
designs or as stand-alone custom microcontroller. Its top entity can be directly synthesized for *any* target technology without modifications.
|
designs or as stand-alone custom microcontroller.
|
|
|
|
|
### Key Features
|
### Key Features
|
|
|
* RISC-V-[compliant](#Status) 32-bit `rv32i` [**NEORV32 CPU**](#NEORV32-CPU-Features)
|
* RISC-V-[compliant](#Status) 32-bit `rv32i` [**NEORV32 CPU**](#NEORV32-CPU-Features), compliant to
|
* Compliant to *Unprivileged ISA Specification* [(Version 2.2)](https://github.com/stnolting/neorv32/blob/master/docs/riscv-privileged.pdf)
|
* Subset of the *Unprivileged ISA Specification* [(Version 2.2)](https://github.com/stnolting/neorv32/blob/master/docs/riscv-privileged.pdf)
|
* Compliant to *Privileged Architecture Specification* [(Version 1.12-draft)](https://github.com/stnolting/neorv32/blob/master/docs/riscv-spec.pdf)
|
* Subset of the *Privileged Architecture Specification* [(Version 1.12-draft)](https://github.com/stnolting/neorv32/blob/master/docs/riscv-spec.pdf)
|
* Optional CPU extensions
|
* Optional CPU extensions
|
* `C` - compressed instructions (16-bit)
|
* `C` - compressed instructions (16-bit)
|
* `E` - embedded CPU (reduced register file)
|
* `E` - embedded CPU (reduced register file)
|
* `M` - integer multiplication and division hardware
|
* `M` - integer multiplication and division hardware
|
* `U` - less-privileged *user mode*
|
* `U` - less-privileged *user mode*
|
* `Zicsr` - control and status register access instructions (+ exception/irq system)
|
* `Zicsr` - control and status register access instructions (+ exception/irq system)
|
* `Zifencei` - instruction stream synchronization
|
* `Zifencei` - instruction stream synchronization
|
* `PMP` - physical memory protection
|
* `PMP` - physical memory protection
|
|
* Full-scale RISC-V microcontroller system (**SoC**) [**NEORV32 Processor**](#NEORV32-Processor-Features) with optional submodules
|
|
* optional embedded memories (instruction/data/bootloader, RAM/ROM)
|
|
* timers (watch dog, RISC-V-compliant machine timer)
|
|
* serial interfaces (SPI, TWI, UART)
|
|
* external bus interface (Wishbone / [AXI4](#AXI4-Connectivity))
|
|
* [more ...](#NEORV32-Processor-Features)
|
* Software framework
|
* Software framework
|
* Core libraries for high-level usage of the provided functions and peripherals
|
* core libraries for high-level usage of the provided functions and peripherals
|
* Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
* application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
* GCC-based toolchain ([pre-compiled toolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
* GCC-based toolchain ([pre-compiled toolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
* runtime environment
|
* runtime environment
|
* several example programs
|
* several example programs
|
* [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
* [doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation: available on [GitHub pages](https://stnolting.github.io/neorv32/files.html)
|
* [FreeRTOS port](https://github.com/stnolting/neorv32/blob/master/sw/example/demo_freeRTOS) available
|
* [FreeRTOS port](https://github.com/stnolting/neorv32/blob/master/sw/example/demo_freeRTOS) available
|
* [**Full-blown data sheet**](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
* [**Full-blown data sheet**](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf)
|
* Completely described in behavioral, platform-independent VHDL - no primitives, macros, etc.
|
* Completely described in behavioral, platform-independent VHDL - no primitives, macros, etc.
|
* Fully synchronous design, no latches, no gated clocks
|
* Fully synchronous design, no latches, no gated clocks
|
* Small hardware footprint and high operating frequency
|
* Small hardware footprint and high operating frequency
|
* Full-scale RISC-V microcontroller system (**SoC**): [**NEORV32 Processor**](#NEORV32-Processor-Features)
|
|
* Optional embedded memories, timers, serial interfaces, external interfaces (Wishbone or [AXI4-Lite](#AXI4-Connectivity)) ...
|
|
|
|
The project’s change log is available in the [CHANGELOG.md](https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md) file in the root directory of this repository.
|
The project’s change log is available in the [CHANGELOG.md](https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md) file in the root directory of this repository.
|
To see the changes between releases visit the project's [release page](https://github.com/stnolting/neorv32/releases).
|
To see the changes between releases visit the project's [release page](https://github.com/stnolting/neorv32/releases).
|
For more information take a look at the [NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf).
|
For more information take a look at the [NEORV32 data sheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf) (pdf).
|
|
|
Line 81... |
Line 87... |
|
|
|
|
### To-Do / Wish List / [Help Wanted](#Contribute)
|
### To-Do / Wish List / [Help Wanted](#Contribute)
|
|
|
* Use LaTeX for data sheet
|
* Use LaTeX for data sheet
|
|
* More support for FreeRTOS
|
* Further size and performance optimization
|
* Further size and performance optimization
|
* Add a cache for the external memory interface
|
* Add a cache for the external memory interface
|
* Synthesis results (+ wrappers?) for more/specific platforms
|
* Synthesis results (+ wrappers?) for more/specific platforms
|
* Maybe port additional RTOSs (like [Zephyr](https://github.com/zephyrproject-rtos/zephyr) or [RIOT](https://www.riot-os.org))
|
* Maybe port additional RTOSs (like [Zephyr](https://github.com/zephyrproject-rtos/zephyr) or [RIOT](https://www.riot-os.org))
|
* Implement further CPU extensions:
|
* Implement further CPU extensions:
|
Line 103... |
Line 110... |
### NEORV32 Processor Features
|
### NEORV32 Processor Features
|
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_processor.png)
|
|
|
The NEORV32 Processor provides a full-scale microcontroller-like SoC based on the NEORV32 CPU. The setup
|
The NEORV32 Processor provides a full-scale microcontroller-like SoC based on the NEORV32 CPU. The setup
|
is highly customizable via the processor's top generics.
|
is highly customizable via the processor's top generics and already provides the following *optional* modules:
|
|
|
* Optional processor-internal data and instruction memories (**DMEM** / **IMEM**)
|
* processor-internal data and instruction memories (**DMEM** / **IMEM**)
|
* Optional internal **Bootloader** with UART console and automatic application boot from SPI flash option
|
* internal **Bootloader** with UART console and automatic application boot from SPI flash option
|
* Optional machine system timer (**MTIME**), RISC-V-compliant
|
* machine system timer (**MTIME**), RISC-V-compliant
|
* Optional universal asynchronous receiver and transmitter (**UART**) with simulation output option via text.io
|
* watchdog timer (**WDT**)
|
* Optional 8/16/24/32-bit serial peripheral interface controller (**SPI**) with 8 dedicated chip select lines
|
* universal asynchronous receiver and transmitter (**UART**) with simulation output option via text.io
|
* Optional two wire serial interface controller (**TWI**), with optional clock-stretching, compatible to the I²C standard
|
* 8/16/24/32-bit serial peripheral interface controller (**SPI**) with 8 dedicated chip select lines
|
* Optional general purpose parallel IO port (**GPIO**), 32xOut & 32xIn, with pin-change interrupt
|
* two wire serial interface controller (**TWI**), with optional clock-stretching, compatible to the I²C standard
|
* Optional 32-bit external bus interface, Wishbone b4 compliant (**WISHBONE**), *standard* or *pipelined* handshake/transactions mode
|
* general purpose parallel IO port (**GPIO**), 32xOut & 32xIn, with pin-change interrupt
|
* Optional wrapper for **AXI4-Lite Master Interface** (see [AXI Connectivity](#AXI4-Connectivity)), compatibility verified with Xilinx Vivado Block Desginer
|
* 32-bit external bus interface, Wishbone b4 compliant (**WISHBONE**), *standard* or *pipelined* handshake/transactions mode
|
* Optional watchdog timer (**WDT**)
|
* wrapper for **AXI4-Lite Master Interface** (see [AXI Connectivity](#AXI4-Connectivity))
|
* Optional PWM controller with 4 channels and 8-bit duty cycle resolution (**PWM**)
|
* PWM controller with 4 channels and 8-bit duty cycle resolution (**PWM**)
|
* Optional GARO-based true random number generator (**TRNG**)
|
* GARO-based true random number generator (**TRNG**)
|
* Optional custom functions units (**CFU0** and **CFU1**) for tightly-coupled custom co-processors
|
* custom functions units (**CFU0** and **CFU1**) for tightly-coupled custom co-processors
|
* System configuration information memory to check hardware configuration by software (**SYSINFO**)
|
* system configuration information memory to check hardware configuration by software (**SYSINFO**, mandatory - not *optional*)
|
|
|
### NEORV32 CPU Features
|
### NEORV32 CPU Features
|
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_cpu.png)
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_cpu.png)
|
|
|
Line 214... |
Line 221... |
## FPGA Implementation Results
|
## FPGA Implementation Results
|
|
|
### NEORV32 CPU
|
### NEORV32 CPU
|
|
|
This chapter shows exemplary implementation results of the NEORV32 CPU for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
This chapter shows exemplary implementation results of the NEORV32 CPU for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 20.1** ("balanced implementation"). The timing
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
|
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
|
|
|
Results generated for hardware version `1.4.4.8`.
|
Results generated for hardware version `1.4.7.0`.
|
|
|
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:--------:|
|
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:--------:|
|
| `rv32i` | 983 | 438 | 2048 | 0 | ~120 MHz |
|
| `rv32i` | 932 | 413 | 2048 | 0 | ~120 MHz |
|
| `rv32i` + `u` + `Zicsr` + `Zifencei` | 1877 | 802 | 2048 | 0 | ~112 MHz |
|
| `rv32i` + `u` + `Zicsr` + `Zifencei` | 1800 | 815 | 2048 | 0 | ~118 MHz |
|
| `rv32im` + `u` + `Zicsr` + `Zifencei` | 2374 | 1048 | 2048 | 0 | ~110 MHz |
|
| `rv32im` + `u` + `Zicsr` + `Zifencei` | 2368 | 1058 | 2048 | 0 | ~117 MHz |
|
| `rv32imc` + `u` + `Zicsr` + `Zifencei` | 2650 | 1064 | 2048 | 0 | ~110 MHz |
|
| `rv32imc` + `u` + `Zicsr` + `Zifencei` | 2604 | 1073 | 2048 | 0 | ~113 MHz |
|
| `rv32emc` + `u` + `Zicsr` + `Zifencei` | 2680 | 1061 | 1024 | 0 | ~110 MHz |
|
| `rv32emc` + `u` + `Zicsr` + `Zifencei` | 2613 | 1073 | 1024 | 0 | ~113 MHz |
|
|
|
|
|
### NEORV32 Processor-Internal Peripherals and Memories
|
### NEORV32 Processor-Internal Peripherals and Memories
|
|
|
Results generated for hardware version `1.4.4.8`.
|
Results generated for hardware version `1.4.7.0`.
|
|
|
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
|:----------|:-----------------------------------------------------|----:|----:|------------:|-----:|
|
|:----------|:-----------------------------------------------------|----:|----:|------------:|-----:|
|
| BOOT ROM | Bootloader ROM (default 4kB) | 4 | 1 | 32 768 | 0 |
|
| BOOT ROM | Bootloader ROM (default 4kB) | 3 | 1 | 32 768 | 0 |
|
| BUSSWITCH | Mux for CPU I & D interfaces | 62 | 8 | 0 | 0 |
|
| BUSSWITCH | Mux for CPU I & D interfaces | 63 | 8 | 0 | 0 |
|
| CFU0 | Custom functions unit 0 | - | - | - | - |
|
| CFU0 | Custom functions unit 0 | - | - | - | - |
|
| CFU1 | Custom functions unit 1 | - | - | - | - |
|
| CFU1 | Custom functions unit 1 | - | - | - | - |
|
| DMEM | Processor-internal data memory (default 8kB) | 13 | 2 | 65 536 | 0 |
|
| DMEM | Processor-internal data memory (default 8kB) | 12 | 2 | 65 536 | 0 |
|
| GPIO | General purpose input/output ports | 66 | 65 | 0 | 0 |
|
| GPIO | General purpose input/output ports | 66 | 65 | 0 | 0 |
|
| IMEM | Processor-internal instruction memory (default 16kb) | 7 | 2 | 131 072 | 0 |
|
| IMEM | Processor-internal instruction memory (default 16kb) | 7 | 2 | 131 072 | 0 |
|
| MTIME | Machine system timer | 268 | 166 | 0 | 0 |
|
| MTIME | Machine system timer | 272 | 166 | 0 | 0 |
|
| PWM | Pulse-width modulation controller | 72 | 69 | 0 | 0 |
|
| PWM | Pulse-width modulation controller | 72 | 69 | 0 | 0 |
|
| SPI | Serial peripheral interface | 184 | 125 | 0 | 0 |
|
| SPI | Serial peripheral interface | 142 | 124 | 0 | 0 |
|
| SYSINFO | System configuration information memory | 11 | 9 | 0 | 0 |
|
| SYSINFO | System configuration information memory | 11 | 9 | 0 | 0 |
|
| TRNG | True random number generator | 132 | 105 | 0 | 0 |
|
| TRNG | True random number generator | 132 | 105 | 0 | 0 |
|
| TWI | Two-wire interface | 74 | 44 | 0 | 0 |
|
| TWI | Two-wire interface | 77 | 44 | 0 | 0 |
|
| UART | Universal asynchronous receiver/transmitter | 175 | 132 | 0 | 0 |
|
| UART | Universal asynchronous receiver/transmitter | 173 | 132 | 0 | 0 |
|
| WDT | Watchdog timer | 58 | 45 | 0 | 0 |
|
| WDT | Watchdog timer | 58 | 45 | 0 | 0 |
|
| WISHBONE | External memory interface | 106 | 104 | 0 | 0 |
|
| WISHBONE | External memory interface | 106 | 104 | 0 | 0 |
|
|
|
|
|
### NEORV32 Processor - Exemplary FPGA Setups
|
### NEORV32 Processor - Exemplary FPGA Setups
|
Line 260... |
Line 267... |
Exemplary processor implementation results for different FPGA platforms. The processor setup uses *the default peripheral configuration* (like no _CFUs_ and no _TRNG_),
|
Exemplary processor implementation results for different FPGA platforms. The processor setup uses *the default peripheral configuration* (like no _CFUs_ and no _TRNG_),
|
no external memory interface and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
no external memory interface and only internal instruction and data memories. IMEM uses 16kB and DMEM uses 8kB memory space. The setup's top entity connects most of the
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
|
|
Results generated for hardware version `1.4.4.8`.
|
Results generated for hardware version `1.4.7.0`.
|
|
|
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
|
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
|
| Intel | Cyclone IV `EP4CE22F17C6N` | Terasic DE0-Nano | Quartus Prime Lite 19.1 | balanced | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 4008 (18%) | 1849 (9%) | 0 (0%) | 231424 (38%) | - | - | 105 MHz |
|
| Intel | Cyclone IV `EP4CE22F17C6N` | Terasic DE0-Nano | Quartus Prime Lite 20.1 | balanced | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 3892 (17%) | 1859 (8%) | 0 (0%) | 231424 (38%) | - | - | 113 MHz |
|
| Lattice | iCE40 UltraPlus `iCE40UP5K-SG48I` | Upduino v2.0 | Radiant 2.1 (Synplify Pro) | default | `rv32ic` + `u` + `Zicsr` + `Zifencei` | 4296 (81%) | 1611 (30%) | 0 (0%) | - | 12 (40%) | 4 (100%) | *c* 22.5 MHz |
|
| Lattice | iCE40 UltraPlus `iCE40UP5K-SG48I` | Upduino v2.0 | Radiant 2.1 (Synplify Pro) | default | `rv32ic` + `u` + `Zicsr` + `Zifencei` | 4331 (82%) | 1673 (31%) | 0 (0%) | - | 12 (40%) | 4 (100%) | *c* 22.5 MHz |
|
| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | default | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 2390 (11%) | 1888 (5%) | 0 (0%) | - | 8 (16%) | - | *c* 100 MHz |
|
| Xilinx | Artix-7 `XC7A35TICSG324-1L` | Arty A7-35T | Vivado 2019.2 | default | `rv32imc` + `u` + `Zicsr` + `Zifencei` + `PMP` | 2416 (12%) | 1900 (5%) | 0 (0%) | - | 8 (16%) | - | *c* 100 MHz |
|
|
|
**_Notes_**
|
**_Notes_**
|
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DMEM (each 64kb).
|
* The Lattice iCE40 UltraPlus setup uses the FPGA's SPRAM memory primitives for the internal IMEM and DMEM (each 64kb).
|
The FPGA-specific memory components can be found in [`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up).
|
The FPGA-specific memory components can be found in [`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up).
|
* The clock frequencies marked with a "c" are constrained clocks. The remaining ones are _f_max_ results from the place and route timing reports.
|
* The clock frequencies marked with a "c" are constrained clocks. The remaining ones are _f_max_ results from the place and route timing reports.
|
Line 286... |
Line 293... |
|
|
The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the NEORV32 and is available in the
|
The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the NEORV32 and is available in the
|
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
|
|
Results generated for hardware version `1.4.5.4`.
|
Results generated for hardware version `1.4.7.0`.
|
|
|
~~~
|
~~~
|
**Configuration**
|
**Configuration**
|
Hardware: 32kB IMEM, 16kB DMEM, 100MHz clock
|
Hardware: 32kB IMEM, 16kB DMEM, 100MHz clock
|
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
|
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
|
Line 299... |
Line 306... |
Peripherals: UART for printing the results
|
Peripherals: UART for printing the results
|
~~~
|
~~~
|
|
|
| CPU | Executable Size | Optimization | CoreMark Score | CoreMarks/MHz |
|
| CPU | Executable Size | Optimization | CoreMark Score | CoreMarks/MHz |
|
|:--------------------------------------------|:---------------:|:------------:|:--------------:|:-------------:|
|
|:--------------------------------------------|:---------------:|:------------:|:--------------:|:-------------:|
|
| `rv32i` | 26 940 bytes | `-O3` | 33.89 | **0.3389** |
|
| `rv32i` | 27 424 bytes | `-O3` | 35.71 | **0.3571** |
|
| `rv32im` | 25 772 bytes | `-O3` | 64.51 | **0.6451** |
|
| `rv32im` | 26 232 bytes | `-O3` | 66.66 | **0.6666** |
|
| `rv32imc` | 20 524 bytes | `-O3` | 64.51 | **0.6451** |
|
| `rv32imc` | 20 876 bytes | `-O3` | 66.66 | **0.6666** |
|
| `rv32imc` + `FAST_MUL_EN` | 20 524 bytes | `-O3` | 80.00 | **0.8000** |
|
| `rv32imc` + `FAST_MUL_EN` | 20 876 bytes | `-O3` | 83.33 | **0.8333** |
|
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 20 524 bytes | `-O3` | 83.33 | **0.8333** |
|
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 20 876 bytes | `-O3` | 86.96 | **0.8696** |
|
|
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic). The `FAST_SHIFT_EN` configuration
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic). The `FAST_SHIFT_EN` configuration
|
uses a barrel shifter for CPU shift operations (enabled via the `FAST_SHIFT_EN` generic).
|
uses a barrel shifter for CPU shift operations (enabled via the `FAST_SHIFT_EN` generic).
|
|
|
When the `C` extension is enabled, branches to an unaligned uncompressed instruction require additional instruction fetch cycles.
|
When the `C` extension is enabled, branches to an unaligned uncompressed instruction require additional instruction fetch cycles.
|
Line 326... |
Line 333... |
The following table shows the performance results for successfully running 2000 CoreMark
|
The following table shows the performance results for successfully running 2000 CoreMark
|
iterations, which reflects a pretty good "real-life" work load. The average CPI is computed by
|
iterations, which reflects a pretty good "real-life" work load. The average CPI is computed by
|
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
|
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
|
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
|
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
|
|
|
Results generated for hardware version `1.4.5.4`.
|
Results generated for hardware version `1.4.7.0`.
|
|
|
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|
|:--------------------------------------------|----------------------:|----------------------:|:-----------:|
|
|:--------------------------------------------|----------------------:|----------------------:|:-----------:|
|
| `rv32i` | 5 945 938 586 | 1 469 587 406 | **4.05** |
|
| `rv32i` | 5 648 997 774 | 1 469 233 238 | **3.84** |
|
| `rv32im` | 3 110 282 586 | 602 225 760 | **5.16** |
|
| `rv32im` | 3 036 749 774 | 601 871 338 | **5.05** |
|
| `rv32imc` | 3 172 969 968 | 615 388 890 | **5.16** |
|
| `rv32imc` | 3 036 959 882 | 615 034 616 | **4.94** |
|
| `rv32imc` + `FAST_MUL_EN` | 2 590 417 968 | 615 388 890 | **4.21** |
|
| `rv32imc` + `FAST_MUL_EN` | 2 454 407 882 | 615 034 588 | **3.99** |
|
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 2 456 318 408 | 615 388 890 | **3.99** |
|
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 2 320 308 322 | 615 034 676 | **3.77** |
|
|
|
|
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic). The `FAST_SHIFT_EN` configuration
|
The `FAST_MUL_EN` configuration uses DSPs for the multiplier of the `M` extension (enabled via the `FAST_MUL_EN` generic). The `FAST_SHIFT_EN` configuration
|
uses a barrel shifter for CPU shift operations (enabled via the `FAST_SHIFT_EN` generic).
|
uses a barrel shifter for CPU shift operations (enabled via the `FAST_SHIFT_EN` generic).
|
|
|
Line 498... |
Line 505... |
Use the bootloader console to upload the `neorv32_exe.bin` executable and run your application image.
|
Use the bootloader console to upload the `neorv32_exe.bin` executable and run your application image.
|
|
|
```
|
```
|
<< NEORV32 Bootloader >>
|
<< NEORV32 Bootloader >>
|
|
|
BLDV: Jul 6 2020
|
BLDV: Nov 7 2020
|
HWV: 1.0.1.0
|
HWV: 0x01040606
|
CLK: 0x0134FD90 Hz
|
CLK: 0x0134FD90 Hz
|
USER: 0x0001CE40
|
USER: 0x0001CE40
|
MISA: 0x42801104
|
MISA: 0x42801104
|
PROC: 0x03FF0035
|
PROC: 0x03FF0035
|
IMEM: 0x00010000 bytes @ 0x00000000
|
IMEM: 0x00010000 bytes @ 0x00000000
|
Line 548... |
Line 555... |
## Legal
|
## Legal
|
|
|
This project is released under the BSD 3-Clause license. No copyright infringement intended.
|
This project is released under the BSD 3-Clause license. No copyright infringement intended.
|
Other implied or used projects might have different licensing - see their documentation to get more information.
|
Other implied or used projects might have different licensing - see their documentation to get more information.
|
|
|
#### Citation
|
#### Citing
|
|
|
If you are using the NEORV32 or some parts of the project in some kind of publication, please cite it as follows:
|
If you are using the NEORV32 or some parts of the project in some kind of publication, please cite it as follows:
|
|
|
> S. Nolting, "The NEORV32 Processor", github.com/stnolting/neorv32
|
> S. Nolting, "The NEORV32 Processor", github.com/stnolting/neorv32
|
|
|
Line 621... |
Line 628... |
|
|
This project is not affiliated with or endorsed by the Open Source Initiative (https://www.oshwa.org / https://opensource.org).
|
This project is not affiliated with or endorsed by the Open Source Initiative (https://www.oshwa.org / https://opensource.org).
|
|
|
--------
|
--------
|
|
|
This repository was created on June 23th, 2020.
|
This repository was created on June 23rd, 2020.
|
|
|
Made with :coffee: in Hannover, Germany :eu:
|
Made with :coffee: in Hannover, Germany :eu:
|