URL
https://opencores.org/ocsvn/neorv32/neorv32/trunk
Subversion Repositories neorv32
[/] [neorv32/] [trunk/] [docs/] [datasheet/] [overview.adoc] - Rev 74
Compare with Previous | Blame | View Log
:sectnums:== OverviewThe NEORV32footnote:[Pronounced "neo-R-V-thirty-two" or "neo-risc-five-thirty-two" in its long form.] is an open-sourceRISC-V compatible processor system that is intended as *ready-to-go* auxiliary processor within a larger SoCdesigns or as stand-alone custom / customizable microcontroller.The system is highly configurable and provides optional common peripherals like embedded memories,timers, serial interfaces, general purpose IO ports and an external bus interface to connect custom IP likememories, NoCs and other peripherals. On-line and in-system debugging is supported by an OpenOCD/gdbcompatible on-chip debugger accessible via JTAG.Special focus is paid on **execution safety** to provide defined and predictable behavior at any time.Therefore, the CPU ensures that all memory access are acknowledged and no invalid/malformed instructionsare executed. Whenever an unexpected situation occurs, the application code is informed via hardware exceptions.The software framework of the processor comes with application makefiles, software libraries for all CPUand processor features, a bootloader, a runtime environment and several example programs - including a portof the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used asdefault toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**that provides hands-on tutorials to get you started.**Structure**[start=2]. <<_neorv32_processor_soc>>. <<_neorv32_central_processing_unit_cpu>>. <<_software_framework>>. <<_on_chip_debugger_ocd>>. <<_legal>>**Annotations**[WARNING]Warning[IMPORTANT]Important[NOTE]Note[TIP]Tip<<<// ####################################################################################################################include::rationale.adoc[]// ####################################################################################################################:sectnums:=== Project Key Features* open-source and documented; including user guides to get started* completely described in behavioral, platform-independent VHDL (yet platform-optimized modules are provided)* fully synchronous design, no latches, no gated clocks* small hardware footprint and high operating frequency for easy integration* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU** RISC-V compatibility: passes the official architecture tests** base architecture + privileged architecture (optional) + ISA extensions (optional)** option to add custom RISC-V instructions (as custom ISA extension)** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...)** aims to support <<_full_virtualization>> capabilities (CPU _and_ SoC) to increase execution safety** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]* **NEORV32 Processor (SoC)**: highly-configurable full-scale microcontroller-like processor system** based on the NEORV32 CPU** optional serial interfaces (UARTs, TWI, SPI)** optional timers and counters (WDT, MTIME)** optional general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface** optional embedded memories / caches for data, instructions and bootloader** optional external memory interface (Wishbone / AXI4-Lite) and stream link interface (AXI4-Stream) for custom connectivity** optional execute in place (XIP) module** on-chip debugger compatible with OpenOCD and gdb including hardware trigger module* **Software framework**** GCC-based toolchain - prebuilt toolchains available; application compilation based on GNU makefiles** internal bootloader with serial user interface** core libraries for high-level usage of the provided functions and peripherals** runtime environment and several example programs** doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/sw/files.html** FreeRTOS port + demos available[TIP]For more in-depth details regarding the feature provided by he hardware see the according sections:<<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>.**Extensibility and Customization**The NEORV32 processor was designed to ease customization and extensibility and provides several options for addingapplication-specific custom hardware modules and accelerators. The three most common options for adding customon-chip modules are listed below.* <<_processor_external_memory_interface_wishbone_axi4_lite>> for processor-external modules* <<_custom_functions_subsystem_cfs>> for tightly-coupled processor-internal co-processors* <<_custom_functions_unit_cfu>> for custom RISC-V instructions[TIP]A more detailed comparison of the extension/customization options can be found in sectionhttps://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules]of the user guide.<<<// ####################################################################################################################:sectnums:=== Project Folder Structure...................................neorv32 - Project home folder│├docs - Project documentation│├datasheet - AsciiDoc sources for the NEORV32 data sheet│├figures - Figures and logos│├icons - Misc. symbols│├references - Data sheets and RISC-V specs.│└userguide - AsciiDoc sources for the NEORV32 user guide│├rtl - VHDL sources│├core - Core sources of the CPU & SoC││└mem - SoC-internal memories (default architectures)│├processor_templates - Pre-configured SoC wrappers│├system_integration - System wrappers for advanced connectivity│└test_setups - Minimal test setup "SoCs" used in the User Guide│├sim - Simulation files (see User Guide)│└sw - Software framework├bootloader - Sources of the processor-internal bootloader├common - Linker script, crt0.S start-up code and central makefile├example - Various example programs│└...├lib - Processor core library│├include - Header files (*.h)│└source - Source files (*.c)├image_gen - Helper program to generate NEORV32 executables^├ocd_firmware - Source code for on-chip debugger's "park loop"├openocd - OpenOCD on-chip debugger configuration files└svd - Processor system view description file (CMSIS-SVD)...................................<<<// ####################################################################################################################:sectnums:=== VHDL File HierarchyAll necessary VHDL hardware description files are located in the project's `rtl/core` folder. The top entityof the entire processor including all the required configuration generics is **`neorv32_top.vhd`**.[IMPORTANT]All core VHDL files from the list below have to be assigned to a new design library named **`neorv32`**. Additionalfiles, like alternative top entities, can be assigned to any library....................................neorv32_top.vhd - NEORV32 Processor top entity│├neorv32_fifo.vhd - General purpose FIFO component├neorv32_package.vhd - Processor/CPU main VHDL package file│├neorv32_cpu.vhd - NEORV32 CPU top entity│├neorv32_cpu_alu.vhd - Arithmetic/logic unit││├neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.)││├neorv32_cpu_cp_cfu.vhd - Custom functions (instruction) co-processor (Zxcfu ext.)││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.)││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor│├neorv32_cpu_bus.vhd - Bus interface + physical memory protection│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder│└neorv32_cpu_regfile.vhd - Data register file│├neorv32_boot_rom.vhd - Bootloader ROM│└neorv32_bootloader_image.vhd - Bootloader boot ROM memory image├neorv32_busswitch.vhd - Processor bus switch for CPU buses (I&D)├neorv32_bus_keeper.vhd - Processor-internal bus monitor├neorv32_cfs.vhd - Custom functions subsystem├neorv32_debug_dm.vhd - on-chip debugger: debug module├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module├neorv32_dmem.entity.vhd - Processor-internal data memory (entity-only!)├neorv32_gpio.vhd - General purpose input/output port unit├neorv32_gptmr.vhd - General purpose 32-bit timer├neorv32_icache.vhd - Processor-internal instruction cache├neorv32_imem.entity.vhd - Processor-internal instruction memory (entity-only!)│└neor32_application_image.vhd - IMEM application initialization image├neorv32_mtime.vhd - Machine system timer├neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface├neorv32_pwm.vhd - Pulse-width modulation controller├neorv32_slink.vhd - Stream link controller├neorv32_spi.vhd - Serial peripheral interface controller├neorv32_sysinfo.vhd - System configuration information memory├neorv32_trng.vhd - True random number generator├neorv32_twi.vhd - Two wire serial interface controller├neorv32_uart.vhd - Universal async. receiver/transmitter├neorv32_wdt.vhd - Watchdog timer├neorv32_wishbone.vhd - External (Wishbone) bus interface├neorv32_xip.vhd - Execute in place module├neorv32_xirq.vhd - External interrupt controller│├mem/neorv32_dmem.default.vhd - _Default_ data memory (architecture-only)└mem/neorv32_imem.default.vhd - _Default_ instruction memory (architecture-only)...................................[NOTE]The processor-internal instruction and data memories (IMEM and DMEM) are split into two design files each:a plain entity definition (`neorv32_*mem.entity.vhd`) and the actual architecture definition(`mem/neorv32_*mem.default.vhd`). The `*.default.vhd` architecture definitions from `rtl/core/mem` provide a _generic_ and_platform independent_ memory design that (should) infers embedded memory blocks. You can replace/modify the architecturesource file in order to use platform-specific features (like advanced memory resources) or to improve technology mappingand/or timing.<<<// ####################################################################################################################:sectnums:=== FPGA Implementation ResultsThis section shows _exemplary_ FPGA implementation results for the NEORV32 CPU and NEORV32 Processor modules.Note that certain configuration options might also have an impact on other configuration options. Furthermore,this report cannot cover all possible option combinations. Hence, the presented implementation results arejust _exemplary_. If not otherwise mentioned all implementations use the default generic configurations.:sectnums:==== CPU[cols="<2,<8"][grid="topbot"]|=======================| HW version: | `1.6.9.8`| Top entity: | `rtl/core/neorv32_cpu.vhd`| FPGA: | Intel Cyclone IV E `EP4CE22F17C6`| Toolchain: | Quartus Prime Lite 21.1| Constraints: | **no timing constraints**, "_balanced optimization_", f~max~ from "_Slow 1200mV 0C Model_"|=======================[cols="<6,>1,>1,>1,>1,>1"][options="header",grid="rows"]|=======================| CPU ISA Configuration | LEs | FFs | MEM bits | DSPs | _f~max~_| `rv32e` | 830 | 400 | 512 | 0 | 129 MHz| `rv32i` | 834 | 400 | 1024 | 0 | 129 MHz| `rv32i_Zicsr` | 1328 | 678 | 1024 | 0 | 128 MHz| `rv32i_Zicsr_Zicntr` | 1614 | 808 | 1024 | 0 | 128 MHz| `rv32im_Zicsr_Zicntr` | 2087 | 983 | 1024 | 0 | 128 MHz| `rv32ima_Zicsr_Zicntr` | 2129 | 987 | 1024 | 0 | 128 MHz| `rv32imac_Zicsr_Zicntr` | 2338 | 992 | 1024 | 0 | 128 MHz| `rv32imacb_Zicsr_Zicntr` | 3175 | 1247 | 1024 | 0 | 128 MHz| `rv32imacbu_Zicsr_Zicntr` | 3186 | 1254 | 1024 | 0 | 128 MHz| `rv32imacbu_Zicsr_Zicntr_Zifencei` | 3187 | 1254 | 1024 | 0 | 128 MHz| `rv32imacbu_Zicsr_Zicntr_Zifencei_Zfinx` | 4450 | 1906 | 1024 | 7 | 123 MHz| `rv32imacbu_Zicsr_Zicntr_Zifencei_Zfinx_DebugMode` | 4825 | 2018 | 1024 | 7 | 123 MHz|=======================.**RISC-V Compliance**[NOTE]The `Zicsr` ISA extension implements the privileged machine architecture(see <<_zicsr_control_and_status_register_access_privileged_architecture>>). The `Zicntr` ISAextension implements the basic counters and timers (see <<_zicntr_cpu_base_counters>>). Bothextensions are _mandatory_ in order to comply with the RISC-V architecture specifications.[NOTE]The table above does not show _all_ CPU ISA extensions. More sophisticated and application-specificoptions like PMP and HMP are not included in this overview..Goal-Driven Optimization[TIP]The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internalcounter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware).See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide sectionhttps://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].:sectnums:==== Processor - Modules[cols="<2,<8"][grid="topbot"]|=======================| HW version: | `1.6.8.3`| Top entity: | `rtl/core/neorv32_top.vhd`| FPGA: | Intel Cyclone IV E `EP4CE22F17C6`| Toolchain: | Quartus Prime Lite 21.1| Constraints: | **no timing constraints**, "_balanced optimization_"|=======================.Hardware utilization by processor module (mandatory modules highlighted in **bold**)[cols="<2,<8,>1,>1,>2,>1"][options="header",grid="rows"]|=======================| Module | Description | LEs | FFs | MEM bits | DSPs| Boot ROM | Bootloader ROM (4kB) | 3 | 2 | 32768 | 0| **BUSKEEPER** | Processor-internal bus monitor | 28 | 15 | 0 | 0| **BUSSWITCH** | Bus multiplexer for CPU instr. and data interface | 69 | 8 | 0 | 0| CFS | Custom functions subsystemfootnote:[Resource utilization depends on custom design logic.] | - | - | - | -| DM | On-chip debugger - debug module | 473 | 240 | 0 | 0| DTM | On-chip debugger - debug transfer module (JTAG) | 259 | 221 | 0 | 0| DMEM | Processor-internal data memory (8kB) | 18 | 2 | 65536 | 0| GPIO | General purpose input/output ports | 102 | 98 | 0 | 0| GPTMR | General Purpose Timer | 153 | 105 | 0 | 0| iCACHE | Instruction cache (2x4 blocks, 64 bytes per block) | 417 | 297 | 4096 | 0| IMEM | Processor-internal instruction memory (16kB) | 12 | 2 | 131072 | 0| MTIME | Machine system timer | 345 | 166 | 0 | 0| NEOLED | Smart LED Interface (NeoPixel/WS28128) (FIFO_depth=1) | 227 | 184 | 0 | 0| PWM | Pulse_width modulation controller (8 channels) | 128 | qq7 | 0 | 0| SLINK | Stream link interface (2xRX, 2xTX, FIFO_depth=1) | 136 | 116 | 0 | 0| SPI | Serial peripheral interface | 114 | 94 | 0 | 0| **SYSINFO** | System configuration information memory | 13 | 11 | 0 | 0| TRNG | True random number generator | 89 | 79 | 0 | 0| TWI | Two-wire interface | 77 | 43 | 0 | 0| UART0, UART1 | Universal asynchronous receiver/transmitter 0/1 (FIFO_depth=1) | 195 | 143 | 0 | 0| WDT | Watchdog timer | 61 | 46 | 0 | 0| WISHBONE | External memory interface | 120 | 112 | 0 | 0| XIP | Execute in place module | 318 | 244 | 0 | 0| XIRQ | External interrupt controller (32 channels) | 245 | 200 | 0 | 0|=======================[NOTE]Note that not all IOs were actually connected to FPGA pins (for example some GPIO inputs and outputs)when generating these reports.<<<:sectnums:==== Exemplary SetupsCheck out the `neorv32-setups` repository (@GitHub: https://github.com/stnolting/neorv32-setups),which provides several demo setups for various FPGA boards and toolchains.<<<// ####################################################################################################################:sectnums:=== CPU PerformanceThe performance of the NEORV32 was tested and evaluated using the https://www.eembc.org/coremark/[Core Mark CPU benchmark].This benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the wholesystem. The according sources can be found in the `sw/example/coremark` folder..Dhrystone[TIP]A _simple_ port of the Dhrystone benchmark is also available in `sw/example/dhrystone`.The resulting CoreMark score is defined as CoreMark iterations per second.The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score isdefined as CoreMark score divided by the CPU's clock frequency in MHz..Configuration[cols="<2,<8"][grid="topbot"]|=======================| HW version: | `1.5.7.10`| Hardware: | 32kB int. IMEM, 16kB int. DMEM, no caches, 100MHz clock| CoreMark: | 2000 iterations, MEM_METHOD is MEM_STACK| Compiler: | RISCV32-GCC 10.2.0| Compiler flags: | default, see makefile|=======================.CoreMark results[cols="<4,^1,^1,^1"][options="header",grid="rows"]|=======================| CPU | CoreMark Score | CoreMarks/MHz | Average CPI| _small_ (`rv32i_Zicsr`) | 33.89 | **0.3389** | **4.04**| _medium_ (`rv32imc_Zicsr`) | 62.50 | **0.6250** | **5.34**| _performance_ (`rv32imc_Zicsr` + perf. options) | 95.23 | **0.9523** | **3.54**|=======================[NOTE]The CoreMark results were generated using a `rv32i` toolchain. This toolchain supports standard extensionslike `M` and `C` but the built-in libraries only use the base `I` ISA.[NOTE]The "_performance_" CPU configuration uses the <<_fast_mul_en>> and <<_fast_shift_en>> options.The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence ofseveral consecutive micro operations.The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also onthe available CPU extensions. The average CPI is computed by dividing the total number of required clock cycles(only the timed core to avoid distortion due to IO wait cycles) by the number of executed instructions(`[m]instret[h]` CSRs). More information regarding the execution time of each implemented instruction can be found inchapter <<_instruction_timing>>.
