OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [datasheet/] [overview.adoc] - Rev 65

Go to most recent revision | Compare with Previous | Blame | View Log

:sectnums:
== Overview

The NEORV32footnote:[Pronounced "neo-R-V-thirty-two" or "neo-risc-five-thirty-two" in its long form.] is an open-source
RISC-V compatible processor system that is intended as *ready-to-go* auxiliary processor within a larger SoC
designs or as stand-alone custom / customizable microcontroller.

The system is highly configurable and provides optional common peripherals like embedded memories,
timers, serial interfaces, general purpose IO ports and an external bus interface to connect custom IP like
memories, NoCs and other peripherals. On-line and in-system debugging is supported by an OpenOCD/gdb
compatible on-chip debugger accessible via JTAG.

The software framework of the processor comes with application makefiles, software libraries for all CPU
and processor features, a bootloader, a runtime environment and several example programs - including a port
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).

[TIP]
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
that provides hands-on tutorial to get you started.

[TIP]
The project's change log is available in https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md[CHANGELOG.md]
in the root directory of the NEORV32 repository. Please also check out the <<_legal>> section.


**Structure**

[start=2]
. <<_neorv32_processor_soc>>
. <<_neorv32_central_processing_unit_cpu>>
. <<_software_framework>>
. <<_on_chip_debugger_ocd>>

[TIP]
Links in this document are <<_overview,highlighted>>.



<<<
// ####################################################################################################################
:sectnums:
=== Rationale

**Why did you make this?**

I am fascinated by processor and CPU architecture design: it is the magic frontier where software meets hardware.
This project has started as something like a _journey_ into this magic realm to understand how things actually work
down on this very low level.

But there is more! When I started to dive into the emerging RISC-V ecosystem I felt overwhelmed by the complexity.
As a beginner it is hard to get an overview - especially when you want to setup a minimal platform to tinker with:
Which core to use? How to get the right toolchain? What features do I need? How does the booting work? How do I
create an actual executable? How to get that into the hardware? How to customize things? **_Where to start???_**

So this project aims to provides a _simple to understand_ and _easy to use_ yet _powerful_ and _flexible_ platform
that targets FPGA and RISC-V beginners as well as advanced users. Join me and us on this journey! 🙃


**Why a _soft_-core processor?**

As a matter of fact soft-core processors _cannot_ compete with discrete or FPGA hard-macro processors in terms
of performance, energy and size. But they do fill a niche in FPGA design space. For example, soft-core processors
allow to implement the _control flow part_ of certain applications (like communication protocol handling) using
software like plain C. This provides high flexibility as software can be easily changed, re-compiled and
re-uploaded again.

Furthermore, the concept of flexibility applies to all aspects of a soft-core processor. The user can add
_exactly_ the features that are required by the application: additional memories, custom interfaces, specialized
IP and even user-defined instructions.


**Why RISC-V?**

[quote, RISC-V International, https://riscv.org/about/]
____
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration.
____

I love the idea of open-source. **Knowledge can help best if it is freely available.**
While open-source has already become quite popular in _software_, hardware projects still need to catch up.
Admittedly, there has been quite a development, but mainly in terms of _platforms_ and _applications_ (so
schematics, PCBs, etc.). Although processors and CPUs are the heart of almost every digital system, having a true
open-source silicon is still a rarity. RISC-V aims to change that. Even it is _just one approach_, it helps paving
the road for future development.

Furthermore, I welcome the community aspect of RISC-V. The ISA and everything beyond is developed with direct
contact to the community: this includes businesses and professionals but also hobbyist, amateurs and people
that are just curious. Everyone can join discussions and contribute to RISC-V in their very own way.

Finally, I really like the RISC-V ISA itself. It aims to be a clean, orthogonal and "intuitive" ISA that
resembles with the basic concepts of _RISC_: simple yet effective.


**Yet another RISC-V core? What makes it special?**

The NEORV32 is not based on another RISC-V core. It was build entirely from ground up (just following the official
ISA specs) having a different design goal in mind. The project does not intend to replace certain RISC-V cores or
just beat existing ones like https://github.com/SpinalHDL/VexRiscv[VexRISC] in terms of performance or
https://github.com/olofk/serv[SERV] in terms of size.

The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance
vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability,
RISC-V compatibility, _customization_ and _ease of use_. See the <<_project_key_features>> below.


// ####################################################################################################################
:sectnums:
=== Project Key Features

* open-source and documented; including user guides to get started
* completely described in behavioral, platform-independent VHDL (yet platform-optimized modules are provided)
* fully synchronous design, no latches, no gated clocks
* small hardware footprint and high operating frequency for easy integration
* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU
** RISC-V compatibility: passes the official architecture tests
** base architecture + privileged architecture (optional) + ISA extensions (optional)
** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...)
** aims to support <<_full_virtualization>> capabilities (CPU _and_ SoC) to increase execution safety
** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]
* **NEORV32 Processor (SoC)**: highly-configurable full-scale microcontroller-like processor system
** based on the NEORV32 CPU
** optional serial interfaces (UARTs, TWI, SPI)
** optional timers and counters (WDT, MTIME)
** optional general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface
** optional embedded memories / caches for data, instructions and bootloader
** optional external memory interface (Wishbone / AXI4-Lite) and stream link interface (AXI4-Stream) for custom connectivity
** on-chip debugger compatible with OpenOCD and gdb
* **Software framework**
** GCC-based toolchain - prebuilt toolchains available; application compilation based on GNU makefiles
** internal bootloader with serial user interface
** core libraries for high-level usage of the provided functions and peripherals
** runtime environment and several example programs
** doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/sw/files.html
** FreeRTOS port + demos available

[TIP]
For more in-depth details regarding the feature provided by he hardware see the according sections:
<<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>.


<<<
// ####################################################################################################################
:sectnums:
=== Project Folder Structure

...................................
neorv32                - Project home folder
│
├docs                  - Project documentation
│├datasheet            - .adoc sources for NEORV32 data sheet
│├doxygen_build        - Software framework documentation (generated by doxygen)
│├figures              - Figures and logos
│├icons                - Misc. symbols
│├references           - Data sheets and RISC-V specs.
│└src_adoc             - AsciiDoc sources for this document
│
├rtl                   - VHDL sources
│├core                 - Core sources of the CPU & SoC
││└mem                 - SoC-internal memories (default architectures)
│├processor_templates  - Pre-configured SoC wrappers
│├system_integration   - System wrappers for advanced connectivity
│└test_setups          - Minimal test setup "SoCs" used in the User Guide
│
├setups                - Example setups for various FPGAs, boards and toolchains
│└...
│
├sim                   - Simulation files (see User Guide)
│
â””sw                    - Software framework
 ├bootloader           - Sources and scripts for the NEORV32 internal bootloader
 ├common               - Linker script and crt0.S start-up code
 ├example              - Various example programs
 │└...
 ├isa-test
 │├riscv-arch-test     - RISC-V spec. compatibility test framework (submodule)
 │└port-neorv32        - Port files for the official RISC-V architecture tests
 ├ocd_firmware         - source code for on-chip debugger's "park loop"
 ├openocd              - OpenOCD on-chip debugger configuration files
 ├image_gen            - Helper program to generate NEORV32 executables
 â””lib                  - Processor core library
  ├include             - Header files (*.h)
  â””source              - Source files (*.c)
...................................



<<<
// ####################################################################################################################
:sectnums:
=== VHDL File Hierarchy

All necessary VHDL hardware description files are located in the project's `rtl/core folder`. The top entity
of the entire processor including all the required configuration generics is **`neorv32_top.vhd`**.

[IMPORTANT]
All core VHDL files from the list below have to be assigned to a new design library named **`neorv32`**. Additional
files, like alternative top entities, can be assigned to any library.

...................................
neorv32_top.vhd                  - NEORV32 Processor top entity
│
├neorv32_fifo.vhd                - General purpose FIFO component
├neorv32_package.vhd             - Processor/CPU main VHDL package file
│
├neorv32_cpu.vhd                 - NEORV32 CPU top entity
│├neorv32_cpu_alu.vhd            - Arithmetic/logic unit
││├neorv32_cpu_cp_bitmanip.vhd   - Bit-manipulation co-processor (B ext.)
││├neorv32_cpu_cp_fpu.vhd        - Floating-point co-processor (Zfinx ext.)
││├neorv32_cpu_cp_muldiv.vhd     - Mul/Div co-processor (M extension)
││└neorv32_cpu_cp_shifter.vhd    - Bit-shift co-processor
│├neorv32_cpu_bus.vhd            - Bus interface + physical memory protection
│├neorv32_cpu_control.vhd        - CPU control, exception/IRQ system and CSRs
││└neorv32_cpu_decompressor.vhd  - Compressed instructions decoder
│└neorv32_cpu_regfile.vhd        - Data register file
│
├neorv32_boot_rom.vhd            - Bootloader ROM
│└neorv32_bootloader_image.vhd   - Bootloader boot ROM memory image
├neorv32_busswitch.vhd           - Processor bus switch for CPU buses (I&D)
├neorv32_bus_keeper.vhd          - Processor-internal bus monitor
├neorv32_cfs.vhd                 - Custom functions subsystem
├neorv32_debug_dm.vhd            - on-chip debugger: debug module
├neorv32_debug_dtm.vhd           - on-chip debugger: debug transfer module
├neorv32_dmem.entity.vhd         - Processor-internal data memory (entity-only!)
├neorv32_gpio.vhd                - General purpose input/output port unit
├neorv32_icache.vhd              - Processor-internal instruction cache
├neorv32_imem.entity.vhd         - Processor-internal instruction memory (entity-only!)
│└neor32_application_image.vhd   - IMEM application initialization image
├neorv32_mtime.vhd               - Machine system timer
├neorv32_neoled.vhd              - NeoPixel (TM) compatible smart LED interface
├neorv32_pwm.vhd                 - Pulse-width modulation controller
├neorv32_spi.vhd                 - Serial peripheral interface controller
├neorv32_sysinfo.vhd             - System configuration information memory
├neorv32_trng.vhd                - True random number generator
├neorv32_twi.vhd                 - Two wire serial interface controller
├neorv32_uart.vhd                - Universal async. receiver/transmitter
├neorv32_wdt.vhd                 - Watchdog timer
├neorv32_wishbone.vhd            - External (Wishbone) bus interface
│
├mem/neorv32_dmem.default.vhd    - _Default_ data memory (architecture-only!)
â””mem/neorv32_imem.default.vhd    - _Default_ instruction memory (architecture-only!)
...................................

[NOTE]
The processor-internal instruction and data memories (IMEM and DMEM) are split into two design files each:
a plain entity definition (`neorv32_*mem.entity.vhd`) and the actual architecture definition
(`mem/neorv32_*mem.default.vhd`). The **default** architecture definitions from `rtl/core/mem` provide a _generic_ and
_platform independent_ memory design that (should) infers embedded memory blocks. You can replace/modify the architecture
source file in order to use platform-specific features (like advanced memory resources) or to improve technology mapping
and/or timing.


<<<
// ####################################################################################################################
:sectnums:
=== FPGA Implementation Results

This chapter shows _exemplary_ implementation results of the NEORV32 CPU and NEORV32 Processor.

:sectnums:
==== CPU

[cols="<2,<8"]
[grid="topbot"]
|=======================
| Hardware version: | `1.5.7.10`
| Top entity:       | `rtl/core/neorv32_cpu.vhd`
|=======================

[cols="<5,>1,>1,>1,>1,>1"]
[options="header",grid="rows"]
|=======================
| CPU                                        | LEs  | FFs  | MEM bits | DSPs | _f~max~_
| `rv32i`                                    |  806 |  359 |     1024 |    0 | 125 MHz
| `rv32i_Zicsr`                              | 1729 |  813 |     1024 |    0 | 124 MHz
| `rv32im_Zicsr`                             | 2269 | 1055 |     1024 |    0 | 124 MHz
| `rv32imc_Zicsr`                            | 2501 | 1070 |     1024 |    0 | 124 MHz
| `rv32imac_Zicsr`                           | 2511 | 1074 |     1024 |    0 | 124 MHz
| `rv32imacu_Zicsr`                          | 2521 | 1079 |     1024 |    0 | 124 MHz
| `rv32imacu_Zicsr_Zifencei`                 | 2522 | 1079 |     1024 |    0 | 122 MHz
| `rv32imacu_Zicsr_Zifencei_Zfinx`           | 3807 | 1731 |     1024 |    7 | 116 MHz
| `rv32imacu_Zicsr_Zifencei_Zfinx_DebugMode` | 3974 | 1815 |     1024 |    7 | 116 MHz
|=======================

[NOTE]
No HPM counters and no PMP regions were implemented for generating these results.

[TIP]
The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internal
counter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware).
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].


:sectnums:
==== Processor Modules

[cols="<2,<8"]
[grid="topbot"]
|=======================
| Hardware version: | `1.5.7.15`
| Top entity:       | `rtl/core/neorv32_top.vhd`
|=======================

.Hardware utilization by the processor modules (mandatory core modules in **bold**)
[cols="<2,<8,>1,>1,>2,>1"]
[options="header",grid="rows"]
|=======================
| Module        | Description                                           | LEs | FFs | MEM bits | DSPs
| Boot ROM      | Bootloader ROM (4kB)                                  |   2 |   1 |    32768 |    0
| **BUSKEEPER** | Processor-internal bus monitor                        |   9 |   6 |        0 |    0
| **BUSSWITCH** | Bus mux for CPU instr. and data interface             |  63 |   8 |        0 |    0
| CFS           | Custom functions subsystemfootnote:[Resource utilization depends on actually implemented custom functionality.] | - | - | - | -
| DMEM          | Processor-internal data memory (8kB)                  |  19 |   2 |    65536 |    0
| DM            | On-chip debugger - debug module                       | 493 | 240 |        0 |    0
| DTM           | On-chip debugger - debug transfer module (JTAG)       | 254 | 218 |        0 |    0
| GPIO          | General purpose input/output ports                    | 134 | 161 |        0 |    0
| iCACHE        | Instruction cache (1x4 blocks, 256 bytes per block)   | 2 21| 156 |     8192 |    0
| IMEM          | Processor-internal instruction memory (16kB)          |  13 |   2 |   131072 |    0
| MTIME         | Machine system timer                                  | 319 | 167 |        0 |    0
| NEOLED        | Smart LED Interface (NeoPixel/WS28128) [FIFO_depth=1] | 226 | 182 |        0 |    0
| SLINK         | Stream link interface (2xRX, 2xTX, FIFO_depth=1)      | 208 | 181 |        0 |    0
| PWM           | Pulse_width modulation controller (4 channels)        |  71 |  69 |        0 |    0
| SPI           | Serial peripheral interface                           | 148 | 127 |        0 |    0
| **SYSINFO**   | System configuration information memory               |  14 |  11 |        0 |    0
| TRNG          | True random number generator                          |  89 |  76 |        0 |    0
| TWI           | Two-wire interface                                    |  77 |  43 |        0 |    0
| UART0/1       | Universal asynchronous receiver/transmitter 0/1       | 183 | 132 |        0 |    0
| WDT           | Watchdog timer                                        |  53 |  43 |        0 |    0
| WISHBONE      | External memory interface                             | 114 | 110 |        0 |    0
| XIRQ          | External interrupt controller (32 channels)           | 241 | 201 |        0 |    0
|=======================


<<<
:sectnums:
==== Exemplary Setups

Check out the `setups` folder (@GitHub: https://github.com/stnolting/neorv32/tree/master/setups),
which provides several demo setups for various FPGA boards and toolchains.


<<<
// ####################################################################################################################
:sectnums:
=== CPU Performance

The performance of the NEORV32 was tested and evaluated using the https://www.eembc.org/coremark/[Core Mark CPU benchmark].
This benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the whole
system. The according sources can be found in the `sw/example/coremark` folder.

.Dhrystone
[TIP]
A _simple_ port of the Dhrystone benchmark is also available in `sw/example/dhrystone`.

The resulting CoreMark score is defined as CoreMark iterations per second.
The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score is
defined as CoreMark score divided by the CPU's clock frequency in MHz.

.Configuration
[cols="<2,<8"]
[grid="topbot"]
|=======================
| HW version:     | `1.5.7.10`
| Hardware:       | 32kB int. IMEM, 16kB int. DMEM, no caches, 100MHz clock
| CoreMark:       | 2000 iterations, MEM_METHOD is MEM_STACK
| Compiler:       | RISCV32-GCC 10.2.0
| Compiler flags: | default, see makefile
|=======================

.CoreMark results
[cols="<4,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| CPU                                            | CoreMark Score | CoreMarks/Mhz | Average CPI
| _small_ (`rv32i_Zicsr`)                        |          33.89 | **0.3389**    | **4.04**
| _medium_ (`rv32imc_Zicsr`)                     |          62.50 | **0.6250**    | **5.34**
| _performance_(`rv32imc_Zicsr` + perf. options) |          95.23 | **0.9523**    | **3.54**
|=======================

[NOTE]
The "_performance_" CPU configuration uses the <<_fast_mul_en>> and <<_fast_shift_en>> options.

[NOTE]
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of
several consecutive micro operations.

[NOTE]
The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also on
the available CPU extensions. The average CPI is computed by dividing the total number of required clock cycles
(only the timed core to avoid distortion due to IO wait cycles) by the number of executed instructions
(`[m]instret[h]` CSRs).

[TIP]
More information regarding the execution time of each implemented instruction can be found in
chapter <<_instruction_timing>>.

Go to most recent revision | Compare with Previous | Blame | View Log

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.