OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [datasheet/] [overview.adoc] - Diff between revs 72 and 73

Go to most recent revision | Show entire file | Details | Blame | View Log

Rev 72 Rev 73
Line 17... Line 17...
The software framework of the processor comes with application makefiles, software libraries for all CPU
The software framework of the processor comes with application makefiles, software libraries for all CPU
and processor features, a bootloader, a runtime environment and several example programs - including a port
and processor features, a bootloader, a runtime environment and several example programs - including a port
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
 
 
[TIP]
 
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
that provides hands-on tutorials to get you started.
that provides hands-on tutorials to get you started.
The project's change log is available in https://github.com/stnolting/neorv32/blob/main/CHANGELOG.md[CHANGELOG.md]
 
in the root directory of the NEORV32 repository. Please also check out the <<_legal>> section.
 
 
 
 
 
**Structure**
**Structure**
 
 
[start=2]
[start=2]
. <<_neorv32_processor_soc>>
. <<_neorv32_processor_soc>>
. <<_neorv32_central_processing_unit_cpu>>
. <<_neorv32_central_processing_unit_cpu>>
. <<_software_framework>>
. <<_software_framework>>
. <<_on_chip_debugger_ocd>>
. <<_on_chip_debugger_ocd>>
 
. <<_legal>>
 
 
 
 
 
**Annotations**
 
 
 
[WARNING]
 
Warning
 
 
 
[IMPORTANT]
 
Important
 
 
 
[NOTE]
 
Note
 
 
 
[TIP]
 
Tip
 
 
 
 
 
 
// ####################################################################################################################
// ####################################################################################################################
 
 
Line 211... Line 223...
 
 
// ####################################################################################################################
// ####################################################################################################################
:sectnums:
:sectnums:
=== FPGA Implementation Results
=== FPGA Implementation Results
 
 
This chapter shows _exemplary_ implementation results of the NEORV32 CPU and NEORV32 Processor.
This section shows _exemplary_ FPGA implementation results for the NEORV32 CPU and NEORV32 Processor modules.
 
Note that certain configuration options might also have an impact on other configuration options. Furthermore,
 
this report cannot cover all possible option combinations. Hence, the presented implementation results are
 
just _exemplary_. If not otherwise mentioned all implementations use the default generic configurations.
 
 
:sectnums:
:sectnums:
==== CPU
==== CPU
 
 
[cols="<2,<8"]
[cols="<2,<8"]
[grid="topbot"]
[grid="topbot"]
|=======================
|=======================
 
| HW version:  | `1.6.8.3`
| Top entity: | `rtl/core/neorv32_cpu.vhd`
| Top entity: | `rtl/core/neorv32_cpu.vhd`
| FPGA:       | Intel Cyclone IV E `EP4CE22F17C6`
| FPGA:       | Intel Cyclone IV E `EP4CE22F17C6`
| Toolchain:  | Quartus Prime 20.1.0
| Toolchain:   | Quartus Prime Lite 21.1
 
| Constraints: | **no timing constraints**, "_balanced optimization_", f~max~ from "_Slow 1200mV 0C Model_"
|=======================
|=======================
 
 
[cols="<5,>1,>1,>1,>1,>1"]
[cols="<6,>1,>1,>1,>1,>1"]
[options="header",grid="rows"]
[options="header",grid="rows"]
|=======================
|=======================
| CPU                                               | LEs  | FFs  | MEM bits | DSPs | _f~max~_
| CPU ISA Configuration                              | LEs  | FFs  | MEM bits | DSPs | _f~max~_
| `rv32i`                                           |  806 |  359 |     1024 |    0 | 125 MHz
| `rv32e`                                            |  900 |  388 |      512 |    0 | 121 MHz
| `rv32i_Zicsr_Zicntr`                              | 1729 |  813 |     1024 |    0 | 124 MHz
| `rv32i`                                            |  904 |  388 |     1024 |    0 | 121 MHz
| `rv32im_Zicsr_Zicntr`                             | 2269 | 1055 |     1024 |    0 | 124 MHz
| `rv32i_Zicsr`                                      | 1425 |  673 |     1024 |    0 | 118 MHz
| `rv32imc_Zicsr_Zicntr`                            | 2501 | 1070 |     1024 |    0 | 124 MHz
| `rv32i_Zicsr_Zicntr`                               | 1778 |  803 |     1024 |    0 | 118 MHz
| `rv32imac_Zicsr_Zicntr`                           | 2511 | 1074 |     1024 |    0 | 124 MHz
| `rv32im_Zicsr_Zicntr`                              | 2244 |  978 |     1024 |    0 | 118 MHz
| `rv32imacu_Zicsr_Zicntr`                          | 2521 | 1079 |     1024 |    0 | 124 MHz
| `rv32ima_Zicsr_Zicntr`                             | 2267 |  982 |     1024 |    0 | 118 MHz
| `rv32imacu_Zicsr_Zicntr_Zifencei`                 | 2522 | 1079 |     1024 |    0 | 122 MHz
| `rv32imac_Zicsr_Zicntr`                            | 2453 |  994 |     1024 |    0 | 118 MHz
| `rv32imacu_Zicsr_Zicntr_Zifencei_Zfinx`           | 3807 | 1731 |     1024 |    7 | 116 MHz
| `rv32imacb_Zicsr_Zicntr`                           | 3270 | 1249 |     1024 |    0 | 118 MHz
| `rv32imacu_Zicsr_Zicntr_Zifencei_Zfinx_DebugMode` | 3974 | 1815 |     1024 |    7 | 116 MHz
| `rv32imacbu_Zicsr_Zicntr`                          | 3286 | 1254 |     1024 |    0 | 118 MHz
 
| `rv32imacbu_Zicsr_Zicntr_Zifencei`                 | 3278 | 1254 |     1024 |    0 | 118 MHz
 
| `rv32imacbu_Zicsr_Zicntr_Zifencei_Zfinx`           | 4536 | 1906 |     1024 |    7 | 115 MHz
 
| `rv32imacbu_Zicsr_Zicntr_Zifencei_Zfinx_DebugMode` | 5989 | 2416 |     1024 |    7 | 110 MHz
|=======================
|=======================
 
 
 
.**RISC-V Compliance**
 
[NOTE]
 
The `Zicsr` ISA extension implements the privileged machine architecture
 
(see <<_zicsr_control_and_status_register_access_privileged_architecture>>). The `Zicntr` ISA
 
extension implements the basic counters and timers (see <<_zicntr_cpu_base_counters>>). Both
 
extensions are _mandatory_ in order to comply with the RISC-V architecture specifications.
 
 
 
[NOTE]
 
The table above does not show _all_ CPU ISA extensions. More sophisticated and application-specific
 
options like PMP and HMP are not included in this overview.
 
 
 
.Goal-Driven Optimization
[TIP]
[TIP]
The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internal
The CPU provides further options to reduce the area footprint (for example by constraining the CPU-internal
counter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware).
counter sizes) or to increase performance (for example by using a barrel-shifter; at cost of extra hardware).
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].
 
 
 
 
:sectnums:
:sectnums:
==== Processor Modules
==== Processor - Modules
 
 
[cols="<2,<8"]
[cols="<2,<8"]
[grid="topbot"]
[grid="topbot"]
|=======================
|=======================
 
| HW version: | `1.6.8.3`
| Top entity: | `rtl/core/neorv32_top.vhd`
| Top entity: | `rtl/core/neorv32_top.vhd`
| FPGA:       | Intel Cyclone IV E `EP4CE22F17C6`
| FPGA:       | Intel Cyclone IV E `EP4CE22F17C6`
| Toolchain:  | Quartus Prime 20.1.0
| Toolchain:  | Quartus Prime Lite 21.1
 
| Constraints: | **no timing constraints**, "_balanced optimization_"
|=======================
|=======================
 
 
.Hardware utilization by the processor modules (mandatory core modules in **bold**)
.Hardware utilization by processor module (mandatory modules highlighted in **bold**)
[cols="<2,<8,>1,>1,>2,>1"]
[cols="<2,<8,>1,>1,>2,>1"]
[options="header",grid="rows"]
[options="header",grid="rows"]
|=======================
|=======================
| Module        | Description                                           | LEs | FFs | MEM bits | DSPs
| Module        | Description                                           | LEs | FFs | MEM bits | DSPs
| Boot ROM      | Bootloader ROM (4kB)                                  |   2 |   1 |    32768 |    0
| Boot ROM      | Bootloader ROM (4kB)                                           |   3 |   2 |    32768 |    0
| **BUSKEEPER** | Processor-internal bus monitor                        |   9 |   6 |        0 |    0
| **BUSKEEPER** | Processor-internal bus monitor                                 |  28 |  15 |        0 |    0
| **BUSSWITCH** | Bus multiplexer for CPU instr. and data interface     |  63 |   8 |        0 |    0
| **BUSSWITCH** | Bus multiplexer for CPU instr. and data interface              |  69 |   8 |        0 |    0
| CFS           | Custom functions subsystemfootnote:[Resource utilization depends on actually implemented custom functionality.] | - | - | - | -
| CFS           | Custom functions subsystemfootnote:[Resource utilization depends on custom design logic.] | - | - | - | -
| DMEM          | Processor-internal data memory (8kB)                  |  19 |   2 |    65536 |    0
| DM            | On-chip debugger - debug module                                | 473 | 240 |        0 |    0
| DM            | On-chip debugger - debug module                       | 493 | 240 |        0 |    0
| DTM           | On-chip debugger - debug transfer module (JTAG)                | 259 | 221 |        0 |    0
| DTM           | On-chip debugger - debug transfer module (JTAG)       | 254 | 218 |        0 |    0
| DMEM          | Processor-internal data memory (8kB)                           |  18 |   2 |    65536 |    0
| GPIO          | General purpose input/output ports                    | 134 | 161 |        0 |    0
| GPIO          | General purpose input/output ports                             | 102 |  98 |        0 |    0
| iCACHE        | Instruction cache (1x4 blocks, 256 bytes per block)   | 2 21| 156 |     8192 |    0
| GPTMR         | General Purpose Timer                                          | 153 | 105 |        0 |    0
| IMEM          | Processor-internal instruction memory (16kB)          |  13 |   2 |   131072 |    0
| iCACHE        | Instruction cache (2x4 blocks, 64 bytes per block)             | 417 | 297 |     4096 |    0
| MTIME         | Machine system timer                                  | 319 | 167 |        0 |    0
| IMEM          | Processor-internal instruction memory (16kB)                   |  12 |   2 |   131072 |    0
| NEOLED        | Smart LED Interface (NeoPixel/WS28128) [FIFO_depth=1] | 226 | 182 |        0 |    0
| MTIME         | Machine system timer                                           | 345 | 166 |        0 |    0
| SLINK         | Stream link interface (2xRX, 2xTX, FIFO_depth=1)      | 208 | 181 |        0 |    0
| NEOLED        | Smart LED Interface (NeoPixel/WS28128) (FIFO_depth=1)          | 227 | 184 |        0 |    0
| PWM           | Pulse_width modulation controller (4 channels)        |  71 |  69 |        0 |    0
| PWM           | Pulse_width modulation controller (8 channels)                 | 128 | qq7 |        0 |    0
| SPI           | Serial peripheral interface                           | 148 | 127 |        0 |    0
| SLINK         | Stream link interface (2xRX, 2xTX, FIFO_depth=1)               | 136 | 116 |        0 |    0
| **SYSINFO**   | System configuration information memory               |  14 |  11 |        0 |    0
| SPI           | Serial peripheral interface                                    | 114 |  94 |        0 |    0
| TRNG          | True random number generator                          |  89 |  76 |        0 |    0
| **SYSINFO**   | System configuration information memory                        |  13 |  11 |        0 |    0
 
| TRNG          | True random number generator                                   |  89 |  79 |        0 |    0
| TWI           | Two-wire interface                                    |  77 |  43 |        0 |    0
| TWI           | Two-wire interface                                    |  77 |  43 |        0 |    0
| UART0/1       | Universal asynchronous receiver/transmitter 0/1       | 183 | 132 |        0 |    0
| UART0, UART1  | Universal asynchronous receiver/transmitter 0/1 (FIFO_depth=1) | 195 | 143 |        0 |    0
| WDT           | Watchdog timer                                        |  53 |  43 |        0 |    0
| WDT           | Watchdog timer                                                 |  61 |  46 |        0 |    0
| WISHBONE      | External memory interface                             | 114 | 110 |        0 |    0
| WISHBONE      | External memory interface                                      | 120 | 112 |        0 |    0
| XIRQ          | External interrupt controller (32 channels)           | 241 | 201 |        0 |    0
| XIP           | Execute in place module                                        | 318 | 244 |        0 |    0
| GPTMR         | General Purpose Timer                                 | 153 | 107 |        0 |    0
| XIRQ          | External interrupt controller (32 channels)                    | 245 | 200 |        0 |    0
| XIP           | Execute in place module                               | 305 | 243 |        0 |    0
 
|=======================
|=======================
 
 
 
[NOTE]
 
Note that not all IOs were actually connected to FPGA pins (for example some GPIO inputs and outputs)
 
when generating these reports.
 
 
 
 
 
 
:sectnums:
:sectnums:
==== Exemplary Setups
==== Exemplary Setups
 
 
Line 335... Line 373...
| _small_ (`rv32i_Zicsr`)                         |          33.89 | **0.3389**    | **4.04**
| _small_ (`rv32i_Zicsr`)                         |          33.89 | **0.3389**    | **4.04**
| _medium_ (`rv32imc_Zicsr`)                      |          62.50 | **0.6250**    | **5.34**
| _medium_ (`rv32imc_Zicsr`)                      |          62.50 | **0.6250**    | **5.34**
| _performance_ (`rv32imc_Zicsr` + perf. options) |          95.23 | **0.9523**    | **3.54**
| _performance_ (`rv32imc_Zicsr` + perf. options) |          95.23 | **0.9523**    | **3.54**
|=======================
|=======================
 
 
[IMPORTANT]
[NOTE]
The CoreMark results were generated using a `rv32i` toolchain. This toolchain supports standard extensions
The CoreMark results were generated using a `rv32i` toolchain. This toolchain supports standard extensions
like `M` and `C` but the built-in libraries only use the base `I` ISA.
like `M` and `C` but the built-in libraries only use the base `I` ISA.
 
 
[NOTE]
[NOTE]
The "_performance_" CPU configuration uses the <<_fast_mul_en>> and <<_fast_shift_en>> options.
The "_performance_" CPU configuration uses the <<_fast_mul_en>> and <<_fast_shift_en>> options.
 
 
[NOTE]
 
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of
several consecutive micro operations.
several consecutive micro operations.
 
 
[NOTE]
 
The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also on
The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also on
the available CPU extensions. The average CPI is computed by dividing the total number of required clock cycles
the available CPU extensions. The average CPI is computed by dividing the total number of required clock cycles
(only the timed core to avoid distortion due to IO wait cycles) by the number of executed instructions
(only the timed core to avoid distortion due to IO wait cycles) by the number of executed instructions
(`[m]instret[h]` CSRs).
(`[m]instret[h]` CSRs). More information regarding the execution time of each implemented instruction can be found in
 
 
[TIP]
 
More information regarding the execution time of each implemented instruction can be found in
 
chapter <<_instruction_timing>>.
chapter <<_instruction_timing>>.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.