OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

[/] [neorv32/] [trunk/] [docs/] [datasheet/] [overview.adoc] - Blame information for rev 61

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 60 zero_gravi
:sectnums:
2
== Overview
3
 
4
The NEORV32footnote:[Pronounced "neo-R-V-thirty-two" or "neo-risc-five-thirty-two" in its long form.] is an open-source
5
RISC-V compatible processor system that is intended as *ready-to-go* auxiliary processor within a larger SoC
6
designs or as stand-alone custom / customizable microcontroller.
7
 
8
The system is highly configurable and provides optional common peripherals like embedded memories,
9
timers, serial interfaces, general purpose IO ports and an external bus interface to connect custom IP like
10
memories, NoCs and other peripherals. On-line and in-system debugging is supported by an OpenOCD/gdb
11
compatible on-chip debugger accessible via JTAG.
12
 
13
The software framework of the processor comes with application makefiles, software libraries for all CPU
14
and processor features, a bootloader, a runtime environment and several example programs – including a port
15
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
16
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
17
 
18
[TIP]
19 61 zero_gravi
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
20
that provides hands-on tutorial to get you started.
21
 
22
[TIP]
23 60 zero_gravi
The project's change log is available in https://github.com/stnolting/neorv32/blob/master/CHANGELOG.md[CHANGELOG.md]
24
in the root directory of the NEORV32 repository. Please also check out the <<_legal>> section.
25
 
26
 
27 61 zero_gravi
**Structure**
28 60 zero_gravi
 
29 61 zero_gravi
* <<_neorv32_processor_soc>>
30
* <<_neorv32_central_processing_unit_cpu>>
31
* <<_on_chip_debugger_ocd>>
32
* <<_software_framework>>
33 60 zero_gravi
 
34 61 zero_gravi
[TIP]
35
Links in this document are <<_overview,highlighted>>.
36 60 zero_gravi
 
37
 
38
 
39 61 zero_gravi
<<<
40
// ####################################################################################################################
41
:sectnums:
42
=== Rationale
43 60 zero_gravi
 
44 61 zero_gravi
**Why did you make this?**
45 60 zero_gravi
 
46 61 zero_gravi
I am fascinated by processor and CPU architecture design: it is the magic frontier where software meets hardware.
47
This project has started as something like a _journey_ into this magic realm to understand how things actually work
48
down on this very low level.
49 60 zero_gravi
 
50 61 zero_gravi
But there is more! When I started to dive into the emerging RISC-V ecosystem I felt overwhelmed by the complexity.
51
As a beginner it is hard to get an overview - especially when you want to setup a minimal platform to tinker with:
52
Which core to use? How to get the right toolchain? What features do I need? How does the booting work? How do I
53
create an actual executable? How to get that into the hardware? How to customize things? **_Where to start???_**
54 60 zero_gravi
 
55 61 zero_gravi
So this project aims to provides a _simple to understand_ and _easy to use_ yet _powerful_ and _flexible_ platform
56
that targets FPGA and RISC-V beginners as well as advanced users. Join me and us on this journey! 🙃
57 60 zero_gravi
 
58
 
59 61 zero_gravi
**Why a _soft_-core processor?**
60 60 zero_gravi
 
61 61 zero_gravi
As a matter of fact soft-core processors _cannot_ compete with discrete or FPGA hard-macro processors in terms
62
of performance, energy and size. But they do fill a niche in FPGA design space. For example, soft-core processors
63
allow to implement the _control flow part_ of certain applications (like communication protocol handling) using
64
software like plain C. This provides high flexibility as software can be easily changed, re-compiled and
65
re-uploaded again.
66 60 zero_gravi
 
67 61 zero_gravi
Furthermore, the concept of flexibility applies to all aspects of a soft-core processor. The user can add
68
_exactly_ the features that are required by the application: additional memories, custom interfaces, specialized
69
IP and even user-defined instructions.
70 60 zero_gravi
 
71
 
72 61 zero_gravi
**Why RISC-V?**
73
 
74
[quote, RISC-V International, https://riscv.org/about/]
75
____
76
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration.
77
____
78
 
79
I love the idea of open-source. **Knowledge can help best if it is freely available.**
80
While open-source has already become quite popular in _software_, hardware projects still need to catch up.
81
Admittedly, there has been quite a development, but mainly in terms of _platforms_ and _applications_ (so
82
schematics, PCBs, etc.). Although processors and CPUs are the heart of almost every digital system, having a true
83
open-source silicon is still a rarity. RISC-V aims to change that. Even it is _just one approach_, it helps paving
84
the road for future development.
85
 
86
Furthermore, I welcome the community aspect of RISC-V. The ISA and everything beyond is developed with direct
87
contact to the community: this includes businesses and professionals but also hobbyist, amateurs and people
88
that are just curious. Everyone can join discussions and contribute to RISC-V in their very own way.
89
 
90
Finally, I really like the RISC-V ISA itself. It aims to be a clean, orthogonal and "intuitive" ISA that
91
resembles with the basic concepts of _RISC_: simple yet effective.
92
 
93
 
94
**Yet another RISC-V core? What makes it special?**
95
 
96
The NEORV32 is not based on another RISC-V core. It was build entirely from ground up (just following the official
97
ISA specs) having a different design goal in mind. The project does not intend to replace certain RISC-V cores or
98
just beat existing ones like https://github.com/SpinalHDL/VexRiscv[VexRISC] in terms of performance or
99
https://github.com/olofk/serv[SERV] in terms of size.
100
 
101
The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance
102
vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability,
103
RISC-V compatibility, _customization_ and _ease of use_. See the <<_project_key_features>> below.
104
 
105
 
106 60 zero_gravi
// ####################################################################################################################
107
:sectnums:
108
=== Project Key Features
109
 
110 61 zero_gravi
* open-source and documented; including user guides to get started
111
* completely described in behavioral, platform-independent VHDL (yet platform-optimized modules are provided)
112
* fully synchronous design, no latches, no gated clocks
113
* small hardware footprint and high operating frequency for easy integration
114
* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU
115
** RISC-V compatibility: passes the official architecture tests
116
** base architecture + privileged architecture (optional) + ISA extensions (optional)
117
** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...)
118
** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]
119
* **NEORV32 Processor (SoC)**: highly-configurable full-scale microcontroller-like processor system
120
** based on the NEORV32 CPU
121
** optional serial interfaces (UARTs, TWI, SPI)
122
** optional timers and counters (WDT, MTIME)
123
** optional general purpose IO and PWM and native NeoPixel (c) compatible smart LED interface
124
** optional embedded memories / caches for data, instructions and bootloader
125
** optional external memory interface (Wishbone / AXI4-Lite) and stream link interface (AXI4-Stream) for custom connectivity
126
** on-chip debugger compatible with OpenOCD and gdb
127 60 zero_gravi
* **Software framework**
128
** GCC-based toolchain - prebuilt toolchains available; application compilation based on GNU makefiles
129
** internal bootloader with serial user interface
130
** core libraries for high-level usage of the provided functions and peripherals
131
** runtime environment and several example programs
132
** doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/sw/files.html
133
** FreeRTOS port + demos available
134
 
135 61 zero_gravi
[TIP]
136
For more in-depth details regarding the feature provided by he hardware see the according sections:
137
<<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>.
138 60 zero_gravi
 
139 61 zero_gravi
 
140 60 zero_gravi
<<<
141
// ####################################################################################################################
142
:sectnums:
143
=== Project Folder Structure
144
 
145
...................................
146 61 zero_gravi
neorv32           - Project home folder
147 60 zero_gravi
├.ci              - Scripts for continuous integration
148 61 zero_gravi
├setups           - Example setups for various FPGA boards and toolchains
149
│└...
150 60 zero_gravi
├CHANGELOG.md     - Project change log
151
├docs             - Project documentation
152 61 zero_gravi
│├doxygen_build   - Software framework documentation (generated by doxygen)
153
│├src_adoc        - AsciiDoc sources for this document
154
│├references      - Data sheets and RISC-V specs.
155
│└figures         - Figures and logos
156 60 zero_gravi
├riscv-arch-test  - Port files for the official RISC-V architecture tests
157
├rtl              - VHDL sources
158 61 zero_gravi
│├core            - Sources of the CPU & SoC
159
│└templates       - Alternate/additional top entities/wrappers
160
│ ├processor      - Processor wrappers
161
│ └system         - System wrappers for advanced connectivity
162 60 zero_gravi
├sim              - Simulation files
163 61 zero_gravi
│└rtl_modules     - Processor modules for simulation-only
164 60 zero_gravi
└sw               - Software framework
165
 ├bootloader      - Sources and scripts for the NEORV32 internal bootloader
166
 ├common          - Linker script and crt0.S start-up code
167
 ├example         - Various example programs
168
 │└...
169
 ├ocd_firmware    - source code for on-chip debugger's "park loop"
170
 ├openocd         - OpenOCD on-chip debugger configuration files
171
 ├image_gen       - Helper program to generate NEORV32 executables
172
 └lib             - Processor core library
173
  ├include        - Header files (*.h)
174
  └source         - Source files (*.c)
175
...................................
176
 
177
[NOTE]
178
There are further files and folders starting with a dot which – for example – contain
179
data/configurations only relevant for git or for the continuous integration framework (`.ci`).
180
 
181
 
182
<<<
183
// ####################################################################################################################
184
:sectnums:
185
=== VHDL File Hierarchy
186
 
187
All necessary VHDL hardware description files are located in the project's `rtl/core folder`. The top entity
188
of the entire processor including all the required configuration generics is **`neorv32_top.vhd`**.
189
 
190
[IMPORTANT]
191
All core VHDL files from the list below have to be assigned to a new design library named **`neorv32`**. Additional
192
files, like alternative top entities, can be assigned to any library.
193
 
194
...................................
195 61 zero_gravi
neorv32_top.vhd                  - NEORV32 Processor top entity
196
197
├neorv32_fifo.vhd                - General purpose FIFO component
198
├neorv32_package.vhd             - Processor/CPU main VHDL package file
199
200
├neorv32_cpu.vhd                 - NEORV32 CPU top entity
201
│├neorv32_cpu_alu.vhd            - Arithmetic/logic unit
202
││├neorv32_cpu_cp_fpu.vhd        - Floating-point co-processor (Zfinx ext.)
203
││├neorv32_cpu_cp_muldiv.vhd     - Mul/Div co-processor (M extension)
204
││└neorv32_cpu_cp_shifter.vhd    - Bit-shift co-processor
205
│├neorv32_cpu_bus.vhd            - Bus interface + physical memory protection
206
│├neorv32_cpu_control.vhd        - CPU control, exception/IRQ system and CSRs
207
││└neorv32_cpu_decompressor.vhd  - Compressed instructions decoder
208
│└neorv32_cpu_regfile.vhd        - Data register file
209
210
├neorv32_boot_rom.vhd            - Bootloader ROM
211
│└neorv32_bootloader_image.vhd   - Bootloader boot ROM memory image
212
├neorv32_busswitch.vhd           - Processor bus switch for CPU buses (I&D)
213
├neorv32_bus_keeper.vhd          - Processor-internal bus monitor
214
├neorv32_icache.vhd              - Processor-internal instruction cache
215
├neorv32_cfs.vhd                 - Custom functions subsystem
216
├neorv32_debug_dm.vhd            - on-chip debugger: debug module
217
├neorv32_debug_dtm.vhd           - on-chip debugger: debug transfer module
218
├neorv32_dmem.vhd                - Processor-internal data memory
219
├neorv32_gpio.vhd                - General purpose input/output port unit
220
├neorv32_imem.vhd                - Processor-internal instruction memory
221
│└neor32_application_image.vhd   - IMEM application initialization image
222
├neorv32_mtime.vhd               - Machine system timer
223
├neorv32_neoled.vhd              - NeoPixel (TM) compatible smart LED interface
224
├neorv32_pwm.vhd                 - Pulse-width modulation controller
225
├neorv32_spi.vhd                 - Serial peripheral interface controller
226
├neorv32_sysinfo.vhd             - System configuration information memory
227
├neorv32_trng.vhd                - True random number generator
228
├neorv32_twi.vhd                 - Two wire serial interface controller
229
├neorv32_uart.vhd                - Universal async. receiver/transmitter
230
├neorv32_wdt.vhd                 - Watchdog timer
231
├neorv32_wishbone.vhd            - External (Wishbone) bus interface
232
└neorv32_xirq.vhd                - External interrupt controller
233 60 zero_gravi
...................................
234
 
235
 
236
<<<
237
// ####################################################################################################################
238
:sectnums:
239
=== FPGA Implementation Results
240
 
241
This chapter shows exemplary implementation results of the NEORV32 CPU and Processor. Please note, that
242
the provided results are just a relative measure as logic functions of different modules might be merged
243
between entity boundaries, so the actual utilization results might vary a bit.
244
 
245
:sectnums:
246
==== CPU
247
 
248
[cols="<2,<8"]
249
[grid="topbot"]
250
|=======================
251
| Hardware version: | `1.5.5.5`
252
| Top entity:       | `rtl/core/neorv32_cpu.vhd`
253
|=======================
254
 
255
[cols="<5,>1,>1,>1,>1,>1"]
256
[options="header",grid="rows"]
257
|=======================
258
| CPU                                   | LEs  | FFs  | MEM bits | DSPs | _f~max~_
259 61 zero_gravi
| `rv32i`                               |  980 |  409 | 1024     | 0    | 125 MHz
260
| `rv32i_Zicsr`                         | 1835 |  856 | 1024     | 0    | 125 MHz
261
| `rv32im_Zicsr`                        | 2443 | 1134 | 1024     | 0    | 125 MHz
262 60 zero_gravi
| `rv32imc_Zicsr`                       | 2669 | 1149 | 1024     | 0    | 125 MHz
263 61 zero_gravi
| `rv32imac_Zicsr`                      | 2685 | 1156 | 1024     | 0    | 125 MHz
264
| `rv32imac_Zicsr` + `debug_mode`       | 3058 | 1225 | 1024     | 0    | 125 MHz
265
| `rv32imac_Zicsr` + `u`                | 2698 | 1162 | 1024     | 0    | 125 MHz
266
| `rv32imac_Zicsr_Zifencei` + `u`       | 2715 | 1162 | 1024     | 0    | 125 MHz
267
| `rv32imac_Zicsr_Zifencei_Zfinx` + `u` | 4004 | 1812 | 1024     | 7    | 118 MHz
268 60 zero_gravi
|=======================
269
 
270
 
271
:sectnums:
272
==== Processor Modules
273
 
274
[cols="<2,<8"]
275
[grid="topbot"]
276
|=======================
277 61 zero_gravi
| Hardware version: | `1.5.7.8`
278 60 zero_gravi
| Top entity:       | `rtl/core/neorv32_top.vhd`
279
|=======================
280
 
281
.Hardware utilization by the processor modules (mandatory core modules in **bold**)
282
[cols="<2,<8,>1,>1,>2,>1"]
283
[options="header",grid="rows"]
284
|=======================
285
| Module        | Description                                         | LEs | FFs | MEM bits | DSPs
286 61 zero_gravi
| Boot ROM      | Bootloader ROM (4kB)                                |   2 |   1 |    32768 |    0
287
| **BUSKEEPER** | Processor-internal bus monitor                      |   9 |   6 |        0 |    0
288
| **BUSSWITCH** | Bus mux for CPU instr. and data interface           |  63 |   8 |        0 |    0
289
| CFS           | Custom functions subsystemfootnote:[Resource utilization depends on actually implemented custom functionality.] | - | - | - | -
290
| DMEM          | Processor-internal data memory (8kB)                |  19 |   2 |    65536 |    0
291 60 zero_gravi
| DM            | On-chip debugger - debug module                     | 493 | 240 |        0 |    0
292
| DTM           | On-chip debugger - debug transfer module (JTAG)     | 254 | 218 |        0 |    0
293 61 zero_gravi
| GPIO          | General purpose input/output ports                  | 134 | 161 |        0 |    0
294
| iCACHE        | Instruction cache (1x4 blocks, 256 bytes per block) | 2 21| 156 |     8192 |    0
295
| IMEM          | Processor-internal instruction memory (16kB)        |  13 |   2 |   131072 |    0
296
| MTIME         | Machine system timer                                | 319 | 167 |        0 |    0
297
| NEOLED        | Smart LED Interface (NeoPixel/WS28128) [4xFIFO]     | 342 | 307 |        0 |    0
298
| SLINK         | Stream link interface (4 links, FIFO_depth=1)       | 345 | 313 |        0 |    0
299 60 zero_gravi
| PWM           | Pulse_width modulation controller (4 channels)      |  71 |  69 |        0 |    0
300 61 zero_gravi
| SPI           | Serial peripheral interface                         | 148 | 127 |        0 |    0
301
| **SYSINFO**   | System configuration information memory             |  14 |  11 |        0 |    0
302
| TRNG          | True random number generator                        |  89 |  76 |        0 |    0
303
| TWI           | Two-wire interface                                  |  77 |  43 |        0 |    0
304
| UART0/1       | Universal asynchronous receiver/transmitter 0/1     | 183 | 132 |        0 |    0
305
| WDT           | Watchdog timer                                      |  53 |  43 |        0 |    0
306
| WISHBONE      | External memory interface                           | 114 | 110 |        0 |    0
307
| XIRQ          | External interrupt controller (32 channels)         | 241 | 201 |        0 |    0
308 60 zero_gravi
|=======================
309
 
310
 
311
<<<
312
:sectnums:
313
==== Exemplary Setups
314
 
315
[TIP]
316 61 zero_gravi
Check out the `setups` folder (@GitHub: https://github.com/stnolting/neorv32/tree/master/setups),
317
which provides several demo setups for various FPGA boards and toolchains.
318 60 zero_gravi
 
319
 
320
<<<
321
// ####################################################################################################################
322
:sectnums:
323
=== CPU Performance
324
 
325
:sectnums:
326
==== CoreMark Benchmark
327
 
328
.Configuration
329
[cols="<2,<8"]
330
[grid="topbot"]
331
|=======================
332
| Hardware:       | 32kB IMEM, 16kB DMEM, no caches, 100MHz clock
333
| CoreMark:       | 2000 iterations, MEM_METHOD is MEM_STACK
334
| Compiler:       | RISCV32-GCC 10.1.0
335
| Peripherals:    | UART for printing the results
336
| Compiler flags: | default, see makefile
337
|=======================
338
 
339
The performance of the NEORV32 was tested and evaluated using the https://www.eembc.org/coremark/[Core Mark CPU benchmark]. This
340
benchmark focuses on testing the capabilities of the CPU core itself rather than the performance of the whole
341
system. The according source code and the SW project can be found in the `sw/example/coremark` folder.
342
 
343
The resulting CoreMark score is defined as CoreMark iterations per second.
344
The execution time is determined via the RISC-V `[m]cycle[h]` CSRs. The relative CoreMark score is
345
defined as CoreMark score divided by the CPU's clock frequency in MHz.
346
 
347
[cols="<2,<8"]
348
[grid="topbot"]
349
|=======================
350
| Hardware version: | `1.4.9.8`
351
|=======================
352
 
353
.CoreMark results
354
[cols="<4,>1,>1,>1"]
355
[options="header",grid="rows"]
356
|=======================
357
| CPU (incl. `Zicsr`)                         | Executable size | CoreMark Score | CoreMarks/Mhz
358
| `rv32i`                                     |     28756 bytes |          36.36 | **0.3636**
359
| `rv32im`                                    |     27516 bytes |          68.97 | **0.6897**
360
| `rv32imc`                                   |     22008 bytes |          68.97 | **0.6897**
361
| `rv32imc` + _FAST_MUL_EN_                   |     22008 bytes |          86.96 | **0.8696**
362
| `rv32imc` + _FAST_MUL_EN_ + _FAST_SHIFT_EN_ |     22008 bytes |          90.91 | **0.9091**
363
|=======================
364
 
365
[NOTE]
366
All executable were generated using maximum optimization `-O3`.
367
The _FAST_MUL_EN_ configuration uses DSPs for the multiplier of the _M_ extension (enabled via the
368
_FAST_MUL_EN_ generic). The _FAST_SHIFT_EN_ configuration uses a barrel shifter for CPU shift
369
operations (enabled via the _FAST_SHIFT_EN_ generic).
370
 
371
 
372
<<<
373
:sectnums:
374
==== Instruction Timing
375
 
376
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of
377
several consecutive micro operations. Hence, each instruction requires several clock cycles to execute.
378
 
379
The average CPI (cycles per instruction) depends on the instruction mix of a specific applications and also on
380
the available CPU extensions. The following table shows the performance results for successfully (!) running
381
2000 CoreMark iterations.
382
 
383
The average CPI is computed by dividing the total number of required clock cycles (only the timed core to
384
avoid distortion due to IO wait cycles) by the number of executed instructions (`[m]instret[h]` CSRs). The
385 61 zero_gravi
executables were generated using optimization `-O3`.
386 60 zero_gravi
 
387
[cols="<2,<8"]
388
[grid="topbot"]
389
|=======================
390
| Hardware version: | `1.4.9.8`
391
|=======================
392
 
393
.CoreMark instruction timing
394
[cols="<4,>2,>2,>2"]
395
[options="header",grid="rows"]
396
|=======================
397
| CPU (incl. `Zicsr`)                         | Required clock cycles | Executed instruction | Average CPI
398
| `rv32i`                                     |            5595750503 | 1466028607           | **3.82**
399
| `rv32im`                                    |            2966086503 |  598651143           | **4.95**
400
| `rv32imc`                                   |            2981786734 |  611814918           | **4.87**
401
| `rv32imc` + _FAST_MUL_EN_                   |            2399234734 |  611814918           | **3.92**
402
| `rv32imc` + _FAST_MUL_EN_ + _FAST_SHIFT_EN_ |            2265135174 |  611814948           | **3.70**
403
|=======================
404
 
405
[TIP]
406
The _FAST_MUL_EN_ configuration uses DSPs for the multiplier of the M extension (enabled via the
407
_FAST_MUL_EN_ generic). The _FAST_SHIFT_EN_ configuration uses a barrel shifter for CPU shift
408
operations (enabled via the _FAST_SHIFT_EN_ generic).
409
 
410
[TIP]
411
More information regarding the execution time of each implemented instruction can be found in
412
chapter <<_instruction_timing>>.
413
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.