1 |
2 |
zero_gravi |
# [The NEORV32 Processor](https://github.com/stnolting/neorv32)
|
2 |
|
|
|
3 |
|
|
[![Build Status](https://travis-ci.com/stnolting/neorv32.svg?branch=master)](https://travis-ci.com/stnolting/neorv32)
|
4 |
|
|
[![license](https://img.shields.io/github/license/stnolting/neorv32)](https://github.com/stnolting/neorv32/blob/master/LICENSE)
|
5 |
|
|
[![release](https://img.shields.io/github/v/release/stnolting/neorv32)](https://github.com/stnolting/neorv32/releases)
|
6 |
|
|
|
7 |
|
|
[![issues](https://img.shields.io/github/issues/stnolting/neorv32)](https://github.com/stnolting/neorv32/issues)
|
8 |
|
|
[![pull requests](https://img.shields.io/github/issues-pr/stnolting/neorv32)](https://github.com/stnolting/neorv32/pulls)
|
9 |
|
|
[![last commit](https://img.shields.io/github/last-commit/stnolting/neorv32)](https://github.com/stnolting/neorv32/commits/master)
|
10 |
|
|
|
11 |
|
|
|
12 |
|
|
|
13 |
|
|
## Table of Content
|
14 |
|
|
|
15 |
|
|
* [Introduction](#Introduction)
|
16 |
|
|
* [Features](#Features)
|
17 |
|
|
* [FPGA Implementation Results](#FPGA-Implementation-Results)
|
18 |
|
|
* [Performance](#Performance)
|
19 |
|
|
* [Top Entity](#Top-Entity)
|
20 |
|
|
* [**Getting Started**](#Getting-Started)
|
21 |
|
|
* [Contact](#Contact)
|
22 |
|
|
* [Legal](#Legal)
|
23 |
|
|
|
24 |
|
|
|
25 |
|
|
|
26 |
|
|
## Introduction
|
27 |
|
|
|
28 |
|
|
The NEORV32 is a customizable mikrocontroller-like processor system based on a RISC-V `rv32i` or `rv32e` CPU with optional
|
29 |
|
|
`M`, `C` and `Zicsr` extensions. The CPU was built from scratch and is compliant to the **Unprivileged
|
30 |
3 |
zero_gravi |
ISA Specification Version 2.1** and a subset of the **Privileged Architecture Specification Version 1.12**. The NEORV32 is intended
|
31 |
2 |
zero_gravi |
as auxiliary processor within a larger SoC designs or as stand-alone custom microcontroller.
|
32 |
|
|
|
33 |
|
|
The processor provides common peripherals and interfaces like input and output ports, serial interfaces for UART, I²C and SPI,
|
34 |
|
|
interrupt controller, timers and embedded memories. External memories peripherals and custom IP can be attached via a
|
35 |
|
|
Wishbone-based external memory interface. All optional features beyond the base CPU can be enabled configured via VHDL generics.
|
36 |
|
|
|
37 |
|
|
This project comes with a complete software ecosystem that features core libraries for high-level usage of the
|
38 |
|
|
provided functions and peripherals, application makefiles and example programs. All software source files
|
39 |
|
|
provide a doxygen-based documentary.
|
40 |
|
|
|
41 |
|
|
The project is intended to work "out of the box". Just synthesize the test setup from this project, upload
|
42 |
|
|
it to your FPGA board of choice and start playing with the NEORV32. If you do not want to [compile the GCC toolchain](https://github.com/riscv/riscv-gnu-toolchain)
|
43 |
|
|
by yourself, you can also download [pre-compiled toolchain](https://github.com/stnolting/riscv_gcc_prebuilt) for Linux.
|
44 |
|
|
|
45 |
|
|
For more information take a look a the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
46 |
|
|
|
47 |
|
|
|
48 |
|
|
### Design Principles
|
49 |
|
|
|
50 |
|
|
* From zero to main(): Completely open source and documented.
|
51 |
|
|
* Plain VHDL without technology-specific parts like attributes, macros or primitives.
|
52 |
|
|
* Easy to use – working out of the box.
|
53 |
|
|
* Clean synchronous design, no wacky combinatorial interfaces.
|
54 |
|
|
* The processor has to fit in a Lattice iCE40 UltraPlus 5k FPGA running at 20+ MHz.
|
55 |
|
|
|
56 |
|
|
|
57 |
|
|
### Status
|
58 |
3 |
zero_gravi |
|
59 |
2 |
zero_gravi |
![processor status](https://img.shields.io/badge/processor%20status-beta-orange)
|
60 |
|
|
|
61 |
3 |
zero_gravi |
The processor is synthesizable (tested with Intel Quartus Prime and Lattice Radiant/Synplify) and can successfully execute all the [provided example programs](https://github.com/stnolting/neorv32/tree/master/sw/example).
|
62 |
2 |
zero_gravi |
|
63 |
|
|
## Features
|
64 |
|
|
|
65 |
|
|
![neorv32 Overview](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/neorv32_overview.png)
|
66 |
|
|
|
67 |
|
|
### Processor Features
|
68 |
|
|
|
69 |
|
|
- RISC-V-compliant `rv32i` or `rv32e` CPU with optional `C`, `E`, `M` and `Zicsr` extensions
|
70 |
|
|
- GCC-based toolchain ([pre-compiled rv32i and rv32 etoolchains available](https://github.com/stnolting/riscv_gcc_prebuilt))
|
71 |
3 |
zero_gravi |
- Application compilation based on [GNU makefiles](https://github.com/stnolting/neorv32/blob/master/sw/example/blink_led/makefile)
|
72 |
|
|
- [Doxygen-based](https://github.com/stnolting/neorv32/blob/master/docs/doxygen_makefile_sw) documentation of the software framework
|
73 |
2 |
zero_gravi |
- Completely described in behavioral, platform-independent VHDL – no primitives, macros, etc.
|
74 |
|
|
- Fully synchronous design, no latches, no gated clocks
|
75 |
|
|
- Small hardware footprint and high operating frequency
|
76 |
|
|
- Highly customizable processor configuration
|
77 |
|
|
- Optional processor-internal data and instruction memories (DMEM/IMEM)
|
78 |
|
|
- Optional internal bootloader with UART console and automatic SPI flash boot option
|
79 |
|
|
- Optional machine system timer (MTIME), RISC-V-compliant
|
80 |
|
|
- Optional universal asynchronous receiver and transmitter (UART)
|
81 |
|
|
- Optional 8/16/24/32-bit serial peripheral interface master (SPI) with 8 dedicated chip select lines
|
82 |
|
|
- Optional two wire serial interface master (TWI), compatible to the I²C standard
|
83 |
|
|
- Optional general purpose parallel IO port (GPIO), 16xOut & 16xIn, with pin-change interrupt
|
84 |
|
|
- Optional 32-bit external bus interface, Wishbone b4 compliant (WISHBONE)
|
85 |
|
|
- Optional watchdog timer (WDT)
|
86 |
|
|
- Optional PWM controller with 4 channels and 8-bit duty cycle resolution (PWM)
|
87 |
|
|
- Optional GARO-based true random number generator (TRNG)
|
88 |
|
|
- Optional core-local interrupt controller with 8 channels (CLIC)
|
89 |
3 |
zero_gravi |
- Optional dummy device (DEVNULL) (can be used for *fast* simulation console output)
|
90 |
2 |
zero_gravi |
|
91 |
|
|
|
92 |
|
|
### CPU Features
|
93 |
|
|
|
94 |
|
|
The CPU is compliant to the [official RISC-V specifications](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf) including a subset of the
|
95 |
|
|
[RISC-V privileged architecture specifications](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/riscv-spec.pdf).
|
96 |
|
|
|
97 |
3 |
zero_gravi |
**RV32I base instruction set** (`I` extension):
|
98 |
2 |
zero_gravi |
* ALU instructions: `LUI` `AUIPC` `ADDI` `SLTI` `SLTIU` `XORI` `ORI` `ANDI` `SLLI` `SRLI` `SRAI` `ADD` `SUB` `SLL` `SLT` `SLTU` `XOR` `SRL` `SRA` `OR` `AND`
|
99 |
|
|
* Branches instructions: `JAL` `JALR` `BEQ` `BNE` `BLT` `BGE` `BLTU` `BGEU`
|
100 |
|
|
* Memory instructions: `LB` `LH` `LW` `LBU` `LHU` `SB` `SH` `SW`
|
101 |
|
|
|
102 |
3 |
zero_gravi |
**Compressed instructions** (`C` extension):
|
103 |
2 |
zero_gravi |
* ALU instructions: `C.ADDI4SPN` `C.ADDI` `C.ADD` `C.ADDI16SP` `C.LI` `C.LUI` `C.SLLI` `C.SRLI` `C.SRAI` `C.ANDI` `C.SUB` `C.XOR` `C.OR` `C.AND` `C.MV` `C.NOP`
|
104 |
|
|
* Branches instructions: `C.J` `C.JAL` `C.JR` `C.JALR` `C.BEQZ` `C.BNEZ`
|
105 |
|
|
* Memory instructions: `C.LW` `C.SW` `C.LWSP` `C.SWSP`
|
106 |
|
|
* Misc instructions: `C.EBREAK` (only with `Zicsr` extension)
|
107 |
|
|
|
108 |
3 |
zero_gravi |
**Embedded CPU version** (`E` extension):
|
109 |
2 |
zero_gravi |
* Reduced register file (only the 16 lowest registers)
|
110 |
|
|
* No performance counter CSRs
|
111 |
|
|
|
112 |
3 |
zero_gravi |
**Integer multiplication and division hardware** (`M` extension):
|
113 |
2 |
zero_gravi |
* Multiplication instructions: `MUL` `MULH` `MULHSU` `MULHU`
|
114 |
|
|
* Division instructions: `DIV` `DIVU` `REM` `REMU`
|
115 |
|
|
|
116 |
3 |
zero_gravi |
**Privileged architecture** (`Zicsr` extension):
|
117 |
2 |
zero_gravi |
* Privilege levels: `M-mode` (Machine mode)
|
118 |
|
|
* CSR access instructions: `CSRRW` `CSRRS` `CSRRC` `CSRRWI` `CSRRSI` `CSRRCI`
|
119 |
|
|
* System instructions: `ECALL` `EBREAK` `MRET` `WFI`
|
120 |
|
|
* Counter CSRs: `cycle` `cycleh` `time` `timeh` `instret` `instreth` `mcycle` `mcycleh` `minstret` `minstreth`
|
121 |
|
|
* Machine CSRs: `mstatus` `misa` `mie` `mtvec` `mscratch` `mepc` `mcause` `mtval` `mip` `mtinst` `mimpid` `mhartid`
|
122 |
|
|
* Custom CSRs: `mfeatures` `mclock` `mispacebase` `mdspacebase` `mispacesize` `mdspacesize`
|
123 |
|
|
* Supported exceptions and interrupts:
|
124 |
|
|
* Misaligned instruction address
|
125 |
|
|
* Instruction access fault
|
126 |
|
|
* Illegal instruction
|
127 |
|
|
* Breakpoint
|
128 |
|
|
* Load address misaligned
|
129 |
|
|
* Load access fault
|
130 |
|
|
* Sore address misaligned
|
131 |
|
|
* Store access fault
|
132 |
|
|
* Environment call from M-mode
|
133 |
|
|
* Machine software instrrupt
|
134 |
|
|
* Machine timer interrupt (from MTIME)
|
135 |
|
|
* Machine external interrupt (via CLIC)
|
136 |
|
|
|
137 |
3 |
zero_gravi |
**General**:
|
138 |
2 |
zero_gravi |
* No hardware support of unaligned accesses (except for instructions in `C` extension that still have to be aligned on 16-bit boundaries)
|
139 |
|
|
* Multi-cycle in-order instruction execution
|
140 |
|
|
|
141 |
|
|
More information including a detailed list of the available CSRs can be found in
|
142 |
|
|
the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
143 |
|
|
|
144 |
|
|
|
145 |
|
|
### To-Do / Wish List
|
146 |
|
|
|
147 |
|
|
- Testing, testing and even more testing
|
148 |
|
|
- Port official [RISC-V compliance test](https://github.com/riscv/riscv-compliance)
|
149 |
|
|
- Port Dhrystone benchmark
|
150 |
|
|
- Implement atomic extensions (`A` extension)
|
151 |
|
|
- Implement co-processor for single-precision floating-point (`F` extension)
|
152 |
|
|
- Implement user mode (`U` extension)
|
153 |
|
|
- Make a 64-bit branch
|
154 |
|
|
- Maybe port an RTOS (like [freeRTOS](https://www.freertos.org/) or [RIOT](https://www.riot-os.org/))
|
155 |
|
|
|
156 |
|
|
|
157 |
|
|
|
158 |
|
|
## FPGA Implementation Results
|
159 |
|
|
|
160 |
|
|
This chapter shows exemplary implementation results of the NEORV32 processor for an **Intel Cyclone IV EP4CE22F17C6N FPGA** on
|
161 |
|
|
a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19.1** ("balanced implementation"). The timing
|
162 |
|
|
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not other specified, the default configuration
|
163 |
|
|
of the processor's generics is assumed. No constraints were used.
|
164 |
|
|
|
165 |
|
|
Results generated for hardware version: `0.0.2.3`
|
166 |
|
|
|
167 |
|
|
### CPU
|
168 |
|
|
|
169 |
|
|
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
170 |
|
|
|:--------------------|:----------:|:--------:|:-----------:|:------:|:-------:|
|
171 |
|
|
| `rv32i` | 852 (4%) | 326 (1%) | 2048 (>1%) | 0 (0%) | 111 MHz |
|
172 |
|
|
| `rv32i` + `Zicsr` | 1488 (7%) | 694 (3%) | 2048 (>1%) | 0 (0%) | 107 MHz |
|
173 |
|
|
| `rv32im` + `Zicsr` | 2057 (9%) | 941 (4%) | 2048 (>1%) | 0 (0%) | 102 MHz |
|
174 |
|
|
| `rv32imc` + `Zicsr` | 2209 (10%) | 958 (4%) | 2048 (>1%) | 0 (0%) | 102 MHz |
|
175 |
|
|
| `rv32e` | 848 (4%) | 326 (1%) | 1024 (>1%) | 0 (0%) | 111 MHz |
|
176 |
|
|
| `rv32e` + `Zicsr` | 1316 (6%) | 594 (3%) | 1024 (>1%) | 0 (0%) | 106 MHz |
|
177 |
|
|
| `rv32em` + `Zicsr` | 1879 (8%) | 841 (4%) | 1024 (>1%) | 0 (0%) | 101 MHz |
|
178 |
|
|
| `rv32emc` + `Zicsr` | 2065 (9%) | 858 (4%) | 1024 (>1%) | 0 (0%) | 100 MHz |
|
179 |
|
|
|
180 |
|
|
### Peripherals / Others
|
181 |
|
|
|
182 |
|
|
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
183 |
|
|
|:---------|:------------------------------------------------|:---:|:---:|:-----------:|:----:|
|
184 |
|
|
| Boot ROM | Bootloader ROM (4kB) | 3 | 1 | 32 768 | 0 |
|
185 |
3 |
zero_gravi |
| DEVNULL | Dummy device | 2 | 1 | 0 | 0 |
|
186 |
2 |
zero_gravi |
| DMEM | Processor-internal data memory (8kB) | 12 | 2 | 65 536 | 0 |
|
187 |
|
|
| GPIO | General purpose input/output ports | 37 | 33 | 0 | 0 |
|
188 |
|
|
| IMEM | Processor-internal instruction memory (16kb) | 7 | 2 | 131 072 | 0 |
|
189 |
|
|
| MTIME | Machine system timer | 369 | 168 | 0 | 0 |
|
190 |
|
|
| PWM | Pulse-width modulation controller | 77 | 69 | 0 | 0 |
|
191 |
|
|
| SPI | Serial peripheral interface | 198 | 125 | 0 | 0 |
|
192 |
|
|
| TRNG | True random number generator | 103 | 93 | 0 | 0 |
|
193 |
|
|
| TWI | Two-wire interface | 76 | 44 | 0 | 0 |
|
194 |
|
|
| UART | Universal asynchronous receiver/transmitter | 154 | 108 | 0 | 0 |
|
195 |
|
|
| WDT | Watchdog timer | 57 | 45 | 0 | 0 |
|
196 |
|
|
|
197 |
|
|
|
198 |
|
|
### Lattice iCE40 UltraPlus 5k
|
199 |
|
|
|
200 |
|
|
The following table shows the hardware utilization for a [iCE40 UP5K](http://www.latticesemi.com/en/Products/FPGAandCPLD/iCE40UltraPlus) FPGA.
|
201 |
|
|
The setup uses all provided peripherals, all CPU extensions (except for the `E` extension), no external memory interface and internal
|
202 |
|
|
instruction and data memoryies (each 64kB) based on SPRAM primitives. The FPGA-specific memory components can be found in the
|
203 |
|
|
[`rtl/fpga_specific`](https://github.com/stnolting/neorv32/blob/master/rtl/fpga_specific/lattice_ice40up) folder.
|
204 |
|
|
|
205 |
|
|
Place & route reports generated with **Lattice Radiant 1.1. Synplify**. The clock frequency is constrained and generated via the
|
206 |
|
|
PLL from the internal HF oscillator running at 12 MHz.
|
207 |
|
|
|
208 |
|
|
| CPU Configuration | Slices | LUT | REG | DSPs | SRAM | EBR | f |
|
209 |
|
|
|:--------------------|:----------:|:----------:|:----------:|:------:|:--------:|:--------:|:---------:|
|
210 |
|
|
| `rv32imc` | 2593 (98%) | 5059 (95%) | 1776 (33%) | 0 (0%) | 4 (100%) | 12 (40%) | 20.25 MHz |
|
211 |
|
|
|
212 |
|
|
|
213 |
|
|
## Performance
|
214 |
|
|
|
215 |
|
|
### CoreMark Benchmark
|
216 |
|
|
|
217 |
|
|
The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the NEORV32 and is available in the
|
218 |
|
|
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
219 |
|
|
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
220 |
|
|
|
221 |
|
|
Results generated for hardware version: `0.0.2.3`
|
222 |
|
|
|
223 |
|
|
~~~
|
224 |
|
|
**Configuration**
|
225 |
|
|
Hardware: 32kB IMEM, 16kb DMEM, 100MHz clock
|
226 |
|
|
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
|
227 |
|
|
CPU extensions: `rv32i` or `rv32im` or `rv32imc`
|
228 |
|
|
Used peripherals: MTIME for time measurement, UART for printing the results
|
229 |
|
|
~~~
|
230 |
|
|
|
231 |
|
|
| __Configuration__ | __Optimization__ | __Executable Size__ | __CoreMark Score__ | __CoreMarks/MHz__ |
|
232 |
|
|
|:------------------|:----------------:|:-------------------:|:------------------:|:-----------------:|
|
233 |
|
|
| `rv32i` | `-Os` | 17 944 bytes | 23.26 | 0.232 |
|
234 |
|
|
| `rv32i` | `-O2` | 20 264 bytes | 25.64 | 0.256 |
|
235 |
|
|
| `rv32im` | `-Os` | 16 880 bytes | 40.81 | 0.408 |
|
236 |
|
|
| `rv32im` | `-O2` | 19 312 bytes | 47.62 | 0.476 |
|
237 |
|
|
| `rv32imc` | `-Os` | 13 000 bytes | 32.78 | 0.327 |
|
238 |
|
|
| `rv32imc` | `-O2` | 15 004 bytes | 37.04 | 0.370 |
|
239 |
|
|
|
240 |
|
|
|
241 |
|
|
### Instruction Cycles
|
242 |
|
|
|
243 |
|
|
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of several
|
244 |
|
|
consecutive micro operations. Hence, each instruction requires several clock cycles to execute. The average CPI
|
245 |
|
|
(cycles per instruction) depends on the instruction mix of a specific applications and also on the available
|
246 |
|
|
CPU extensions.
|
247 |
|
|
|
248 |
|
|
Please note that the CPU-internal shifter (e.g. for the `SLL` instruction) as well as the multiplier and divider of the
|
249 |
|
|
`M` extension use a bit-serial approach and require several cycles for completion.
|
250 |
|
|
|
251 |
|
|
The following table shows the performance results for successfully (!) running 2000 CoreMark
|
252 |
|
|
iterations. The average CPI is computed by dividing the total number of required clock cycles (all of CoreMark
|
253 |
|
|
– not only the timed core) by the number of executed instructions (`instret[h]` CSRs). The executables
|
254 |
|
|
were generated using optimization `-O2`.
|
255 |
|
|
|
256 |
|
|
| CPU / Toolchain Config. | Required Clock Cycles | Executed Instructions | Average CPI |
|
257 |
|
|
|:------------------------|----------------------:|----------------------:|:-----------:|
|
258 |
|
|
| `rv32i` | 10 385 023 697 | 1 949 310 506 | 5.3 |
|
259 |
|
|
| `rv32im` | 6 276 943 488 | 995 011 883 | 6.3 |
|
260 |
|
|
| `rv32imc` | 7 340 734 652 | 934 952 588 | 7.6 |
|
261 |
|
|
|
262 |
|
|
|
263 |
|
|
### Evaluation
|
264 |
|
|
|
265 |
|
|
Based on the provided performance measurement and the hardware utilization for the
|
266 |
|
|
different CPU configurations, the following configurations are suggested:
|
267 |
|
|
|
268 |
|
|
|
269 |
|
|
| Design Goal | NEORV32 CPU Config. |
|
270 |
|
|
|:-------------------------------|:--------------------|
|
271 |
|
|
| Highest performance: | `rv32im` |
|
272 |
|
|
| Lowest memory requirements: | `rv32imc` |
|
273 |
|
|
| Lowest hardware requirements*: | `rv32ec` |
|
274 |
|
|
|
275 |
|
|
*) Including on-chip memory hardware requirements.
|
276 |
|
|
|
277 |
|
|
|
278 |
|
|
|
279 |
|
|
## Top Entity
|
280 |
|
|
|
281 |
|
|
The top entity of the processor is [**neorv32_top.vhd**](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) (from the `rtl/core` folder).
|
282 |
|
|
Just instantiate this file in your project and you are ready to go! All signals of this top entity are of type *std_ulogic* or *std_ulogic_vector*, respectively
|
283 |
|
|
(except for the TWI signals, which are of type *std_logic*).
|
284 |
|
|
|
285 |
|
|
Use the generics to configure the processor according to your needs. Each generics is initilized with the default configuration.
|
286 |
|
|
Detailed information regarding the signals and configuration generics can be found in the [NEORV32 documentary](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
287 |
|
|
|
288 |
|
|
```vhdl
|
289 |
|
|
entity neorv32_top is
|
290 |
|
|
generic (
|
291 |
|
|
-- General --
|
292 |
|
|
CLOCK_FREQUENCY : natural := 0; -- clock frequency of clk_i in Hz
|
293 |
|
|
HART_ID : std_ulogic_vector(31 downto 0) := x"00000000"; -- custom hardware thread ID
|
294 |
|
|
BOOTLOADER_USE : boolean := true; -- implement processor-internal bootloader?
|
295 |
|
|
-- RISC-V CPU Extensions --
|
296 |
|
|
CPU_EXTENSION_RISCV_C : boolean := false; -- implement compressed extension?
|
297 |
|
|
CPU_EXTENSION_RISCV_E : boolean := false; -- implement embedded RF extension?
|
298 |
|
|
CPU_EXTENSION_RISCV_M : boolean := false; -- implement muld/div extension?
|
299 |
|
|
CPU_EXTENSION_RISCV_Zicsr : boolean := true; -- implement CSR system?
|
300 |
|
|
-- Memory configuration: Instruction memory --
|
301 |
|
|
MEM_ISPACE_BASE : std_ulogic_vector(31 downto 0) := x"00000000"; -- base address of instruction memory space
|
302 |
|
|
MEM_ISPACE_SIZE : natural := 16*1024; -- total size of instruction memory space in byte
|
303 |
|
|
MEM_INT_IMEM_USE : boolean := true; -- implement processor-internal instruction memory
|
304 |
|
|
MEM_INT_IMEM_SIZE : natural := 16*1024; -- size of processor-internal instruction memory in bytes
|
305 |
|
|
MEM_INT_IMEM_ROM : boolean := false; -- implement processor-internal instruction memory as ROM
|
306 |
|
|
-- Memory configuration: Data memory --
|
307 |
|
|
MEM_DSPACE_BASE : std_ulogic_vector(31 downto 0) := x"80000000"; -- base address of data memory space
|
308 |
|
|
MEM_DSPACE_SIZE : natural := 8*1024; -- total size of data memory space in byte
|
309 |
|
|
MEM_INT_DMEM_USE : boolean := true; -- implement processor-internal data memory
|
310 |
|
|
MEM_INT_DMEM_SIZE : natural := 8*1024; -- size of processor-internal data memory in bytes
|
311 |
|
|
-- Memory configuration: External memory interface --
|
312 |
|
|
MEM_EXT_USE : boolean := false; -- implement external memory bus interface?
|
313 |
|
|
MEM_EXT_REG_STAGES : natural := 2; -- number of interface register stages (0,1,2)
|
314 |
|
|
MEM_EXT_TIMEOUT : natural := 15; -- cycles after which a valid bus access will timeout (>=1)
|
315 |
|
|
-- Processor peripherals --
|
316 |
|
|
IO_GPIO_USE : boolean := true; -- implement general purpose input/output port unit (GPIO)?
|
317 |
|
|
IO_MTIME_USE : boolean := true; -- implement machine system timer (MTIME)?
|
318 |
|
|
IO_UART_USE : boolean := true; -- implement universal asynchronous receiver/transmitter (UART)?
|
319 |
|
|
IO_SPI_USE : boolean := true; -- implement serial peripheral interface (SPI)?
|
320 |
|
|
IO_TWI_USE : boolean := true; -- implement two-wire interface (TWI)?
|
321 |
|
|
IO_PWM_USE : boolean := true; -- implement pulse-width modulation unit (PWM)?
|
322 |
|
|
IO_WDT_USE : boolean := true; -- implement watch dog timer (WDT)?
|
323 |
|
|
IO_CLIC_USE : boolean := true; -- implement core local interrupt controller (CLIC)?
|
324 |
3 |
zero_gravi |
IO_TRNG_USE : boolean := false; -- implement true random number generator (TRNG)?
|
325 |
|
|
IO_DEVNULL_USE : boolean := true -- implement dummy device (DEVNULL)?
|
326 |
2 |
zero_gravi |
);
|
327 |
|
|
port (
|
328 |
|
|
-- Global control --
|
329 |
|
|
clk_i : in std_ulogic := '0'; -- global clock, rising edge
|
330 |
|
|
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async
|
331 |
|
|
-- Wishbone bus interface (available if MEM_EXT_USE = true) --
|
332 |
|
|
wb_adr_o : out std_ulogic_vector(31 downto 0); -- address
|
333 |
|
|
wb_dat_i : in std_ulogic_vector(31 downto 0) := (others => '0'); -- read data
|
334 |
|
|
wb_dat_o : out std_ulogic_vector(31 downto 0); -- write data
|
335 |
|
|
wb_we_o : out std_ulogic; -- read/write
|
336 |
|
|
wb_sel_o : out std_ulogic_vector(03 downto 0); -- byte enable
|
337 |
|
|
wb_stb_o : out std_ulogic; -- strobe
|
338 |
|
|
wb_cyc_o : out std_ulogic; -- valid cycle
|
339 |
|
|
wb_ack_i : in std_ulogic := '0'; -- transfer acknowledge
|
340 |
|
|
wb_err_i : in std_ulogic := '0'; -- transfer error
|
341 |
|
|
-- GPIO (available if IO_GPIO_USE = true) --
|
342 |
|
|
gpio_o : out std_ulogic_vector(15 downto 0); -- parallel output
|
343 |
|
|
gpio_i : in std_ulogic_vector(15 downto 0) := (others => '0'); -- parallel input
|
344 |
|
|
-- UART (available if IO_UART_USE = true) --
|
345 |
|
|
uart_txd_o : out std_ulogic; -- UART send data
|
346 |
|
|
uart_rxd_i : in std_ulogic := '0'; -- UART receive data
|
347 |
|
|
-- SPI (available if IO_SPI_USE = true) --
|
348 |
|
|
spi_sclk_o : out std_ulogic; -- serial clock line
|
349 |
|
|
spi_mosi_o : out std_ulogic; -- serial data line out
|
350 |
|
|
spi_miso_i : in std_ulogic := '0'; -- serial data line in
|
351 |
|
|
spi_csn_o : out std_ulogic_vector(07 downto 0); -- SPI CS
|
352 |
|
|
-- TWI (available if IO_TWI_USE = true) --
|
353 |
|
|
twi_sda_io : inout std_logic := 'H'; -- twi serial data line
|
354 |
|
|
twi_scl_io : inout std_logic := 'H'; -- twi serial clock line
|
355 |
|
|
-- PWM (available if IO_PWM_USE = true) --
|
356 |
|
|
pwm_o : out std_ulogic_vector(03 downto 0); -- pwm channels
|
357 |
|
|
-- Interrupts (available if IO_CLIC_USE = true) --
|
358 |
|
|
ext_irq_i : in std_ulogic_vector(01 downto 0) := (others => '0'); -- external interrupt request
|
359 |
|
|
ext_ack_o : out std_ulogic_vector(01 downto 0) -- external interrupt request acknowledge
|
360 |
|
|
);
|
361 |
|
|
end neorv32_top;
|
362 |
|
|
```
|
363 |
|
|
|
364 |
|
|
|
365 |
|
|
|
366 |
|
|
## Getting Started
|
367 |
|
|
|
368 |
|
|
This overview is just a short excerpt from the *Let's Get It Started* section of the NEORV32 documentary:
|
369 |
|
|
|
370 |
|
|
[![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf)
|
371 |
|
|
|
372 |
|
|
|
373 |
|
|
### Building the Toolchain
|
374 |
|
|
|
375 |
|
|
At first you need the **RISC-V GCC toolchain**. You can either [download the sources](https://github.com/riscv/riscv-gnu-toolchain)
|
376 |
|
|
and build the toolchain by yourself, or you can download a prebuilt one and install it.
|
377 |
|
|
|
378 |
|
|
To build the toolchain by yourself, get the sources from the official [RISCV-GNU-TOOLCHAIN](https://github.com/riscv/riscv-gnu-toolchain) github page:
|
379 |
|
|
|
380 |
|
|
$ git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
|
381 |
|
|
|
382 |
|
|
Download and install the prerequisite standard packages:
|
383 |
|
|
|
384 |
|
|
$ sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev
|
385 |
|
|
|
386 |
|
|
To build the Linux cross-compiler, pick an install path. If you choose, say, `/opt/riscv`, then add `/opt/riscv/bin` to your `PATH` environment variable.
|
387 |
|
|
|
388 |
|
|
$ export PATH:$PATH:/opt/riscv/bin
|
389 |
|
|
|
390 |
|
|
Then, simply run the following commands in the RISC-V GNU toolchain source folder (for the `rv32i` toolchain):
|
391 |
|
|
|
392 |
|
|
riscv-gnu-toolchain$ ./configure --prefix=/opt/riscv --with-arch=rv32i –with-abi=ilp32
|
393 |
|
|
riscv-gnu-toolchain$ make
|
394 |
|
|
|
395 |
|
|
After a while (hours!) you will get `riscv32-unknown-elf-gcc` and all of its friends in your `/opt/riscv/bin` folder.
|
396 |
|
|
|
397 |
|
|
|
398 |
|
|
### Using a Prebuilt Toolchain
|
399 |
|
|
|
400 |
|
|
Alternatively, you can download a prebuilt toolchain. I have uploaded the toolchain I am using to GitHub. This toolchain
|
401 |
|
|
has been compiled on a 64-bit x86 Ubuntu (Ubuntu on Windows). Download the toolchain of choice:
|
402 |
|
|
|
403 |
|
|
[https://github.com/stnolting/riscv_gcc_prebuilt](https://github.com/stnolting/riscv_gcc_prebuilt)
|
404 |
|
|
|
405 |
|
|
|
406 |
|
|
### Dowload the Project and Create a Hardware Project
|
407 |
|
|
|
408 |
|
|
Now its time to get the most recent version the NEORV32 Processor project from GitHub. Clone the NEORV32 repository using
|
409 |
|
|
`git` from the command line (suggested for easy project updates via `git pull`):
|
410 |
|
|
|
411 |
|
|
$ git clone https://github.com/stnolting/neorv32.git
|
412 |
|
|
|
413 |
|
|
Create a new HW project with your FPGA synthesis tool of choice. Add all files from the [`rtl/core`](https://github.com/stnolting/neorv32/blob/master/rtl)
|
414 |
|
|
folder to this project and add them to a **new library** called `neorv32`.
|
415 |
|
|
|
416 |
|
|
You can either instantiate the [processor's top entity](https://github.com/stnolting/neorv32#top-entity) in you own project, or you
|
417 |
|
|
can use a simple [test setup](https://github.com/stnolting/neorv32/blob/master/rtl/top_templates/neorv32_test_setup.vhd) as top entity. This test
|
418 |
|
|
setup instantiates the processor, implements most of the peripherals and the basic ISA. Only the UART, clock, reset and some GPIO output sginals are
|
419 |
|
|
propagated:
|
420 |
|
|
|
421 |
|
|
```vhdl
|
422 |
|
|
entity neorv32_test_setup is
|
423 |
|
|
port (
|
424 |
|
|
-- Global control --
|
425 |
|
|
clk_i : in std_ulogic := '0'; -- global clock, rising edge
|
426 |
|
|
rstn_i : in std_ulogic := '0'; -- global reset, low-active, async
|
427 |
|
|
-- GPIO --
|
428 |
|
|
gpio_o : out std_ulogic_vector(7 downto 0); -- parallel output
|
429 |
|
|
-- UART --
|
430 |
|
|
uart_txd_o : out std_ulogic; -- UART send data
|
431 |
|
|
uart_rxd_i : in std_ulogic := '0' -- UART receive data
|
432 |
|
|
);
|
433 |
|
|
end neorv32_test_setup;
|
434 |
|
|
```
|
435 |
|
|
|
436 |
|
|
This test setup is intended as quick and easy "hello world" test setup to get into the NEORV32.
|
437 |
|
|
|
438 |
|
|
|
439 |
|
|
### Compiling and Uploading One of the Example Projects
|
440 |
|
|
|
441 |
|
|
Make sure `GNU Make` and a native `GCC` compiler are installed. To test the installation of the RISC-V toolchain, navigate to an example project like
|
442 |
|
|
`sw/example/blink_led` and run:
|
443 |
|
|
|
444 |
|
|
neorv32/sw/example/blink_led$ make check
|
445 |
|
|
|
446 |
|
|
The NEORV32 project includes some example programs from which you can start your own application:
|
447 |
|
|
[SW example projects](https://github.com/stnolting/neorv32/tree/master/sw/example)
|
448 |
|
|
|
449 |
|
|
Simply compile one of these projects. This will create a NEORV32 executable `neorv32_exe.bin` in the same folder.
|
450 |
|
|
|
451 |
|
|
neorv32/sw/example/blink_led$ make clean_all compile
|
452 |
|
|
|
453 |
|
|
Connect your FPGA board via UART to you computer and open the according port to interface with the NEORV32 bootloader. The bootloader
|
454 |
|
|
uses the following default UART configuration:
|
455 |
|
|
|
456 |
|
|
- 19200 Baud
|
457 |
|
|
- 8 data bits
|
458 |
|
|
- 1 stop bit
|
459 |
|
|
- No parity bits
|
460 |
|
|
- No transmission / flow control protocol (raw bytes only)
|
461 |
|
|
- Newline on `\r\n` (carriage return & newline)
|
462 |
|
|
|
463 |
|
|
Use the bootloader console to upload and execute your application image.
|
464 |
|
|
|
465 |
|
|
```
|
466 |
|
|
<< NEORV32 Bootloader >>
|
467 |
|
|
|
468 |
|
|
BLDV: Jun 22 2020
|
469 |
|
|
HWV: 0.0.2.3
|
470 |
|
|
CLK: 0x0134FD90 Hz
|
471 |
|
|
MISA: 0x42801104
|
472 |
|
|
CONF: 0x01FF0015
|
473 |
|
|
IMEM: 0x00010000 bytes @ 0x00000000
|
474 |
|
|
DMEM: 0x00010000 bytes @ 0x80000000
|
475 |
|
|
|
476 |
|
|
Autoboot in 8s. Press key to abort.
|
477 |
|
|
Aborted.
|
478 |
|
|
|
479 |
|
|
Available commands:
|
480 |
|
|
h: Help
|
481 |
|
|
r: Restart
|
482 |
|
|
u: Upload
|
483 |
|
|
s: Store to flash
|
484 |
|
|
l: Load from flash
|
485 |
|
|
e: Execute
|
486 |
|
|
CMD:>
|
487 |
|
|
```
|
488 |
|
|
|
489 |
|
|
Going further: Take a look at the [![NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/PDF_32.png) NEORV32 datasheet](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/NEORV32.pdf).
|
490 |
|
|
|
491 |
|
|
|
492 |
|
|
|
493 |
|
|
## Contact
|
494 |
|
|
|
495 |
3 |
zero_gravi |
If you have any questions, bug reports, ideas or if you are facing problems with the NEORV32 or want to give some kind of feedback, open a
|
496 |
2 |
zero_gravi |
[new issue](https://github.com/stnolting/neorv32/issues) or directly drop me a line:
|
497 |
|
|
|
498 |
|
|
stnolting@gmail.com
|
499 |
|
|
|
500 |
|
|
|
501 |
|
|
|
502 |
|
|
## Citation
|
503 |
|
|
|
504 |
|
|
If you are using the NEORV32 Processor in some kind of publication, please cite it as follows:
|
505 |
|
|
|
506 |
|
|
> S. Nolting, "The NEORV32 Processor", github.com/stnolting/neorv32
|
507 |
|
|
|
508 |
|
|
|
509 |
|
|
|
510 |
|
|
## Legal
|
511 |
|
|
|
512 |
|
|
This is a hobby project released under the BSD 3-Clause license. No copyright infringement intended.
|
513 |
|
|
|
514 |
|
|
**BSD 3-Clause License**
|
515 |
|
|
|
516 |
|
|
Copyright (c) 2020, Stephan Nolting. All rights reserved.
|
517 |
|
|
|
518 |
|
|
Redistribution and use in source and binary forms, with or without modification, are
|
519 |
|
|
permitted provided that the following conditions are met:
|
520 |
|
|
|
521 |
|
|
1. Redistributions of source code must retain the above copyright notice, this list of
|
522 |
|
|
conditions and the following disclaimer.
|
523 |
|
|
2. Redistributions in binary form must reproduce the above copyright notice, this list of
|
524 |
|
|
conditions and the following disclaimer in the documentation and/or other materials
|
525 |
|
|
provided with the distribution.
|
526 |
|
|
3. Neither the name of the copyright holder nor the names of its contributors may be used to
|
527 |
|
|
endorse or promote products derived from this software without specific prior written
|
528 |
|
|
permission.
|
529 |
|
|
|
530 |
|
|
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
|
531 |
|
|
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
532 |
|
|
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
|
533 |
|
|
COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
534 |
|
|
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
535 |
|
|
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
|
536 |
|
|
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
537 |
|
|
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
|
538 |
|
|
OF THE POSSIBILITY OF SUCH DAMAGE.
|
539 |
|
|
|
540 |
|
|
|
541 |
|
|
"Windows" is a trademark of Microsoft Corporation.
|
542 |
|
|
|
543 |
|
|
"Artix" and "Vivado" are trademarks of Xilinx Inc.
|
544 |
|
|
|
545 |
|
|
"Cyclone", "Quartus Prime" and "Avalon Bus" are trademarks of Intel Corporation.
|
546 |
|
|
|
547 |
|
|
"iCE40", "UltraPlus" and "Lattice Radiant" are trademarks of Lattice Semiconductor Corporation.
|
548 |
|
|
|
549 |
|
|
"AXI4" and "AXI4-Lite" are trademarks of Arm Holdings plc.
|
550 |
|
|
|
551 |
|
|
|
552 |
|
|
[![Continous Integration provided by Travis CI](https://travis-ci.com/images/logos/TravisCI-Full-Color.png)](https://travis-ci.com/stnolting/neorv32)
|
553 |
|
|
|
554 |
|
|
Continous integration provided by [Travis CI](https://travis-ci.com/stnolting/neorv32) and powered by [GHDL](https://github.com/ghdl/ghdl).
|
555 |
|
|
|
556 |
|
|
|
557 |
|
|
![Open Source Hardware Logo https://www.oshwa.org](https://raw.githubusercontent.com/stnolting/neorv32/master/docs/figures/oshw_logo.png)
|
558 |
|
|
|
559 |
|
|
This project is not affiliated with or endorsed by the Open Source Initiative (https://www.oshwa.org / https://opensource.org).
|
560 |
|
|
|
561 |
|
|
|
562 |
|
|
Made with :heart: in Hannover, Germany.
|