1 |
3 |
howe.r.j.8 |
# Forth computing system
2 |
3 |
| Project | Forth SoC written in VHDL |
4 |
| --------- | ------------------------- |
5 |
| Author | Richard James Howe |
6 |
| Copyright | 2013-2017 Richard Howe |
7 |
| License | MIT/LGPL |
8 |
| Email | howe.r.j.89@gmail.com |
9 |
10 |

11 |
12 |
# Introduction
13 |
14 |
This project implements a small stack computer tailored to executing Forth
15 |
based on the [J1][] CPU. The processor has been rewritten in [VHDL][] from
16 |
[Verilog][], and extended slightly.
17 |
18 |
The goals of the project are as follows:
19 |
20 |
* Create a working version of [J1][] processor (called the H2).
21 |
* Make a working toolchain for the processor.
22 |
* Create a [FORTH][] for the processor which can take its input either from a
23 |
[UART][] or a USB keyboard and a [VGA][] adapter.
24 |
25 |
The H2 processor, like the [J1][], is a stack based processor that executes an
26 |
instruction set especially suited for [FORTH][].
27 |
28 |
The current target is the [Nexys3][] board, with a [Xilinx][] Spartan-6 XC6LX16-CS324
29 |
[FPGA][], new boards will be targeted in the future as this board is reaching it's
30 |
end of life. The [VHDL][] is written in a generic way, with hardware components
31 |
being inferred instead of explicitly instantiated, this should make the code
32 |
fairly portable, although the interfaces to the [Nexys3][] board components are
33 |
specific to the peripherals on that board.
34 |
35 |
A video of the project in action, on the hardware, can be viewed here:
36 |

37 |
38 |
And a lower quality version of the same video that should play automatically:
39 |
40 |

41 |
42 |
The SoC can also be simulated with a simulator written in C, as shown below:
43 |
44 |

45 |
46 |
The System Architecture is as follows:
47 |
48 |

49 |

50 |
51 |
# License
52 |
53 |
The licenses used by the project are mixed and are on a per file basis. For my
54 |
code I use the [MIT][] license - so feel free to use it as you wish. The other
55 |
licenses used are the [LGPL][], they are confined to single modules so could be
56 |
removed if you have some aversion to [LGPL][] code.
57 |
58 |
# Target Board
59 |
60 |
The only target board available at the moment is the [Nexys3][], this should
61 |
change in the future as the board is currently at it's End Of Life. The next
62 |
boards I am looking to support are it's successor, the Nexys 4, and the myStorm
63 |
BlackIce (). The myStorm board uses a completely open
64 |
source toolchain for synthesis, place and route and bit file generation.
65 |
66 |
# Build and Running requirements
67 |
68 |
The build has been tested under [Debian][] [Linux][], version 8.
69 |
70 |
You will require:
71 |
72 |
* [GCC][], or a suitable [C][] compiler capable of compiling [C99][]
73 |
* [Make][]
74 |
* [Xilinx ISE][] version 14.7
75 |
* [GHDL][]
76 |
* [GTKWave][]
77 |
* [tcl][] version 8.6
78 |
* Digilent Adept2 runtime and Digilent Adept2 utilities available at
79 |
80 |
* [freeglut][] (for the GUI simulator only)
81 |
* [pandoc][] for building the documentation
82 |
* [picocom][] (or an alternative terminal client)
83 |
84 |
85 |
86 |
* VGA Monitor
87 |
* USB Keyboard (plugs into the Nexys3 USB to PS/2 bridge)
88 |
* [Nexys3][] development board
89 |
90 |
[Xilinx ISE][] can (or could be) downloaded for free, but requires
91 |
registration. ISE needs to be on your path:
92 |
93 |
94 |
95 |
96 |
# Building and Running
97 |
98 |
To make the [C][] based toolchain:
99 |
100 |
make h2
101 |
102 |
To make a bit file that can be flashed to the target board:
103 |
104 |
make simulation synthesis implementation bitfile
105 |
106 |
To upload the bitfile to the target board:
107 |
108 |
make upload
109 |
110 |
To view the wave form generated by "make simulation":
111 |
112 |
make viewer
113 |
114 |
The [C][] based CLI simulator can be invoked with:
115 |
116 |
make run
117 |
118 |
Which will assemble the H2 Forth source file [h2.fth][], and run the assembled
119 |
object file under the H2 simulator with the debugger activated. A graphical
120 |
simulator can be run with:
121 |
122 |
make gui-run
123 |
124 |
Which requires [freeglut][] as well as a [C][] compiler.
125 |
126 |
# Related Projects
127 |
128 |
The original [J1][] project is available at:
129 |
130 |
131 |
132 |
This project targets the original [J1][] core and provides a eForth
133 |
implementation (written using [Gforth][] as for meta-compilation/cross
134 |
compilation to the [J1][] core). It also provides a simulator for the system
135 |
written in [C][].
136 |
137 |
138 |
139 |
140 |
# Manual
141 |
142 |
The H2 processor and associated peripherals are subject to change, so the code
143 |
is the definitive source what instructions are available, the register map, and
144 |
how the peripherals behave.
145 |
146 |
There are a few modifications to the [J1][] CPU which include:
147 |
148 |
* New instructions
149 |
* A CPU hold line which keeps the processor in the same state so long as it is
150 |
151 |
* Interrupt Service Routines have been added.
152 |
* Larger return and data stacks
153 |
154 |
The Interrupt Service Routines (ISR) have not been throughly tested and will be
155 |
subject to the most change.
156 |
157 |
### H2 CPU
158 |
159 |
The H2 CPU behaves very similarly to the [J1][] CPU, and the [J1 PDF][] can be
160 |
read in order to better understand this processor. The processor is 16-bit with
161 |
instructions taking a single clock cycle. Most of the primitive Forth words can
162 |
also be executed in a single cycle as well, one notable exception is store ("!"),
163 |
which is split into two instructions.
164 |
165 |
The CPU has the following state within it:
166 |
167 |
* A 64 deep return stack (up from 32 in the original [J1][])
168 |
* A 65 deep variable stack (up from 33 in the original [J1][])
169 |
* A program counter
170 |
* An interrupt enable and interrupt request bit
171 |
* An interrupt address register
172 |
173 |
Loads and stores into the block RAM that holds the H2 program discard the
174 |
lowest bit, every other memory operation uses the lower bit (such as jumps
175 |
and loads and stores to Input/Output peripherals). This is so applications can
176 |
use the lowest bit for character operations when accessing the program RAM.
177 |
178 |
The instruction set is decoded in the following manner:
179 |
180 |
181 |
| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
182 |
183 |
184 |
185 |
186 |
187 |
188 |
189 |
| 0 | 1 | 0 | CALL TARGET ADDRESS |
190 |
191 |
192 |
193 |
| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
194 |
195 |
196 |
T : Top of data stack
197 |
N : Next on data stack
198 |
PC : Program Counter
199 |
200 |
LITERAL VALUES : push a value onto the data stack
201 |
CONDITIONAL : BRANCHS pop and test the T
202 |
CALLS : PC+1 onto the return stack
203 |
204 |
T2N : Move T to N
205 |
T2R : Move T to top of return stack
206 |
N2A : STORE T to memory location addressed by N
207 |
R2P : Move top of return stack to PC
208 |
209 |
RSTACK and DSTACK are signed values (twos compliment) that are
210 |
the stack delta (the amount to increment or decrement the stack
211 |
by for their respective stacks: return and data)
212 |
213 |
#### ALU operations
214 |
215 |
216 |
All ALU operations replace T:
217 |
218 |
| Value | Operation | Description |
219 |
220 |
| 0 | T | Top of Stack |
221 |
| 1 | N | Copy T to N |
222 |
| 2 | T + N | Addition |
223 |
| 3 | T & N | Bitwise AND |
224 |
| 4 | T or N | Bitwise OR |
225 |
| 5 | T ^ N | Bitwise XOR |
226 |
| 6 | ~T | Bitwise Inversion |
227 |
| 7 | T = N | Equality test |
228 |
| 8 | N < T | Signed comparison |
229 |
| 9 | N >> T | Logical Right Shift |
230 |
| 10 | T - 1 | Decrement |
231 |
| 11 | R | Top of return stack |
232 |
| 12 | [T] | Load from address |
233 |
| 13 | N << T | Logical Left Shift |
234 |
| 14 | depth | Depth of stack |
235 |
| 15 | N u< T | Unsigned comparison |
236 |
| 16 | set interrupts | Enable interrupts |
237 |
| 17 | interrupts on? | Are interrupts on? |
238 |
| 18 | rdepth | Depth of return stk |
239 |
| 19 | 0= | T == 0? |
240 |
| 20 | CPU ID | CPU Identifier |
241 |
242 |
243 |
### Peripherals and registers
244 |
245 |
Registers marked prefixed with an 'o' are output registers, those with an 'i'
246 |
prefix are input registers. Registers are divided into an input and output
247 |
section of registers and the addresses of the input and output registers do not
248 |
correspond to each other in all cases. Unlike for RAM reads, the I/O registers
249 |
are indexed by word aligned addresses, without the lowest bit being discarded
250 |
(this should be fixed at a later date).
251 |
252 |
The following peripherals have been implemented in the [VHDL][] SoC to
253 |
interface with devices on the [Nexys3][] board:
254 |
255 |
* [VGA][] output device, text mode only, 80 by 40 characters from
256 |
257 |
* Timer
258 |
* [UART][] (Rx/Tx) with a [FIFO][]
259 |
260 |
* [PS/2][] Keyboard
261 |
262 |
* [LED][] next to a bank of switches
263 |
* An [8 Segment LED Display][] driver (a 7 segment display with a decimal point)
264 |
265 |
The SoC also features a limited set of interrupts that can be enabled or
266 |
267 |
268 |
The output register map:
269 |
270 |
| Register | Address | Description |
271 |
272 |
| oUart | 0x4000 | UART register |
273 |
| oVT100 | 0x4002 | VT100 Terminal Write |
274 |
| oLeds | 0x4004 | LED outputs |
275 |
| oTimerCtrl | 0x4006 | Timer control |
276 |
| oMemDout | 0x4008 | Memory Data Output |
277 |
| oMemControl | 0x400A | Memory Control / Hi Address |
278 |
| oMemAddrLow | 0x400C | Memory Lo Address |
279 |
| o7SegLED | 0x400E | 4 x LED 8 Segment display |
280 |
| oIrcMask | 0x4010 | CPU Interrupt Mask |
281 |
282 |
283 |
The input registers:
284 |
285 |
| Register | Address | Description |
286 |
287 |
| iUart | 0x4000 | UART register |
288 |
| iVT100 | 0x4002 | Terminal status & PS/2 Keyboard |
289 |
| iSwitches | 0x4004 | Buttons and switches |
290 |
| iTimerDin | 0x4006 | Current Timer Value |
291 |
| iMemDin | 0x4008 | Memory Data Input |
292 |
293 |
294 |
The following description of the registers should be read in order and describe
295 |
how the peripherals work as well.
296 |
297 |
#### oUart
298 |
299 |
A UART with a fixed baud rate and format (115200, 8 bits, 1 stop bit) is
300 |
present on the SoC. The UART has a FIFO of depth 8 on both the RX and TX
301 |
channels. The control of the UART is split across oUart and iUart.
302 |
303 |
To write a value to the UART assert TXWE along with putting the data in TXDO.
304 |
The FIFO state can be analyzed by looking at the iUart register.
305 |
306 |
To read a value from the UART: iUart can be checked to see if data is present
307 |
in the FIFO, if it is assert RXRE in the oUart register, on the next clock
308 |
cycle the data will be present in the iUart register.
309 |
310 |
The baud rate of the UART can be changed by rebuilding the VHDL project, bit
311 |
length, parity bits and stop bits can only be changed with modifications to
312 |
313 |
314 |
315 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
316 |
317 |
| X | X |TXWE| X | X |RXRE| X | X | TXDO |
318 |
319 |
320 |
TXWE: UART TX Write Enable
321 |
RXRE: UART RX Read Enable
322 |
TXDO: UART TX Data Output
323 |
324 |
#### oVT100
325 |
326 |
The VGA Text device emulates a terminal which the user can talk to by writing
327 |
to the oVT100 register. It supports a subset of the [VT100][] terminal
328 |
functionality. The interface behaves much like writing to a UART with the same
329 |
busy and control signals. The input is taken from a [PS/2][] keyboard available
330 |
on the board, this behaves like the RX mechanism of the UART.
331 |
332 |
333 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
334 |
335 |
| X | X |TXWE| X | X |RXRE| X | X | TXDO |
336 |
337 |
338 |
TXWE: VT100 TX Write Enable
339 |
RXRE: UART RX Read Enable
340 |
TXDO: UART TX Data Output
341 |
342 |
#### oLeds
343 |
344 |
On the [Nexys3][] board there is a bank of LEDs that are situated next to the
345 |
switches, these LEDs can be turned on (1) or off (0) by writing to LEDO. Each
346 |
LED here corresponds to the switch it is next to.
347 |
348 |
349 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
350 |
351 |
| X | X | X | X | X | X | X | X | LEDO |
352 |
353 |
354 |
LEDO: LED Output
355 |
356 |
#### oTimerCtrl
357 |
358 |
The timer is controllable by the oTimerCtrl register, it is a 13-bit timer
359 |
running at 100MHz, it can optionally generate interrupts and the current timers
360 |
internal count can be read back in with the iTimerDin register.
361 |
362 |
The timer counts once the TE bit is asserted, once the timer reaches TCMP value
363 |
it wraps around and can optionally generate an interrupt by asserting INTE.
364 |
This also toggles the Q and NQ lines that come out of the timer and are routed
365 |
to pins on the board (see the constraints file [top.ucf][] for the pins).
366 |
367 |
The timer can be reset by writing to RST.
368 |
369 |
370 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
371 |
372 |
373 |
374 |
375 |
TE: Timer Enable
376 |
RST: Timer Reset
377 |
INTE: Interrupt Enable
378 |
TCMP: Timer Compare Value
379 |
380 |
381 |
#### oIrcMask
382 |
383 |
The H2 core has a mechanism for interrupts, interrupts have to be enabled or
384 |
disabled with an instruction. Each interrupt can be masked off with a bit in
385 |
IMSK to enable that specific interrupt. A '1' in a bit of IMSK enables that
386 |
specific interrupt, which will be delivered to the CPU if interrupts are
387 |
enabled within it.
388 |
389 |
390 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
391 |
392 |
| X | X | X | X | X | X | X | X | IMSK |
393 |
394 |
395 |
IMSK: Interrupt Mask
396 |
397 |
#### oMemDout
398 |
399 |
Data to be output to selected address when write enable (WE) issued in
400 |
401 |
402 |
403 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
404 |
405 |
| Data Ouput |
406 |
407 |
408 |
#### oMemControl
409 |
410 |
This register contains the control registers for the onboard memory on the
411 |
[Nexys3][] board. The board contains three memory devices, two non-volatile
412 |
memory devices and a volatile RAM based device. The two devices accessible by a
413 |
simple SRAM interface (one volatile M45W8MW16, one non-volatile - a
414 |
NP8P128A13T1760E) are both accessible, the third is an SPI based memory device,
415 |
NP5Q128A13ESFC0E) and is currently not accessible.
416 |
417 |
418 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
419 |
420 |
| OE | WE | RST|WAIT| RCS| FCS| Address Hi |
421 |
422 |
423 |
OE: Output Enable - enable reading from current address into iMemDin
424 |
WE: Write Enable - enable writing oMemDout into ram at current address
425 |
RST: Reset the Flash memory controller
426 |
RCS: RAM Chip Select, Enable Volatile Memory
427 |
FCS: Flash Chip Select, Enable Non-Volatile Memory
428 |
Address Hi: High Bits of RAM address
429 |
430 |
OE and WE are mutually exclusive, if both are set then there is no effect.
431 |
432 |
The memory controller is in active development, and the interface to it might
433 |
434 |
435 |
#### oMemAddrLow
436 |
437 |
This is the lower address bits of the RAM.
438 |
439 |
440 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
441 |
442 |
| Address Lo |
443 |
444 |
445 |
#### o7SegLED
446 |
447 |
On the [Nexys3][] board there is a bank of 7 segment displays, with a dot
448 |
(8-segment really), which can be used for numeric output. The LED segments
449 |
cannot be directly addressed. Instead the value stored in L8SD is mapped
450 |
to a hexadecimal display value (or a BCD value, but this requires regeneration
451 |
of the SoC and modification of a generic in the VHDL).
452 |
453 |
The value '0' corresponds to a zero displayed on the LED segment, '15' to an
454 |
'F', etcetera.
455 |
456 |
There are 4 displays in a row.
457 |
458 |
459 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
460 |
461 |
| L7SD0 | L7SD1 | L7SD2 | L7SD3 |
462 |
463 |
464 |
L7SD0: LED 7 Segment Display (leftmost display)
465 |
L7SD1: LED 7 Segment Display
466 |
L7SD2: LED 7 Segment Display
467 |
L7SD3: LED 7 Segment Display (right most display)
468 |
469 |
#### iUart
470 |
471 |
The iUart register works in conjunction with the oUart register. The status of
472 |
the FIFO that buffers both transmission and reception of bytes is available in
473 |
the iUart register, as well as any received bytes.
474 |
475 |
476 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
477 |
478 |
479 |
480 |
481 |
482 |
483 |
484 |
485 |
RXDI: UART RX Data Input
486 |
487 |
#### iVT100
488 |
489 |
The iVT100 register works in conjunction with the oVT100 register. The status of
490 |
the FIFO that buffers both transmission and reception of bytes is available in
491 |
the iVT100 register, as well as any received bytes. It works the same as the
492 |
iUart/oUart registers.
493 |
494 |
495 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
496 |
497 |
| X | X | X |TFFL|TFEM| X |RFFL|RFEM| 0 | ACHR |
498 |
499 |
500 |
501 |
502 |
503 |
504 |
ACHR: New character available on PS2 Keyboard
505 |
506 |
#### iTimerDin
507 |
508 |
This register contains the current value of the timers counter.
509 |
510 |
511 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
512 |
513 |
| X | X | X | TCNT |
514 |
515 |
516 |
TCNT: Timer Counter Value
517 |
518 |
#### iSwitches
519 |
520 |
iSwitches contains input lines from multiple sources. The buttons
521 |
(BUP, BDWN, BLFT, BRGH, and BCNT) correspond to a [D-Pad][] on the [Nexys3][]
522 |
board. The switches (TSWI) are the ones mentioned in oLeds, each have an LED
523 |
next to them.
524 |
525 |
The switches and the buttons are already debounced in hardware so they do not
526 |
have to be further processed once read in from these registers.
527 |
528 |
529 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
530 |
531 |
532 |
533 |
534 |
BUP: Button Up
535 |
BDWN: Button Down
536 |
BLFT: Button Left
537 |
BRGH: Button Right
538 |
BCNT: Button Center
539 |
TSWI: Two Position Switches
540 |
541 |
#### iMemDin
542 |
543 |
Memory input, either from the SRAM or Flash, indexed by oMemControl and
544 |
oMemAddrLow. When reading from flash this might actually be status information
545 |
or information from the query table.
546 |
547 |
548 |
| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
549 |
550 |
| Data Input |
551 |
552 |
553 |
554 |
### Interrupt Service Routines
555 |
556 |
The following interrupt service routines are defined:
557 |
558 |
| Name | Number | Description |
559 |
560 |
| isrNone | 0 | Not used |
561 |
| isrRxFifoNotEmpty | 1 | UART RX FIFO Is Not Empty |
562 |
| isrRxFifoFull | 2 | UART RX FIFI Is Full |
563 |
| isrTxFifoNotEmpty | 3 | UART TX FIFO Is Not Empty |
564 |
| isrTxFifoFull | 4 | UART TX FIFO Is Full |
565 |
| isrKbdNew | 5 | New PS/2 Keyboard Character |
566 |
| isrTimer | 6 | Timer Counter |
567 |
| isrDPadButton | 7 | Any D-Pad Button Change State |
568 |
569 |
570 |
When an interrupt occurs, and interrupts are enabled within the processor, then
571 |
a call to the location in memory is performed - the location is the same as the
572 |
ISR number. An ISR with a number of '4' will perform a call (not a jump) to the
573 |
location '4' within memory, for example.
574 |
575 |
Interrupts have a latency of at least 4-5 cycles before they are acted on, there
576 |
is a two to three cycle delay in the interrupt request handler, then the call
577 |
to the ISR location in memory has to be done, then the call to the word that
578 |
implements the ISR itself.
579 |
580 |
If two interrupts occur at the same time they are processed from the lowest
581 |
interrupt number to the highest.
582 |
583 |
Interrupts are lost when an interrupt with the same number occurs that has not
584 |
been processed.
585 |
586 |
# The Toolchain
587 |
588 |
The Assembler, Disassembler and [C][] based simulator for the H2 is in a single
589 |
program (see [h2.c][]). This simulator complements the [VHDL][] test bench
590 |
[tb.vhd][] and is not a replacement for it.
591 |
592 |
To build it a [C][] compiler is needed, the build target "h2" will build the
593 |
594 |
595 |
make h2
596 |
597 |
And it can be run on the source file [h2.fth][] with the make target:
598 |
599 |
make run
600 |
601 |
The make file is not needed:
602 |
603 |
604 |
605 |
cc -std=c99 h2.c -o h2 # To build the h2 executable
606 |
./h2 -h # For a list of options
607 |
./h2 -T -R h2.fth # Assemble h2.fth and run it
608 |
609 |
610 |
611 |
gcc -std=c99 h2.c -o h2.exe # Builds the h2.exe executable
612 |
h2.exe -h # For a list of options
613 |
h2.exe -T -R h2.fth # Assemble h2.fth and run it
614 |
615 |
A list of command line options available:
616 |
617 |
- stop processing options, following arguments are files
618 |
-h print a help message and exit
619 |
-v increase logging level
620 |
-d disassemble input files (default)
621 |
-D full disassembly of input files
622 |
-T Enter debug mode when running simulation
623 |
-a assemble file
624 |
-r run hex file
625 |
-R assemble file then run it
626 |
-L # load symbol file
627 |
-S # save symbols to file
628 |
-s # number of steps to run simulation (0 = forever)
629 |
-n # specify NVRAM block file (default is nvram.blk)
630 |
file* file to process
631 |
632 |
This program is released under the [MIT][] license, feel free to use it and
633 |
modify it as you please. With minimal modification it should be able to
634 |
assemble programs for the original [J1][] core.
635 |
636 |
## Assembler
637 |
638 |
The assembler is actually a compiler for a pseudo Forth like language with a
639 |
fixed grammar. It is a much more restricted language than Forth and cannot be
640 |
extended within itself like Forth can.
641 |
642 |
The main program can be found in [h2.fth][], which is still currently in
643 |
644 |
645 |
The assembler/compiler reads in a text file containing a program and produces a
646 |
hex file which can be read in by the simulator, disassembler, the VHDL test
647 |
bench or read in by the [Xilinx ISE][] toolchain when it generates the bit file
648 |
for the [Spartan 6][] on the [Nexys3][] board.
649 |
650 |
A rough [EBNF][] grammar for the language is as follows:
651 |
652 |
Program := Statement* EOF
653 |
Statement := Label | Branch | 0Branch | Call | Literal | Instruction
654 |
| Identifier | Constant | Variable | Location | Definition | If
655 |
| Begin | Char | Set | Pc | Pwd | Break | Mode | String | BuiltIn
656 |
Label := Identifier ";"
657 |
Branch := "branch" ( Identifier | Literal )
658 |
0Branch := "0branch" ( Identifier | Literal )
659 |
Call := "call" ( Identifier | Literal )
660 |
Set := ".set" ( Identifier | Literal | String ) ( Identifier | Literal | String )
661 |
Pc := ".pc" ( Identifier | Literal )
662 |
Pwd := ".pwd" ( Identifier | Literal )
663 |
Break := ".break"
664 |
BuiltIn := ".built-in"
665 |
Mode := ".mode" Literal
666 |
Allocate := ".allocate" ( Identifier | Literal )
667 |
Constant := "constant" Identifier Literal "hidden"?
668 |
Variable := "variable" Identifier ( Literal | String ) "hidden"?
669 |
Location := "Location" Identifier ( Literal | String )
670 |
Instruction := "@" | "store" | "exit" | ...
671 |
Definition := ":" ( Identifier | String) Statement* ";" ( "hidden" | "immediate" | "inline")
672 |
If := "if" Statement* [ "else" ] Statement* "then"
673 |
Begin := "begin" Statement* ("until" | "again" | "while" Statement* "repeat")
674 |
For := "for" Statement* ("aft" Statement* "then" Statement* | "next")
675 |
Literal := [ "-" ] Number
676 |
String := '"' SChar* '"'
677 |
Char := "[char]" ASCII ","
678 |
Number := Hex | Decimal
679 |
Decimal := "0" ... "9" ("0" ... "9")*
680 |
Hex := "$" HexDigit HexDigit*
681 |
HexDigit := ( "a" ... "f" | "A" ... "F" )
682 |
SChar := Any character except quote
683 |
684 |
Literals have higher priority than Identifiers, and comments are '\'
685 |
until a new line is encountered, or '(' until a ')' is encountered.
686 |
687 |
The grammar allows for nested word definitions, however state is held in the
688 |
lexer to prevent this.
689 |
690 |
The assembler the following directives:
691 |
692 |
.pc Set the program counter
693 |
.pwd Set the previous word pointer
694 |
.allocate Increment the program counter
695 |
.set Set location in memory
696 |
.mode Change compiler mode
697 |
.built-in Assemble built words here
698 |
699 |
700 |
There are several optimizations that can be performed, the ".mode" directive
701 |
controls whether they are active, along with controlling whether word
702 |
definitions are compiled with their headers or not. Optimizations performed
703 |
include merging a call to exit with the previous instruction if it is possible
704 |
to do so and performing tail call optimization where possible.
705 |
706 |
The built in words, with their instruction encodings:
707 |
708 |
709 |
| Word | 0 | 1 | 1 | ALU OPERATION |T2N|T2R|N2A|R2P| RSTACK| DSTACK|
710 |
711 |
| dup | 0 | 1 | 1 | T | X | | | | | +1 |
712 |
| over | 0 | 1 | 1 | N | X | | | | | +1 |
713 |
| invert | 0 | 1 | 1 | ~ T | | | | | | |
714 |
| + | 0 | 1 | 1 | T + N | | | | | | -1 |
715 |
| swap | 0 | 1 | 1 | N | X | | | | | |
716 |
| nip | 0 | 1 | 1 | T | | | | | | -1 |
717 |
| drop | 0 | 1 | 1 | N | | | | | | -1 |
718 |
| exit | 0 | 1 | 1 | T | | | | X | -1 | |
719 |
| >r | 0 | 1 | 1 | N | | X | | | +1 | -1 |
720 |
| r> | 0 | 1 | 1 | R | X | | | | -1 | +1 |
721 |
| r@ | 0 | 1 | 1 | R | X | | | | | +1 |
722 |
| @ | 0 | 1 | 1 | [T] | | | | | | |
723 |
| store | 0 | 1 | 1 | N | | | X | | | -1 |
724 |
| rshift | 0 | 1 | 1 | N >> T | | | | | | -1 |
725 |
| lshift | 0 | 1 | 1 | N << T | | | | | | -1 |
726 |
| = | 0 | 1 | 1 | N = T | | | | | | -1 |
727 |
| u< | 0 | 1 | 1 | N u< T | | | | | | -1 |
728 |
| < | 0 | 1 | 1 | N < T | | | | | | -1 |
729 |
| and | 0 | 1 | 1 | T & N | | | | | | -1 |
730 |
| xor | 0 | 1 | 1 | T ^ N | | | | | | -1 |
731 |
| or | 0 | 1 | 1 | T or N | | | | | | -1 |
732 |
| depth | 0 | 1 | 1 | depth | | | | | | +1 |
733 |
| 1- | 0 | 1 | 1 | T - 1 | | | | | | |
734 |
| seti | 0 | 1 | 1 | set interrupts | | | | | | -1 |
735 |
| iset? | 0 | 1 | 1 | interrupts on? | | | | | | +1 |
736 |
| rdepth | 0 | 1 | 1 | rdepth | | | | | | +1 |
737 |
| 0= | 0 | 1 | 1 | 0= | | | | | | |
738 |
| up1 | 0 | 1 | 1 | T | | | | | | +1 |
739 |
| nop | 0 | 1 | 1 | T | | | | | | |
740 |
| cpu-id | 0 | 1 | 1 | CPU ID | | | | | | +1 |
741 |
| rdrop | 0 | 1 | 1 | T | | | | | -1 | |
742 |
743 |
744 |
The language used in the assembler is Forth like, the best example of how to
745 |
use it is in the file "h2.fth", which contains a working Forth interpreter and
746 |
many Forth definitions. New words can be defined in the usual manner:
747 |
748 |
: 2+ 2 + ;
749 |
: ?dup dup if dup then ;
750 |
751 |
Control structure mismatches cause the parser to terminate with an error
752 |
condition as they are handled with a parser, each ":" must have a corresponding
753 |
";", an "if" must have either an "else" and then "then", or just "then",
754 |
755 |
756 |
Variables and constants can also be defined, but the grammar is slightly
757 |
different to how it works in a normal Forth:
758 |
759 |
variable x 55
760 |
constant y 20
761 |
762 |
Constants take up no space unless they are used, whereas variables are
763 |
allocated a location and set to an initial value. The above example creates a
764 |
variable 'x' and sets the variable to '55'. It also adds a constant 'y' to the
765 |
current symbol table, which can be used in other function definitions.
766 |
767 |
Code that does not appear within a word definition is assembled at that
768 |
769 |
770 |
The following control structures are available:
771 |
772 |
T = value to consume off the top of the stack
773 |
A = First clause
774 |
B = Second clause
775 |
C = Third clause
776 |
777 |
T if A else B then If T != 0 execute A else execute B
778 |
T if A then If T != 0 execute A
779 |
begin A T until Execute T until T != 0
780 |
begin A again Inifinite loop, execute A
781 |
begin A T while B repeat Execute A, if T = 0 exit loop, else execute B
782 |
T for A next Execute loop T times (stores
783 |
loop parameter on the return stack)
784 |
T for A aft B then C next Execute loop T times, skip B on first loop
785 |
label: A branch label Branch to label
786 |
label: A T 0branch label Branch to label if T = 0
787 |
788 |
Unlike in a normal Forth environment these control structures can be called
789 |
from outside functions definitions. They must also matched up correctly,
790 |
otherwise a syntax error will be raised.
791 |
792 |
## Disassembler
793 |
794 |
The disassembler takes a text file containing the assembled program, which
795 |
consists of 16-bit hexadecimal numbers. It then attempts to disassemble the
796 |
instructions. It can also be fed a symbols file which can be generated by the
797 |
assembler and attempt to find the locations jumps and calls point to.
798 |
799 |
The disassembler is used by a [tcl][] script called by [GTKwave][], it
800 |
turns the instruction trace of the H2 from a series of numbers into the
801 |
instructions and branch destinations that they represent. This makes debugging
802 |
the VHDL much easier.
803 |
804 |

805 |
806 |
The purple trace shows the disassembled instructions.
807 |
808 |
## Simulator
809 |
810 |
The simulator in C implements the H2 core and most of the SoC. The IO for the
811 |
simulator is not cycle accurate (and most likely will never be), but can be
812 |
used for running and debugging programs with results that are very similar to
813 |
how the hardware behaves. This is much faster than rebuilding the bit file used
814 |
to flash the [FPGA][].
815 |
816 |
## Debugger
817 |
818 |
The simulator also includes a debugger, which is designed to be similar to the
819 |
[DEBUG.COM][] program available in [DOS][]. The debugger can be used to
820 |
disassemble sections of memory, inspect the status of the peripherals and dump
821 |
sections of memory to the screen. It can also be used to set breakpoints,
822 |
single step and run through the code until a breakpoint is hit.
823 |
824 |
To run the debugger either a hex file or a source file must be given:
825 |
826 |
# -T turns debugging mode on
827 |
./h2 -T -r file.hex # Run simulator
828 |
./h2 -T -R file.fth # Assemble and run some code
829 |
830 |
Both modes of operation can be augmented with a symbols file, which lists where
831 |
variables, labels and functions are located with the assembled core.
832 |
833 |
When the "-T" option is given debug mode will be entered before the simulation
834 |
is executed. A prompt should appear and the command line should look like this:
835 |
836 |
$ ./h2 -T -R h2.fth
837 |
Debugger running, type 'h' for a list of command
838 |
839 |
840 |
Break points can be set either symbolically or by program location, the 'b'
841 |
command is used to set breakpoints:
842 |
843 |
Numbers can be entered in octal (prefix the number with '0'), hexadecimal
844 |
(prefix with '0x') or in decimal. As an example, the following three debug
845 |
commands all set a breakpoint at the same location:
846 |
847 |
debug> b 16
848 |
debug> b 0x10
849 |
debug> b 020
850 |
851 |
'k' can be used to list the current break points that are set:
852 |
853 |
debug> k
854 |
855 |
856 |
This sets a breakpoint when the function "key?" is called:
857 |
858 |
debug> b key?
859 |
860 |
Functions and labels can both be halted on, this requires either a
861 |
symbols file to be specified on the command line or assemble and run
862 |
to be used on a source file, not a hex file. Symbol files can be used
863 |
on source or on hex files.
864 |
865 |
To single step the 's' command can be given, although not much will happen if
866 |
tracing is turned off (tracing is off by default). Tracing can be toggled on or
867 |
off with the 't' command:
868 |
869 |
debug> s
870 |
debug> s
871 |
debug> t
872 |
trace on
873 |
debug> s
874 |
0001: pc(089a) inst(4889) sp(0) rp(0) tos(0000) r(0000) call 889 init
875 |
debug> s
876 |
0002: pc(0889) inst(807a) sp(0) rp(1) tos(0000) r(089b) 7a
877 |
debug> s
878 |
0003: pc(088a) inst(e004) sp(1) rp(1) tos(007a) r(089b) 6004
879 |
880 |
It is advisable to turn tracing off when running issuing the 'c', or continue,
881 |
882 |
883 |
The '.' command can be used to display the H2 cores internal state:
884 |
885 |
debug> .
886 |
Return Stack:
887 |
0000: 0000 08aa 0883 017b 0000 031b 0000 ffb0 0000 02eb ffb5 0210 0167 0167
888 |
0167 0167
889 |
0010: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
890 |
0000 0000
891 |
892 |
Variable Stack:
893 |
tos: 0000
894 |
0001: 0000 0000 0000 0001 0004 0005 0000 ffb0 0000 0000 0000 0000 0000 0000
895 |
0000 0000
896 |
0011: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
897 |
0000 0000
898 |
899 |
pc: 0538
900 |
rp: 0001
901 |
dp: 0000
902 |
ie: false
903 |
904 |
And the 'p' command can be used to display the state of the simulated
905 |
906 |
907 |
debug> p
908 |
LEDS: 00
909 |
VGA Cursor: 0005
910 |
VGA Control: 007a
911 |
Timer Control: 8032
912 |
Timer: 001b
913 |
IRC Mask: 0000
914 |
UART Input: 6c
915 |
LED 7seg: 0005
916 |
Switches: 00
917 |
LFSR: 40ba
918 |
Waiting: false
919 |
920 |
For a complete list of commands, use the 'h' command.
921 |
922 |
Other ways to enter debug mode include putting the ".break" assembler directive
923 |
into the source code (this only works if the assemble and run command is used
924 |
on source files, not on hex files), and hitting the escape character when the
925 |
simulator is trying to read data via the simulated UART or PS/2 keyboard (the
926 |
escape will still be passed onto the simulator, but it also activates debug
927 |
928 |
929 |
## Graphical simulator
930 |
931 |
A separate program can be compiled, tested under [Linux][] and [Windows][].
932 |
This simulates the [Nexys3][] board peripherals that the SoC interfaces with,
933 |
but provides a graphical environment, unlike the command line utility. It is easier
934 |
to interact with the device and see what it is doing, but the debugging sessions
935 |
are a less controlled. It requires [free glut][].
936 |
937 |
* VGA output works
938 |
* UART or PS/2 input (selectable by pressing F11) comes from typing in the screen,
939 |
and in the case of the UART this is buffered with a FIFO.
940 |
* UART output gets written to a display box.
941 |
* There are four 7-Segment displays as on the original board.
942 |
* The switches and push buttons can take their input from either keyboard keys
943 |
or from mouse clicks.
944 |
* The LED indicators above the switches can be lit up.
945 |
946 |
Below is an image of a running session in the GUI simulator:
947 |
948 |

949 |
950 |
Building can be done with
951 |
952 |
make gui
953 |
954 |
And running:
955 |
956 |
make gui-run
957 |
958 |
959 |
960 |
./gui h2.hex (on Linux)
961 |
gui.exe h2.hex (on Windows)
962 |
963 |
The [Linux][] build should work when the development package for [free glut][]
964 |
is installed on your system, the [Windows][] build may require changes to the
965 |
build system and/or manual installation of the compiler, libraries and headers.
966 |
967 |
The current key map is:
968 |
969 |
Up Activate Up D-Pad Button, Release turns off
970 |
Down Activate Down D-Pad Button, Release turns off
971 |
Left Activate Left D-Pad Button, Release turns off
972 |
Right Activate Right D-Pad Button, Release turns off
973 |
F1 - F8 Toggle Switch On/Off, F1 is left most, F8 Right Most
974 |
F11 Toggle UART/PS2 Keyboard Input
975 |
F12 Toggle Debugging Information
976 |
Escape Quit simulator
977 |
978 |
All other keyboard keys are redirected to the UART or PS/2 Keyboard input.
979 |
980 |
The Switches and D-Pad buttons can be clicked on to turn them on, the switches
981 |
turn on with left clicks and off with right clicks. The D-Pads buttons turn on
982 |
with a click on top of them and turn off with a key release anywhere on the
983 |
984 |
985 |
# VHDL Components
986 |
987 |
The VHDL components used in this system are designed to be reusable and
988 |
portable across different toolchains and vendors. Hardware components, like block
989 |
RAM, are inferred and not explicitly instantiated. The components are also made
990 |
to be as generic as possible, with most having selectable widths. This would be
991 |
taken to the extreme, but unfortunately many vendors still do not support the
992 |
VHDL-2008 standard.
993 |
994 |
| File | License | Author | Description |
995 |
| -------- | ---------- | --------------- | ----------------------------------- |
996 |
| util.vhd | MIT | Richard J Howe | A collection of generic components |
997 |
| h2.vhd | MIT | Richard J Howe | H2 Forth CPU Core |
998 |
| vga.vhd | LGPL 3.0 | Javier V GarcÃa | Text Mode VGA 80x40 Display |
999 |
| uart.vhd | Apache 2.0 | Peter A Bennett | UART, modified from original |
1000 |
| kbd.vhd | ??? | Scott Larson | PS/2 Keyboard |
1001 |
| led.vhd | MIT | Richard J Howe | LED 7-Segment + Dot Display Driver |
1002 |
1003 |
1004 |
# eForth on the H2
1005 |
1006 |
The pseudo Forth like language used as an assembler is described above, the
1007 |
application that actually runs on the Forth core is in itself a Forth
1008 |
interpreter. This section describes the Forth interpreter that runs on H2 Core,
1009 |
it is contained within [h2.fth][].
1010 |
1011 |
1012 |
- Fully implement the Forth interpreter
1013 |
- Describe and show its operation on here including memory layout, list of
1014 |
words, word layout, ...
1015 |
1016 |
# Using Forth as a bootloader
1017 |
1018 |
A running Forth environment can be quite easily used as a bootloader with no
1019 |
further modification, a simple protocol for sending data and verification of it
1020 |
can be built using only Forth primitives - although it is not the most
1021 |
efficient use of bandwidth.
1022 |
1023 |
The sender can interrogate the running Forth environment over the serial link
1024 |
to determine the amount of space left in memory, and then populate it with an
1025 |
assembled binary.
1026 |
1027 |
The Forth words needed are:
1028 |
1029 |
1030 |
| Word | Description |
1031 |
| ------- | --------------------- |
1032 |
| .free | show free space |
1033 |
| cp | compile pointer |
1034 |
| pwd | previous word pointer |
1035 |
| @ | load |
1036 |
| ! | store |
1037 |
| cr | print new line |
1038 |
| execute | execute |
1039 |
| decimal | set decimal output |
1040 |
| cells | size of cell |
1041 |
| . | print number |
1042 |
1043 |
1044 |
And of course numeric input, all of which are provided by this interpreter. The
1045 |
protocol is line oriented, the host with the program to transfer to the H2
1046 |
(called PC) sends a line of text and expects a reply from the H2 board (called
1047 |
1048 |
1049 |
PC: decimal ( set the H2 core to a known numeric output )
1050 |
PC: .free cp @ . cr ( query how much space is left, and where to put it )
1051 |
H2: ADDR ADDR ( H2 replies with both addresses )
1052 |
PC: 1 cells . cr ( PC queries size of cells )
1053 |
H2: 2 ( H2 responds, PC now knows to increment ADDR )
1054 |
PC: NUM ADDR ! ( PC write NUM to ADDR )
1055 |
PC: ADDR @ . cr ( optionally PC checks value )
1056 |
H2: NUM ( H2 responds with value stored at ADDR )
1057 |
... ( PC and H2 do this as often as necessary )
1058 |
PC: ADDR pwd ! ( PC optionally updates previous word register )
1059 |
PC ADDR cp ! ( PC optionally updated compile poiinter )
1060 |
PC: ADDR execute ( Begin execution of word )
1061 |
1062 |
The advantage of this "protocol" is that is human readable, and includes a
1063 |
debugger for the microcontroller it is operating on.
1064 |
1065 |
# A simple Forth block editor
1066 |
1067 |
1068 |
- Add an SPI master with Rx/Tx Fifos to the SoC, which can then be use
1069 |
communicate with the memory on the [Nexys3][], this can then be used in
1070 |
conjunction with a simple block editor for a full Forth experience.
1071 |
- Talk about implementing a simple block editor in a few words of Forth.
1072 |
1073 |
1074 |
1075 |
# Coding standards
1076 |
1077 |
There are several languages used throughout this project, all of which are
1078 |
radically different from each other and require their own set of coding
1079 |
standards and style guides.
1080 |
1081 |
1082 |
1083 |
Common signal names:
1084 |
1085 |
clk - The system clock
1086 |
rst - A reset signal for the module
1087 |
we - Write Enable
1088 |
re - Read Enable
1089 |
di - Data In
1090 |
din - Data In
1091 |
do - Data Out
1092 |
dout - Data Out
1093 |
control - Generally an input to a register, the documentation
1094 |
for the module will need to be consulted to find out
1095 |
what each bit means
1096 |
signal_we - The write enable for 'signal'
1097 |
signal_i - This is an input signal
1098 |
signal_o - This is an output signal
1099 |
1100 |
Generally the use of the "\_i" and "\_o" suffixes are not used, modules are
1101 |
kept short and names chosen so their meaning is obvious. This rule might be
1102 |
revisited once the project grows.
1103 |
1104 |
Components should:
1105 |
1106 |
* Be as generic as possible
1107 |
* Use an asynchronous reset
1108 |
* If a feature of a module can be made optional, by either ignoring outputs
1109 |
or setting inputs to sensible values, it should be.
1110 |
* Where possible use a function, it is easy enough to turn a generic
1111 |
component into a module that can be synthesized but not the other way around.
1112 |
* Use "downto" not "to" when specify variable ranges.
1113 |
* Use assertions throughout the code with the correct severity level ('failure'
1114 |
for when something has seriously gone wrong or 'error' for debugging purposes)
1115 |
* Constrain types and generic parameters if possible, as an example, if a generic
1116 |
value should never be zero, use "positive" not "natural".
1117 |
* Try not to specify constants with fixed lengths where an expression using
1118 |
"others" can be used instead, for example:
1119 |
1120 |
1121 |
1122 |
constant N: positive := 4;
1123 |
signal a: std_logic_vector(N - 1 downto 0) := (others => '1');
1124 |
1125 |
1126 |
1127 |
Instead of:
1128 |
1129 |
1130 |
signal a: std_logic_vector(3 downto 0) := x"F";
1131 |
1132 |
1133 |
1134 |
The style rules are as follows:
1135 |
1136 |
* All words, including keywords, are to be in lower case. An underscore
1137 |
will separate words in names.
1138 |
* Tabs are to be used to indent text, a tab spacing of 8 has been used when
1139 |
making the VHDL code
1140 |
* Do not repeat the name of a entity, component, function or architecture,
1141 |
there is little point of repeating this, it just means when a unit has to be
1142 |
renamed it has to be done in two places instead of one.
1143 |
* The ":" in definitions of signals belongs next to the signal name, not
1144 |
some arbitrary amount of spaces after it.
1145 |
* Group related signals.
1146 |
* Try to line up rows of signals
1147 |
* Trigger logic on the rising edge, and use the "rising\_edge" function not
1148 |
"clk'event and clk ='1'"
1149 |
* By and large, each warning produced by the synthesis tool should be
1150 |
justified, and there should be very few warnings in the entire project if any.
1151 |
* Do not use inferred latches.
1152 |
* Load data from a file instead of generating VHDL files that contain the data,
1153 |
synthesis tools can handle impure VHDL functions that can read the initial data
1154 |
(for a ROM or block RAM as an example) from textual files.
1155 |
1156 |
1157 |
1158 |
An example of the formatting guidelines, this describes a simple arbitrary
1159 |
width register:
1160 |
1161 |
-- Lots of comments about what the unit does should go
1162 |
-- here. Describe the waveforms, states and use ASCII
1163 |
-- art where possible.
1164 |
library ieee;
1165 |
use ieee.std_logic_1164.all;
1166 |
use ieee.numeric_std.all; -- numeric_std not std_logic_arith
1167 |
1168 |
entity reg is -- generic and port indented one tab, their parameters two
1169 |
1170 |
N: positive); -- Generic parameters make for a generic component
1171 |
1172 |
clk: in std_logic; -- standard signal names
1173 |
rst: in std_logic; --
1174 |
we: in std_logic;
1175 |
di: in std_logic_vector(N - 1 downto 0);
1176 |
do: out std_logic_vector(N - 1 downto 0)); -- note the position of ");
1177 |
end entity; -- "end entity", not "end reg"
1178 |
1179 |
architecture rtl of reg is
1180 |
signal r_c, r_n: std_logic_vector(N - 1 downto 0) := (others => '0');
1181 |
1182 |
do <= r_c;
1183 |
1184 |
process(rst, clk)
1185 |
1186 |
if rst = '1' then -- asynchronous reset
1187 |
r_c <= (others => '0');
1188 |
elsif rising_edge(clk) then -- rising edge, not "clk'event and clk = '1'"
1189 |
r_c <= r_n;
1190 |
end if;
1191 |
end process;
1192 |
1193 |
process(r_c, di, we)
1194 |
1195 |
r_n <= r_c;
1196 |
if we = '1' then
1197 |
r_n <= di;
1198 |
end if;
1199 |
end process;
1200 |
end; -- "end" or "end architecture"
1201 |
1202 |
1203 |
## C
1204 |
1205 |
There is quite a lot of [C][] code used within this project, used to make a
1206 |
tool chain for the H2 core and to simulate the system. It follows a fairly
1207 |
strict coding style.
1208 |
1209 |
* Tabs are to be used instead of spaces, a tab width of 8 was used when coding
1210 |
the C, if this causes any code to go off screen then there is a problem with
1211 |
the code and not the tab length.
1212 |
* Generally the [K&R][] style is followed.
1213 |
* Line lengths should ideally be limited to 80 characters, but this is
1214 |
definitely not an enforced limit.
1215 |
* Where there are two or more data structures that must be kept in sync, with a
1216 |
one to one correspondence of elements, such as an enumeration and an array of
1217 |
strings that each enumeration maps onto, an [X-Macro][] should be used to
1218 |
keep the data in sync and to initialize the enumeration and array of strings.
1219 |
* Try to use only portable constructs and isolate the constructs that are not
1220 |
1221 |
1222 |
1223 |
1224 |
There is nothing too surprising about the [C][] code within here, so some of
1225 |
the exceptions should be dealt with.
1226 |
1227 |
* Switch statements are formatted depending upon what the switch statement 'case'
1228 |
clauses look like, if they are a simple one liner such as an assignment or a
1229 |
mapping then the entire statement should occupy only a single line, for
1230 |
1231 |
1232 |
1233 |
1234 |
static const char *alu_op_to_string(uint16_t instruction)
1235 |
1236 |
/* notice also that the 'case' clauses are inline with the
1237 |
* switch selector */
1238 |
switch(ALU_OP(instruction)) {
1239 |
case ALU_OP_T: return "T";
1240 |
case ALU_OP_N: return "N";
1241 |
case ALU_OP_T_PLUS_N: return "T+N";
1242 |
case ALU_OP_T_AND_N: return "T&N";
1243 |
case ALU_OP_T_OR_N: return "T|N";
1244 |
case ALU_OP_T_XOR_N: return "T^N";
1245 |
case ALU_OP_T_INVERT: return "~T";
1246 |
case ALU_OP_T_EQUAL_N: return "N=T";
1247 |
case ALU_OP_N_LESS_T: return "T>N";
1248 |
case ALU_OP_N_RSHIFT_T: return "N>>T";
1249 |
case ALU_OP_T_DECREMENT: return "T-1";
1250 |
case ALU_OP_R: return "R";
1251 |
case ALU_OP_T_LOAD: return "[T]";
1252 |
case ALU_OP_N_LSHIFT_T: return "N<
1253 |
case ALU_OP_DEPTH: return "depth";
1254 |
case ALU_OP_N_ULESS_T: return "Tu>N";
1255 |
case ALU_OP_ENABLE_INTERRUPTS: return "seti";
1256 |
case ALU_OP_INTERRUPTS_ENABLED: return "iset?";
1257 |
case ALU_OP_RDEPTH: return "rdepth";
1258 |
case ALU_OP_T_EQUAL_0: return "0=";
1259 |
case ALU_OP_CPU_ID: return "cpu-id";
1260 |
default: return "unknown";
1261 |
1262 |
1263 |
1264 |
* Unnecessary braces are avoided:
1265 |
1266 |
1267 |
1268 |
1269 |
1270 |
1271 |
1272 |
1273 |
* "goto" can be used - it can be misused, but using it does not instantly make
1274 |
code inscrutable contrary to popular belief.
1275 |
1276 |
1277 |
1278 |
The Forth in [h2.fth][] used to build an actual Forth system for the target
1279 |
core is not a proper Forth, but a compiler for a Forth like language, this
1280 |
idiosyncratic language has its own way of doing things. The workings of the
1281 |
language will not be described in this section, only the coding standards and
1282 |
style guide.
1283 |
1284 |
Either type of comment can be used, although "( )" comments are preferred,
1285 |
single line words should be short at only a few words, multi line words
1286 |
should be indented properly
1287 |
1288 |
: 1+ 1 + ;
1289 |
: negate invert 1+ ;
1290 |
: dnegate not >r not 1 um+ r> + ; ( d -- d )
1291 |
1292 |
Tabs should be used for indentation and a stack comment present for long or
1293 |
complex words.
1294 |
1295 |
: ?rx ( -- c -1 | 0 : read in a character of input from UART )
1296 |
iUart @ 0x0100 and 0= ( the value to test goes on one line )
1297 |
if ( the 'if' on another line )
1298 |
0x0400 oUart ! iUart @ 0xff and -1
1299 |
1300 |
1301 |
then ; ( ';' is part of the final statement )
1302 |
1303 |
\ This word is too long for the stack comment and to be on a
1304 |
\ single line
1305 |
: parse ( c -- b u ; )
1306 |
>r tib >in @ + #tib @ >in @ - r> parser >in +! ;
1307 |
1308 |
Space is seriously limited on the target device at only 8192 cells (16KiB), so
1309 |
words kept as short as possible, and programs highly factored. Speed is not so
1310 |
much of an issue as the board and core runs at 100MHz.
1311 |
1312 |
Stack comments describe what values and Forth word takes and returns, it is
1313 |
good practice to make words that accept and return a fixed number of parameters
1314 |
but in certain circumstances it is advantages to return a variable number of
1315 |
arguments. The comments also describe the type of the arguments word accepts,
1316 |
the Forth kernel will do no checking on the data it gets however.
1317 |
1318 |
Stack comments should be added with the following scheme:
1319 |
1320 |
| Comment | Meaning |
1321 |
1322 |
| a | cell address |
1323 |
| n | signed number |
1324 |
| u | unsigned number |
1325 |
| b | string address |
1326 |
| c | single character |
1327 |
| d | double width number (2 Cells) |
1328 |
| f | boolean flag (-1 = true, 0 = false) |
1329 |
| k | block number |
1330 |
| cfa | code field address of a word |
1331 |
| nfa | name field address of a word |
1332 |
| pwd | previous word address of a word |
1333 |
| <string> | a parsing word |
1334 |
1335 |
Stack comments have the following format:
1336 |
1337 |
Variable Stack Effects Return stack Effects Parsing Description
1338 |
( arguments -- returns; R: arguments -- returns; : comment )
1339 |
1340 |
Examples of words and their stack comments:
1341 |
1342 |
dup ( n -- n n : duplicate a number )
1343 |
>r ( n -- ; R: -- n : move a number to the return stack )
1344 |
r> ( -- n ; R: n -- : move a number from the return stack )
1345 |
parse ( c -- b u; : parse a word delimted by 'c' )
1346 |
over ( n1 n2 -- n1 n2 n1 : duplicate next on stack over top )
1347 |
1348 |
Words can have their arguments numbered to make it clearer what the effects
1349 |
1350 |
1351 |
# To Do
1352 |
1353 |
* My [embed][] project, which was derived from the simulator and Forth for this
1354 |
project, has an improved version of Forth which could be reintegrated with
1355 |
this project. The [embed][] project features a metacompiler suitable for 16-bit
1356 |
systems like this one, it could be used in lieu of the Pseudo-Forth compiler.
1357 |
* Guide to reusing the VHDL in this project, and component listing
1358 |
* Turn the [h2.fth][] into a literate program file, describing how to build up
1359 |
a Forth system like in "The Zen of eForth by C. H. Ting".
1360 |
* Make a bootloader/program loader as a single, simple program
1361 |
* Make diagrams of the SoC layout, take GIFs and pictures of the simulators and
1362 |
the running board.
1363 |
* Make a javascript based simulator for the H2, perhaps with [emscripten][]
1364 |
* Move this file to "h2.md" and make a simpler "readme.md" with a short
1365 |
description and flashy GIFs
1366 |
* Add notes about picocom, and setting up the hardware:
1367 |
* Add a [Wishbone interface][] for each component
1368 |
* Put the project up on [opencores][]
1369 |
* It should be possible to turn [h2.c][] into a Forth interpreter usable in
1370 |
a hosted environment. Some of the remaining instructions could be used for
1371 |
function calls and floating point operations and the "uint16\_t" would have to
1372 |
be changed to "uintptr\_t". As the interpreter is so small a Forth to "C"
1373 |
compiler could just spit out a copy of the interpreter. It should make for
1374 |
a very compact system.
1375 |
* Spin off the VT100 VHDL code into a separate project, and the C code in the
1376 |
simulator into a simple terminal emulator for Unixen.
1377 |
* Add some example games, any of the following: Space Invaders, Tetris, Pong,
1378 |
Minesweeper, Sokoban, Sudoku, Hack, 2048, Breakout, Brain F\*ck, Conway's
1379 |
Game of Life, and other terminal based games could be added, see
1380 |
for a
1381 |
minimal version of Nethack, and for a
1382 |
cool 1 dimensional game.
1383 |
* Other programs like a compressor, decompressor, ARC4, data base function based
1384 |
around blocks, and a meta compiler (removing the need for the C assembler),
1385 |
could be added.
1386 |
1387 |
1388 |
1389 |
1390 |
1391 |
picocom --omap delbs -b 115200 -e b /dev/ttyUSB1
1392 |
1393 |
# Resources
1394 |
1395 |
1396 |
1397 |
1398 |
1399 |
1400 |
1401 |
1402 |
1403 |
1404 |
[javascript]: https://www.javascript.com/
1405 |
[emscripten]: https://github.com/kripken/emscripten
1406 |
[DEBUG.COM]: https://en.wikipedia.org/wiki/Debug_%28command%29
1407 |
[DOS]: https://en.wikipedia.org/wiki/DOS
1408 |
[h2.c]: h2.c
1409 |
[h2.fth]: h2.fth
1410 |
[tb.vhd]: tb.vhd
1411 |
[uart.vhd]: uart.vhd
1412 |
[top.ucf]: top.ucf
1413 |
[font.bin]: font.bin
1414 |
[text.bin]: text.bin
1415 |
[J1]: http://www.excamera.com/sphinx/fpga-j1.html
1416 |
[J1 PDF]: http://excamera.com/files/j1.pdf
1417 |
[PL/0]: https://github.com/howerj/pl0
1418 |
[libforth]: https://github.com/howerj/libforth/
1419 |
[MIT]: https://en.wikipedia.org/wiki/MIT_License
1420 |
[LGPL]: https://www.gnu.org/licenses/lgpl-3.0.en.html
1421 |
[VHDL]: https://en.wikipedia.org/wiki/VHDL
1422 |
[Verilog]: https://en.wikipedia.org/wiki/Verilog
1423 |
[UART]: https://en.wikipedia.org/wiki/Universal_asynchronous_receiver/transmitter
1424 |
[FORTH]: https://en.wikipedia.org/wiki/Forth_%28programming_language%29
1425 |
[Nexys3]: http://store.digilentinc.com/nexys-3-spartan-6-fpga-trainer-board-limited-time-see-nexys4-ddr/
1426 |
[Make]: https://en.wikipedia.org/wiki/Make_%28software%29
1427 |
[C]: https://en.wikipedia.org/wiki/C_%28programming_language%29
1428 |
[Debian]: https://en.wikipedia.org/wiki/Debian
1429 |
[Linux]: https://en.wikipedia.org/wiki/Linux
1430 |
[GCC]: https://en.wikipedia.org/wiki/GNU_Compiler_Collection
1431 |
[Xilinx ISE]: https://www.xilinx.com/products/design-tools/ise-design-suite.html
1432 |
[Xilinx]: https://www.xilinx.com
1433 |
[GHDL]: http://ghdl.free.fr/
1434 |
[GTKWave]: http://gtkwave.sourceforge.net/
1435 |
[C99]: https://en.wikipedia.org/wiki/C99
1436 |
[tcl]: https://en.wikipedia.org/wiki/Tcl
1437 |
[Wishbone interface]: https://en.wikipedia.org/wiki/Wishbone_%28computer_bus%29
1438 |
[D-Pad]: https://en.wikipedia.org/wiki/D-pad
1439 |
[FIFO]: https://en.wikipedia.org/wiki/FIFO_%28computing_and_electronics%29
1440 |
[VGA]: https://en.wikipedia.org/wiki/Video_Graphics_Array
1441 |
[PS/2]: https://en.wikipedia.org/wiki/PS/2_port
1442 |
[LED]: https://en.wikipedia.org/wiki/Light-emitting_diode
1443 |
[8 Segment LED Display]: https://en.wikipedia.org/wiki/Seven-segment_display
1444 |
[ISO 8859-1 (Latin-1)]: https://cs.stanford.edu/people/miles/iso8859.html
1445 |
[Spartan 6]: https://www.xilinx.com/products/silicon-devices/fpga/spartan-6.html
1446 |
[FPGA]: https://en.wikipedia.org/wiki/Field-programmable_gate_array
1447 |
[ASCII]: https://en.wikipedia.org/wiki/ASCII
1448 |
[free glut]: http://freeglut.sourceforge.net/
1449 |
[pthreads]: https://en.wikipedia.org/wiki/POSIX_Threads
1450 |
[LFSR]: https://en.wikipedia.org/wiki/Linear-feedback_shift_register
1451 |
[freeglut]: http://freeglut.sourceforge.net/
1452 |
[EBNF]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
1453 |
[K&R]: https://en.wikipedia.org/wiki/Indent_style#K.26R
1454 |
[X-Macro]: https://en.wikipedia.org/wiki/X_Macro
1455 |
[Windows]: https://en.wikipedia.org/wiki/Windows_7
1456 |
[pandoc]: https://pandoc.org
1457 |
[picocom]: https://github.com/npat-efault/picocom
1458 |
[Gforth]: https://www.gnu.org/software/gforth/
1459 |
[opencores]: https://opencores.org
1460 |
[VT100]: https://en.wikipedia.org/wiki/VT100
1461 |
[embed]: https://github.com/howerj/embed
1462 |
1469 |
1470 |