URL https://opencores.org/ocsvn/forth-cpu/forth-cpu/trunk
Subversion Repositories forth-cpu

[/] [forth-cpu/] [trunk/] [readme.md] - Rev 3

Go to most recent revision | Compare with Previous | Blame | View Log
# Forth computing system

| Project   | Forth SoC written in VHDL |
| --------- | ------------------------- |
| Author    | Richard James Howe        |
| Copyright | 2013-2017 Richard Howe    |
| License   | MIT/LGPL                  |
| Email     | howe.r.j.89@gmail.com     |

![H2 build status](https://travis-ci.org/howerj/forth-cpu.svg?branch=master "Build status of the H2 Assembler")

# Introduction

This project implements a small stack computer tailored to executing Forth
based on the [J1][] CPU. The processor has been rewritten in [VHDL][] from
[Verilog][], and extended slightly. 

The goals of the project are as follows:

* Create a working version of [J1][] processor (called the H2).
* Make a working toolchain for the processor.
* Create a [FORTH][] for the processor which can take its input either from a
  [UART][] or a USB keyboard and a [VGA][] adapter.

The H2 processor, like the [J1][], is a stack based processor that executes an
instruction set especially suited for [FORTH][].

The current target is the [Nexys3][] board, with a [Xilinx][] Spartan-6 XC6LX16-CS324
[FPGA][], new boards will be targeted in the future as this board is reaching it's
end of life. The [VHDL][] is written in a generic way, with hardware components
being inferred instead of explicitly instantiated, this should make the code
fairly portable, although the interfaces to the [Nexys3][] board components are
specific to the peripherals on that board.

A video of the project in action, on the hardware, can be viewed here:
![Nexys-3 Board](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/107.mp4)

And a lower quality version of the same video that should play automatically:

![Nexys-3 Board](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/107.gif)

The SoC can also be simulated with a simulator written in C, as shown below:

![GUI Simulator](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/sim2.gif)

The System Architecture is as follows:

![System Architecture](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/system.svg)
![System Architecture](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/system.png)

# License

The licenses used by the project are mixed and are on a per file basis. For my
code I use the [MIT][] license - so feel free to use it as you wish. The other
licenses used are the [LGPL][], they are confined to single modules so could be
removed if you have some aversion to [LGPL][] code.

# Target Board

The only target board available at the moment is the [Nexys3][], this should
change in the future as the board is currently at it's End Of Life. The next
boards I am looking to support are it's successor, the Nexys 4, and the myStorm
BlackIce (<https://mystorm.uk/>). The myStorm board uses a completely open
source toolchain for synthesis, place and route and bit file generation.

# Build and Running requirements

The build has been tested under [Debian][] [Linux][], version 8.

You will require:

* [GCC][], or a suitable [C][] compiler capable of compiling [C99][]
* [Make][]
* [Xilinx ISE][] version 14.7
* [GHDL][]
* [GTKWave][]
* [tcl][] version 8.6
* Digilent Adept2 runtime and Digilent Adept2 utilities available at
  <http://store.digilentinc.com/digilent-adept-2-download-only/>
* [freeglut][] (for the GUI simulator only)
* [pandoc][] for building the documentation
* [picocom][] (or an alternative terminal client)

Hardware:

* VGA Monitor
* USB Keyboard (plugs into the Nexys3 USB to PS/2 bridge)
* [Nexys3][] development board

[Xilinx ISE][] can (or could be) downloaded for free, but requires
registration. ISE needs to be on your path:

        PATH=$PATH:/opt/Xilinx/14.7/ISE_DS/ISE/bin/lin64;
        PATH=$PATH:/opt/Xilinx/14.7/ISE_DS/ISE/lib/lin64;

# Building and Running

To make the [C][] based toolchain:

        make h2

To make a bit file that can be flashed to the target board:

        make simulation synthesis implementation bitfile

To upload the bitfile to the target board:

        make upload

To view the wave form generated by "make simulation":

        make viewer

The [C][] based CLI simulator can be invoked with:

        make run

Which will assemble the H2 Forth source file [h2.fth][], and run the assembled
object file under the H2 simulator with the debugger activated. A graphical
simulator can be run with:

        make gui-run

Which requires [freeglut][] as well as a [C][] compiler.

# Related Projects

The original [J1][] project is available at:

* <http://www.excamera.com/sphinx/fpga-j1.html>

This project targets the original [J1][] core and provides a eForth
implementation (written using [Gforth][] as for meta-compilation/cross
compilation to the [J1][] core). It also provides a simulator for the system
written in [C][].

* <https://github.com/samawati/j1eforth>


# Manual

The H2 processor and associated peripherals are subject to change, so the code
is the definitive source what instructions are available, the register map, and
how the peripherals behave.

There are a few modifications to the [J1][] CPU which include:

* New instructions
* A CPU hold line which keeps the processor in the same state so long as it is
high.
* Interrupt Service Routines have been added.
* Larger return and data stacks

The Interrupt Service Routines (ISR) have not been throughly tested and will be
subject to the most change.

### H2 CPU

The H2 CPU behaves very similarly to the [J1][] CPU, and the [J1 PDF][] can be
read in order to better understand this processor. The processor is 16-bit with
instructions taking a single clock cycle. Most of the primitive Forth words can
also be executed in a single cycle as well, one notable exception is store ("!"),
which is split into two instructions.

The CPU has the following state within it:

* A 64 deep return stack (up from 32 in the original [J1][])
* A 65 deep variable stack (up from 33 in the original [J1][])
* A program counter
* An interrupt enable and interrupt request bit
* An interrupt address register

Loads and stores into the block RAM that holds the H2 program discard the
lowest bit, every other memory operation uses the lower bit (such as jumps
and loads and stores to Input/Output peripherals). This is so applications can
use the lowest bit for character operations when accessing the program RAM.

The instruction set is decoded in the following manner:

        +---------------------------------------------------------------+
        | F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
        +---------------------------------------------------------------+
        | 1 |                    LITERAL VALUE                          |
        +---------------------------------------------------------------+
        | 0 | 0 | 0 |            BRANCH TARGET ADDRESS                  |
        +---------------------------------------------------------------+
        | 0 | 0 | 1 |            CONDITIONAL BRANCH TARGET ADDRESS      |
        +---------------------------------------------------------------+
        | 0 | 1 | 0 |            CALL TARGET ADDRESS                    |
        +---------------------------------------------------------------+
        | 0 | 1 | 1 |   ALU OPERATION   |T2N|T2R|N2A|R2P| RSTACK| DSTACK|
        +---------------------------------------------------------------+
        | F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
        +---------------------------------------------------------------+

        T   : Top of data stack
        N   : Next on data stack
        PC  : Program Counter

        LITERAL VALUES : push a value onto the data stack
        CONDITIONAL    : BRANCHS pop and test the T
        CALLS          : PC+1 onto the return stack

        T2N : Move T to N
        T2R : Move T to top of return stack
        N2A : STORE T to memory location addressed by N
        R2P : Move top of return stack to PC

        RSTACK and DSTACK are signed values (twos compliment) that are
        the stack delta (the amount to increment or decrement the stack
        by for their respective stacks: return and data)

#### ALU operations


All ALU operations replace T:

| Value |   Operation    |     Description       |
|-------|----------------|-----------------------|
|   0   |       T        |  Top of Stack         |
|   1   |       N        |  Copy T to N          |
|   2   |     T + N      |  Addition             |
|   3   |     T & N      |  Bitwise AND          |
|   4   |     T or N     |  Bitwise OR           |
|   5   |     T ^ N      |  Bitwise XOR          |
|   6   |      ~T        |  Bitwise Inversion    |
|   7   |     T = N      |  Equality test        |
|   8   |     N < T      |  Signed comparison    |
|   9   |     N >> T     |  Logical Right Shift  |
|  10   |     T - 1      |  Decrement            |
|  11   |       R        |  Top of return stack  |
|  12   |      [T]       |  Load from address    |
|  13   |     N << T     |  Logical Left Shift   |
|  14   |     depth      |  Depth of stack       |
|  15   |     N u< T     |  Unsigned comparison  |
|  16   | set interrupts |  Enable interrupts    |
|  17   | interrupts on? |  Are interrupts on?   |
|  18   |     rdepth     |  Depth of return stk  |
|  19   |      0=        |  T == 0?              |
|  20   |     CPU ID     |  CPU Identifier       |


### Peripherals and registers

Registers marked prefixed with an 'o' are output registers, those with an 'i'
prefix are input registers. Registers are divided into an input and output
section of registers and the addresses of the input and output registers do not
correspond to each other in all cases. Unlike for RAM reads, the I/O registers
are indexed by word aligned addresses, without the lowest bit being discarded
(this should be fixed at a later date).

The following peripherals have been implemented in the [VHDL][] SoC to
interface with devices on the [Nexys3][] board:

* [VGA][] output device, text mode only, 80 by 40 characters from
  <http://www.javiervalcarce.eu/html/vhdl-vga80x40-en.html>
* Timer
* [UART][] (Rx/Tx) with a [FIFO][]
from <https://github.com/pabennett/uart>
* [PS/2][] Keyboard
from <https://eewiki.net/pages/viewpage.action?pageId=28279002>
* [LED][] next to a bank of switches
* An [8 Segment LED Display][] driver (a 7 segment display with a decimal point)

The SoC also features a limited set of interrupts that can be enabled or
disabled.

The output register map:

| Register    | Address | Description                     |
|-------------|---------|---------------------------------|
| oUart       | 0x4000  | UART register                   |
| oVT100      | 0x4002  | VT100 Terminal Write            |
| oLeds       | 0x4004  | LED outputs                     |
| oTimerCtrl  | 0x4006  | Timer control                   |
| oMemDout    | 0x4008  | Memory Data Output              |
| oMemControl | 0x400A  | Memory Control / Hi Address     |
| oMemAddrLow | 0x400C  | Memory Lo Address               |
| o7SegLED    | 0x400E  | 4 x LED 8 Segment display       |
| oIrcMask    | 0x4010  | CPU Interrupt Mask              |


The input registers:

| Register    | Address | Description                     |
|-------------|---------|---------------------------------|
| iUart       | 0x4000  | UART register                   |
| iVT100      | 0x4002  | Terminal status & PS/2 Keyboard |
| iSwitches   | 0x4004  | Buttons and switches            |
| iTimerDin   | 0x4006  | Current Timer Value             |
| iMemDin     | 0x4008  | Memory Data Input               |


The following description of the registers should be read in order and describe
how the peripherals work as well.

#### oUart

A UART with a fixed baud rate and format (115200, 8 bits, 1 stop bit) is
present on the SoC. The UART has a FIFO of depth 8 on both the RX and TX
channels. The control of the UART is split across oUart and iUart.

To write a value to the UART assert TXWE along with putting the data in TXDO.
The FIFO state can be analyzed by looking at the iUart register.

To read a value from the UART: iUart can be checked to see if data is present
in the FIFO, if it is assert RXRE in the oUart register, on the next clock
cycle the data will be present in the iUart register.

The baud rate of the UART can be changed by rebuilding the VHDL project, bit
length, parity bits and stop bits can only be changed with modifications to
[uart.vhd][]

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |TXWE|  X |  X |RXRE|  X |  X |               TXDO                    |
        +-------------------------------------------------------------------------------+

        TXWE: UART TX Write Enable
        RXRE: UART RX Read Enable
        TXDO: UART TX Data Output

#### oVT100

The VGA Text device emulates a terminal which the user can talk to by writing
to the oVT100 register. It supports a subset of the [VT100][] terminal
functionality. The interface behaves much like writing to a UART with the same
busy and control signals. The input is taken from a [PS/2][] keyboard available
on the board, this behaves like the RX mechanism of the UART.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |TXWE|  X |  X |RXRE|  X |  X |               TXDO                    |
        +-------------------------------------------------------------------------------+

        TXWE: VT100 TX Write Enable
        RXRE: UART RX Read Enable
        TXDO: UART TX Data Output

#### oLeds

On the [Nexys3][] board there is a bank of LEDs that are situated next to the
switches, these LEDs can be turned on (1) or off (0) by writing to LEDO. Each
LED here corresponds to the switch it is next to.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X |  X |  X |  X |  X |  X |              LEDO                     |
        +-------------------------------------------------------------------------------+

        LEDO: LED Output

#### oTimerCtrl

The timer is controllable by the oTimerCtrl register, it is a 13-bit timer
running at 100MHz, it can optionally generate interrupts and the current timers
internal count can be read back in with the iTimerDin register.

The timer counts once the TE bit is asserted, once the timer reaches TCMP value
it wraps around and can optionally generate an interrupt by asserting INTE.
This also toggles the Q and NQ lines that come out of the timer and are routed
to pins on the board (see the constraints file [top.ucf][] for the pins).

The timer can be reset by writing to RST.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        | TE | RST|INTE|                      TCMP                                      |
        +-------------------------------------------------------------------------------+

        TE:   Timer Enable
        RST:  Timer Reset
        INTE: Interrupt Enable
        TCMP: Timer Compare Value


#### oIrcMask

The H2 core has a mechanism for interrupts, interrupts have to be enabled or
disabled with an instruction. Each interrupt can be masked off with a bit in
IMSK to enable that specific interrupt. A '1' in a bit of IMSK enables that
specific interrupt, which will be delivered to the CPU if interrupts are
enabled within it.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X |  X |  X |  X |  X |  X |                 IMSK                  |
        +-------------------------------------------------------------------------------+

        IMSK: Interrupt Mask

#### oMemDout

Data to be output to selected address when write enable (WE) issued in
oMemControl.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |                           Data Ouput                                          |
        +-------------------------------------------------------------------------------+

#### oMemControl

This register contains the control registers for the onboard memory on the
[Nexys3][] board. The board contains three memory devices, two non-volatile
memory devices and a volatile RAM based device. The two devices accessible by a
simple SRAM interface (one volatile M45W8MW16, one non-volatile - a
NP8P128A13T1760E) are both accessible, the third is an SPI based memory device,
NP5Q128A13ESFC0E) and is currently not accessible.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        | OE | WE | RST|WAIT| RCS| FCS|                 Address Hi                      |
        +-------------------------------------------------------------------------------+

        OE:  Output Enable - enable reading from current address into iMemDin
        WE:  Write Enable  - enable writing oMemDout into ram at current address
        RST: Reset the Flash memory controller
        RCS: RAM Chip Select, Enable Volatile Memory
        FCS: Flash Chip Select, Enable Non-Volatile Memory
        Address Hi: High Bits of RAM address

OE and WE are mutually exclusive, if both are set then there is no effect.

The memory controller is in active development, and the interface to it might
change.

#### oMemAddrLow

This is the lower address bits of the RAM.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |                           Address Lo                                          |
        +-------------------------------------------------------------------------------+

#### o7SegLED

On the [Nexys3][] board there is a bank of 7 segment displays, with a dot
(8-segment really), which can be used for numeric output. The LED segments
cannot be directly addressed. Instead the value stored in L8SD is mapped
to a hexadecimal display value (or a BCD value, but this requires regeneration
of the SoC and modification of a generic in the VHDL).

The value '0' corresponds to a zero displayed on the LED segment, '15' to an
'F', etcetera.

There are 4 displays in a row.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |      L7SD0        |       L7SD1       |       L7SD2       |       L7SD3       |
        +-------------------------------------------------------------------------------+

        L7SD0: LED 7 Segment Display (leftmost display)
        L7SD1: LED 7 Segment Display
        L7SD2: LED 7 Segment Display
        L7SD3: LED 7 Segment Display (right most display)

#### iUart

The iUart register works in conjunction with the oUart register. The status of
the FIFO that buffers both transmission and reception of bytes is available in
the iUart register, as well as any received bytes.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X |TFFL|TFEM|  X |RFFL|RFEM|                RXDI                   |
        +-------------------------------------------------------------------------------+

        TFFL: UART TX FIFO Full
        TFEM: UART TX FIFO Empty
        RFFL: UART RX FIFO Full
        RFEM: UART RX FIFO Empty
        RXDI: UART RX Data Input

#### iVT100

The iVT100 register works in conjunction with the oVT100 register. The status of
the FIFO that buffers both transmission and reception of bytes is available in
the iVT100 register, as well as any received bytes. It works the same as the
iUart/oUart registers.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X |TFFL|TFEM|  X |RFFL|RFEM|  0 |           ACHR                   |
        +-------------------------------------------------------------------------------+

        TFFL: VGA VT100 TX FIFO Full
        TFEM: VGA VT100 TX FIFO Empty
        RFFL: PS2 VT100 RX FIFO Full
        RFEM: PS2 VT100 RX FIFO Empty
        ACHR: New character available on PS2 Keyboard

#### iTimerDin

This register contains the current value of the timers counter.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X |                       TCNT                                     |
        +-------------------------------------------------------------------------------+

        TCNT: Timer Counter Value

#### iSwitches

iSwitches contains input lines from multiple sources. The buttons
(BUP, BDWN, BLFT, BRGH, and BCNT) correspond to a [D-Pad][] on the [Nexys3][]
board. The switches (TSWI) are the ones mentioned in oLeds, each have an LED
next to them.

The switches and the buttons are already debounced in hardware so they do not
have to be further processed once read in from these registers.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |  X |  X |  X | BUP|BDWN|BLFT|BRGH|BCNT|               TSWI                    |
        +-------------------------------------------------------------------------------+

        BUP:  Button Up
        BDWN: Button Down
        BLFT: Button Left
        BRGH: Button Right
        BCNT: Button Center
        TSWI: Two Position Switches

#### iMemDin

Memory input, either from the SRAM or Flash, indexed by oMemControl and
oMemAddrLow. When reading from flash this might actually be status information
or information from the query table.

        +-------------------------------------------------------------------------------+
        | 15 | 14 | 13 | 12 | 11 | 10 |  9 |  8 |  7 |  6 |  5 |  4 |  3 |  2 |  1 |  0 |
        +-------------------------------------------------------------------------------+
        |                           Data Input                                          |
        +-------------------------------------------------------------------------------+


### Interrupt Service Routines

The following interrupt service routines are defined:

|       Name        | Number |         Description           |
|-------------------|--------|-------------------------------|
| isrNone           |   0    | Not used                      |
| isrRxFifoNotEmpty |   1    | UART RX FIFO Is Not Empty     |
| isrRxFifoFull     |   2    | UART RX FIFI Is Full          |
| isrTxFifoNotEmpty |   3    | UART TX FIFO Is Not Empty     |
| isrTxFifoFull     |   4    | UART TX FIFO Is Full          |
| isrKbdNew         |   5    | New PS/2 Keyboard Character   |
| isrTimer          |   6    | Timer Counter                 |
| isrDPadButton     |   7    | Any D-Pad Button Change State |


When an interrupt occurs, and interrupts are enabled within the processor, then
a call to the location in memory is performed - the location is the same as the
ISR number. An ISR with a number of '4' will perform a call (not a jump) to the
location '4' within memory, for example.

Interrupts have a latency of at least 4-5 cycles before they are acted on, there
is a two to three cycle delay in the interrupt request handler, then the call
to the ISR location in memory has to be done, then the call to the word that
implements the ISR itself.

If two interrupts occur at the same time they are processed from the lowest
interrupt number to the highest.

Interrupts are lost when an interrupt with the same number occurs that has not
been processed.

# The Toolchain

The Assembler, Disassembler and [C][] based simulator for the H2 is in a single
program (see [h2.c][]). This simulator complements the [VHDL][] test bench
[tb.vhd][] and is not a replacement for it.

To build it a [C][] compiler is needed, the build target "h2" will build the
executable:

        make h2

And it can be run on the source file [h2.fth][] with the make target:

        make run

The make file is not needed:

        Linux:

        cc -std=c99 h2.c -o h2  # To build the h2 executable
        ./h2 -h                 # For a list of options
        ./h2 -T -R h2.fth       # Assemble h2.fth and run it

        Windows:

        gcc -std=c99 h2.c -o h2.exe # Builds the h2.exe executable
        h2.exe -h                   # For a list of options
        h2.exe -T -R h2.fth         # Assemble h2.fth and run it

A list of command line options available:

        -       stop processing options, following arguments are files
        -h      print a help message and exit
        -v      increase logging level
        -d      disassemble input files (default)
        -D      full disassembly of input files
        -T      Enter debug mode when running simulation
        -a      assemble file
        -r      run hex file
        -R      assemble file then run it
        -L #    load symbol file
        -S #    save symbols to file
        -s #    number of steps to run simulation (0 = forever)
        -n #    specify NVRAM block file (default is nvram.blk)
        file*   file to process

This program is released under the [MIT][] license, feel free to use it and
modify it as you please. With minimal modification it should be able to
assemble programs for the original [J1][] core.

## Assembler

The assembler is actually a compiler for a pseudo Forth like language with a
fixed grammar. It is a much more restricted language than Forth and cannot be
extended within itself like Forth can.

The main program can be found in [h2.fth][], which is still currently in
testing.

The assembler/compiler reads in a text file containing a program and produces a
hex file which can be read in by the simulator, disassembler, the VHDL test
bench or read in by the [Xilinx ISE][] toolchain when it generates the bit file
for the [Spartan 6][] on the [Nexys3][] board.

A rough [EBNF][] grammar for the language is as follows:

        Program     := Statement* EOF
        Statement   :=   Label | Branch | 0Branch | Call | Literal | Instruction
                       | Identifier | Constant | Variable | Location | Definition | If
                       | Begin | Char | Set | Pc | Pwd | Break | Mode | String | BuiltIn
        Label       := Identifier ";"
        Branch      := "branch"  ( Identifier | Literal )
        0Branch     := "0branch" ( Identifier | Literal )
        Call        := "call"    ( Identifier | Literal )
        Set         := ".set"    ( Identifier | Literal | String ) ( Identifier | Literal | String )
        Pc          := ".pc"     ( Identifier | Literal )
        Pwd         := ".pwd"    ( Identifier | Literal )
        Break       := ".break"
        BuiltIn     := ".built-in"
        Mode        := ".mode"      Literal
        Allocate    := ".allocate" ( Identifier | Literal )
        Constant    := "constant" Identifier Literal "hidden"?
        Variable    := "variable" Identifier ( Literal | String ) "hidden"?
        Location    := "Location" Identifier ( Literal | String )
        Instruction := "@" | "store" | "exit" | ...
        Definition  := ":" ( Identifier | String) Statement* ";" ( "hidden" | "immediate" | "inline")
        If          := "if" Statement* [ "else" ] Statement* "then"
        Begin       := "begin" Statement* ("until" | "again" | "while" Statement* "repeat")
        For         := "for"   Statement* ("aft" Statement* "then" Statement* | "next")
        Literal     := [ "-" ] Number
        String      := '"' SChar* '"'
        Char        := "[char]" ASCII ","
        Number      := Hex | Decimal
        Decimal     := "0" ... "9" ("0" ... "9")*
        Hex         := "$" HexDigit HexDigit*
        HexDigit    := ( "a" ... "f" | "A" ... "F" )
        SChar       := Any character except quote

Literals have higher priority than Identifiers, and comments are '\'
until a new line is encountered, or '(' until a ')' is encountered.

The grammar allows for nested word definitions, however state is held in the
lexer to prevent this.

The assembler the following directives:

        .pc        Set the program counter 
        .pwd       Set the previous word pointer
        .allocate  Increment the program counter
        .set       Set location in memory
        .mode      Change compiler mode
        .built-in  Assemble built words here


There are several optimizations that can be performed, the ".mode" directive
controls whether they are active, along with controlling whether word
definitions are compiled with their headers or not. Optimizations performed
include merging a call to exit with the previous instruction if it is possible
to do so and performing tail call optimization where possible.

The built in words, with their instruction encodings:


| Word   | 0 | 1 | 1 |   ALU OPERATION   |T2N|T2R|N2A|R2P| RSTACK| DSTACK|
|--------|---|---|---|-------------------|---|---|---|---|-------|-------|
| dup    | 0 | 1 | 1 |       T           | X |   |   |   |       |  +1   |
| over   | 0 | 1 | 1 |       N           | X |   |   |   |       |  +1   |
| invert | 0 | 1 | 1 |       ~ T         |   |   |   |   |       |       |
| +      | 0 | 1 | 1 |       T + N       |   |   |   |   |       |  -1   |
| swap   | 0 | 1 | 1 |       N           | X |   |   |   |       |       |
| nip    | 0 | 1 | 1 |       T           |   |   |   |   |       |  -1   |
| drop   | 0 | 1 | 1 |       N           |   |   |   |   |       |  -1   |
| exit   | 0 | 1 | 1 |       T           |   |   |   | X |   -1  |       |
| &gt;r  | 0 | 1 | 1 |       N           |   | X |   |   |   +1  |  -1   |
| r&gt;  | 0 | 1 | 1 |       R           | X |   |   |   |   -1  |  +1   |
| r@     | 0 | 1 | 1 |       R           | X |   |   |   |       |  +1   |
| @      | 0 | 1 | 1 |       [T]         |   |   |   |   |       |       |
| store  | 0 | 1 | 1 |       N           |   |   | X |   |       |  -1   |
| rshift | 0 | 1 | 1 |       N >> T      |   |   |   |   |       |  -1   |
| lshift | 0 | 1 | 1 |       N << T      |   |   |   |   |       |  -1   |
| =      | 0 | 1 | 1 |       N = T       |   |   |   |   |       |  -1   |
| u<     | 0 | 1 | 1 |       N u< T      |   |   |   |   |       |  -1   |
| <      | 0 | 1 | 1 |       N < T       |   |   |   |   |       |  -1   |
| and    | 0 | 1 | 1 |       T & N       |   |   |   |   |       |  -1   |
| xor    | 0 | 1 | 1 |       T ^ N       |   |   |   |   |       |  -1   |
| or     | 0 | 1 | 1 |       T or N      |   |   |   |   |       |  -1   |
| depth  | 0 | 1 | 1 |       depth       |   |   |   |   |       |  +1   |
| 1-     | 0 | 1 | 1 |       T - 1       |   |   |   |   |       |       |
| seti   | 0 | 1 | 1 |   set interrupts  |   |   |   |   |       |  -1   |
| iset?  | 0 | 1 | 1 |   interrupts on?  |   |   |   |   |       |  +1   |
| rdepth | 0 | 1 | 1 |      rdepth       |   |   |   |   |       |  +1   |
| 0=     | 0 | 1 | 1 |        0=         |   |   |   |   |       |       |
| up1    | 0 | 1 | 1 |        T          |   |   |   |   |       |  +1   |
| nop    | 0 | 1 | 1 |        T          |   |   |   |   |       |       |
| cpu-id | 0 | 1 | 1 |      CPU ID       |   |   |   |   |       |  +1   |
| rdrop  | 0 | 1 | 1 |        T          |   |   |   |   |  -1   |       |


The language used in the assembler is Forth like, the best example of how to
use it is in the file "h2.fth", which contains a working Forth interpreter and
many Forth definitions. New words can be defined in the usual manner:

        : 2+ 2 + ;
        : ?dup dup if dup then ;

Control structure mismatches cause the parser to terminate with an error
condition as they are handled with a parser, each ":" must have a corresponding
";", an "if" must have either an "else" and then "then", or just "then",
etcetera.

Variables and constants can also be defined, but the grammar is slightly
different to how it works in a normal Forth:

        variable x 55
        constant y 20

Constants take up no space unless they are used, whereas variables are
allocated a location and set to an initial value. The above example creates a
variable 'x' and sets the variable to '55'. It also adds a constant 'y' to the
current symbol table, which can be used in other function definitions.

Code that does not appear within a word definition is assembled at that
location.

The following control structures are available:

        T = value to consume off the top of the stack
        A = First clause
        B = Second clause
        C = Third clause

        T if     A    else    B then           If T != 0 execute A else execute B
        T if     A    then                     If T != 0 execute A
          begin  A T  until                    Execute T until T != 0
          begin  A    again                    Inifinite loop, execute A
          begin  A T  while   B repeat         Execute A, if T = 0 exit loop, else execute B
        T for    A    next                     Execute loop T times (stores
                                               loop parameter on the return stack)
        T for    A    aft     B then C next    Execute loop T times, skip B on first loop
          label: A    branch  label            Branch to label
          label: A T  0branch label            Branch to label if T = 0

Unlike in a normal Forth environment these control structures can be called
from outside functions definitions. They must also matched up correctly,
otherwise a syntax error will be raised.

## Disassembler

The disassembler takes a text file containing the assembled program, which
consists of 16-bit hexadecimal numbers. It then attempts to disassemble the
instructions. It can also be fed a symbols file which can be generated by the
assembler and attempt to find the locations jumps and calls point to.

The disassembler is used by a [tcl][] script called by [GTKwave][], it
turns the instruction trace of the H2 from a series of numbers into the
instructions and branch destinations that they represent. This makes debugging
the VHDL much easier.

![H2 Disassembly Results](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/forth-cpu-wave.png "Dissembled Instructions in GTKWave")

The purple trace shows the disassembled instructions.

## Simulator

The simulator in C implements the H2 core and most of the SoC. The IO for the
simulator is not cycle accurate (and most likely will never be), but can be
used for running and debugging programs with results that are very similar to
how the hardware behaves. This is much faster than rebuilding the bit file used
to flash the [FPGA][].

## Debugger

The simulator also includes a debugger, which is designed to be similar to the
[DEBUG.COM][] program available in [DOS][]. The debugger can be used to
disassemble sections of memory, inspect the status of the peripherals and dump
sections of memory to the screen. It can also be used to set breakpoints,
single step and run through the code until a breakpoint is hit.

To run the debugger either a hex file or a source file must be given:

        # -T turns debugging mode on
        ./h2 -T -r file.hex  # Run simulator
        ./h2 -T -R file.fth  # Assemble and run some code

Both modes of operation can be augmented with a symbols file, which lists where
variables, labels and functions are located with the assembled core.

When the "-T" option is given debug mode will be entered before the simulation
is executed. A prompt should appear and the command line should look like this:

        $ ./h2 -T -R h2.fth
        Debugger running, type 'h' for a list of command
        debug>

Break points can be set either symbolically or by program location, the 'b'
command is used to set breakpoints:

Numbers can be entered in octal (prefix the number with '0'), hexadecimal
(prefix with '0x') or in decimal. As an example, the following three debug
commands all set a breakpoint at the same location:

        debug> b 16
        debug> b 0x10
        debug> b 020

'k' can be used to list the current break points that are set:

        debug> k
                0x0010

This sets a breakpoint when the function "key?" is called:

        debug> b key?

Functions and labels can both be halted on, this requires either a
symbols file to be specified on the command line or assemble and run
to be used on a source file, not a hex file. Symbol files can be used
on source or on hex files.

To single step the 's' command can be given, although not much will happen if
tracing is turned off (tracing is off by default). Tracing can be toggled on or
off with the 't' command:

        debug> s
        debug> s
        debug> t
        trace on
        debug> s
        0001: pc(089a) inst(4889) sp(0) rp(0) tos(0000) r(0000) call 889 init
        debug> s
        0002: pc(0889) inst(807a) sp(0) rp(1) tos(0000) r(089b) 7a
        debug> s
        0003: pc(088a) inst(e004) sp(1) rp(1) tos(007a) r(089b) 6004

It is advisable to turn tracing off when running issuing the 'c', or continue,
command.

The '.' command can be used to display the H2 cores internal state:

        debug> .
        Return Stack:
        0000: 0000 08aa 0883 017b 0000 031b 0000 ffb0 0000 02eb ffb5 0210 0167 0167
        0167 0167
        0010: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
        0000 0000

        Variable Stack:
        tos:  0000
        0001: 0000 0000 0000 0001 0004 0005 0000 ffb0 0000 0000 0000 0000 0000 0000
        0000 0000
        0011: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
        0000 0000

        pc:   0538
        rp:   0001
        dp:   0000
        ie:   false

And the 'p' command can be used to display the state of the simulated
peripherals:

        debug> p
        LEDS:          00
        VGA Cursor:    0005
        VGA Control:   007a
        Timer Control: 8032
        Timer:         001b
        IRC Mask:      0000
        UART Input:    6c
        LED 7seg:      0005
        Switches:      00
        LFSR:          40ba
        Waiting:       false

For a complete list of commands, use the 'h' command.

Other ways to enter debug mode include putting the ".break" assembler directive
into the source code (this only works if the assemble and run command is used
on source files, not on hex files), and hitting the escape character when the
simulator is trying to read data via the simulated UART or PS/2 keyboard (the
escape will still be passed onto the simulator, but it also activates debug
mode).

## Graphical simulator

A separate program can be compiled, tested under [Linux][] and [Windows][].
This simulates the [Nexys3][] board peripherals that the SoC interfaces with,
but provides a graphical environment, unlike the command line utility. It is easier
to interact with the device and see what it is doing, but the debugging sessions
are a less controlled. It requires [free glut][].

* VGA output works
* UART or PS/2 input (selectable by pressing F11) comes from typing in the screen,
and in the case of the UART this is buffered with a FIFO.
* UART output gets written to a display box.
* There are four 7-Segment displays as on the original board.
* The switches and push buttons can take their input from either keyboard keys
or from mouse clicks.
* The LED indicators above the switches can be lit up.

Below is an image of a running session in the GUI simulator:

![H2 GUI Simulator](https://raw.githubusercontent.com/howerj/howerj.github.io/master/h2/forth-cpu-gui.png "Running GUI H2 SoC Simulator")

Building can be done with

        make gui

And running:

        make gui-run

Or:

        ./gui   h2.hex (on Linux)
        gui.exe h2.hex (on Windows)

The [Linux][] build should work when the development package for [free glut][]
is installed on your system, the [Windows][] build may require changes to the
build system and/or manual installation of the compiler, libraries and headers.

The current key map is:

        Up         Activate Up D-Pad Button, Release turns off
        Down       Activate Down D-Pad Button, Release turns off
        Left       Activate Left D-Pad Button, Release turns off
        Right      Activate Right D-Pad Button, Release turns off
        F1 - F8    Toggle Switch On/Off, F1 is left most, F8 Right Most
        F11        Toggle UART/PS2 Keyboard Input
        F12        Toggle Debugging Information
        Escape     Quit simulator

All other keyboard keys are redirected to the UART or PS/2 Keyboard input.

The Switches and D-Pad buttons can be clicked on to turn them on, the switches
turn on with left clicks and off with right clicks. The D-Pads buttons turn on
with a click on top of them and turn off with a key release anywhere on the
screen.

# VHDL Components

The VHDL components used in this system are designed to be reusable and
portable across different toolchains and vendors. Hardware components, like block
RAM, are inferred and not explicitly instantiated. The components are also made
to be as generic as possible, with most having selectable widths. This would be
taken to the extreme, but unfortunately many vendors still do not support the
VHDL-2008 standard.

| File     | License    | Author          | Description                         |
| -------- | ---------- | --------------- | ----------------------------------- |
| util.vhd | MIT        | Richard J Howe  | A collection of generic components  |
| h2.vhd   | MIT        | Richard J Howe  | H2 Forth CPU Core                   |
| vga.vhd  | LGPL 3.0   | Javier V García | Text Mode VGA 80x40 Display         |
| uart.vhd | Apache 2.0 | Peter A Bennett | UART, modified from original        |
| kbd.vhd  | ???        | Scott Larson    | PS/2 Keyboard                       |
| led.vhd  | MIT        | Richard J Howe  | LED 7-Segment + Dot Display Driver  |


# eForth on the H2

The pseudo Forth like language used as an assembler is described above, the
application that actually runs on the Forth core is in itself a Forth
interpreter. This section describes the Forth interpreter that runs on H2 Core,
it is contained within [h2.fth][].

TODO:
- Fully implement the Forth interpreter
- Describe and show its operation on here including memory layout, list of
  words, word layout, ...

# Using Forth as a bootloader

A running Forth environment can be quite easily used as a bootloader with no
further modification, a simple protocol for sending data and verification of it
can be built using only Forth primitives - although it is not the most
efficient use of bandwidth.

The sender can interrogate the running Forth environment over the serial link
to determine the amount of space left in memory, and then populate it with an
assembled binary.

The Forth words needed are:


| Word    | Description           |
| ------- | --------------------- |
| .free   | show free space       |
| cp      | compile pointer       |
| pwd     | previous word pointer |
| @       | load                  |
| !       | store                 |
| cr      | print new line        |
| execute | execute               |
| decimal | set decimal output    |
| cells   | size of cell          |
| .       | print number          |


And of course numeric input, all of which are provided by this interpreter. The
protocol is line oriented, the host with the program to transfer to the H2
(called PC) sends a line of text and expects a reply from the H2 board (called
H2),

        PC: decimal           ( set the H2 core to a known numeric output )
        PC: .free cp @ . cr   ( query how much space is left, and where to put it )
        H2: ADDR ADDR         ( H2 replies with both addresses )
        PC: 1 cells . cr      ( PC queries size of cells )
        H2: 2                 ( H2 responds, PC now knows to increment ADDR )
        PC: NUM  ADDR !       ( PC write NUM to ADDR )
        PC: ADDR @ . cr       ( optionally PC checks value )
        H2: NUM               ( H2 responds with value stored at ADDR )
        ...                   ( PC and H2 do this as often as necessary )
        PC: ADDR pwd !        ( PC optionally updates previous word register )
        PC  ADDR cp  !        ( PC optionally updated compile poiinter )
        PC: ADDR execute      ( Begin execution of word )

The advantage of this "protocol" is that is human readable, and includes a
debugger for the microcontroller it is operating on.

# A simple Forth block editor

TODO:
- Add an SPI master with Rx/Tx Fifos to the SoC, which can then be use
communicate with the memory on the [Nexys3][], this can then be used in
conjunction with a simple block editor for a full Forth experience.
- Talk about implementing a simple block editor in a few words of Forth.

<http://retroforth.org/pages/?PortsOfRetroEditor>

# Coding standards

There are several languages used throughout this project, all of which are
radically different from each other and require their own set of coding
standards and style guides.

## VHDL

Common signal names:

        clk       - The system clock
        rst       - A reset signal for the module
        we        - Write Enable
        re        - Read  Enable
        di        - Data  In
        din       - Data  In
        do        - Data  Out
        dout      - Data  Out
        control   - Generally an input to a register, the documentation
                    for the module will need to be consulted to find out
                    what each bit means
        signal_we - The write enable for 'signal'
        signal_i  - This is an input signal
        signal_o  - This is an output signal

Generally the use of the "\_i" and "\_o" suffixes are not used, modules are
kept short and names chosen so their meaning is obvious. This rule might be
revisited once the project grows.

Components should:

* Be as generic as possible
* Use an asynchronous reset
* If a feature of a module can be made optional, by either ignoring outputs
or setting inputs to sensible values, it should be.
* Where possible use a function, it is easy enough to turn a generic
component into a module that can be synthesized but not the other way around.
* Use "downto" not "to" when specify variable ranges.
* Use assertions throughout the code with the correct severity level ('failure'
for when something has seriously gone wrong or 'error' for debugging purposes)
* Constrain types and generic parameters if possible, as an example, if a generic
value should never be zero, use "positive" not "natural".
* Try not to specify constants with fixed lengths where an expression using
"others" can be used instead, for example:

<!-- -->

        constant N: positive := 4;
        signal a: std_logic_vector(N - 1 downto 0) := (others => '1');



Instead of:


        signal a: std_logic_vector(3 downto 0) := x"F";



The style rules are as follows:

* All words, including keywords, are to be in lower case. An underscore
will separate words in names.
* Tabs are to be used to indent text, a tab spacing of 8 has been used when
making the VHDL code
* Do not repeat the name of a entity, component, function or architecture,
there is little point of repeating this, it just means when a unit has to be
renamed it has to be done in two places instead of one.
* The ":" in definitions of signals belongs next to the signal name, not
some arbitrary amount of spaces after it.
* Group related signals.
* Try to line up rows of signals
* Trigger logic on the rising edge, and use the "rising\_edge" function not
"clk'event and clk ='1'"
* By and large, each warning produced by the synthesis tool should be
justified, and there should be very few warnings in the entire project if any.
* Do not use inferred latches.
* Load data from a file instead of generating VHDL files that contain the data,
synthesis tools can handle impure VHDL functions that can read the initial data
(for a ROM or block RAM as an example) from textual files.

<!-- -->

An example of the formatting guidelines, this describes a simple arbitrary
width register:

        -- Lots of comments about what the unit does should go
        -- here. Describe the waveforms, states and use ASCII
        -- art where possible.
        library ieee;
        use ieee.std_logic_1164.all;
        use ieee.numeric_std.all;    -- numeric_std not std_logic_arith

        entity reg is -- generic and port indented one tab, their parameters two
                generic(
                        N: positive); -- Generic parameters make for a generic component
                port(
                        clk: in  std_logic; -- standard signal names
                        rst: in  std_logic; --
                        we:  in  std_logic;
                        di:  in  std_logic_vector(N - 1 downto 0);
                        do:  out std_logic_vector(N - 1 downto 0)); -- note the position of ");
        end entity; -- "end entity", not "end reg"

        architecture rtl of reg is
                signal r_c, r_n: std_logic_vector(N - 1 downto 0) := (others => '0');
        begin
                do <= r_c;

                process(rst, clk)
                begin
                        if rst = '1' then -- asynchronous reset
                                r_c <= (others => '0');
                        elsif rising_edge(clk) then -- rising edge, not "clk'event and clk = '1'"
                                r_c <= r_n;
                        end if;
                end process;

                process(r_c, di, we)
                begin
                        r_n <= r_c;
                        if we = '1' then
                                r_n <= di;
                        end if;
                end process;
        end; -- "end" or "end architecture"


## C

There is quite a lot of [C][] code used within this project, used to make a
tool chain for the H2 core and to simulate the system. It follows a fairly
strict coding style.

* Tabs are to be used instead of spaces, a tab width of 8 was used when coding
  the C, if this causes any code to go off screen then there is a problem with
  the code and not the tab length.
* Generally the [K&R][] style is followed.
* Line lengths should ideally be limited to 80 characters, but this is
  definitely not an enforced limit.
* Where there are two or more data structures that must be kept in sync, with a
  one to one correspondence of elements, such as an enumeration and an array of
  strings that each enumeration maps onto, an [X-Macro][] should be used to
  keep the data in sync and to initialize the enumeration and array of strings.
* Try to use only portable constructs and isolate the constructs that are not
  portable.

<!-- -->

There is nothing too surprising about the [C][] code within here, so some of
the exceptions should be dealt with.

* Switch statements are formatted depending upon what the switch statement 'case'
clauses look like, if they are a simple one liner such as an assignment or a
mapping then the entire statement should occupy only a single line, for
example:

<!-- -->

        static const char *alu_op_to_string(uint16_t instruction)
        {
                /* notice also that the 'case' clauses are inline with the
                 * switch selector */
                switch(ALU_OP(instruction)) {
                case ALU_OP_T:                  return "T";
                case ALU_OP_N:                  return "N";
                case ALU_OP_T_PLUS_N:           return "T+N";
                case ALU_OP_T_AND_N:            return "T&N";
                case ALU_OP_T_OR_N:             return "T|N";
                case ALU_OP_T_XOR_N:            return "T^N";
                case ALU_OP_T_INVERT:           return "~T";
                case ALU_OP_T_EQUAL_N:          return "N=T";
                case ALU_OP_N_LESS_T:           return "T>N";
                case ALU_OP_N_RSHIFT_T:         return "N>>T";
                case ALU_OP_T_DECREMENT:        return "T-1";
                case ALU_OP_R:                  return "R";
                case ALU_OP_T_LOAD:             return "[T]";
                case ALU_OP_N_LSHIFT_T:         return "N<<T";
                case ALU_OP_DEPTH:              return "depth";
                case ALU_OP_N_ULESS_T:          return "Tu>N";
                case ALU_OP_ENABLE_INTERRUPTS:  return "seti";
                case ALU_OP_INTERRUPTS_ENABLED: return "iset?";
                case ALU_OP_RDEPTH:             return "rdepth";
                case ALU_OP_T_EQUAL_0:          return "0=";
                case ALU_OP_CPU_ID:             return "cpu-id";
                default:                        return "unknown";
                }
        }

* Unnecessary braces are avoided:

<!-- -->

        if(foo)
                bar();
        else
                baz();

* "goto" can be used - it can be misused, but using it does not instantly make
  code inscrutable contrary to popular belief.

## FORTH

The Forth in [h2.fth][] used to build an actual Forth system for the target
core is not a proper Forth, but a compiler for a Forth like language, this
idiosyncratic language has its own way of doing things. The workings of the
language will not be described in this section, only the coding standards and
style guide.

Either type of comment can be used, although "( )" comments are preferred,
single line words should be short at only a few words, multi line words
should be indented properly

        : 1+ 1 + ;
        : negate invert 1+ ;
        : dnegate not >r not 1 um+ r> + ; ( d -- d )

Tabs should be used for indentation and a stack comment present for long or
complex words.

        : ?rx ( -- c -1 | 0 : read in a character of input from UART )
                iUart @ 0x0100 and 0= ( the value to test goes on one line )
                if                    ( the 'if' on another line )
                        0x0400 oUart ! iUart @ 0xff and -1
                else
                        0
                then ; ( ';' is part of the final statement )

        \ This word is too long for the stack comment and to be on a
        \ single line
        : parse ( c -- b u ; <string> )
                >r tib >in @ + #tib @ >in @ - r> parser >in +! ;

Space is seriously limited on the target device at only 8192 cells (16KiB), so
words kept as short as possible, and programs highly factored. Speed is not so
much of an issue as the board and core runs at 100MHz.

Stack comments describe what values and Forth word takes and returns, it is
good practice to make words that accept and return a fixed number of parameters
but in certain circumstances it is advantages to return a variable number of
arguments. The comments also describe the type of the arguments word accepts,
the Forth kernel will do no checking on the data it gets however.

Stack comments should be added with the following scheme:

| Comment        | Meaning                             |
|----------------|-------------------------------------|
| a              | cell address                        |
| n              | signed number                       |
| u              | unsigned number                     |
| b              | string address                      |
| c              | single character                    |
| d              | double width number (2 Cells)       |
| f              | boolean flag (-1 = true, 0 = false) |
| k              | block number                        |
| cfa            | code field address of a word        |
| nfa            | name field address of a word        |
| pwd            | previous word address of a word     |
| &lt;string&gt; | a parsing word                      |

Stack comments have the following format:

          Variable Stack Effects  Return stack Effects     Parsing    Description
        ( arguments -- returns;   R: arguments -- returns; <parses> : comment     )

Examples of words and their stack comments:

        dup   ( n -- n n : duplicate a number )
        >r    ( n -- ; R: -- n : move a number to the return stack )
        r>    ( -- n ; R: n -- : move a number from the return stack )
        parse ( c -- b u; <string> : parse a word delimted by 'c' )
        over  ( n1 n2 -- n1 n2 n1 : duplicate next on stack over top )

Words can have their arguments numbered to make it clearer what the effects
are.

# To Do

* My [embed][] project, which was derived from the simulator and Forth for this
project, has an improved version of Forth which could be reintegrated with
this project. The [embed][] project features a metacompiler suitable for 16-bit
systems like this one, it could be used in lieu of the Pseudo-Forth compiler.
* Guide to reusing the VHDL in this project, and component listing
* Turn the [h2.fth][] into a literate program file, describing how to build up
a Forth system like in "The Zen of eForth by C. H. Ting".
* Make a bootloader/program loader as a single, simple program
* Make diagrams of the SoC layout, take GIFs and pictures of the simulators and
the running board.
* Make a javascript based simulator for the H2, perhaps with [emscripten][]
* Move this file to "h2.md" and make a simpler "readme.md" with a short
description and flashy GIFs
* Add notes about picocom, and setting up the hardware:
* Add a [Wishbone interface][] for each component
* Put the project up on [opencores][]
* It should be possible to turn [h2.c][] into a Forth interpreter usable in
a hosted environment. Some of the remaining instructions could be used for
function calls and floating point operations and the "uint16\_t" would have to
be changed to "uintptr\_t". As the interpreter is so small a Forth to "C"
compiler could just spit out a copy of the interpreter. It should make for
a very compact system. 
* Spin off the VT100 VHDL code into a separate project, and the C code in the
simulator into a simple terminal emulator for Unixen.
* Add some example games, any of the following: Space Invaders, Tetris, Pong,
Minesweeper, Sokoban, Sudoku, Hack, 2048, Breakout, Brain F\*ck, Conway's 
Game of Life, and other terminal based games could be added, see
<https://codegolf.stackexchange.com/questions/52547/minimal-nethack> for a
minimal version of Nethack, and <https://github.com/rupa/YOU_ARE_DEAD> for a
cool 1 dimensional game.
* Other programs like a compressor, decompressor, ARC4, data base function based
around blocks, and a meta compiler (removing the need for the C assembler),
could be added.



<!-- -->

        picocom --omap delbs -b 115200 -e b /dev/ttyUSB1

# Resources

* <https://nanode0000.wordpress.com/2017/04/08/exploring-the-j1-instruction-set-and-architecture/>
* <https://www.fpgarelated.com/showarticle/790.php>
* <https://opencores.org/>
* <https://en.wikipedia.org/wiki/Peephole_optimization>
* <https://en.wikipedia.org/wiki/Superoptimization>
* <https://github.com/samawati/j1eforth>
* <https://github.com/jamesbowman/j1>
* <http://sovietov.com/app/forthwiz.html>

[javascript]: https://www.javascript.com/
[emscripten]: https://github.com/kripken/emscripten
[DEBUG.COM]: https://en.wikipedia.org/wiki/Debug_%28command%29
[DOS]: https://en.wikipedia.org/wiki/DOS
[h2.c]: h2.c
[h2.fth]: h2.fth
[tb.vhd]: tb.vhd
[uart.vhd]: uart.vhd
[top.ucf]: top.ucf
[font.bin]: font.bin
[text.bin]: text.bin
[J1]: http://www.excamera.com/sphinx/fpga-j1.html
[J1 PDF]: http://excamera.com/files/j1.pdf
[PL/0]: https://github.com/howerj/pl0
[libforth]: https://github.com/howerj/libforth/
[MIT]: https://en.wikipedia.org/wiki/MIT_License
[LGPL]: https://www.gnu.org/licenses/lgpl-3.0.en.html
[VHDL]: https://en.wikipedia.org/wiki/VHDL
[Verilog]: https://en.wikipedia.org/wiki/Verilog
[UART]: https://en.wikipedia.org/wiki/Universal_asynchronous_receiver/transmitter
[FORTH]: https://en.wikipedia.org/wiki/Forth_%28programming_language%29
[Nexys3]: http://store.digilentinc.com/nexys-3-spartan-6-fpga-trainer-board-limited-time-see-nexys4-ddr/
[Make]: https://en.wikipedia.org/wiki/Make_%28software%29
[C]: https://en.wikipedia.org/wiki/C_%28programming_language%29
[Debian]: https://en.wikipedia.org/wiki/Debian
[Linux]: https://en.wikipedia.org/wiki/Linux
[GCC]: https://en.wikipedia.org/wiki/GNU_Compiler_Collection
[Xilinx ISE]: https://www.xilinx.com/products/design-tools/ise-design-suite.html
[Xilinx]: https://www.xilinx.com
[GHDL]: http://ghdl.free.fr/
[GTKWave]: http://gtkwave.sourceforge.net/
[C99]: https://en.wikipedia.org/wiki/C99
[tcl]: https://en.wikipedia.org/wiki/Tcl
[Wishbone interface]: https://en.wikipedia.org/wiki/Wishbone_%28computer_bus%29
[D-Pad]: https://en.wikipedia.org/wiki/D-pad
[FIFO]: https://en.wikipedia.org/wiki/FIFO_%28computing_and_electronics%29
[VGA]: https://en.wikipedia.org/wiki/Video_Graphics_Array
[PS/2]: https://en.wikipedia.org/wiki/PS/2_port
[LED]: https://en.wikipedia.org/wiki/Light-emitting_diode
[8 Segment LED Display]: https://en.wikipedia.org/wiki/Seven-segment_display
[ISO 8859-1 (Latin-1)]: https://cs.stanford.edu/people/miles/iso8859.html
[Spartan 6]: https://www.xilinx.com/products/silicon-devices/fpga/spartan-6.html
[FPGA]: https://en.wikipedia.org/wiki/Field-programmable_gate_array
[ASCII]: https://en.wikipedia.org/wiki/ASCII
[free glut]: http://freeglut.sourceforge.net/
[pthreads]: https://en.wikipedia.org/wiki/POSIX_Threads
[LFSR]: https://en.wikipedia.org/wiki/Linear-feedback_shift_register
[freeglut]: http://freeglut.sourceforge.net/
[EBNF]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
[K&R]: https://en.wikipedia.org/wiki/Indent_style#K.26R
[X-Macro]: https://en.wikipedia.org/wiki/X_Macro
[Windows]: https://en.wikipedia.org/wiki/Windows_7
[pandoc]: https://pandoc.org
[picocom]: https://github.com/npat-efault/picocom
[Gforth]: https://www.gnu.org/software/gforth/
[opencores]: https://opencores.org
[VT100]: https://en.wikipedia.org/wiki/VT100
[embed]: https://github.com/howerj/embed
<!--

https://www.w3schools.com/css/css_table.asp
http://adis.ca/entry/2011/pretty-code-block-in-css/
https://www.w3.org/Style/Examples/007/center.en.html
https://css-tricks.com/centering-css-complete-guide/
-->

<style type="text/css">body{margin:40px auto;max-width:850px;line-height:1.6;font-size:16px;color:#444;padding:0 10px}h1,h2,h3{line-height:1.2}table {width: 100%; border-collapse: collapse;}table, th, td{border: 1px solid black;}code { color: #091992; } </style>
Go to most recent revision | Compare with Previous | Blame | View Log
Browse

Tools

Subversion Repositories forth-cpu

[/] [forth-cpu/] [trunk/] [readme.md] - Rev 3