Rev |
Log message |
Author |
Age |
Path |
105 |
Fixed some nasty early branching bugs. Adjusted the Makefile to declare that
cpudefs.h was automatically generated from cpudefs.v, and made sure that
zipbones included the cpudefs.v so it could get the DEBUG_SCOPE define.
In addition, the test.S was updated to test long jumps, the early branching
bug we found, and all three early branching instructions: ADD #x,PC, LOC(PC),PC,
and LDI #x,PC. |
dgisselq |
3108d 11h |
/zipcpu/trunk/rtl/ |
91 |
Minor updates. |
dgisselq |
3149d 04h |
/zipcpu/trunk/rtl/ |
90 |
Removed MOV x(PC),PC from the list of possible early branching instructions.
ADD X,PC and LDI X,PC are now the only recognized early branching instructions.
This was done to spare logic, although I don't think I spared more than a
LUT or two. |
dgisselq |
3149d 04h |
/zipcpu/trunk/rtl/ |
88 |
Eliminated some warnings. The div fixes were to simplify the logic, even though
the result is less readable ... |
dgisselq |
3173d 04h |
/zipcpu/trunk/rtl/ |
84 |
Minor updates. |
dgisselq |
3175d 02h |
/zipcpu/trunk/rtl/ |
83 |
Added a flag to indicate whether an exception took place on the first
or second half of a VLIW instruction--will be zero in non-VLIW mode,
equivalent to the second half of the instruction having caused the
exception. (Expect these flags to be reordered some time in the future into
a less haphazard ordering ...)
Vastly simplified the pipeline logic, primarily for op_stall, but also touched
opA and opB. (Trying to fit within timing on Spartan 6 ...)
Changed division instruction to include a reset on clear_pipeline, to make
certain [BC $addr; DIV Rx,Ry ] works regardless of whether the condition is
true. |
dgisselq |
3175d 02h |
/zipcpu/trunk/rtl/ |
82 |
Found and (I hope) fixed a nasty bug that would send the prefetch into an
endless loop whenever you jumped to an instruction at the last location
in an unloaded cache line. |
dgisselq |
3175d 02h |
/zipcpu/trunk/rtl/ |
81 |
Trying to clean up ISE generated warnings. |
dgisselq |
3175d 02h |
/zipcpu/trunk/rtl/ |
80 |
Bug fix: declared the (combined) multiply to be signed again. Also
changed the name of the generate'd for block, to keep ISE from complaining. |
dgisselq |
3175d 02h |
/zipcpu/trunk/rtl/ |
71 |
This contains a bunch of bug fixes. (A lot ...) For example, the pipeline
stall code has also seriously changed, to fixed the pipeline memory load/op
stage conflict, while maintaining no-stall operation for operands that don't
need an offset. This had a cascading effect, however, so that the multiply
could no longer complete in a single cycle. Therefore, the timing on the
multiplies was slowed down to two cycles from a single cycle. (It's the
only two-cycle ALU operation ...) The illegal instruction code has also been
fixed, so that illegal instructions no longer stalls the prefetch bus. |
dgisselq |
3180d 06h |
/zipcpu/trunk/rtl/ |
69 |
This implements the "new Instruction Set" architecture for the Zip CPU. It's
a massive change set, that touches just about everything but probably not
enough of everything. Please see the spec.pdf for a description of this
new architecture. |
dgisselq |
3186d 10h |
/zipcpu/trunk/rtl/ |
66 |
Adjusted the support for the DEBUG_SCOPE within these so that it can be
compiled in, or not, based upon an external build configuration file: cpudefs.v.
That allows me to make that file project specific, while the rest of the CPU
is shared among all projects. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
65 |
Lots of logic simplifications to the core, in addition to better support for
illegal instruction detection and bus error detection. The biggest change
had to deal with pushing the debug write interface into the ALU write
processing path. This simplifies the logic of adjusting the PC and CC
registers primarily, but also any writes to other registers. It also delays
these register writes by a clock, but since the debug interface is already
ridiculously slow I doubt that matters any. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
64 |
Shuffled some comments into here from elsewhere. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
63 |
Simplified bus interactions, and added support for detecting illegal
instructions (i.e. bus errors) in the pipefetch routine. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
62 |
Simplified the subtraction logic, so the carry bit no longer depends on
a separate 32-bit operation but becomes part of the subtract operation. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
61 |
Simplified the bus delay logic. Depends upon the stall line being irrelevant
outside of a bus cycle. |
dgisselq |
3247d 10h |
/zipcpu/trunk/rtl/ |
56 |
Here's a bit of work in progress for getting the Zip CPU working on a XuLA2
board. Many changes include: the existence of a cpudefs.v file to control
what "options" are included in the ZipCPU build. This allows build control
to be separated from the project directory (one build for a XuLA2 board,
another for a Basys-3 development board). Other changes have made things
perhaps harder to read, but they get rid of warnings from XST.
A big change was the addition of the (* ram_style="distributed" *) comment
for the register set. This was necessary to keep XST from inferring a block
RAM and breaking the logic that was supposed to take place between a register
read and when it was used. |
dgisselq |
3257d 13h |
/zipcpu/trunk/rtl/ |
49 |
Final set of changes finishing the Dhrystone package. Dhrystone, as
implemented by hand in assembly, now works. |
dgisselq |
3267d 04h |
/zipcpu/trunk/rtl/ |
48 |
Files added/updated to get Dhrystone benchmark to work. Several fixes
to the CPU in the process, 'cause it wasn't working. Stall-less ALU
ops now work better, to include grabbing the memory result as it comes out
of the memory unit and placing it straight into either ALU or memory unit
for the next instruction. |
dgisselq |
3267d 04h |
/zipcpu/trunk/rtl/ |
38 |
A couple of quick updates:
- The Zip CPU now supports pipelined memory access at one clock per
instruction (assuming all the instructions are in the cache)
- There is now a 'zipbones' module to build a Zip System without peripherals.
Any peripherals would then need to be external to the CPU.
- Some bug fixes.
Documentation changes coming shortly. |
dgisselq |
3270d 09h |
/zipcpu/trunk/rtl/ |
36 |
*Lots* of changes to increase processing speed and remove pipeline stalls.
Removed the useless flash cache, replacing it with a proper DMA controller.
"make test" in the main directory now runs a test program in Verilator and
reports on the results. |
dgisselq |
3279d 13h |
/zipcpu/trunk/rtl/ |
34 |
Bunches of changes, although very little changed with the core itself.
Regarding the core, some bugs were fixed within zipcpu.v (the CPU part of the
core), so that the debugger can change the program counter. The debugger
can now halt the CPU and then view, examine, and modify registers to include
the program counter, although live changes to the CC register have not been
tested.
There was also a bug in the stall handling of the wishbone bus delay line. This
has now been fixed.
Moving outwards to the system, some parameters have been added to zipsystem
to make it more configurable for whatever environment you might wish to place
it within. Other minor clean ups have taken place, mostly to the internal
documentation.
Lots of changes, though, to the assembler. The big one is the implementation
of #define macros, C style. Several buggy macros were in sys.i. These have
been fixed. The Makefile has been adjusted so that the build of test.S, which
depends upon sys.i, is now properly dependent upon sys.i for make purposes.
Further, not only will zpp, the assembler preprocessor, handle #define macros,
it will also recursive #defines. The assembler expression evaluator has also
been updated to properly handle both operator precedence, as well as modulo
arithmetic.
The master system test file, test.S, found in the sw/zasm directory has been
updated to reflect these new capabilities. (I really need to move it to the
bench/asm directory, so you may expect that change sometime later.) |
dgisselq |
3305d 08h |
/zipcpu/trunk/rtl/ |
30 |
Here's a 20% increase in performance: We've gone from 0.44 clocks per
instruction up to 0.53 clocks per instruction on the test.S testset. The
cost? Oh, only about 300 slices.
Not bad.
The specification document will also soon be updated with a list of
conditions that create stalls, as eliminating stalls was how I managed to get
the performance up like I did. |
dgisselq |
3308d 15h |
/zipcpu/trunk/rtl/ |
25 |
Lots of changes, hopefully all for the better. The result works in a
simulator, although it has yet to be tested yet in an FPGA--so it may still
have Xilinx build errors.
1. The wires brought from the CPU to the Zip System for the debug command
register were adjusted. They now include GIE and SLEEP, but no longer include
the step or break enable bits as these were fairly useless anyway.
2. The user and master A-Stall counters were re-labeled as instruction count
counters (which is what they are now anyway). This is for performance reasons
so that, after the fact, you can measure how many instructions per clock
you were actually able to achieve.
3. The CPU debug access port stall was adjusted so that the data port no longer
stalls when the CPU isn't halted. This can be useful, for example, when trying
to determine where th program counter is at without stalling the CPU. (You'll
still need to read two registers, the supervisor and user program counters, and
reading these registers still requires a write to the debug command port first,
so this still requires 4 single operand wishbone bus cycles.)
4. Signed and unsigned 16-bit multiply capabilities were added to the ALU
(cpuops.v) and support added in the Zip CPU master file as well.
5. The ZIP CPU now spports the TRAP bit in the CC register, so that after a user
interrupt the supervisor can tell that it was a user interrupt versus a hardware
interrupt. This bit is set any time the user disables the GIE bit, and cleared
any time the supervisor sets the GIE bit.
6. A reserved position was created in the CC register for a floating point
enable flag. This flag is permanently false, however, on the current
implementation as it doesn't implement floating point.
7. Logic was added to handle the break instruction. This instruction has now
been tested successfully in the simulator. If a break is issued, the CPU will
either halt (if in supervisor mode, or if in user mode with the break enable
bit set in the CC register), or the CPU will trip an interrupt for the
supervisor to transfer execution to a user-level debugging task.
8. After watching the CPU stall on a LDIHI followed by an LDILO, logic was
adjusted to keep the pipeline from stalling in thesee conditions. This lew
logic works for an 'A' operand, or equivalently for a 'B' operand with no
immediate. In the cases of such logic, the operand is loaded directly from the
output of the ALU into the input of the ALU skipping the operand read stage of
the pipelinle. This logic has not been tested on an FPGA yet, so it isn't clear
if it will break timing requirements or not. (Goal is 100 MHz clock.) As
of this new change, the CPU can now execute 0.48 instructions per clock, versus
the 0.44 it was getting before, across the test set.
9. Sleep logic was adjusted to prevent the user from switching to supervisor
mode and putting the processor to (infinite) sleep at the same time. The
justification was the fact that a user should not be able to halt the CPU when
other processes that might want it might still exist.
Other changes were made as well, but to other portions of the project. Those
will be checked in shortly. |
dgisselq |
3309d 01h |
/zipcpu/trunk/rtl/ |
18 |
A couple of changes: Registers can now be changed via the debug interface.
Also, in anticipation of being able to interrupt the break the processor,
the CPU now exports an interrupt line to the external environment to tell
when it has been halted. Thus, if it gets halted by a break instruction,
the ZipSystem will interrupt whatever's in its environment so that the
debugger can come and examine its state.
Oh, and one other: because you can't examine the state of the CPU without
halting it, I modified the debug control register to export the four
useful flags: break-enable, interrupts enabled, and sleep (step comes for
free in this implementation). |
dgisselq |
3315d 01h |
/zipcpu/trunk/rtl/ |
15 |
Updated the core CPUOPS module to make certain that the carry was properly
set on right shifts. (Carry is then the last bit shifted out to the right,
and has no relation to the high order bits of the word.) Also fixed a bug
in the busdelay.v file that prevented our Quad SPI flash controller from
working. (This bug fix has not yet been tested ...) Our test.S program, the
closest thing we have to a regression test and found in the sw/zasm directory,
still successfully passes in Verilator. |
dgisselq |
3318d 14h |
/zipcpu/trunk/rtl/ |
12 |
Bunch of changes while trying to get a hello world program:
1. Right shifts by 32 or more now result in zero, or all of the top bit in the
case of ASRs.
2. zdump now properly includes addresses with dumped lines.
3. zparser now properly handles immediate values via the .DAT instruction. |
dgisselq |
3333d 06h |
/zipcpu/trunk/rtl/ |
11 |
This version works on an FPGA!!!
(Or at least the wdt.S program passes ...) |
dgisselq |
3333d 14h |
/zipcpu/trunk/rtl/ |
9 |
This checkin is the result of a watchdog timer test, and everything it took
to get the watchdog timer working. The timer function was simplified,
although it now uses a touch more resources--being able to count down 31
bits instead of 30. The parser was modified, since it couldn't handle
storing to register plus offsets like it was supposed to be able to. The
testbench, zippy_tb, was modified to accept an assembled machine code file
such as I might place on a board to test it.
Lots of work to get it working.
Looking at the files below, it looks like I'll need a second check in to check
in the watchdog timer test itself. |
dgisselq |
3334d 04h |
/zipcpu/trunk/rtl/ |