OpenCores
URL https://opencores.org/ocsvn/m16c5x/m16c5x/trunk

Subversion Repositories m16c5x

[/] [m16c5x/] [trunk/] [README.txt] - Blame information for rev 3

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 2 MichaelA
M16C5x Soft-Core Microcomputer
2
=======================
3
 
4
Copyright (C) 2013, Michael A. Morris .
5
All Rights Reserved.
6
 
7
Released under LGPL.
8
 
9
General Description
10
-------------------
11
 
12
This project demonstrates the use of a PIC16C5x-compatible core as an FPGA-
13 3 MichaelA
based processor. It implements the 12-bit instruction set, the timer 0 module,
14
the pre-scaler, and the watchdog timer. The core provided here is compatible
15
with instruction set, but it is not a cycle accurate model of any particular
16
PIC microcomputer.
17 2 MichaelA
 
18
As configured, the core supports single cycle (1) operation with internal
19
block RAM serving as program memory. In addition to the block RAM program
20 3 MichaelA
store, a 4x clock generator and reset controller is included as part of the
21
demonstration.
22 2 MichaelA
 
23
Three I/O ports are supported, but they are accessed as external registers and
24
buffers using a bidirectional data bus. The TRIS I/O control registers are
25
similarly supported. Thus, the core's user is able to map the TRIS and I/O
26
port registers in a manner appropriate to the intended application.
27
 
28
Read-modify-operations on the I/O ports do not generate read strobes. Read
29
strobes of the three I/O ports are generated only if the ports are being read
30
using MOVF xxx,0 instructions. Similarly, the write enables for the three I/O
31
ports are asserted whenever the ports are updated. This occurs during MOVWF
32
instructions, or during read- modify-write operations such as XORF, MOVF, etc.
33
 
34
Implementation
35
--------------
36
 
37
The implementation of the core provided consists of several Verilog source files
38
and memory initialization files:
39
 
40
    M16C5x.v                - Top level module
41
        M16C5x_ClkGen.v     - M16C5x Clock/Reset Generator
42
        P16C5x.v            - PIC16C5x-compatible processor core
43
            P16C5x_IDEC.v   - ROM-based instruction decoder for PIC16C5x core
44
            P16C5x_ALU.v    - Arithmetic & Logic Unit for PIC16C5x core
45
        M16C5x_SPI.v        - High-Speed, FIFO-buffered SPI Master Interface
46
            DPSFmnCE.v      - Configurable Depth/Width LUT-based Synch FIFO
47
                TF_Init.coe - Transmit FIFO Initialization file
48
                RF_Init.coe - Receive FIFO Initialization file
49
            SPIxIF.v        - Configurable Master SPI I/F with clock Generator
50
        M16C5x_UART.v       - UART with Serial Interface
51
            SSPx_Slv.v      - SSP-compatible Slave Interface
52
            SSP_UART.v      - SSP-compatible UART
53
                re1ce.v     - Rising Edge Clock Domain Crossing Synchronizer
54
                DPSFmnCE.v  - onfigurable Depth/Width LUT-based Synch FIFO
55
                    UART_TF.coe - UART Transmit FIFO Initialization file
56
                    UART_RF.coe - UART Receive FIFO Initialization file
57
                UART_BRG.v  - UART Baud Rate Generator
58
                UART_TXSM.v - UART Transmit State Machine (includes SR)
59
                UART_RXSM.v - UART Receive State Machine (includes SR)
60
                UART_RTO.v  - UART Receive Timeout Generator
61
                UART_INT.v  - UART Interrupt Generator
62
 
63
        M16C5x_Test.coe     - M16C5x Test Program Memory Initialization File
64
        M16C5x_Tst2.coe     - M16C5x Test #2 Program Memory Initialization File
65
        M16C5x_Tst3.coe     - M16C5x Test #3 Program Memory Initialization File
66
        M16C5x_Tst4.coe     - M16C5x Test #4 Program Memory Initialization File
67
 
68
        M16C5x.ucf          - M16C5x User Constraint File
69
        M16C5x.bmm          - M16C5x Block RAM Memory Map File
70
 
71
Verilog tesbench files are included for the processor core, the FIFO, and the
72
SPI modules.
73
 
74
    tb_M16C5x.v             - testbench for the soft-core processor module
75
    tb_P16C5x.v             - testbench for the processor core module
76
    tb_DPSFmnCE.v           - testbench for the LUT-based FIFO module
77
    tb_SPIxIF.v             - testbench for the SPI Master Interface module
78
 
79
Also provided is the MPLAB project and the source files used to create the
80
memory initialization files for testing the microcomputer application. These
81
files are found in the MPLAB subdirectory of the Code directory.
82
 
83
Finally, the configuration of the Xilinx tools used to synthesize, map, place,
84
and route are captured in the the TCL file:
85
 
86
        M16C5x_3S50A.tcl    - TCL file for XC3S50A-4VQG100I FPGA
87
 
88
Run this TCL script from within the TCL console of ISE, or examine it in a
89
text editor, to set up the project files and to set the tools to the options
90
used to achieve the results provided here.
91
 
92
Added utility program to convert MPLAB Intel Hex programming files into MEM
93
files for use with Xilinx Data2MEM utility program to speed the process of
94
incorporating program/data/parameter data into block RAMs. TCL also
95
incorporates the process parameter changes to get the BMM file processed by
96
Map/PAR/Bitgen.
97
 
98
    IH2MEM.c                    - Source code for Intel Hex to MEM utility
99
    IH2MEM.exe                  - Windows Executable (32-bit)
100
 
101
        M16C5x_Tst3.mem         - M16C5x Test #3 Program Memory Data2Mem File
102
        M16C5x_Tst4.mem         - M16C5x Test #4 Program Memory Data2Mem File
103
 
104
Synthesis
105
---------
106
 
107
The primary objective of the M16C5x is to synthesize a processor core, 4kW of
108
program memory, a buffered SPI master, and a buffered UART into a Xilinx
109
XC3S50A-4VQG100I FPGA. The present implementation includes the P16C5x core,
110
4kW of program memory, a dual-channel SPI Master I/F, and an SSP-compatible
111
UART supporting baud rates from 3M bps to 1200 bps.
112
 
113
Using ISE 10.1i SP3, the implementation results for an XC3S50A-4VQ100I are as
114
follows:
115
 
116
    Number of Slice FFs:                619 of 1408      43%
117
    Number of 4-input LUTs:            1287 of 1408      92%
118
    Number of Occupied Slices:          701 of  704      99%
119
    Total Number of 4-input LUTs:      1333 of 1408      94%
120
 
121
                    Logic:             1052
122
                    Route-Through:       46
123
                    16x1 RAMs:            8
124
                    Dual-Port RAMs:     194
125
                    32x1 RAMs:           32
126
                    Shift Registers:      1
127
 
128
    Number of BUFGMUXs:                   4 of   24      16%
129
    Number of DCMs:                       1 of    2      50%
130
    Number of RAMB16BWEs                  3 of    3     100%
131
 
132
    Best Case Achievable:           12.381 ns (0.119 ns Setup, 0.691 ns Hold)
133
 
134
Status
135
------
136
 
137 3 MichaelA
Design and verification is complete. Verification performed using ISim, MPLAB,
138 2 MichaelA
and a board with an XC3S200AN-4VQG100I FPGA, various oscillators, SEEPROMs,
139 3 MichaelA
and RS-232/RS-485 transceivers.
140 2 MichaelA
 
141
Release Notes
142
-------------
143
 
144
###Release 1.0
145
 
146
In this release, the M16C5x has been synthesized, mapped, placed, routed, and
147
used to configure an FPGA. The FPGA used for this initial test of the M16C5x
148
was the XC3S200A-4VQG100I FPGA. The test program provided demonstrated that
149
the M16C5x was executing the program in the same manner as simulated with the
150
MPLAB simulator.
151
 
152
Using an external 14.7456 MHz oscillator, selected for use for use with the
153
UART, square waves were generated by the core to illuminate external LEDs
154
using the upper 6 bits of PortA. The square waves have the appropriate ratios,
155
and the frequency of the fastest LED drive signal is ~4.753kHz.
156
 
157
The clock generator multiplies the input frequency to 58.9824 MHz which
158
results in an effective instruction frequency of 29.4912 MHz because of the
159
two cycle nature of the core. The instruction loop is essentially 8*(*+3*256),
160
which equals 6208 cycles per LED toggle. The measured toggle frequency of the
161
fastest LED is approximately equal to 29.4912 MHz / 6208, or 4.750 kHz.
162
 
163
Work will continue to verify the testbench results with the FPGA. The next
164
release should include the UART, and test the ability of the core to
165
send/receive data using the FIFOs at rates of 115,200 baud or greater.
166
 
167
###Release 2.0
168
 
169
In this release, the UART has been addded. An update has been made to the SPI
170
I/F Master function; update correct fault with the framing of SPI Mode 3
171
frames with shift lengths greater than 1 byte. A correction, not fully tested
172
or verified, was made to the P16C5x core to correct anomalous behavior for
173
BTFSC/BTFSS instructions.
174
 
175
UART integrated with the Release 1.0 core. Verification of the integrated
176
interface is underway.
177
 
178
###Release 2.1
179
 
180
Testing with an M16C5x core processor program assembled using
181
MPLAB and ISIM showed that polling of the UART status register to determine
182
whether the transmit FIFO was empty or not (using the iTFE interrupt flag)
183
would clear the generated interrupt flags before they had actually been
184
captured and shifted in the SSP response to the core.
185
 
186
This indicated a clock domain crossing issue in the interrupt clearing logic.
187
This release fixes that issue. Previous use of the UART does not poll the USR,
188
so this problem does not manisfest itself in a reasonable amount of time, if
189
ever. In other words, the synchronization fault has been present all along in
190
the implementation, but the module's usage in the application (or testbench)
191
did not present the conditions under which the fault manifests.
192
 
193
The correction required registering the USR data on the SSP clock domain, and
194
qualifying the clearing of the interrupt flags on the basis of whether the
195
flag is set in both domains when the USR is read. The addition of the register
196
reduced the logic utilization, and only a small additonal time delay was
197
incurred. The resulting design is still able to fit into a Spartan 3A XC3S50A-
198
4VQG100I FPGA.
199
 
200
Modified the UART Baud Rate Generator. Removed the fixed 16x12 ROM that
201
provided the pre-scaler and divider constants for a fixed set of 16 baud
202
rates. Added a 12-bit, write-only register, BRR - Baud Rate Register, that can
203
be used to set the baud rate from 1/16 of the processor clock. With a
204
58.9824 MHz oscillator, the baud rate can range from 3.6864Mbps down to 900 bps.
205
Set the default baud rate to 9600 for a 58.9824 MHz UART clock.
206
 
207
Utilization for a XC3S50A-4VQG100I FPGA is 100%. The 128 byte LUT-based
208
receive FIFO can be reduced to accomodate some additional functions. Synthesis
209
and MAP/PAR able to implement the design. There is also some place holder
210
logic that can be used for other purposes.
211
 
212
###Release 2.2
213
 
214
Updated the soft-core so as to be able to parameterize the microcontroller
215
from the top module. Changed the frequency multiplication from 4 to 5 in order
216
to test operation at the frequency which the UCF constrains Map/PAR tools. The
217
input clock is driven by a 14.7456 MHz oscillator, and the clock multiplier
218
(DCM) generates **73.7280 MHz**. The default baud rate, 9600, required that the
219
default settings be adjusted. All other parameters remain the same.
220
 
221
Also added a Block RAM Memory Map file to the project. Utilized Xilinx's
222
Data2MEM tool to insert modified program contents into the affected Block RAMs
223
using MEM files dereived from standard MPLAB outputs. Tutorial on this subject
224
is being prepared and will be released on an associated Wiki soon.
225
 
226
###Release 2.3
227
 
228
Updated the soft-core microcomputer. Fixed the UART clock, Clk_UART, to twice
229
the input frequency. This means that the UART operates with a fixed reference
230
frequency unlike Release 2.2 where Clk_UART was set to the system clock
231
frequency.
232
 
233
Also added asynchronous resets to several registers in the UART so that it
234
would simulate correcly with ISim. Direct control of the UART prescaler and
235
divider was previously untested using the simulation. With that change to the
236
baud rate generator made to UART, the reset/power-on values of these two logic
237
functions are unknown. The unknowns, "X", propagate through the baud rate
238
generator and prevent the simulator from resolving the state of the internal
239
baud rate clock of the UART. Thus, although the rest of circuits simulate as
240
expected, the transmit shift register never shifts because there's an
241
"unknown" signal level applied on the bit clock.
242
 
243
###Release 2.4
244
 
245
Polling the UART's Receive Data Register (RDR) uncovered a race condition like
246
that previously found and corrected in regards to polling the UART Status
247
Register (USR). Correction required registering the RDR in the SCK clock
248
domain, and qualifying the read enable pulse for the receive FIFO so that it
249
is only generated if the Receive Rdy flag is present in the SCK clock domain.
250
Otherwise, the Receive FIFO is not read which prevents the inadvertent
251
clearing of the FIFO empty flag.
252
 
253
Test Program 4, M16C5x_Tst4.asm, is used to test the receive signal path.
254
Hyperterminal and Tera Term were used to sent (without local echo) several
255
large text files through the M16C5x UART. The test program polls the RDR, and
256
if a character is received without error, then upper case are converted to
257
lower case characters, and vice-versa. Using a Keyspan Quad Port USB serial
258
port adapter, characters were sent to the M16C5x at a rate of 921.6k baud, the
259
highest programmable baud rate supported by the Keyspan device. The echo back
260
to terminal emulator appeared to be without error. (**Note:** _the two wire
261
RS-232 mode of the UART was used for this test. The ADM3232 charge-pump RS-232
262
transceiver appeared to work well at this frequency. Som slew rate limiting is
263
visible on an O-scope, but it appears to be tolerable. These tests were
264
conducted while the core was operating at **117.9648 MHz**._)
265
 
266
This release is expected to be the last public release of this soft-core
267
microcomputer. The released core and peripherals are sufficient to demonstrate
268
a non-trivial FPGA implementation of a soft-core microcomputer. Further
269
developments will be focused on improving access to the internal block RAMs,
270
and improving the I/O capabilities of the release core.
271
 
272
###Release 2.5
273
 
274
Converted the core to operate in a single cycle mode with the block RAM
275
memories of the FPGA. Operating frequency, in a -4 Spartan 3A FPGA, is 60+
276
MHz. This rate is equivalent to the 117.9848 MHz reported above of for Release
277
2.4. Some combinatorial path improvements were made to the processor core,
278
P16C5x, by using wired-OR bus connections rather than explicit multiplexers.
279
These improvements also provided some reductions in the resource utilization
280
of the project.
281 3 MichaelA
 
282
####Release 2.5.1
283
 
284
Modified the BMM file to allow the MEM file data fields to be represented in
285
natural order. In other words, unlike the previous release, the most
286
significant nibble is the first (leftmost) character of each data word, and
287
the least significant nibble is the last (rightmost) character in a data word.
288
Also modified the utility provided that converts Intel Hex programming files
289
into files compatible with the Xilinx Data2MEM utility program.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.