OpenCores
URL https://opencores.org/ocsvn/opb_usblite/opb_usblite/trunk

Subversion Repositories opb_usblite

[/] [opb_usblite/] [trunk/] [doc/] [README2.txt] - Blame information for rev 6

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 6 rehnmaak
-----------------------------------------
2
 USB 1.1 / 2.0 serial data transfer core
3
-----------------------------------------
4
 
5
Version:   2009-10-06
6
Author:    Joris van Rantwijk
7
Language:  VHDL
8
License:   GPL - GNU General Public License
9
Website:   http://www.xs4all.nl/~rjoris/fpga/usb.html
10
 
11
 
12
usb_serial is a synthesizable VHDL core, implementing serial data
13
transfer over USB.  Combined with a UTMI-compatible USB transceiver
14
chip, this core acts as a USB device that transfers a byte stream
15
in both directions over the bus.
16
 
17
This package is free software; you can redistribute it and/or modify
18
it under the terms of the GNU General Public License as published by
19
the Free Software Foundation; either version 2 of the License, or
20
(at your option) any later version.
21
 
22
 
23
-----------------------------------------
24
See MANUAL.pdf for detailed information.
25
-----------------------------------------
26
 
27
 
28
Files in this package
29
---------------------
30
 
31
 COPYING               Text of the GNU General Public License.
32
 MANUAL.pdf            Manual for usb_serial core.
33
 Makefile              Script to synthesizes the VHDL code for Xilinx devices.
34
 usb_serial.vhdl       Main core.
35
 usb_control.vhdl      Sub-entity handling control requests.
36
 usb_init.vhdl         Sub-entity handling device initialization.
37
 usb_packet.vhdl       Sub-entity for sending and receiving packets.
38
 usb_transact.vhdl     Sub-entity for transaction handling.
39
 usbtest.vhdl          Sample top-level design for testing.
40
 te0146.ucf            Constraints file for a TE0146 FPGA module.
41
 testdev.py            Python program running a torture test on usbtest.bit.
42
 perftest.c            C program measuring data transfer performance on Linux.
43
 crcformula.py         Python program for computing CRC update formulas.
44
 
45
----
46
 
47
The rest of this file contains some unorganized notitions.
48
 
49
 
50
Changes from version 2007-04-19 to 2009-10-06
51
---------------------------------------------
52
 
53
* usb_init:
54
  + Add generic HSSUPPORT
55
  + Rename USBRST to I_USBRST
56
  + Add output signal I_HIGHSPEED, active iff attached in high speed mode
57
  + Add output signal I_SUSPEND, active iff suspended by host
58
  + Add output P_CHIRPK
59
  + Add output PHY_XCVRSELECT
60
  + Add output PHY_TERMSELECT
61
  + Implement HS handshake / FS fallback protocol.
62
  + Implement suspend detection.
63
 
64
* usb_packet:
65
  + Add input P_CHIRPK.
66
  + Send continuous chirp-K when P_CHIRPK asserted.
67
  + Use signal s_dataout instead of variable v_dataout as register.
68
  + Recognize PING as a valid token packet.
69
  + Clear PHY_TXVALID and PHY_DATAOUT in response to RESET.
70
  + Pay attention to PHY_RXERROR while receiving handshake packet.
71
  + Eliminate ST_RFIN state and release P_RXACT one cycle earlier,
72
    i.e. at the same time as raising P_RXFIN. (Necessary because PHY_RXACTIVE
73
    may be low for just a single cycle between packets).
74
 
75
* usb_transact:
76
  + Verified that releasing P_RXACT while asserting P_RXFIN is handled fine.
77
  + Add generic HSSUPPORT.
78
  + Add output signal T_PING.
79
  + Add input signal T_NYET; must be valid when SEND goes down.
80
  + Eliminate ST_FIN so that we will always be in time to catch the rising
81
    edge of P_RXACT even in the cycle immediately following P_RXFIN.
82
  + Implement PING transaction (same application timing as IN transaction).
83
  + Implement NYET handshake.
84
  + Reduce guaranteed decision time for application from 10 to 2 cycles.
85
  + Separate inter-packet delay and response timeout values for FS and HS;
86
    increase FS inter-packet delay from 10 to 14 cycles.
87
  + Ignore our own transmitted packet while waiting for ACK.
88
  + Fail transaction if empty packet received while waiting for ACK or DATA.
89
  + Again rejected (after extensive consideration) the idea of using
90
    PHY_LINESTATE for inter-packet delay, even though this is actually
91
    required according to the UTMI standard. It is difficult to reliably
92
    relate PHY_LINESTATE to logical send/receive activity. The best I can come
93
    up with is to have an inter-packet timer which counts down iff the line
94
    is idle as indicated by PHY_LINESTATE. But detecting line idle in FS mode
95
    depends on the SE0-to-J transition, which makes the scheme vulnerable in
96
    case the SE0 state is missed somehow.
97
    So we stay with the concept of inter-packet timing based on PHY_RXACTIVE
98
    plus a much relaxed timeout for host responses.
99
    Note to self: please don't waste more time on this.
100
 
101
* usb_control:
102
  + Add generic HSSUPPORT.
103
  + Rename upstream interface signals to C_xxx.
104
  + Add input signal T_PING (ignored, therefore always ACK-ed).
105
  + Add output signal T_NYET (always driven to zero).
106
  + Redesigned descriptor ROM interface.
107
  + Implement ENDPOINT_HALT feature.
108
  + Implement self-powered bit in status word.
109
 
110
* usb_serial:
111
  + Changed interface to sub-entities.
112
  + Redesigned descriptor ROM interface.
113
  + Implement device_qualifier and other_speed_configuration descriptors.
114
  + Split single block RAM into three separate RAMs for RX buffer,
115
    TX buffer and descriptor ROM.
116
  + Streamline state machine.
117
  + Implement PING / NYET handshake.
118
  + Add RXLEN / TXROOM status signals.
119
  + Add TXCORK control signal.
120
  + Add HIGHSPEED and SUSPEND signals to application interface.
121
  + Prepare for separate clock domains.
122
  + Support halting of endpoints.
123
 
124
* usb_serial_wb:
125
  + Removed. Wishbone is not intended for this kind of thing.
126
 
127
* usbtest:
128
  + Add testing of TXCORK flag.
129
  + Add blast mode for test of fast streaming transmission.
130
 
131
* Makefile:
132
  + Fix command line options for newer versions of Xilinx tools.
133
 
134
* testdev.py:
135
  + Testing of TXCORK feature.
136
  + Adapt test parameters for bigger TX/RX buffers in the device.
137
  + Test partial read of incoming data.
138
 
139
* perftest.c
140
  + Performance measurements.
141
 
142
 
143
Performance measurements
144
------------------------
145
 
146
Version 20090929:
147
 
148
Performance full speed, RX 128, TX 128, libusb-1.0 async:
149
  RX 67108864 bytes in 61.673 s =  1088137 bytes/s
150
  TX 64000000 bytes in 58.816 s =  1088146 bytes/s
151
 
152
Performance high speed, RX 2k, TX 1k, libusb-1.0 async:
153
  RX 67108864 bytes in  1.490 s = 45049302 bytes/s
154
  TX 64000000 bytes in  1.953 s = 32766457 bytes/s
155
 
156
 
157
Intermediate version 20090917:
158
( Comparing performance of normal code against error injection. )
159
 
160
Performance FS, normal:
161
  RX 67108864 bytes in 61.674 s =  1088118 bytes/s
162
  TX 64000000 bytes in 58.820 s =  1088073 bytes/s
163
 
164
Performance HS, normal:
165
  RX 67108864 bytes in  1.535 s = 43727704 bytes/s
166
  TX 64000000 bytes in  1.961 s = 32635212 bytes/s
167
 
168
Performance FS, error injection:
169
  RX 67108864 bytes in 82.163 s =   816777 bytes/s
170
  TX 64000000 bytes in 78.420 s =   816113 bytes/s
171
 
172
Performance HS, error injection:
173
  RX 67108864 bytes in  1.965 s = 34144882 bytes/s
174
  TX 64000000 bytes in  3.110 s = 20576099 bytes/s
175
 
176
 
177
Tested
178
------
179
 
180
  + Suspend/resume with SUSPEND signal used as clock gate.
181
  + Verified that none of the following events occur during functional test:
182
    aborted transaction; duplicate OUT packet; OUT-NAK in high speed mode.
183
  + Deliberate error injection: works ok, but reduced performance as expected.
184
  + Tested SetFeature(ENDPOINT_HALT)
185
  + Functional test and performance test:
186
    + full speed, konijn, linux, RX 128, TX 128
187
    + full speed, konijn, linux, RX 128, TX 128, no_fullpacket
188
    + full speed, konijn, linux, RX 1k, TX 128
189
    + full speed, konijn, linux, RX 128, TX 1k (one time hang in perftest)
190
    + full speed, konijn, linux, RX 2k, TX 1k
191
    + high speed, konijn, linux, RX 1k, TX 1k  (problems with usbserial, fixed)
192
    + high speed, konijn, linux, RX 2k, TX 1k
193
    + high speed, konijn, linux, RX 2k, TX 1k, no_fullpacket
194
    + high speed, konijn, linux, RX 1k, TX 2k
195
    + high speed, konijn, linux, RX 4k, TX 2k
196
    + full speed, schildpad, linux, RX 128, TX 128
197
    + fallback to full speed, schildpad, linux, RX 2k, TX 1k
198
    + full speed, sron, linux
199
    + high speed, sron, linux
200
  + Limited functional test:
201
    + full speed, schildpad, Win2k, RX 128, TX 128 (fails due to zero length packet)
202
    + full speed, schildpad, Win2k, RX 128, TX 128, no_fullpacket
203
    + fallback to full speed, schildpad, Win2k, RX 2k, TX 1k, no_fullpacket (failed)
204
    + full speed, iBook, RX 128, TX 128
205
    + high speed, iBook, RX 2k, TX 1k
206
    + full speed, sron, Windows XP
207
    + high speed, sron, Windows XP
208
  + Performance:
209
    + full speed, konijn, linux, RX 128, TX 128
210
    + high speed, konijn, linux, RX 2k, TX 1k
211
  + Verify descriptors, device, config, qualifier, other_speed_config, status:
212
    + full speed, konijn, linux
213
    + high speed, konijn, linux
214
  + Test suspend/resume:
215
    + full speed, konijn
216
    + full speed, iBook
217
    + high speed, konijn
218
    + high speed, iBook
219
  + Plug-in handling:
220
    + high speed, konijn, linux
221
    + fallback to full speed, schildpad, linux
222
    + high speed, sron, Windows XP
223
    + high speed, iBook
224
 
225
 
226
Misc issues
227
-----------
228
 
229
* USB 2.0 high speed requires support of SET_FEATURE(TEST_MODE).
230
  We will not implement this.
231
  Reason: overkill, no way to test it properly.
232
 
233
* Suspend detection is implemented.
234
  The output signal SUSPEND from usb_serial can be used to combinatorially
235
  drive the suspend pin on the UTMI interface. Reset of the SUSPEND signal
236
  is asynchronous and can therefore work even when the FPGA has no clock.
237
 
238
* We will not implement detection of SOF packets.
239
  Reason: usefulness is questionable.
240
 
241
* No separate clock domains.
242
  Reason: difficult to implement, very hard to validate.
243
 
244
* There is a problem with empty packets under Windows 2000.
245
  The Windows 2000 version of usbser.dll chokes on unexpected empty packets,
246
  such as send by the device after a final full-length packet.
247
  This has been solved in Windows XP.
248
 
249
* A babble error occurs when a device sends more bytes than expected
250
  by the host, even if this is less than the maximum packet size.
251
  This may happen if software submits an IN request which is not
252
  a multiple of the maximum packet size. It may also happen if the host
253
  sends an invalid standard device request, for example GET_STATUS with
254
  wLength=0.
255
  To avoid this, always submit IN requests with the transfer size set to
256
  a multiple of the maximum packet size.
257
  Note that babble errors can freeze the host controller; this is a known
258
  bug of VIA UHCI controllers:
259
  http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg17019.html
260
 
261
* After plugging in, the Linux kernel log shows
262
  "device descriptor read/64, error -62" and
263
  "Cannot enable port 2.  Maybe the USB cable is bad?".
264
  After the errors, the kernel retries and the second attempt is successful.
265
  It seems pretty reproducible; occurs in FS and HS mode after plugin,
266
  but not after soft-reattach of the device.
267
  It is worse under Win2k; the USB subsystem seems to crash after plugging in.
268
  Theory: Initialization of the FPGA initialization takes longer than 100 ms,
269
  causing us to miss the initial port handshake.
270
 
271
* Even 8k TX buffer is not sufficient for loss-free transmission @ 25 MB/s.
272
  Loss rate becomes much higher under CPU load.
273
 
274
 
275
FPGA Resources
276
--------------
277
 
278
( From mapper log file; target = XC3S1000 )
279
 
280
Design:     usbtest-20070419
281
Tools:      Xilinx Webpack 7.1i
282
 
283
Number of errors:      0
284
Number of warnings:    2
285
Logic Utilization:
286
  Number of Slice Flip Flops:         301 out of  15,360    1%
287
  Number of 4 input LUTs:             969 out of  15,360    6%
288
Logic Distribution:
289
  Number of occupied Slices:                          573 out of   7,680    7%
290
    Number of Slices containing only related logic:     573 out of     573  100%
291
    Number of Slices containing unrelated logic:          0 out of     573    0%
292
      *See NOTES below for an explanation of the effects of unrelated logic
293
Total Number 4 input LUTs:          1,034 out of  15,360    6%
294
  Number used as logic:                969
295
  Number used as a route-thru:          65
296
  Number of bonded IOBs:               31 out of     173   17%
297
    IOB Flip Flops:                    27
298
  Number of Block RAMs:                2 out of      24    8%
299
  Number of GCLKs:                     1 out of       8   12%
300
 
301
Total equivalent gate count for design:  140,539
302
 
303
----
304
 
305
Design:     usbtest-20090927, full speed, RX 128, TX 128
306
Tools:      Xilinx Webpack 7.1i
307
 
308
Number of errors:      0
309
Number of warnings:    2
310
Logic Utilization:
311
  Number of Slice Flip Flops:         337 out of  15,360    2%
312
  Number of 4 input LUTs:           1,151 out of  15,360    7%
313
Logic Distribution:
314
  Number of occupied Slices:                          671 out of   7,680    8%
315
    Number of Slices containing only related logic:     671 out of     671  100%
316
    Number of Slices containing unrelated logic:          0 out of     671    0%
317
      *See NOTES below for an explanation of the effects of unrelated logic
318
Total Number 4 input LUTs:          1,249 out of  15,360    8%
319
  Number used as logic:              1,151
320
  Number used as a route-thru:          98
321
  Number of bonded IOBs:               31 out of     173   17%
322
    IOB Flip Flops:                    31
323
  Number of Block RAMs:                4 out of      24   16%
324
  Number of GCLKs:                     1 out of       8   12%
325
 
326
Total equivalent gate count for design:  273,110
327
 
328
----
329
 
330
Design:     usbtest-20090929, high speed, RX 2k, TX 1k
331
Tools:      Xilinx Webpack 7.1i
332
 
333
Number of errors:      0
334
Number of warnings:    2
335
Logic Utilization:
336
  Number of Slice Flip Flops:         380 out of  15,360    2%
337
  Number of 4 input LUTs:           1,349 out of  15,360    8%
338
Logic Distribution:
339
  Number of occupied Slices:                          787 out of   7,680   10%
340
    Number of Slices containing only related logic:     787 out of     787  100%
341
    Number of Slices containing unrelated logic:          0 out of     787    0%
342
      *See NOTES below for an explanation of the effects of unrelated logic
343
Total Number 4 input LUTs:          1,465 out of  15,360    9%
344
  Number used as logic:              1,349
345
  Number used as a route-thru:         116
346
  Number of bonded IOBs:               31 out of     173   17%
347
    IOB Flip Flops:                    34
348
  Number of Block RAMs:                4 out of      24   16%
349
  Number of GCLKs:                     1 out of       8   12%
350
 
351
Total equivalent gate count for design:  274,894
352
 
353
----
354
 
355
Design:     usb_serial only, 20090929, full speex RX 128, TX 128
356
Tools:      Xilinx Webpack 7.1i
357
 
358
Number of errors:      0
359
Number of warnings:    2
360
Logic Utilization:
361
  Number of Slice Flip Flops:         235 out of  15,360    1%
362
  Number of 4 input LUTs:             841 out of  15,360    5%
363
Logic Distribution:
364
  Number of occupied Slices:                          479 out of   7,680    6%
365
    Number of Slices containing only related logic:     479 out of     479  100%
366
    Number of Slices containing unrelated logic:          0 out of     479    0%
367
      *See NOTES below for an explanation of the effects of unrelated logic
368
Total Number 4 input LUTs:            899 out of  15,360    5%
369
  Number used as logic:                841
370
  Number used as a route-thru:          58
371
  Number of bonded IOBs:               69 out of     173   39%
372
    IOB Flip Flops:                    33
373
  Number of Block RAMs:                3 out of      24   12%
374
  Number of GCLKs:                     1 out of       8   12%
375
 
376
Total equivalent gate count for design:  204,527
377
 
378
----
379
 
380
Design:     usb_serial only, 20090929, high speed, RX 2k, TX 1k
381
Tools:      Xilinx Webpack 7.1i
382
 
383
Number of errors:      0
384
Number of warnings:    2
385
Logic Utilization:
386
  Number of Slice Flip Flops:         285 out of  15,360    1%
387
  Number of 4 input LUTs:           1,062 out of  15,360    6%
388
Logic Distribution:
389
  Number of occupied Slices:                          610 out of   7,680    7%
390
    Number of Slices containing only related logic:     610 out of     610  100%
391
    Number of Slices containing unrelated logic:          0 out of     610    0%
392
      *See NOTES below for an explanation of the effects of unrelated logic
393
Total Number 4 input LUTs:          1,139 out of  15,360    7%
394
  Number used as logic:              1,062
395
  Number used as a route-thru:          77
396
  Number of bonded IOBs:               76 out of     173   43%
397
    IOB Flip Flops:                    36
398
  Number of Block RAMs:                3 out of      24   12%
399
  Number of GCLKs:                     1 out of       8   12%
400
 
401
Total equivalent gate count for design:  206,559
402
 
403
----
404
 
405
Design:     usb_serial only, 20090929, full speed, RX 128, TX 128
406
Tools:      Xilinx ISE 11.2
407
 
408
Number of errors:      0
409
Number of warnings:    1
410
Logic Utilization:
411
  Number of Slice Flip Flops:           227 out of  15,360    1%
412
  Number of 4 input LUTs:               808 out of  15,360    5%
413
Logic Distribution:
414
  Number of occupied Slices:            459 out of   7,680    5%
415
    Number of Slices containing only related logic:     459 out of     459 100%
416
    Number of Slices containing unrelated logic:          0 out of     459   0%
417
      *See NOTES below for an explanation of the effects of unrelated logic.
418
  Total Number of 4 input LUTs:         835 out of  15,360    5%
419
    Number used as logic:               808
420
    Number used as a route-thru:         27
421
  The Slice Logic Distribution report is not meaningful if the design is
422
  over-mapped for a non-slice resource or if Placement fails.
423
  Number of bonded IOBs:                 69 out of     173   39%
424
    IOB Flip Flops:                      27
425
  Number of RAMB16s:                      3 out of      24   12%
426
  Number of BUFGMUXs:                     1 out of       8   12%
427
 
428
----
429
 
430
Design:     usb_serial only, 20090929, high speed, RX 2k, TX 1k
431
Tools:      Xilinx ISE 11.2
432
 
433
Number of errors:      0
434
Number of warnings:    2
435
Logic Utilization:
436
  Number of Slice Flip Flops:           265 out of  15,360    1%
437
  Number of 4 input LUTs:               955 out of  15,360    6%
438
Logic Distribution:
439
  Number of occupied Slices:            555 out of   7,680    7%
440
    Number of Slices containing only related logic:     555 out of     555 100%
441
    Number of Slices containing unrelated logic:          0 out of     555   0%
442
      *See NOTES below for an explanation of the effects of unrelated logic.
443
  Total Number of 4 input LUTs:       1,010 out of  15,360    6%
444
    Number used as logic:               955
445
    Number used as a route-thru:         55
446
  The Slice Logic Distribution report is not meaningful if the design is
447
  over-mapped for a non-slice resource or if Placement fails.
448
  Number of bonded IOBs:                 76 out of     173   43%
449
    IOB Flip Flops:                      32
450
  Number of RAMB16s:                      3 out of      24   12%
451
  Number of BUFGMUXs:                     1 out of       8   12%
452
 
453
----

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.