OpenCores
URL https://opencores.org/ocsvn/mpeg2fpga/mpeg2fpga/trunk

Subversion Repositories mpeg2fpga

[/] [mpeg2fpga/] [trunk/] [doc/] [mpeg2fpga.txt] - Blame information for rev 2

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 2 kdv
MPEG-2 Decoder User Guide
2
 
3
Koenraad De Vleeschauwer
4
kdv@kdvelectronics.eu
5
 
6
  Copyright Notice
7
 
8
Copyright 2007-2009, Koenraad De Vleeschauwer.
9
 
10
Redistribution and use in source (LyX format) and `compiled'
11
forms (PDF, PostScript, HTML, RTF, etc.), with or without
12
modification, are permitted provided that the following
13
conditions are met:
14
 
15
1. Redistributions of source code (LyX format) must retain the
16
  above copyright notice, this list of conditions and the
17
  following disclaimer.
18
 
19
2. Redistributions in compiled form (transformed to other DTDs,
20
  converted to PDF, PostScript, HTML, RTF, and other formats)
21
  must reproduce the above copyright notice, this list of
22
  conditions and the following disclaimer in the documentation
23
  and/or other materials provided with the distribution.
24
 
25
3. The name of the author may not be used to endorse or promote
26
  products derived from this documentation without specific prior
27
  written permission.
28
 
29
This documentation is provided by the author “as is" and any
30
express or implied warranties, including, but not limited to, the
31
implied warranties of merchantability and fitness for a
32
particular purpose are disclaimed. In no event shall the author
33
be liable for any direct, indirect, incidental, special,
34
exemplary, or consequential damages (including, but not limited
35
to, procurement of substitute goods or services; loss of use,
36
data, or profits; or business interruption) however caused and on
37
any theory of liability, whether in contract, strict liability,
38
or tort (including negligence or otherwise) arising in any way
39
out of the use of this documentation, even if advised of the
40
possibility of such damage.
41
 
42
  MPEG-2 License Notice
43
 
44
Commercial implementations of MPEG-1 and MPEG-2 video, including
45
shareware, are subject to royalty fees to patent holders. Many of
46
these patents are general enough such that they are unavoidable
47
regardless of implementation design.
48
 
49
MPEG-2 intermediate product. Use of this product in any manner
50
that complies with the MPEG-2 standard is expressly prohibited
51
without a license under applicable patents in the MPEG-2 patent
52
portfolio, which license is available from MPEG LA, L.L.C., 250
53
Stelle Street, suite 300, Denver, Colorado 80206.
54
 
55
Table of Contents
56
 
57
Copyright Notice
58
MPEG-2 License Notice
59
Chapter 1 Processor Interface
60
1.1 Decoder Block Diagram
61
1.2 Ports
62
1.2.1 Clocks
63
1.2.2 Reset
64
1.2.3 Stream Input
65
1.2.4 Register File Access
66
1.2.5 Memory Controller
67
1.2.6 Memory Request FIFO
68
1.2.7 Memory Response FIFO
69
1.2.8 Video Output
70
1.2.9 Test Point
71
1.2.10 Status
72
1.3 Processor Tasks
73
1.4 Registers
74
1.5 Read-only Registers
75
1.6 On-Screen Display
76
1.7 Frame Store
77
1.8 Video Modeline
78
1.9 Interrupts
79
1.10 Watchdog
80
1.11 Trick mode
81
1.12 Test point
82
Chapter 2 Decoder Sources
83
2.1 Source Directory Structure
84
2.2 MPEG2 Decoder
85
2.2.1 FIFO sizes
86
2.2.2 Dual-ported memory and FIFO models
87
2.2.3 Memory mapping
88
2.2.4 Modeline
89
2.2.5 Inverse Discrete Cosine Transform
90
2.2.6 Bilinear chroma upsampling
91
2.3 Simulation
92
2.3.1 Icarus Verilog Simulation
93
2.3.2 Conformance Tests
94
2.4 Tools
95
2.4.1 Logic Analyzer
96
2.4.2 Finite State Machine Graphs
97
2.4.3 IEEE-1180 IDCT Accuracy Test
98
2.4.4 Reference software decoder
99
2.4.5 MPEG2 Test Streams
100
 
101
 
102
Processor Interface
103
 
104
An MPEG2 decoder, implemented in Verilog, is presented. Chapter [cha:Processor-Interface]
105
 describes the decoder for the software engineer who wishes to
106
write a device driver.
107
 
108
1.1 Decoder Block Diagram
109
 
110
[float Figure:
111
112
 
113
[Figure 1.1:
114
Decoder Block Diagram
115
]
116
]
117
 
118
Figure [fig:Decoder-Block-Diagram] shows the MPEG2 decoder block
119
diagram. An external source such as a DVB tuner or DVD drive
120
provides an MPEG2 stream. The video elementary stream is
121
extracted and sent to the decoder. The video buffer acts as a
122
fifo between the incoming MPEG2 video stream and the variable
123
length decoder. The video buffer evens out temporary differences
124
between the bitrate of the incoming MPEG2 bitstream and the
125
bitrate at which the decoder parses the bitstream.
126
 
127
The MPEG2 codec is a variable length codec; codewords which occur
128
often occupy less bits than codewords which occur only rarely.
129
Getbits provides a sliding window over the incoming stream. As
130
the codewords have a variable length, the sliding window moves
131
forward a variable amount of bits at a time.
132
 
133
Variable length decoding does the actual parsing of the
134
bitstream. Variable length decoding stores stream parameters such
135
as horizontal and vertical resolution, and produces run/length
136
values and motion vectors. Run/length values and motion vectors
137
are different ways of describing an image. The run/length values
138
describe an image as compressed data contained within the
139
bitstream. The motion vectors describe an image as a mosaic of
140
already decoded images.
141
 
142
Run-length decoding, inverse quantizing and inverse discrete
143
cosine transform decompress the run/length values.
144
 
145
Motion compensation retrieves already decoded images from memory
146
and applies the motion vector translations.
147
 
148
The reconstructed image is the sum of the decompressed run/length
149
values and translated pieces of already decoded images. The
150
reconstructed image is stored in the frame store for later
151
display and reference.
152
 
153
The frame store receives requests to store and retrieve pixels
154
from three different sources:
155
 
156
• motion compensation, which writes reconstructed image frames to
157
  memory
158
 
159
• chroma resampling, which reads reconstructed image frames from
160
  memory for displaying
161
 
162
• writes to the on-screen display, under software control.
163
 
164
Some of these blocks have multiple accesses to the frame store.
165
Within the MPEG2 decoder a total of six memory read or write
166
requests may occur simultaneously. The frame store prioritizes
167
these requests and serializes them into a single stream of memory
168
read/write requests, which is sent to the memory controller.
169
 
170
The memory controller is external to the MPEG2 decoder. The
171
memory controller handles the low-level details of interfacing
172
with the memory chips. If memory is static RAM, interfacing
173
requires little more than a buffer; dynamic memory requires a
174
more complex controller.
175
 
176
The MPEG2 decoder accepts 4:2:0 format video, in which color and
177
brightness information have a different resolution: color
178
information (chrominance) is sent at half the horizontal and half
179
the vertical resolution of brightness information (luminance).
180
This makes sense because the human eye uses different mechanisms
181
to perceive color and brightness; and the different mechanisms
182
used have different sensitivities.
183
 
184
Sending color information at half the horizontal and half the
185
vertical resolution of brightness information implies the
186
reconstructed image in the frame store has only one color pixel
187
for every four brightness pixels. Assigning the same color
188
information to the four pixels of brightness information would
189
result in a chunky image. Chroma resampling does horizontal and
190
vertical interpolation of the color information, resulting in a
191
smooth color image.
192
 
193
A dot clock marks the frequency at which pixels are sent to the
194
display. The dot clock is external to the MPEG2 decoder and can
195
be either free running or synchronized to another clock.
196
 
197
The video synchronization generator counts pixels, lines and
198
image frames at the dot clock frequency. At any given moment, the
199
video synchronization generator knows the horizontal and vertical
200
coordinate of the pixel to be displayed.
201
 
202
The pixels generated in chroma resampling and the coordinates
203
generated by the video synchronization generator are joined in
204
the mixer. The result is a stream of pixels, at the current
205
horizontal/vertical coordinate, at the dot clock frequency.
206
 
207
At this point the on-screen display is added. The on-screen
208
display has the same resolution as the video and uses a 256-color
209
palette. Software can choose to put the on-screen display on top,
210
completely hiding the video; or to blend on-screen display and
211
video, as if they were two translucent glass plates.
212
 
213
The MPEG2 decoder works with chrominance (color) and luminance
214
(brightness) information throughout. The final step is converting
215
chrominance and luminance to red, green and blue in yuv2rgb. The
216
red, green and blue information is the output of the decoder.
217
 
218
1.2 Ports
219
 
220
Table [tab:Ports] lists MPEG2 decoder input/output ports.[float Table:
221
 
222
+-------------------------+-------+--------------------------------+------+---------+
223
|          Port           | Bits  |          Description           | I/O  |  Clock  |
224
+-------------------------+-------+--------------------------------+------+---------+
225
+-------------------------+-------+--------------------------------+------+---------+
226
|          clk            |  1    |         Decoder clock          |  I   |    -    |
227
+-------------------------+-------+--------------------------------+------+---------+
228
|        dot_clk          |  1    |          Video clock           |  I   |    -    |
229
+-------------------------+-------+--------------------------------+------+---------+
230
|        mem_clk          |  1    |    Memory Controller clock     |  I   |    -    |
231
+-------------------------+-------+--------------------------------+------+---------+
232
|          rst            |  1    |             Reset              |  I   |    -    |
233
+-------------------------+-------+--------------------------------+------+---------+
234
|      stream_data        |  8    |      Program stream data       |  I   |   clk   |
235
+-------------------------+-------+--------------------------------+------+---------+
236
|      stream_valid       |  1    |       stream_data valid        |  I   |   clk   |
237
+-------------------------+-------+--------------------------------+------+---------+
238
|          busy           |  1    |       Decoder busy flag        |  O   |   clk   |
239
+-------------------------+-------+--------------------------------+------+---------+
240
|        reg_addr         |  4    |       Register address         |  I   |   clk   |
241
+-------------------------+-------+--------------------------------+------+---------+
242
|       reg_dta_in        |  32   |      Register write data       |  I   |   clk   |
243
+-------------------------+-------+--------------------------------+------+---------+
244
|       reg_wr_en         |  1    |     Register write enable      |  I   |   clk   |
245
+-------------------------+-------+--------------------------------+------+---------+
246
|      reg_dta_out        |  32   |      Register read data        |  O   |   clk   |
247
+-------------------------+-------+--------------------------------+------+---------+
248
|       reg_rd_en         |  1    |     Register read enable       |  I   |   clk   |
249
+-------------------------+-------+--------------------------------+------+---------+
250
|         error           |  1    |      Decoding error flag       |  O   |   clk   |
251
+-------------------------+-------+--------------------------------+------+---------+
252
|       interrupt         |  1    |           Interrupt            |  O   |   clk   |
253
+-------------------------+-------+--------------------------------+------+---------+
254
|      watchdog_rst       |  1    |   Watchdog-generated Reset     |  O   |   clk   |
255
+-------------------------+-------+--------------------------------+------+---------+
256
|           r             |  8    |              Red               |  O   | dot_clk |
257
+-------------------------+-------+--------------------------------+------+---------+
258
|           g             |  8    |             Green              |  O   | dot_clk |
259
+-------------------------+-------+--------------------------------+------+---------+
260
|           b             |  8    |             Blue               |  O   | dot_clk |
261
+-------------------------+-------+--------------------------------+------+---------+
262
|           y             |  8    |          Y Luminance           |  O   | dot_clk |
263
+-------------------------+-------+--------------------------------+------+---------+
264
|           u             |  8    |        Cr Chrominance          |  O   | dot_clk |
265
+-------------------------+-------+--------------------------------+------+---------+
266
|           v             |  8    |        Cb Chrominance          |  O   | dot_clk |
267
+-------------------------+-------+--------------------------------+------+---------+
268
|        pixel_en         |  1    |         Pixel enable           |  O   | dot_clk |
269
+-------------------------+-------+--------------------------------+------+---------+
270
|         h_sync          |  1    |  Horizontal synchronization    |  O   | dot_clk |
271
+-------------------------+-------+--------------------------------+------+---------+
272
|         v_sync          |  1    |   Vertical synchronization     |  O   | dot_clk |
273
+-------------------------+-------+--------------------------------+------+---------+
274
|         c_sync          |  1    |   Composite synchronization    |  O   | dot_clk |
275
+-------------------------+-------+--------------------------------+------+---------+
276
|     mem_req_rd_cmd      |  2    |    Memory request command      |  O   | mem_clk |
277
+-------------------------+-------+--------------------------------+------+---------+
278
|    mem_req_rd_addr      |  22   |    Memory request address      |  O   | mem_clk |
279
+-------------------------+-------+--------------------------------+------+---------+
280
|     mem_req_rd_dta      |  64   |      Memory request data       |  O   | mem_clk |
281
+-------------------------+-------+--------------------------------+------+---------+
282
|     mem_req_rd_en       |  1    |  Memory request read enable    |  I   | mem_clk |
283
+-------------------------+-------+--------------------------------+------+---------+
284
|    mem_req_rd_valid     |  1    |     Memory request valid       |  O   | mem_clk |
285
+-------------------------+-------+--------------------------------+------+---------+
286
|     mem_res_wr_dta      |  64   |     Memory response data       |  I   | mem_clk |
287
+-------------------------+-------+--------------------------------+------+---------+
288
|     mem_res_wr_en       |  1    |    Memory response enable      |  I   | mem_clk |
289
+-------------------------+-------+--------------------------------+------+---------+
290
| mem_res_wr_almost_full  |  1    |  Memory response almost full   |  O   | mem_clk |
291
+-------------------------+-------+--------------------------------+------+---------+
292
|    testpoint_dip_en     |  1    | Testpoint dip switches enable  |  I   |    -    |
293
+-------------------------+-------+--------------------------------+------+---------+
294
|     testpoint_dip       |  4    |    Testpoint dip switches      |  I   |    -    |
295
+-------------------------+-------+--------------------------------+------+---------+
296
|       testpoint         |  34   |  Logical analyzer test point   |  O   |    -    |
297
+-------------------------+-------+--------------------------------+------+---------+
298
 
299
 
300
[Table 1.2:
301
Ports
302
]
303
]
304
 
305
1.2.1 Clocks
306
 
307
Up to three different clocks may be supplied to the MPEG2
308
decoder.
309
 
310
clk Main decoder clock, input.
311
 
312
dot_clk Video clock, input.  Variable frequency, varying with
313
current video modeline.
314
 
315
mem_clk Memory Controller Clock, input.
316
 
317
The decoder produces pixels at a maximum rate of one per clk
318
cycle.
319
 
320
1.2.2 Reset
321
 
322
rst Asynchronous reset, input, active low, internally
323
synchronized.
324
 
325
1.2.3 Stream Input
326
 
327
stream_data 8-bit elementary stream data, input, synchronous with
328
 clk, byte aligned. The elementary stream is an MPEG2 4:2:0 video
329
elementary stream.
330
 
331
stream_valid elementary stream data valid, input, synchronous
332
with clk. Assert when stream_data valid.
333
 
334
busy busy, active high, output, synchronous with clk. When high,
335
indicates maintaining stream_valid high will overflow decoder
336
input buffers.
337
 
338
1.2.4 Register File Access
339
 
340
reg_addr 5-bit register address, input, synchronous with clk.
341
 
342
reg_dta_in 32-bit register data in, input, synchronous with clk.
343
 
344
reg_wr_en register write enable, input, active high, synchronous
345
with clk. Assert to write reg_dta_in to reg_addr.
346
 
347
reg_dta_out 32-bit register data out, output, synchronous with
348
clk.
349
 
350
reg_rd_en Active high register read enable, input, synchronous
351
with clk. Assert to obtain the contents of register reg_addr at
352
reg_dta_out.
353
 
354
1.2.5 Memory Controller
355
 
356
The interface between MPEG2 decoder and memory controller
357
consists of two fifos. The memory request FIFO sends memory read,
358
write or refresh requests from decoder to memory controller. The
359
memory response FIFO sends data read from memory controller to
360
MPEG2 decoder. The data from the memory read requests appears in
361
the memory response FIFO in the same order as the memory reads
362
were issued in the memory request FIFO.
363
 
364
1.2.6 Memory Request FIFO
365
 
366
mem_req_rd_cmd memory request command, output, synchronous with
367
mem_clk. Valid values are defined in table [tab:Memory-controller-commands]
368
. [float Table:
369
 
370
+-----------------+--------------+--------------------+
371
| mem_req_rd_cmd  |  Mnemonic    |    Description     |
372
+-----------------+--------------+--------------------+
373
+-----------------+--------------+--------------------+
374
|       0         |  CMD_NOOP    |    No operation    |
375
+-----------------+--------------+--------------------+
376
|       1         | CMD_REFRESH  |   Refresh memory   |
377
+-----------------+--------------+--------------------+
378
|       2         |  CMD_READ    |  Read 64-bit word  |
379
+-----------------+--------------+--------------------+
380
|       3         |  CMD_WRITE   | Write 64-bit word  |
381
+-----------------+--------------+--------------------+
382
 
383
 
384
[Table 1.3:
385
Memory controller commands
386
]
387
]
388
 
389
mem_req_rd_addr 22-bit memory request address, output,
390
synchronous with  mem_clk.
391
 
392
mem_req_rd_dta 64-bit memory request data, output, synchronous
393
with mem_clk.
394
 
395
mem_req_rd_en memory request read enable, input, active high,
396
synchronous  with mem_clk.
397
 
398
mem_req_rd_valid memory request read valid, output, active high,
399
synchronous  with mem_clk. Indicates when mem_req_rd_cmd,
400
mem_req_rd_addr and mem_req_rd_dta have meaningful values.
401
 
402
1.2.7 Memory Response FIFO
403
 
404
mem_res_wr_dta 64-bit memory response write data, input,
405
synchronous with  mem_clk.
406
 
407
mem_res_wr_en memory response write enable, input, active high,
408
synchronous  with mem_clk. Assert to write mem_res_wr_dta to the
409
memory response FIFO.
410
 
411
mem_res_wr_almost_full memory response write almost full, output,
412
active  high, synchronous with mem_clk. When high, indicates
413
maintaining mem_res_wr_en high will overflow the memory response
414
FIFO. The current clock cycle can be completed without
415
overflowing the memory response FIFO.
416
 
417
1.2.8 Video Output
418
 
419
r red component, output, synchronous with dot_clk.
420
 
421
g green component, output, synchronous with dot_clk.
422
 
423
b blue component, output, synchronous with dot_clk.
424
 
425
y Y luminance, output, synchronous with dot_clk.
426
 
427
u Cr chrominance, output, synchronous with dot_clk.
428
 
429
v Cb chrominance, output, synchronous with dot_clk.
430
 
431
pixel_en pixel enable, output, active high, synchronous with
432
dot_clk. When pixel_en is high, r, g, b, y, u and v are valid;
433
when pixel_en is low video is blanked.
434
 
435
h_sync horizontal synchronization, output, active high,
436
synchronous with  dot_clk.
437
 
438
v_sync vertical synchronization, output, active high, synchronous
439
with dot_clk.
440
 
441
c_sync composite synchronization, output, active low, synchronous
442
with dot_clk.
443
 
444
1.2.9 Test Point
445
 
446
The decoder provides a test point for connecting a logic
447
analyzer. The signals available at the test point can be selected
448
either by software control, or using dip switches. The signals
449
available at the test point are not defined as part of this
450
specification, may vary even for implementations with the same
451
status register version number and are subject to change without
452
notice. See Verilog source probe.v for details.
453
 
454
testpoint_dip_en 1-bit input. If testpoint_dip_en is high, the
455
registers visible at testpoint are selected using testpoint_dip.
456
If testpoint_dip_en is low, the registers visible at testpoint
457
output are selected using the testpoint_sel field of register 15.
458
 
459
testpoint_dip  4-bit input. testpoint_dip selects test point
460
output if testpoint_dip_en is high.
461
 
462
testpoint  34-bit output. testpoint is a test point to connect a
463
34-channel logic analyzer probe to the MPEG2 decoder. Up to 16
464
different sets of signals are available, hardware selectable
465
using the testpoint_dip dip switches or software selectable by
466
writing to register 15. Any clocks present are on bits 32 and/or
467
33; bits 0 to 31 are data only. Bits 0 to 31 can also be accessed
468
by software, by reading register 15.
469
 
470
1.2.10 Status
471
 
472
error error, output, active high, synchronous with clk. Indicates
473
variable length decoding encountered an error in the bitstream.
474
 
475
interrupt interrupt, output, active high, synchronous with clk.
476
Reading the status register allows software to determine the
477
cause of the interrupt, and will clear the interrupt.
478
 
479
watchdog_rst watchdog-generated reset signal, output, active low,
480
synchronous with clk. Normally high; low during one clock cycle
481
if the watchdog timer expires.
482
 
483
1.3 Processor Tasks
484
 
485
To decode an MPEG-2 bitstream, the processor should execute the
486
following tasks, in order:
487
 
488
1. Initialize the horizontal, horizontal sync, vertical, vertical
489
  sync and video mode registers with reasonable defaults. Clear
490
  osd_enable, picture_hdr_intr_en and frame_end_intr_en. Set the
491
  video_ch_intr_en flag.
492
 
493
2. Start feeding the MPEG-2 bitstream to the stream_data port of
494
  the decoder.
495
 
496
3. The decoder will issue an interrupt when video resolution or
497
  frame rate changes. Whenever the decoder issues an interrupt,
498
  clear the interrupt by reading the status register. Read the
499
  size, display size and frame rate registers. Calculate a new
500
  modeline, change dot clock frequency if necessary, and write
501
  the new video timing parameters to the horizontal, horizontal
502
  sync, vertical, vertical sync and video mode registers.
503
 
504
4. At bitstream end, pad the stream with 8 times hex 000001b7,
505
  the sequence end code (ISO/IEC 13818-2, par. 6.2.1, Start
506
  Codes).
507
 
508
If the On-Screen Display (OSD) is used, the processor should
509
execute the following tasks as well:
510
 
511
1. Initialize the On-Screen Display color look-up table.
512
 
513
2. Wait until horizontal_size and vertical_size have meaningful
514
  values.
515
 
516
3. Write to the On-Screen Display.
517
 
518
4. Set osd_enable to one.
519
 
520
5. If a video change interrupt occurs, and horizontal_size or
521
  vertical_size has changed, rewrite the On-Screen Display.
522
 
523
Writing to the OSD is described in detail [sec:On-Screen-Display]
524
. Interrupt handling is treated [sec:Interrupts].
525
 
526
1.4 Registers
527
 
528
The processor interface to the decoder consists of two times 16
529
32-bit registers. These registers can be divided in 16 read-mode
530
registers (Table [tab:Read-mode-Registers]) and 16 write-mode
531
registers (Table [tab:Write-mode-Registers]). The read-mode
532
registers allow reading decoder status, while the write-mode
533
registers allow setting video timing parameters and writing to
534
the On-Screen Display (OSD).[float Table:
535
 
536
+----+--------------++--------+---------------------------+------------+
537
|    |   register   || bits   |         content           | read/write |
538
+----+--------------++--------+---------------------------+------------+
539
+----+--------------++--------+---------------------------+------------+
540
| 0  |   version    || 15-0   |         version           |     r      |
541
+----+--------------++--------+---------------------------+------------+
542
| 1  |    status    || 15-8   |   matrix_coefficients     |     r      |
543
+----+--------------++--------+---------------------------+------------+
544
|    |              ||   7    |     watchdog_status       |     r      |
545
+----+--------------++--------+---------------------------+------------+
546
|    |              ||   6    |        osd_wr_en          |     r      |
547
+----+--------------++--------+---------------------------+------------+
548
|    |              ||   5    |        osd_wr_ack         |     r      |
549
+----+--------------++--------+---------------------------+------------+
550
|    |              ||   4    |       osd_wr_full         |     r      |
551
+----+--------------++--------+---------------------------+------------+
552
|    |              ||   3    |       picture_hdr         |     r      |
553
+----+--------------++--------+---------------------------+------------+
554
|    |              ||   2    |        frame_end          |     r      |
555
+----+--------------++--------+---------------------------+------------+
556
|    |              ||   1    |         video_ch          |     r      |
557
+----+--------------++--------+---------------------------+------------+
558
|    |              ||   0    |          error            |     r      |
559
+----+--------------++--------+---------------------------+------------+
560
| 2  |     size     || 29-16  |     horizontal_size       |     r      |
561
+----+--------------++--------+---------------------------+------------+
562
|    |              || 13-0   |      vertical_size        |     r      |
563
+----+--------------++--------+---------------------------+------------+
564
| 3  | display size || 29-16  | display_horizontal_size   |     r      |
565
+----+--------------++--------+---------------------------+------------+
566
|    |              || 13-0   |  display_vertical_size    |     r      |
567
+----+--------------++--------+---------------------------+------------+
568
| 4  |  frame rate  || 15-12  | aspect_ratio_information  |     r      |
569
+----+--------------++--------+---------------------------+------------+
570
|    |              ||  11    |   progressive_sequence    |     r      |
571
+----+--------------++--------+---------------------------+------------+
572
|    |              || 10-6   |  frame_rate_extension_d   |     r      |
573
+----+--------------++--------+---------------------------+------------+
574
|    |              ||  5-4   |  frame_rate_extension_n   |     r      |
575
+----+--------------++--------+---------------------------+------------+
576
|    |              ||  3-0   |     frame_rate_code       |     r      |
577
+----+--------------++--------+---------------------------+------------+
578
| f  |  testpoint   || 31-0   |        testpoint          |     r      |
579
+----+--------------++--------+---------------------------+------------+
580
 
581
 
582
[Table 1.4:
583
Read-mode Registers
584
]
585
 
586
][float Table:
587
 
588
+----+------------------+--------+------------------------+------------+
589
|    |    register      | bits   |        content         | read/write |
590
+----+------------------+--------+------------------------+------------+
591
+----+------------------+--------+------------------------+------------+
592
| 0  |     stream       | 15-8   |   watchdog_interval    |     w      |
593
+----+------------------+--------+------------------------+------------+
594
|    |                  |   3    |      osd_enable        |     w      |
595
+----+------------------+--------+------------------------+------------+
596
|    |                  |   2    |  picture_hdr_intr_en   |     w      |
597
+----+------------------+--------+------------------------+------------+
598
|    |                  |   1    |   frame_end_intr_en    |     w      |
599
+----+------------------+--------+------------------------+------------+
600
|    |                  |   0    |   video_ch_intr_en     |     w      |
601
+----+------------------+--------+------------------------+------------+
602
| 1  |   horizontal     | 27-16  | horizontal_resolution  |     w      |
603
+----+------------------+--------+------------------------+------------+
604
|    |                  | 11-0   |   horizontal_length    |     w      |
605
+----+------------------+--------+------------------------+------------+
606
| 2  | horizontal sync  | 27-16  | horizontal_sync_start  |     w      |
607
+----+------------------+--------+------------------------+------------+
608
|    |                  | 11-0   |  horizontal_sync_end   |     w      |
609
+----+------------------+--------+------------------------+------------+
610
| 3  |    vertical      | 27-16  |  vertical_resolution   |     w      |
611
+----+------------------+--------+------------------------+------------+
612
|    |                  | 11-0   |    vertical_length     |     w      |
613
+----+------------------+--------+------------------------+------------+
614
| 4  |  vertical sync   | 27-16  |  vertical_sync_start   |     w      |
615
+----+------------------+--------+------------------------+------------+
616
|    |                  | 11-0   |   vertical_sync_end    |     w      |
617
+----+------------------+--------+------------------------+------------+
618
| 5  |   video mode     | 27-16  |  horizontal_halfline   |     w      |
619
+----+------------------+--------+------------------------+------------+
620
|    |                  |   2    |   clip_display_size    |     w      |
621
+----+------------------+--------+------------------------+------------+
622
|    |                  |   1    |   pixel_repetition     |     w      |
623
+----+------------------+--------+------------------------+------------+
624
|    |                  |   0    |      interlaced        |     w      |
625
+----+------------------+--------+------------------------+------------+
626
| 6  |  osd clt yuvm    | 31-24  |           y            |     w      |
627
+----+------------------+--------+------------------------+------------+
628
|    |                  | 23-16  |           u            |     w      |
629
+----+------------------+--------+------------------------+------------+
630
|    |                  | 15-8   |           v            |     w      |
631
+----+------------------+--------+------------------------+------------+
632
|    |                  |  7-0   |     osd_clt_mode       |     w      |
633
+----+------------------+--------+------------------------+------------+
634
| 7  |  osd clt addr    |  7-0   |     osd_clt_addr       |     w      |
635
+----+------------------+--------+------------------------+------------+
636
| 8  |  osd dta high    | 31-0   |     osd_dta_high       |     w      |
637
+----+------------------+--------+------------------------+------------+
638
| 9  |   osd dta low    | 31-0   |      osd_dta_low       |     w      |
639
+----+------------------+--------+------------------------+------------+
640
| a  |    osd_addr      | 31-29  |       osd_frame        |     w      |
641
+----+------------------+--------+------------------------+------------+
642
|    |                  | 28-27  |       osd_comp         |     w      |
643
+----+------------------+--------+------------------------+------------+
644
|    |                  | 26-16  |      osd_addr_x        |     w      |
645
+----+------------------+--------+------------------------+------------+
646
|    |                  | 10-0   |      osd_addr_y        |     w      |
647
+----+------------------+--------+------------------------+------------+
648
| b  |   trick mode     |  10    |      deinterlace       |     w      |
649
+----+------------------+--------+------------------------+------------+
650
|    |                  |  9-5   |     repeat_frame       |     w      |
651
+----+------------------+--------+------------------------+------------+
652
|    |                  |   4    |      persistence       |     w      |
653
+----+------------------+--------+------------------------+------------+
654
|    |                  |  3-1   |     source_select      |     w      |
655
+----+------------------+--------+------------------------+------------+
656
|    |                  |   0    |      flush_vbuf        |     w      |
657
+----+------------------+--------+------------------------+------------+
658
| f  |    testpoint     |  3-0   |     testpoint_sel      |     w      |
659
+----+------------------+--------+------------------------+------------+
660
 
661
 
662
[Table 1.5:
663
Write-mode Registers
664
]
665
]
666
 
667
1.5 Read-only Registers
668
 
669
version contains a non-zero FPGA bitstream (hardware) version
670
number.  Software should at least print a warning “Warning:
671
hardware version (%i.%i) more recent than software driver” if the
672
hardware version is higher than expected.
673
 
674
picture_hdr is set whenever an picture header is encountered in
675
the bitstream.  picture_hdr is cleared whenever the status
676
register is read. In a well-behaved MPEG-2 stream,
677
horizontal_size, vertical_size, display_horizontal_size,
678
display_vertical_size, aspect_ratio_information and frame_rate
679
will have meaningful values when a picture header is encountered.
680
 
681
frame_end is set when video vertical synchronization begins.
682
frame_end is cleared whenever the status register is read.
683
 
684
video_ch is set whenever video resolution or frame rate changes.
685
video_ch is cleared whenever the status register is read.
686
 
687
error is set when variable length decoding cannot parse the
688
bitstream.  error is cleared whenever the status register is
689
read.
690
 
691
watchdog_status is high if the watchdog timer expired.
692
watchdog_status is cleared whenever the status register is read.
693
 
694
horizontal_size is defined in ISO/IEC 13818-2, par.  6.2.2.1,
695
par. 6.3.3.
696
 
697
vertical_size is defined in ISO/IEC 13818-2, par.  6.2.2.1, par.
698
6.3.3.
699
 
700
display_horizontal_size is defined in ISO/IEC 13818-2, par.
701
6.2.2.4, par. 6.3.6.
702
 
703
display_vertical_size is defined in ISO/IEC 13818-2, par.
704
6.2.2.4, par. 6.3.6.
705
 
706
aspect_ratio_information is defined in ISO/IEC 13818-2, par.
707
6.3.3.
708
 
709
matrix_coefficients is defined in ISO/IEC 13818-2, par.  6.3.6.
710
 
711
frame_rate_extension_n is defined in ISO/IEC 13818-2, par.
712
6.3.3, par. 6.3.5.
713
 
714
frame_rate_code is defined in ISO/IEC 13818-2, par.  6.3.3, Table
715
6-4.
716
 
717
progressive_sequence is defined in ISO/IEC 13818-2, par.  6.3.5.
718
 
719
frame_rate_extension_d is defined in ISO/IEC 13818-2, par.
720
6.3.3, par. 6.3.5.
721
 
722
1.6 On-Screen Display
723
 
724
The OSD has the same resolution and aspect
725
ratio as the MPEG-2 video being displayed. If no MPEG-2 video is
726
being displayed, the OSD is undefined. Note feeding the decoder a
727
simple MPEG-2 sequence header with horizontal_size and
728
vertical_size already satisfies the requirements for using the
729
OSD.
730
 
731
The OSD is only shown if there is video output. If one wishes to
732
display an OSD when no MPEG2 video is being reproduced, video
733
output can be forced by setting source_select to 4, 5, 6 or 7.
734
 
735
 
736
The OSD may use up to 256 different colors. The OSD color lookup
737
table (CLT) stores y, u, v and osd_clt_mode data for each color.
738
The y, u and v values are interpreted as defined by
739
matrix_coefficients. The osd_clt_mode value determines the color
740
displayed according to Table [tab:On-Screen-Display-Modes]. [float Table:
741
 
742
+---------------+----------------------------------------+
743
| osd_clt_mode  |                Comment                 |
744
+---------------+----------------------------------------+
745
+---------------+----------------------------------------+
746
|   xxx00000    |              alpha = 0/16              |
747
+---------------+----------------------------------------+
748
|   xxx00001    |              alpha = 1/16              |
749
+---------------+----------------------------------------+
750
|   xxx00010    |              alpha = 2/16              |
751
+---------------+----------------------------------------+
752
|   xxx00011    |              alpha = 3/16              |
753
+---------------+----------------------------------------+
754
|   xxx00100    |              alpha = 4/16              |
755
+---------------+----------------------------------------+
756
|   xxx00101    |              alpha = 5/16              |
757
+---------------+----------------------------------------+
758
|   xxx00110    |              alpha = 6/16              |
759
+---------------+----------------------------------------+
760
|   xxx00111    |              alpha = 7/16              |
761
+---------------+----------------------------------------+
762
|   xxx01000    |              alpha = 8/16              |
763
+---------------+----------------------------------------+
764
|   xxx01001    |              alpha = 9/16              |
765
+---------------+----------------------------------------+
766
|   xxx01010    |             alpha = 10/16              |
767
+---------------+----------------------------------------+
768
|   xxx01011    |             alpha = 11/16              |
769
+---------------+----------------------------------------+
770
|   xxx01100    |             alpha = 12/16              |
771
+---------------+----------------------------------------+
772
|   xxx01101    |             alpha = 13/16              |
773
+---------------+----------------------------------------+
774
|   xxx01110    |             alpha = 14/16              |
775
+---------------+----------------------------------------+
776
|   xxx01111    |             alpha = 15/16              |
777
+---------------+----------------------------------------+
778
|   xxx11111    |             alpha = 16/16              |
779
+---------------+----------------------------------------+
780
|   xx0xxxxx    |     attenuate video pixel by alpha     |
781
+---------------+----------------------------------------+
782
|   xx1xxxxx    |    alpha blend osd and video pixel     |
783
+---------------+----------------------------------------+
784
|   00xxxxxx    |          display video pixel           |
785
+---------------+----------------------------------------+
786
|   01xxxxxx    | display attenuated/alpha blended pixel |
787
+---------------+----------------------------------------+
788
|   10xxxxxx    |           display osd pixel            |
789
+---------------+----------------------------------------+
790
|   11xxxxxx    |       display blinking osd pixel       |
791
+---------------+----------------------------------------+
792
 
793
 
794
[Table 1.6:
795
On-Screen Display Modes
796
]
797
]The different modes combine osd and video in various ways:
798
 
799
• video. This is the normal mode of operation.
800
 
801
• attenuated video. 16 discrete levels of attenuation can be used
802
  to fade video in or out.
803
 
804
• on-screen display.
805
 
806
• blend of on-screen display and video. 16 discrete levels of
807
  translucency.
808
 
809
• blinking on-screen display. Alternates between osd pixel and
810
  attenuated/alpha blended video pixel with a frequency of about
811
  one second.
812
 
813
osd_enable determines whether the On-Screen Display is shown or
814
not.  If osd_enable is low, the On-Screen Display is not shown.
815
If osd_enable is high, the On-Screen Display is shown. The osd
816
color lookup table has to be initialized and the osd has to be
817
written before osd_enable is raised. osd_enable is 0 on power-up
818
or reset.
819
 
820
osd_wr_en is set whenever an osd write is has been accepted,
821
whether the  osd write was successful or not. osd_wr_en is
822
cleared whenever the status register is read.
823
 
824
osd_wr_ack is set whenever an osd write has been successful.
825
osd_wr_ack is cleared whenever the status register is read.
826
 
827
osd_wr_full is set when the osd write fifo is full.  When the osd
828
write fifo is full, osd writes are not accepted.
829
 
830
When writing to the osd color lookup table:
831
 
832
1. Write osd_clt_yuvm.
833
 
834
2. Write osd_clt_addr.
835
 
836
Writes to the osd color lookup table take effect immediately.
837
 
838
When writing to the osd:
839
 
840
1. Only write to the osd when horizontal_size and vertical_size
841
  have meaningful values. This is the case when a picture header
842
  has been encountered.
843
 
844
2. Verify osd_wr_full is low. Writing when osd_wr_full is high
845
  has no effect.
846
 
847
3. Write the leftmost four pixels to osd_dta_high.
848
 
849
4. Write the rightmost four pixels to osd_dta_low.
850
 
851
5. Write x and y position of the leftmost pixel to osd_addr. Note
852
  x has to be a multiple of 8. osd_frame always has value 4 for
853
  OSD writes. osd_comp always has value 0 for OSD writes.
854
 
855
6. Read the status register until osd_wr_en is asserted. When
856
  osd_wr_en is high, the value of osd_wr_ack indicates whether
857
  the write was successful.
858
 
859
Writes to the osd pass through a 32-position fifo. This
860
introduces some latency. Repeating the last osd write 32 times
861
flushes fifo contents, ensuring osd memory has been updated.
862
 
863
1.7 Frame Store
864
 
865
Pixels can be written directly to the frame store, using the same
866
mechanism as OSD writes. By writing pixels to the frame store and
867
afterwards setting the source_select field of the trick register
868
(described[sec:Trick-mode]) arbitrary bitmaps can be shown.
869
 
870
 
871
The only difference between an OSD write and a frame store write
872
is the value of osd_frame and/or osd_comp. Tables [tab:OSD-Frame]
873
and [tab:OSD-Component] list the frame and component codes.
874
Frames 0 and 1 are used for storing I and P frames. Frames 2 and
875
3 are used for storing B frames. All frames are stored in 4:2:0
876
format, with u and v frames having half the width and height of
877
the y frame. Note y, u and v values are stored in memory with an
878
offset of 128. [float Table:
879
 
880
+------------+-------+
881
| osd_frame  | Frame |
882
+------------+-------+
883
+------------+-------+
884
|     0      |   0   |
885
+------------+-------+
886
|     1      |   1   |
887
+------------+-------+
888
|     2      |   2   |
889
+------------+-------+
890
|     3      |   3   |
891
+------------+-------+
892
|     4      |  OSD  |
893
+------------+-------+
894
 
895
 
896
[Table 1.7:
897
OSD Frame
898
]
899
 
900
][float Table:
901
 
902
+-----------+-----------+
903
| osd_comp  | Component |
904
+-----------+-----------+
905
+-----------+-----------+
906
|    0      |     y     |
907
+-----------+-----------+
908
|    1      |     u     |
909
+-----------+-----------+
910
|    2      |     v     |
911
+-----------+-----------+
912
 
913
 
914
[Table 1.8:
915
OSD Component
916
]
917
]
918
 
919
Writes to the frame store are only defined when horizontal_size
920
and vertical_size have meaningful values. Writes with osd_frame 4
921
are only defined when osd_comp is 0.
922
 
923
1.8 Video Modeline
924
 
925
The video timing parameters are:
926
 
927
• horizontal_resolution
928
 
929
• horizontal_sync_start
930
 
931
• horizontal_sync_end
932
 
933
• horizontal_length
934
 
935
• vertical_resolution
936
 
937
• vertical_sync_start
938
 
939
• vertical_sync_end
940
 
941
• vertical_length
942
 
943
• horizontal_halfline
944
 
945
• interlaced
946
 
947
• pixel_repetition
948
 
949
These parameters can be deduced from the X11 modeline for the
950
display, which is described in the “XFree86 Video Timings HOWTO”.
951
Writing to the internal registers which contain the video timing
952
parameters will restart the video synchronization generator.
953
 
954
Two video timing diagrams are shown, one for progressive video
955
(Figure [fig:Progressive-Video]) and one for interlaced video
956
(Figure [fig:Interlaced-Video]). The diagrams show the picture
957
area (a light grey rectangle), flanked by horizontal sync (a dark
958
grey vertical bar) and vertical sync (a dark grey horizontal
959
bar).[float Figure:
960
961
 
962
[Figure 1.2:
963
Progressive Video
964
]
965
 
966
][float Figure:
967
968
 
969
[Figure 1.3:
970
Interlaced Video
971
]
972
]
973
 
974
horizontal_resolution number of dots per scan line.
975
 
976
horizontal_sync_start used to specify the horizontal position the
977
horizontal  sync pulse begins. The leftmost pixel of a line has
978
position zero.
979
 
980
horizontal_sync_end used to specify the horizontal position the
981
horizontal  sync pulse ends.
982
 
983
horizontal_length total length, in pixels, of one scan line.
984
 
985
vertical_resolution number of visible lines per frame
986
(progressive) or field  (interlaced).
987
 
988
vertical_sync_start used to specify the line number within the
989
frame (progressive) or field (interlaced) the vertical sync pulse
990
begins. The topmost line of a frame or field is line number zero.
991
 
992
vertical_sync_end used to specify the line number within the
993
frame (progressive)  or field (interlaced) the vertical sync
994
pulse ends.
995
 
996
horizontal_halfline used to specify the horizontal position the
997
vertical  sync begins on odd fields of interlaced video. Not used
998
in progressive mode.
999
 
1000
vertical_length total number of lines of a vertical frame
1001
(progressive)  or field (interlaced).
1002
 
1003
clip_display_size If asserted, the image is clipped to
1004
(display_horizontal_size, display_vertical_size). If not
1005
asserted, the image is clipped to (horizontal_size,
1006
vertical_size).
1007
 
1008
interlaced used to specify interlaced output is required.  If
1009
interlaced is asserted, vertical sync is delayed one-half scan
1010
line at the end of odd fields.
1011
 
1012
pixel_repetition If pixel_repetition is asserted, each pixel is
1013
output twice. This can be used if the original dot clock is too
1014
low for the transmitter. As an example, suppose valid dot clock
1015
rates are 25…165 MHz, but the SDTV video being decoded has a dot
1016
clock of only 13.5 MHz. Asserting pixel_repetition and doubling
1017
dot clock frequency results in a dot clock of 27 MHz, sufficient
1018
for SDTV video to be transmitted across the link.
1019
 
1020
1.9 Interrupts
1021
 
1022
Three independent conditions may trigger an
1023
interrupt: when a picture header is encountered in the bitstream,
1024
when frame display ends, and when video resolution or frame rate
1025
changes. All three interrupt sources are optional and can be
1026
disabled individually.
1027
 
1028
When picture_hdr_intr_en is high and a picture header is
1029
encountered in the bitstream, picture_hdr is set and the
1030
interrupt signal is asserted until the status register is read.
1031
If picture_hdr_intr_en is low, the interrupt signal is never
1032
raised. picture_hdr and picture_hdr_intr_en are 0 on power-up or
1033
reset. The picture header interrupt marks the “heartbeat” of the
1034
video decoding engine.
1035
 
1036
When video vertical synchronization begins and frame_end_intr_en
1037
is high, frame_end is set and the interrupt signal is asserted
1038
until the status register is read. If frame_end_intr_en is low,
1039
the interrupt signal is never raised. frame_end and
1040
frame_end_intr_en are 0 on power-up or reset. The frame end
1041
interrupt marks the “heartbeat” of the video display engine.
1042
 
1043
When one of horizontal_size, vertical_size,
1044
display_horizontal_size, display_vertical_size,
1045
progressive_sequence, aspect_ratio_information, frame_rate_code,
1046
frame_rate_extension_n, or frame_rate_extension_d changes, and
1047
video_ch_intr_en is high, video_ch is set and the interrupt
1048
signal is asserted until the status register is read. If
1049
video_ch_intr_en is low, the interrupt signal is never raised.
1050
video_ch and video_ch_intr_en are 0 on power-up or reset. The
1051
video change interrupt marks an abrupt change in the MPEG2
1052
bitstream.
1053
 
1054
It is suggested that software, when receiving a video change
1055
interrupt:
1056
 
1057
1. Reads the size, display size and frame rate registers.
1058
 
1059
2. If frame_rate_code, frame_rate_extension_d or
1060
  frame_rate_extension_n have changed, change dot clock
1061
  frequency.
1062
 
1063
3. Calculates a video modeline, either using a look-up table or
1064
  algebraically, e.g. using the VESA General Timing Formula.
1065
 
1066
4. Writes the new video modeline parameters to the horizontal,
1067
  horizontal sync, vertical, vertical sync and video mode
1068
  registers. This restarts the video synchronization.
1069
 
1070
5. If horizontal_size or vertical_size have changed and
1071
  osd_enable is high, rewrite the On-Screen Display.
1072
 
1073
1.10 Watchdog
1074
 
1075
The MPEG2 decoder contains a watchdog circuit. The watchdog
1076
circuit resets the decoder if the decoder is unresponsive. The
1077
decoder is considered unresponsive if the decoder does not accept
1078
MPEG2 data for a period of time longer than the watchdog timeout
1079
interval. We outline how to configure the watchdog timeout
1080
interval, define under which conditions the watchdog circuit
1081
activates, and describe what happens when the watchdog timer
1082
expires.
1083
 
1084
The watchdog timeout interval can be configured by writing
1085
watchdog_interval, register 0, bits 15-8.
1086
 
1087
• writing 0 to watchdog_interval causes the watchdog timer to
1088
  expire immediately.
1089
 
1090
• writing a value from 1 to 254, inclusive, to watchdog_interval
1091
  enables the watchdog circuit.
1092
 
1093
• writing 255 decimal to watchdog_interval disables the watchdog
1094
  circuit.
1095
 
1096
The default value of watchdog_interval is 127. If
1097
watchdog_interval has a value from 1 to 254, inclusive, the
1098
watchdog timeout iswatchdog\_timeout=(watchdog\_interval+1).(repeat\_frame+1).2^{18}
1099
clk clock cycles. repeat_frame (Section [sec:Trick-mode])
1100
determines the numer of times a decoded video frame is displayed.
1101
Each decoded video image is shown repeat_frame + 1 times. If a
1102
video frame is shown n times, the watchdog timeout is multiplied
1103
by n as well. This implies there is no need to adjust the
1104
watchdog timer if video is reproduced in slow motion.
1105
 
1106
The default value of repeat_frame is 0. If decoder clk frequency
1107
is 75 MHz the default watchdog timeout interval is 0.45 seconds.
1108
 
1109
The watchdog timer starts running when the decoder raises the
1110
busy signal. If the busy signal remains high for longer than the
1111
watchdog timeout interval, a reset is generated.
1112
 
1113
The watchdog timer is reset
1114
 
1115
• when the global rst input signal is driven low
1116
 
1117
• when the decoder busy signal is low
1118
 
1119
• when the decoder has been halted to show the current frame
1120
  (repeat_frame is 31, freeze-frame)
1121
 
1122
• when the decoder has been halted to show a particular
1123
  framestore frame (source_select is non-zero)
1124
 
1125
• when the watchdog circuit has been disabled (watchdog_interval
1126
  has been set to 0 or to 255)
1127
 
1128
• during the first 2^{26} clk clock cycles after the watchdog
1129
  timer expired, or the decoder was reset. This watchdog timer
1130
  holdoff disables the watchdog during system initialisation. If
1131
  clock frequency is 75 MHz, 2^{26} clock cycles corresponds to
1132
  0.89 seconds.
1133
 
1134
When the watchdog timer expires
1135
 
1136
• the watchdog_rst output pin becomes low during one clk clock
1137
  cycle. The watchdog_rst output can be used to reset external
1138
  hardware, or to generate a processor interrupt.
1139
 
1140
• the watchdog_status bit in the status register is set to 1.
1141
  Software can detect whether the watchdog timer expired by
1142
  checking watchdog_status in the status register. Reading the
1143
  status register resets the watchdog_status bit back to 0.
1144
 
1145
• The framestore, On-Screen Display and circular video buffer are
1146
  filled with zeroes.
1147
 
1148
• any data in the memory response fifo is discarded.
1149
 
1150
• osd_enable is set to 0. This disables the On-Screen Display, as
1151
  the On-Screen Display now contains all zeroes.
1152
 
1153
• configuration data written to the register file is not modified
1154
  when the watchdog expires. In particular, the video timing
1155
  parameters (Sec. [sec:Video-Modeline]) remain unchanged.
1156
 
1157
The watchdog_rst output pin can optionally be used to reset
1158
external hardware when the watchdog expires. Examples of external
1159
hardware are the memory controller and the DVI dot clock
1160
generator. Note, however, resetting memory controller and DVI dot
1161
clock generator when the watchdog timer expires is optional.
1162
 
1163
The MPEG2 decoder does not require the external memory controller
1164
to be reset when the watchdog timer expires. When the watchdog
1165
timer expires, the MPEG2 decoder will write zeroes to all
1166
addresses from FRAME_0_Y to VBUF_END (framestore_request.v,
1167
STATE_CLEAR). When the watchdog timer expires, the MPEG2 decoder
1168
will also read and discard any data from the memory response fifo
1169
(framestore_response.v, STATE_FLUSH). These two actions
1170
re-synchronize MPEG2 decoder and external memory controller and
1171
bring memory to a known state.
1172
 
1173
The MPEG2 decoder also does not require the DVI clock generator
1174
to be reset when the watchdog expires. When the watchdog timer
1175
expires, the video timing parameters (Sec. [sec:Video-Modeline])
1176
remain unchanged. If the DVI clock frequency remains unchanged
1177
when the watchdog timer expires, the decoder will continue with
1178
exactly the same video timing.
1179
 
1180
1.11 Trick mode
1181
 
1182
The trick mode register provides a toolbox for
1183
implementing non-standard playback modes. An example of a
1184
non-standard playback mode is slow motion. It is perhaps easiest
1185
to visualize trick mode settings as a pipeline (Figure [fig:Trick-mode-pipeline]
1186
).[float Figure:
1187
1188
 
1189
[Figure 1.4:
1190
Trick mode pipeline
1191
]
1192
]
1193
 
1194
flush_vbuf Writing one to flush_vbuf clears the incoming video
1195
buffer. Flushing the video buffer may be useful when changing
1196
channels.
1197
 
1198
persistence If persistence is set, and no new decoded image is
1199
available at frame start the last decoded image is shown again.
1200
If persistence is not set, and no new decoded image is available
1201
at frame start a blank screen is shown. persistence is 1 on
1202
power-up or reset.
1203
 
1204
source_select If zero, normal video is shown.  Non-zero values
1205
allow continuous output of a blank screen, or a specific frame
1206
from the frame store, as in table [tab:Source-Select].
1207
source_select is 0 on power-up or reset. [float Table:
1208
 
1209
+----------------+--------------------+
1210
| source_select  |    Frame shown     |
1211
+----------------+--------------------+
1212
+----------------+--------------------+
1213
|       0        | last decoded frame |
1214
+----------------+--------------------+
1215
|       1        |    blank screen    |
1216
+----------------+--------------------+
1217
|       4        |      frame 0       |
1218
+----------------+--------------------+
1219
|       5        |      frame 1       |
1220
+----------------+--------------------+
1221
|       6        |      frame 2       |
1222
+----------------+--------------------+
1223
|       7        |      frame 3       |
1224
+----------------+--------------------+
1225
 
1226
 
1227
[Table 1.9:
1228
Source Select
1229
]
1230
]
1231
 
1232
repeat_frame If zero, each decoded image is shown once.  If
1233
non-zero, contains the number of times the decoded image will be
1234
additionally shown, as in table [tab:Repeat-Frame]. A value of 31
1235
shows the image indefinitely. repeat_frame is 0 on power-up or
1236
reset. [float Table:
1237
 
1238
+---------------+-------------+
1239
| repeat_frame  | times shown |
1240
+---------------+-------------+
1241
+---------------+-------------+
1242
|      0        |      1      |
1243
+---------------+-------------+
1244
|      1        |      2      |
1245
+---------------+-------------+
1246
|      2        |      3      |
1247
+---------------+-------------+
1248
|      …        |             |
1249
+---------------+-------------+
1250
|      30       |     31      |
1251
+---------------+-------------+
1252
|      31       |   forever   |
1253
+---------------+-------------+
1254
 
1255
 
1256
[Table 1.10:
1257
Repeat Frame
1258
]
1259
]
1260
 
1261
deinterlace Setting deinterlace high forces the decoder to output
1262
video as frames, even if the MPEG2 stream is interlaced. This can
1263
be used to reproduce interlaced MPEG2 streams on progressive
1264
displays. Setting deinterlace is not recommended when reproducing
1265
a progressive MPEG2 stream on a progressive display. Setting
1266
deinterlace has no effect if the video modeline specifies
1267
interlaced output (interlaced set). Note no spatial or temporal
1268
interpolation is done (“weaving”).
1269
 
1270
1.12 Test point
1271
 
1272
The MPEG2 decoder provides a test point for connecting a logic
1273
analyzer. Internally, the decoder contains various test points,
1274
only one of which is actually output to the logic analyzer. Which
1275
internal test point is output to the logic analyzer is determined
1276
by the contents of testpoint_sel. The value of bits 0..31 of the
1277
test point can also be read by software. While this is no
1278
substitute for a logic analyzer, it is recognized that in many
1279
cases this may be the only option available.
1280
 
1281
testpoint_sel Used in hardware debugging.  Determines which
1282
internal test point is multiplexed to the 34-channel logical
1283
analyzer test point.
1284
 
1285
testpoint Used in hardware debugging.  Provides the current value
1286
of bits 0 to 31 of the 34-channel logical analyzer test point.
1287
 
1288
Decoder Sources
1289
 
1290
Chapter [cha:Decoder-Sources] provides an overview of the decoder
1291
sources for the hardware engineer who wishes to synthesize or
1292
modify the decoder.
1293
 
1294
2.1 Source Directory Structure
1295
 
1296
The source files are organized in directories as follows:
1297
 
1298
 
1299
  bench/    iverilog     Icarus behavioral simulation, page [subsec:Icarus-Verilog-Simulation]
1300
  doc/                   Documentation
1301
  rtl/      mpeg2        MPEG2 decoder, page [sec:MPEG2-Decoder]
1302
  tools/    fsmgraph     Finite state machine graphs, page [subsec:FSM-Graphs]
1303
            ieee1180     IEEE1180 IDCT accuracy test, page [subsec:IEEE-1180-IDCT]
1304
            logicport    Logicport logic analyzer, page [subsec:Logicport-Logic-Analyzer]
1305
            mpeg2dec     Reference MPEG2 decoder, page [subsec:mpeg2decode]
1306
            streams      MPEG2 test streams, page [subsec:MPEG2-Test-Streams]
1307
 
1308
 
1309
A linux system with Icarus Verilog is suggested, but not
1310
required, as development environment.
1311
 
1312
2.2 MPEG2 Decoder
1313
 
1314
The rtl/mpeg2 directory contains the sources of the MPEG2 decoder
1315
itself. This section describes the changes most likely to be
1316
needed when instantiating the decoder: changing default modeline,
1317
changing FIFO sizes, choosing dual-ported ram and fifo models,
1318
changing memory mapping. In addition, references are provided for
1319
the IDCT and bilinear chroma upsampling algorithms.
1320
 
1321
2.2.1 FIFO sizes
1322
 
1323
Fifo depth and almost full/almost empty thresholds are defined in
1324
fifo_size.v. Note setting fifo depths and thresholds to arbitrary
1325
values can result in decoder deadlock.
1326
 
1327
Figure [fig:MPEG2-decoder-dataflow] shows MPEG2 decoder data
1328
flow. Together, framestore_request, memory controller and
1329
framestore_response implement the framestore. Communication with
1330
the framestore is through fifos. The incoming MPEG2 stream is
1331
written to vbuf_write_fifo. framestore_request reads the stream
1332
from vbuf_write_fifo and writes it to the circular video buffer
1333
in memory. If vbuf_read_fifo is almost empty, framestore_request
1334
issues memory read requests for the circular video buffer.
1335
framestore_response receives data from the circular video buffer
1336
and writes the data to vbuf_read_fifo. The net result is
1337
vbuf_write_fifo, circular video buffer and vbuf_read_fifo acting
1338
as a single, huge fifo.
1339
 
1340
Variable-length decoding reads the MPEG2 stream from
1341
vbuf_read_fifo, and produces motion vectors and run/length codes.
1342
Run/length decoding, inverse quantizing, inverse zig-zag and
1343
inverse discrete cosine transform (IDCT) read the run/length
1344
codes and produce the prediction error. The prediction error is
1345
written to predict_err_fifo, one row of eight pixels at a time.
1346
 
1347
Motion compensation address generation motcomp_addrgen translates
1348
the motion vectors into three sets of memory addresses: the
1349
addresses where the forward motion compensation pixels can be
1350
read, the addresses where the backward motion compensation pixels
1351
can be read, and the addresses where the reconstructed pixels can
1352
be written. The addresses of the pixels needed for forward and
1353
backward motion compensation are written to the fwd_reader and
1354
bwd_reader address fifos. The address of the reconstructed pixels
1355
is written to the motion compensation destination fifo, dst_fifo.
1356
The memory subsystem reads the fwd_reader and bwd_reader address
1357
fifos, and writes the pixel values to the fwd_reader and
1358
bwd_reader data fifos.
1359
 
1360
Motion compensation reconstruction motcomp_recon adds pixel
1361
values read from forward motion compensation data fifo, backward
1362
motion compensation data fifo and prediction error, and writes
1363
the result to the address read from the motion compensation
1364
destination fifo.
1365
 
1366
 
1367
Displaying the video image requires chroma resampling and yuv to
1368
rgb conversion. Resampling address generation resample_addrgen
1369
scans the reconstructed video image, line by line. The addresses
1370
of the pixels are written to the disp_reader address fifo. The
1371
memory subsystem reads the addresses from disp_reader address
1372
fifo and writes the pixel values to the disp_reader data fifo.
1373
resample_dta reads the pixel values from the disp_reader data
1374
fifo, while resample_bilinear does the actual bilinear chroma
1375
upsampling calculations. After conversion from yuv to rgb, the
1376
pixels are written to the pixel queue pixel_queue which adapts
1377
between decoder and DVI clocks. [float Figure:
1378
1379
 
1380
[Figure 2.1:
1381
MPEG2 decoder dataflow
1382
]
1383
]
1384
 
1385
Note the memory tag fifo mem_tag_fifo between framestore_request
1386
and framestore_response. For every memory read request,
1387
framestore_request writes a tag to the memory tag fifo. The tag
1388
identifies the source of the memory read request: circular video
1389
buffer, forward and backward motion compensation, or resampling.
1390
For every data word received from memory, framestore_response
1391
reads a tag from the memory tag fifo, and writes the data word
1392
received from memory to the data fifo corresponding to the tag.
1393
If the memory tag fifo is almost full, framestore_request stops
1394
issuing memory read or write requests. As a result, the number of
1395
outstanding memory read requests is always less than or equal to
1396
the size of the memory tag fifo.
1397
 
1398
When modifying fifo_size.v, care should be taken the fifos can
1399
never overflow. Note that when framestore_request stops issuing
1400
memory read requests, there still may be outstanding memory read
1401
requests in the memory request queue. The number of outstanding
1402
memory read requests is always smaller than, or equal to, the
1403
size of the memory tag fifo. When modifying fifo_size.v, remember
1404
fifos which receive data from memory may receive outstanding
1405
data, even after framestore_request has stopped sending memory
1406
read requests.
1407
 
1408
2.2.2 Dual-ported memory and FIFO models
1409
 
1410
FPGAs typically provide dedicated on-chip fifo's and dual-port
1411
RAMs. The designer then has to choose between using
1412
vendor-provided FIFOs and dual-port RAMs or writing his own.
1413
 
1414
The file wrappers.v defines the implementation of all dual-port
1415
RAMs and fifos in the design. For each component, two versions
1416
are provided: one where read and write port share a common clock;
1417
and one where read and write port have independent clocks.
1418
 
1419
dpram_sc dual-ported ram, same clock for read and write ports
1420
 
1421
dpram_dc dual-ported ram, different clock for read and write
1422
ports
1423
 
1424
fifo_sc fifo, same clock for read and write ports
1425
 
1426
fifo_dc fifo, different clock for read and write ports
1427
 
1428
The dual-ported rams are inferred from code in wrappers.v. The
1429
fifos can be either implemented in Verilog, or instantiated as
1430
FPGA primitives, depending upon wrappers.v. Following fifo models
1431
are available:
1432
 
1433
xfifo_sc.v fifo, same clock for read and write port.
1434
 
1435
generic_fifo_sc_b.v OpenCores generic fifo, different clock for
1436
read and write ports.
1437
 
1438
xilinx_fifo_sc.v Xilinx Virtex-5 fifo, same clock for read and
1439
write ports. Uses xilinx_fifo.v, xilinx_fifo144.v and
1440
xilinx_fifo216.v.
1441
 
1442
xilinx_fifo_dc.v Xilinx Virtex-5 fifo, different clock for read
1443
and write ports. Uses xilinx_fifo.v, xilinx_fifo144.v and
1444
xilinx_fifo216.v.
1445
 
1446
xilinx_fifo_sc.v and xilinx_fifo_dc.v implement fifos using
1447
FIFO18, FIFO18_36, FIFO36 or FIFO36_72 Virtex-5 primitives. Table
1448
[tab:Xilinx-FIFO-address] lists available data and address
1449
widths. If a xilinx_fifo_sc.v or a xilinx_fifo_dc.v is
1450
instantiated with data and/or address widths different from those
1451
in Table [tab:Xilinx-FIFO-address], the actual fifo will be
1452
larger and/or wider. [float Table:
1453
 
1454
+------------+---------------+-------------+----------------+
1455
| Data bits  | Address bits  | FIFO Depth  | Implementation |
1456
+------------+---------------+-------------+----------------+
1457
+------------+---------------+-------------+----------------+
1458
|     4      |      13       |    8192     |     FIFO36     |
1459
+------------+---------------+-------------+----------------+
1460
|     4      |      12       |    4096     |     FIFO18     |
1461
+------------+---------------+-------------+----------------+
1462
|     9      |      12       |    4096     |     FIFO36     |
1463
+------------+---------------+-------------+----------------+
1464
|     9      |      11       |    2048     |     FIFO18     |
1465
+------------+---------------+-------------+----------------+
1466
|    18      |      11       |    2048     |     FIFO36     |
1467
+------------+---------------+-------------+----------------+
1468
|    18      |      10       |    1024     |     FIFO18     |
1469
+------------+---------------+-------------+----------------+
1470
|    36      |      10       |    1024     |     FIFO36     |
1471
+------------+---------------+-------------+----------------+
1472
|    36      |      9        |    512      |     FIFO18     |
1473
+------------+---------------+-------------+----------------+
1474
|    72      |      9        |    512      |   FIFO36_72    |
1475
+------------+---------------+-------------+----------------+
1476
|    144     |      9        |    512      | 2 * FIFO36_72  |
1477
+------------+---------------+-------------+----------------+
1478
|    216     |      9        |    512      | 3 * FIFO36_72  |
1479
+------------+---------------+-------------+----------------+
1480
 
1481
 
1482
[Table 2.1:
1483
Xilinx FIFO address widths
1484
]
1485
]
1486
 
1487
2.2.3 Memory mapping
1488
 
1489
The MPEG2 decoder memory mapping is defined in
1490
rtl/mpeg2/mem_codes.v. The default memory mapping needs 4 mbyte
1491
RAM and is sufficient for SDTV. By defining MP_AT_HL an
1492
alternative memory mapping can be chosen which requires 16 mbyte
1493
RAM and is sufficient for HDTV.
1494
 
1495
Translation of macroblock addresses to memory addresses is
1496
implemented in rtl/mpeg2/mem_addr.v. A macroblock address, a
1497
signed motion vector (mv_x, mv_y) with halfpixel precision, and
1498
an signed offset (delta_x, delta_y) with pixel precision are
1499
translated to an address in memory.
1500
 
1501
The macroblock address is assumed to iterate over all allowable
1502
values: beginning at zero, incrementing by one, until after the
1503
final macroblock the macroblock address is reset to zero.
1504
Macroblock address has to be initialized to zero, or an error
1505
condition results. Macroblock address changes other than
1506
incrementing by one, remaining unchanged or resetting to zero
1507
also result in an error condition.
1508
 
1509
Note the motion vector (mv_x, mv_y) is scaled by a factor two
1510
when accessing chrominance as defined in [1, par. 7.6.3.7]. The
1511
offset (delta_x, delta_y) remains unchanged when accessing
1512
chrominance blocks.
1513
 
1514
The translation of macroblock addresses and motion vectors to
1515
memory addresses in rtl/mpeg2/mem_addr.v has to be kept
1516
synchronized with the framestore dump task write_framestore in
1517
rtl/sim/mem_ctl.v, else the framestore dumps made during
1518
simulation will not accurately represent framestore contents.
1519
 
1520
Note out-of-range memory accesses are translated to the ADDR_ERR
1521
address. If a memory request with address mem_req_rd_addr equal
1522
to ADDR_ERR occurs during simulation, simulation stops with an
1523
error message.
1524
 
1525
The MPEG2 decoder zeroes out the framestore after system reset or
1526
when the watchdog timer expires. The MPEG2 decoder writes zeroes
1527
to all addresses from FRAME_0_Y to VBUF_END when the rst input
1528
pin goes low or when the watchdog_rst pin goes low.
1529
 
1530
2.2.4 Modeline
1531
 
1532
The default modeline is 800x600 progressive @ 60 Hz (SVGA). The
1533
modeline.v source contains the modeline parameters, and can be
1534
edited to change horizontal and vertical resolution, sync pulse
1535
width and position. The default pixel frequency on the ML505 is
1536
38.21 MHz, and is defined in dotclock_synthesizer.v. Note
1537
dotclock_synthesizer.v synthesizes two frequencies, dotclock and
1538
dotclock90, equal in frequency but 90 degrees phase shifted. The
1539
frequency synthesized is f_{out}=f_{osc}.r.\frac{DCM\_ADV\_INST.CLKFX\_MULTIPLY}{DCM\_ADV\_INST.CLKFX\_DIVIDE}
1540
 
1541
where f_{osc} is the 100 MHz user clock frequency f_{osc}=100and r=\frac{PLL\_ADV\_INST.CLKFBOUT\_MULT}{PLL\_ADV\_INST.CLKOUT1\_DIVIDE}=0.25
1542
 To change pixel frequency, first calculate the multiplier and
1543
divider for the new frequency. Suppose one wishes to synthesize a
1544
frequency of 35 MHz:
1545
 
1546
macpro mpeg2ether # ./mpeg2ether --dot_clock 35
1547
 
1548
dotclock ftarget =  35.00 fout =  35.00 MHz
1549
 multiplier:  7 divider:  5
1550
 high frequency mode: 0 ch7301 lowfreq: 1 ch7301 colorbars: 0
1551
 
1552
A pixel frequency of 35 MHz requires a multiplier of 7 and a
1553
divider of 5, with lowfreq asserted. Hence, in dvi/dotclock.v:
1554
 
1555
parameter [7:0]
1556
 
1557
  DEFAULT_DIVIDER       = 8'd4, // Divider minus one, actually
1558
 
1559
  DEFAULT_MULTIPLIER    = 8'd6; // Multiplier minus one, actually
1560
 
1561
 
1562
 
1563
parameter
1564
 
1565
  DEFAULT_LOWFREQ       = 1'b1
1566
 
1567
 
1568
 
1569
Note the modeline can be configured at any time using the
1570
mpeg2ether utility; it is only when changing the default modeline
1571
that modifying the sources is necessary. The mpeg2ether utility
1572
is explained on page [subsec:mpeg2ether].
1573
 
1574
2.2.5 Inverse Discrete Cosine Transform
1575
 
1576
The IDCT algorithm used is described in [4]. A copy of document [4]
1577
 can be found in the doc directory. The IDCT implementation uses
1578
12 18x18 multipliers and two dual-port rams, and can do
1579
streaming. Run-length decoding (rld.v), inverse quantizing
1580
(iquant.v, zigzag_table.v) and IDCT transform (idct.v) all
1581
operate at the same speed of one pixel per clock. The IDCT meets
1582
the requirements of the former IEEE-1180.
1583
 
1584
2.2.6 Bilinear chroma upsampling
1585
 
1586
The chrominance components have half the vertical and half the
1587
horizontal resolution of the luminance. To obtain equal
1588
chrominance and luminance resolution, bilinear chroma upsampling
1589
is used. Bilinear chroma upsampling computes chroma pixel values
1590
by vertical and horizontal interpolation. Vertical interpolation
1591
implies adding two rows of chroma values with different weights.
1592
The chroma row closest to the luma row gets weight 3/4, while the
1593
chroma row farthest from the luma row gets weight 1/4. The
1594
document doc/bilinear.pdf shows the weights used.
1595
 
1596
 
1597
Bilinear chroma upsampling is implemented in various source
1598
files, as described in Table [tab:Upsampling-source-files]. [float Table:
1599
 
1600
+----------------------+------------------------------------------------+
1601
| Source               | Description                                    |
1602
+----------------------+------------------------------------------------+
1603
+----------------------+------------------------------------------------+
1604
| resample.v           | Upsampling top-level file                      |
1605
+----------------------+------------------------------------------------+
1606
| resample_addrgen.v   | Generates memory addresses of chroma/lumi rows |
1607
+----------------------+------------------------------------------------+
1608
| resample_dta.v       | Reads chroma/lumi rows from memory             |
1609
+----------------------+------------------------------------------------+
1610
| resample_bilinear.v  | Performs bilinear upsampling calculations      |
1611
+----------------------+------------------------------------------------+
1612
 
1613
 
1614
[Table 2.2:
1615
Upsampling source files
1616
]
1617
 
1618
 
1619
]
1620
 
1621
2.3 Simulation
1622
 
1623
Behavioral simulation using Icarus Verilog is described. For
1624
timing simulation consult your synthesis software.
1625
 
1626
2.3.1 Icarus Verilog Simulation
1627
 
1628
Behavioral simulation of the decoder can be performed using
1629
Icarus Verilog. The Icarus Verilog testbench in the
1630
bench/iverilog directory contains the following files:
1631
 
1632
testbench.v Top-level Verilog source; instantiates MPEG2 decoder.
1633
 
1634
mem_ctl.v Simple memory controller, for simulation only.
1635
 
1636
Makefile Makefile to create and run the simulation.
1637
 
1638
wrappers.v Wrapper for dual-port ram and fifos. Implements
1639
synchronous fifos using xfifo_sc.v, and implements asynchronous
1640
fifos as OpenCores generic_fifo_sc_b.v.
1641
 
1642
generic_dpram.v, generic_fifo_dc.v, generic_fifo_sc_b.v Opencores
1643
generic fifos.
1644
 
1645
Create the decoder is easy using the accompanying Makefile.
1646
First, remove any files left over from a previous simulation:
1647
 
1648
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ make clean
1649
 
1650
rm -f mpeg2 stream.dat testbench.lxt trace framestore_*.ppm
1651
tv_out_*.ppm
1652
 
1653
Now create the decoder:
1654
 
1655
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ make
1656
 
1657
iverilog -D__IVERILOG__ -DMODELINE_SIF -I ../../rtl/mpeg2 -o
1658
mpeg2
1659
 testbench.v mem_ctl.v wrappers.v generic_fifo_dc.v
1660
 generic_fifo_sc_b.v generic_dpram.v ../../rtl/mpeg2/mpeg2video.v
1661
 
1662
../../rtl/mpeg2/vbuf.v ../../rtl/mpeg2/getbits.v
1663
xxd -c 1 ../../tools/streams/stream-susi.mpg |
1664
 cut -d\  -f 2 > stream.dat
1665
 
1666
This executes two commands:
1667
 
1668
• iverilog to compile the Verilog sources to an executable,
1669
  mpeg2.
1670
 
1671
• xxd to convert the binary MPEG2 program stream file stream.mpg
1672
  to an ASCII file stream.dat, which the simulator can load.
1673
 
1674
When compiling the Verilog sources, two Verilog parameters are
1675
defined on the command line: __IVERILOG__ and MODELINE_SIF. The
1676
first Verilog define, __IVERILOG__ , is defined only during
1677
simulation, and never during synthesis. It is used to enable
1678
several run-time checks which only make sense in a simulation
1679
environment. The second Verilog define, MODELINE_SIF, chooses one
1680
of several pre-defined video output formats from modeline.v.
1681
 
1682
Finally, run the newly created executable mpeg2:
1683
 
1684
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ make test
1685
 
1686
IVERILOG_DUMPER=lxt ./mpeg2
1687
 
1688
LXT info: dumpfile testbench.lxt opened for output.
1689
 
1690
$readmemh(stream.dat): Not enough words in the read file for
1691
 requested range.
1692
 
1693
testbench.mem_ctl.write_framestore      dumping framestore to
1694
               framestore_000.ppm @  0.02 ms
1695
 
1696
testbench.mem_ctl.write_framestore      dumping framestore to
1697
               framestore_001.ppm @  0.02 ms
1698
 
1699
testbench.mpeg2.motcomp macroblock_address:    0
1700
 
1701
testbench.mpeg2.motcomp macroblock_address:    1
1702
 
1703
testbench.mpeg2.motcomp macroblock_address:    2
1704
 
1705
testbench.mpeg2.motcomp macroblock_address:    3
1706
 
1707
During simulation, the environment variable IVERILOG_DUMPER=lxt
1708
is set. This instructs the simulator to produce a dumpfile in the
1709
more compact lxt format, instead of the default vcd format.
1710
 
1711
By default, simulator output includes the macroblock address.
1712
This allows easy monitoring of decoder progress.
1713
 
1714
Each Verilog source file contains a define DEBUG statement, which
1715
can be uncommented or commented to switch trace output for that
1716
particular source file on or off.
1717
 
1718
During simulation, two kinds of graphics files are written:
1719
framestore dumps framestore_*.ppm and video captures
1720
tv_out_*.ppm. The framestore is where the decoder stores already
1721
decoded images. These are Portable Pixmap graphics files in ASCII
1722
format. Figure [fig:susi-framestore-dump] shows a sample
1723
framestore dump.
1724
 
1725
The framestore consists of four frames and the on-screen display
1726
(OSD). The first two frames contain I and P pictures, while the
1727
last two frames contain B-pictures. Each frame consists of y
1728
(luminance), u and v (chrominance) information, with u and v
1729
having half the horizontal and half the vertical resolution of y.
1730
In the framestore dump, uninitialized memory is displayed in
1731
green. Looking at figure [fig:susi-framestore-dump], one can see
1732
that the first three frames of the framestore have already been
1733
written; the decoder is halfway through the fourth frame. The
1734
On-Screen Display, at the bottom of the framestore dump, has not
1735
been initialized yet.
1736
 
1737
During simulation, by default, the framestore is dumped whenever
1738
a new frame begins; and every 200 macroblocks. As a framestore
1739
dump is a graphics file in ASCII format, one can also look at the
1740
file using standard text file utilities. These are the first 12
1741
lines of a sample framestore dump:
1742
 
1743
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ head -12
1744
framestore_0001.ppm
1745
 
1746
P3
1747
 
1748
# mpeg2 framestore dump @ 11.81 ms
1749
 
1750
# frame number 2
1751
 
1752
# horizontal_size 352
1753
 
1754
# vertical_size 288
1755
 
1756
# display_horizontal_size 0
1757
 
1758
# display_vertical_size 0
1759
 
1760
# mb_width 22
1761
 
1762
# mb_height 18
1763
 
1764
# picture_structure frame picture
1765
 
1766
# chroma_format 420
1767
 
1768
352 2618 255
1769
 
1770
255 255 255 255 255 255 255 255 255 255 255 255
1771
 255 255 255 255 255 255 255 255 255 255 255 255
1772
 
1773
The header of the framestore dump contains information about
1774
decoder status at the moment of the dump.[float Figure:
1775
1776
 
1777
[Figure 2.2:
1778
Framestore dump
1779
]
1780
]
1781
 
1782
Figure [fig:susi-video-output] shows video capture file
1783
tv_out_0000.ppm.Horizontal sync is displayed as a vertical black
1784
stripe, to the right of the image. Vertical sync is displayed as
1785
a horizontal black stripe, below the image area. Blanking is
1786
displayed in a dark grey. The position of picture, horizontal
1787
sync and vertical sync in figure [fig:susi-video-output] is as
1788
defined in figure [fig:Progressive-Video]. As with the framestore
1789
dumps, one can look at tv_out_0000.ppm using standard text
1790
utilities.
1791
 
1792
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ head -10
1793
tv_out_0000.ppm
1794
 
1795
P3
1796
 
1797
# picture 1 @ 10.73 ms
1798
 
1799
# horizontal resolution 352 sync_start 381 sync_end 388 length
1800
458
1801
 
1802
# vertical resolution 288 sync_start 295 sync_end 298 length 315
1803
 
1804
# interlaced 0 halfline 175
1805
 
1806
459 316 255
1807
 
1808
 
1809
 
1810
 
1811
 
1812
3 0 3
1813
 
1814
2 0 2
1815
 
1816
The header of the video capture file contains information about
1817
the video modeline at the moment of video capture.[float Figure:
1818
1819
 
1820
[Figure 2.3:
1821
Video output capture
1822
]
1823
]
1824
 
1825
To end the simulation, go to the window where iverilog is running
1826
and type ctrl-c finish. The simulator will finish writing trace
1827
and testbench.lxt files, and return control to the command
1828
prompt.
1829
 
1830
The binary file testbench.lxt is a log of all wire and register
1831
changes which occurred during simulation. testbench.lxt can be
1832
displayed using vcd viewers such as gtkwave.
1833
 
1834
koen@macpro ~/xilinx/mpeg2/bench/iverilog $ gtkwave testbench.lxt
1835
&
1836
 
1837
Once testbench.lxt file has been loaded in gtkwave, internal
1838
decoder wires and registers can be displayed as waveforms.
1839
 
1840
2.3.2 Conformance Tests
1841
 
1842
The bench/conformance directory contains a testbench for the
1843
ISO/IEC 13818-4 MPEG2 conformance tests. The testbench assumes
1844
the ISO/IEC 13818-4 conformance test bitstreams are available on
1845
your system. The ISO/IEC 13818-4 MPEG2 Conformance test
1846
bitstreams for Main Profile @ Main Level can be downloaded from
1847
the ISO web site using the tools/streams/retrieve script.
1848
 
1849
Typing make clean test in the bench/conformance directory
1850
simulates all MP@ML conformance test bitstreams. Table [tab:Conformance-Test-Suite]
1851
 summarizes test results.
1852
 
1853
When running the compatibility tests, note the decoder is not
1854
MPEG1-compatible, and does not decode MPEG1 streams. The MPEG2
1855
decoder decodes MPEG2 4:2:0 program streams only. [float Table:
1856
 
1857
+----------------------------+--------------------+---------------------+
1858
| Test bitstream             | Profile and level  |       Remarks       |
1859
+----------------------------+--------------------+---------------------+
1860
+----------------------------+--------------------+---------------------+
1861
| tcela/tcela-16-matrices    |      11172-2       | Fail (MPEG1 stream) |
1862
+----------------------------+--------------------+---------------------+
1863
| tcela/tcela-18-d-pict      |      11172-2       | Fail (MPEG1 stream) |
1864
+----------------------------+--------------------+---------------------+
1865
| compcore/ccm1              |      11172-2       | Fail (MPEG1 stream) |
1866
+----------------------------+--------------------+---------------------+
1867
| tcela/tcela-19-wide        |      11172-2       | Fail (MPEG1 stream) |
1868
+----------------------------+--------------------+---------------------+
1869
| toshiba/toshiba_DPall-0    |       SP@ML        |                     |
1870
+----------------------------+--------------------+---------------------+
1871
| nokia/nokia6_dual          |       SP@ML        |                     |
1872
+----------------------------+--------------------+---------------------+
1873
| nokia/nokia6_dual60        |       SP@ML        |                     |
1874
+----------------------------+--------------------+---------------------+
1875
| nokia/nokia_7              |       SP@ML        |                     |
1876
+----------------------------+--------------------+---------------------+
1877
| tcela/tcela-14-bff-dp      |       SP@ML        |                     |
1878
+----------------------------+--------------------+---------------------+
1879
| ibm/ibm-bw-v3              |       SP@ML        |                     |
1880
+----------------------------+--------------------+---------------------+
1881
| tcela/tcela-8-fp-dp        |       SP@ML        |                     |
1882
+----------------------------+--------------------+---------------------+
1883
| tcela/tcela-9-fp-dp        |       SP@ML        |      1 bit off      |
1884
+----------------------------+--------------------+---------------------+
1885
| mei/MEI.stream16v2         |       SP@ML        | Fail (MPEG1 stream) |
1886
+----------------------------+--------------------+---------------------+
1887
| mei/MEI.stream16.long      |       SP@ML        | Fail (MPEG1 stream) |
1888
+----------------------------+--------------------+---------------------+
1889
| ntr/ntr_skipped_v3         |       SP@ML        |                     |
1890
+----------------------------+--------------------+---------------------+
1891
| teracom/teracom_vlc4       |       SP@ML        |                     |
1892
+----------------------------+--------------------+---------------------+
1893
| tcela/tcela-15-stuffing    |       SP@ML        |                     |
1894
+----------------------------+--------------------+---------------------+
1895
| tcela/tcela-17-dots        |       SP@ML        |                     |
1896
+----------------------------+--------------------+---------------------+
1897
| gi/gi4                     |       MP@ML        |                     |
1898
+----------------------------+--------------------+---------------------+
1899
| gi/gi6                     |       MP@ML        |                     |
1900
+----------------------------+--------------------+---------------------+
1901
| gi/gi_from_tape            |       MP@ML        |                     |
1902
+----------------------------+--------------------+---------------------+
1903
| gi/gi7                     |       MP@ML        |                     |
1904
+----------------------------+--------------------+---------------------+
1905
| gi/gi_9                    |       MP@ML        |                     |
1906
+----------------------------+--------------------+---------------------+
1907
| ti/TI_cl_2                 |       MP@ML        |                     |
1908
+----------------------------+--------------------+---------------------+
1909
| tceh/tceh_conf2            |       MP@ML        |                     |
1910
+----------------------------+--------------------+---------------------+
1911
| mei/mei.2conftest.4f       |       MP@ML        |                     |
1912
+----------------------------+--------------------+---------------------+
1913
| mei/mei.2conftest.60f.new  |       MP@ML        |                     |
1914
+----------------------------+--------------------+---------------------+
1915
| tek/Tek-5.2                |       MP@ML        |                     |
1916
+----------------------------+--------------------+---------------------+
1917
| tek/Tek-5-long             |       MP@ML        |                     |
1918
+----------------------------+--------------------+---------------------+
1919
| tcela/tcela-6-slices       |       MP@ML        |                     |
1920
+----------------------------+--------------------+---------------------+
1921
| tcela/tcela-7-slices       |       MP@ML        |                     |
1922
+----------------------------+--------------------+---------------------+
1923
| sony/sony-ct1              |       MP@ML        |                     |
1924
+----------------------------+--------------------+---------------------+
1925
| sony/sony-ct2              |       MP@ML        |                     |
1926
+----------------------------+--------------------+---------------------+
1927
| sony/sony-ct3              |       MP@ML        |                     |
1928
+----------------------------+--------------------+---------------------+
1929
| sony/sony-ct4              |       MP@ML        |                     |
1930
+----------------------------+--------------------+---------------------+
1931
| att/att_mismatch           |       MP@ML        |                     |
1932
+----------------------------+--------------------+---------------------+
1933
| teracom/teracom_vlc4       |       MP@ML        |                     |
1934
+----------------------------+--------------------+---------------------+
1935
| ccett/mcp10ccett           |       MP@ML        |                     |
1936
+----------------------------+--------------------+---------------------+
1937
| lep/bits_conf_lep_11       |       MP@ML        |                     |
1938
+----------------------------+--------------------+---------------------+
1939
| hhi/hhi_burst_short        |       MP@ML        |                     |
1940
+----------------------------+--------------------+---------------------+
1941
| hhi/hhi_burst_long         |       MP@ML        |                     |
1942
+----------------------------+--------------------+---------------------+
1943
| tcela/tcela-10-killer      |       MP@ML        |                     |
1944
+----------------------------+--------------------+---------------------+
1945
 
1946
 
1947
[Table 2.3:
1948
Conformance Test Suite
1949
]
1950
]
1951
 
1952
2.4 Tools
1953
 
1954
The tools directory contains various utilities and tools used
1955
during decoder development and test.
1956
 
1957
2.4.1 Logic Analyzer
1958
 
1959
On the Xilinx ML505, the MPEG2 decoder testpoint has been broken
1960
out to the Xilinx Generic Interface (XGI) . The test point
1961
selection can be done using the GPIO DIP switches. If the ML505
1962
is held so the LCD can be read, the GPIO DIP switches are at the
1963
bottom right of the board. GPIO DIP switches are numbered 1 to 8,
1964
from left to right.
1965
 
1966
If GPIO DIP switch 3 is off, test point selection is made by
1967
writing to register 15 decimal, REG_WR_TESTPOINT. If GPIO DIP
1968
switch 3 is on, test point selection is made by dip switches 5 to
1969
8. GPIO DIP switch 5 is MSB, GPIO DIP switch 8 is LSB.
1970
 
1971
Verify the probing has been enabled in probe.v. Note that, as one
1972
adds test points, routing and timing closure becomes more and
1973
more difficult. Only define those test points you need.
1974
 
1975
The Intronix Logicport is a small USB-based logic analyzer. It
1976
has 34 channels, two of which can be used as clock inputs, and
1977
does state analysis at up to 200 MHz. The MPEG2 decoder on the
1978
ML505 runs at 75 MHz, with a typical dot clock of 27 MHz, well
1979
within the capabilities of the Logicport logic analyzer. Probing
1980
the memory controller at 200 MHz, however, is borderline. To be
1981
on the safe side, when probing the memory controller with the
1982
Logicport, lower memory clock to 125 MHz .
1983
 
1984
A small two-layer adapter board has been designed to connect the
1985
Intronix Logicport to the Xilinx ML505. Board layout can be
1986
downloaded from http://www.kdvelectronics.eu/probe_adapter/probe_adapter.html
1987
.
1988
 
1989
The tools/logicport directory contains Logicport configuration
1990
files for the test points defined in probe.v. Note configuration
1991
files can be read and waveforms displayed by Logicport software
1992
even if no analyzer is present.
1993
 
1994
2.4.2 Finite State Machine Graphs
1995
 
1996
The MPEG2 decoder uses Finite State Machines throughout; no
1997
embedded processors or microcontrollers are used. Verifying the
1998
correctness of the Finite State Machines is important. Finite
1999
state machine transition graphs are created from Verilog source
2000
files as a means of visually inspecting and verifying source
2001
correctness. The mkfsmgraph Perl script in tools/fsmgraph assumes
2002
the comment /* next state logic */ marks the beginning of a case
2003
statement in an always block, used to select the next state, and
2004
that all states begin with STATE_ :
2005
 
2006
/* next state logic */
2007
 
2008
always @*
2009
 
2010
  case (state)
2011
 
2012
    STATE_INIT: if (first_pixel_read) next = STATE_WAIT;
2013
 
2014
                else next = STATE_INIT;
2015
 
2016
    ...
2017
 
2018
    default next = STATE_INIT
2019
 
2020
  endcase
2021
 
2022
/* state */
2023
 
2024
always @(posedge clk)
2025
 
2026
  if(~rst) state <= STATE_INIT;
2027
 
2028
  else state <= next;
2029
 
2030
The mkfsmgraph tool parses the Verilog source files using the
2031
following algorithm:
2032
 
2033
• read the Verilog file until the comment /* next state logic */
2034
  is found
2035
 
2036
• take the first always block after the /* next state logic */
2037
  comment
2038
 
2039
• any word beginning with STATE_ is assumed to represent a FSM
2040
  state.
2041
 
2042
• if the character following the FSM state is a colon (:) the
2043
  state is a graph node.
2044
 
2045
• if the character following the FSM state is a semicolon (;) the
2046
  state is the end point of a state transition.
2047
 
2048
• if the character following the FSM state is neither a colon (:)
2049
  nor a semicolon (;) the state is not added to the graph.
2050
 
2051
The resulting graph is written to standard output in gml format.
2052
Graph layout software uDrawGraph from the University of Bremen,
2053
Germany, is then used to produce a visually appealing graph.
2054
 
2055
No attempt has been made to write a script capable of parsing
2056
arbitrary Verilog sources. The Verilog sources have been written
2057
so the script can parse them.
2058
 
2059
The graph of the variable length-decoding FSM vld.v has been
2060
simplified further by removing all transitions to
2061
STATE_NEXT_START_CODE and STATE_ERROR. Nodes which transition to
2062
STATE_NEXT_START_CODE are drawn with double border. Removing
2063
transitions to STATE_NEXT_START_CODE and STATE_ERROR produces a
2064
graph with much less visual clutter. A large format version of
2065
the FMS graph of vld.v  can be found in doc/vld-poster.pdf. It is
2066
suggested to become familiar with the graph before significantly
2067
modifying vld.v.
2068
 
2069
2.4.3 IEEE-1180 IDCT Accuracy Test
2070
 
2071
idct.v has been tested to comply with the former IEEE-1180, the
2072
actual ISO/IEC 23002-1 [2]. The testbench can be found in the
2073
tools/ieee1180 directory. Test results can be found in the file
2074
ieee-1180-results. Test results indicate the idct implementation
2075
is IEEE-1180 compliant.
2076
 
2077
2.4.4 Reference software decoder
2078
 
2079
The directory tools/mpeg2dec contains the MPEG2 reference
2080
decoder, modified to provide extensive logging and to regularly
2081
write the framebuffers to file. A sample run could be:
2082
 
2083
koen@macpro ~/xilinx/mpeg2/tools $ mkdir run
2084
 
2085
koen@macpro ~/xilinx/mpeg2/tools $ cd run
2086
 
2087
koen@macpro ~/xilinx/mpeg2/tools/run $ ../mpeg2dec/mpeg2decode
2088
 -r -v9 -t -o0 'dump_%d_out_%c' -b ../streams/tcela-17.mpg > log
2089
 
2090
saving dump_0_out_f.y.ppm
2091
 
2092
saving dump_0_out_f.u.ppm
2093
 
2094
saving dump_0_out_f.v.ppm
2095
 
2096
saving dump_0_forward_ref_frm.y.ppm
2097
 
2098
saving dump_0_forward_ref_frm.u.ppm
2099
 
2100
saving dump_0_forward_ref_frm.v.ppm
2101
 
2102
saving dump_0_backward_ref_frm.y.ppm
2103
 
2104
saving dump_0_backward_ref_frm.u.ppm
2105
 
2106
saving dump_0_backward_ref_frm.v.ppm
2107
 
2108
saving dump_0_auxframe.y.ppm
2109
 
2110
saving dump_0_auxframe.u.ppm
2111
 
2112
saving dump_0_auxframe.v.ppm
2113
 
2114
saving dump_1_out_f.y.ppm
2115
 
2116
saving dump_1_out_f.u.ppm
2117
 
2118
...
2119
 
2120
The log file contains detailed information about the execution of
2121
the MPEG2 decoding algorithm, while the .ppm files contain
2122
framestore dumps, using separate graphics files for each y, u and
2123
v component.
2124
 
2125
2.4.5 MPEG2 Test Streams
2126
 
2127
The tools/streams directory contains some sample MPEG2 program
2128
streams, useful during testing. The retrieve script in the
2129
tools/streams directory can be used to download the ISO/IEC
2130
13818-4 conformance test bitstreams from the ISO web site[footnote:
2131
ISO/IEC 13818-4 test bitstreams,
2132
http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_13818-4_2004_Conformance_Testing/Video/bitstreams/main-profile/
2133
].
2134
 
2135
References
2136
 
2137
[1] ITU-T Recommendation H.262 “Information technology - Generic
2138
coding of moving pictures and associated audio information: Video”
2139
, 2000. Also published as ISO/IEC International Standard 13818-2.
2140
 
2141
[2] ISO/IEC International Standard 23002-1 “Information
2142
technology - MPEG video technologies - Part 1: Accuracy
2143
requirements for implementation of integer-output 8x8 inverse
2144
discrete cosine transform”, 2006.
2145
 
2146
[3] “Architecture and Bus-Arbitration Schemes for MPEG-2 Video
2147
Decoder", Jui-Hua Li and Nam Ling, IEEE Transactions on Circuits
2148
and Systems for Video Technology, Vol. 9, No. 5, August 1999,
2149
p.727-736.
2150
 
2151
[4] “Systematic approach of Fixed Point 8x8 IDCT and DCT Design
2152
and Implementation", Ci-Xun Zhang , Jing Wang , Lu Yu, Institute
2153
of Information and Communication Engineering, Zhejiang
2154
University, Hangzhou, China, 310027.
2155
 
2156
[5] “Virtex-5 FPGA User Guide”, Xilinx UG190 (v3.2), December 11,
2157
2007.
2158
 
2159
[6] “ML505/506 MIG Design Creation Using ISE 9.2i SP3, MIG 2.0
2160
and ChipScope Pro 9.2i”, Xilinx, December 2007.
2161
 

powered by: WebSVN 2.1.0

© copyright 1999-2019 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.