1 |
2 |
kdv |
The following changes have been made to debug spatial scalability:
|
2 |
|
|
|
3 |
|
|
gethdr.c
|
4 |
|
|
--------
|
5 |
|
|
|
6 |
|
|
Temporal_reference is used to compute the frame number of each frame,
|
7 |
|
|
named true_framenum. The periodic reset at each GOP header as well as
|
8 |
|
|
the wrap of temporal_reference at 1024 cause a base value
|
9 |
|
|
temp_ref_base to be incremented accordingly.
|
10 |
|
|
|
11 |
|
|
spatscal.c
|
12 |
|
|
----------
|
13 |
|
|
|
14 |
|
|
getspatref()
|
15 |
|
|
|
16 |
|
|
A potential problem: Variable char fname[32] was dimensioned
|
17 |
|
|
statically and too small.
|
18 |
|
|
|
19 |
|
|
true_framenum is used instead of lower_layer_temporal_reference to
|
20 |
|
|
determine the lower layer frame to be read for spatial prediction.
|
21 |
|
|
|
22 |
|
|
The verification of lower_layer_temporal_reference is not possible
|
23 |
|
|
since the temporal reference values that have been encoded into the
|
24 |
|
|
base layer bitstream are not available to the enhancement layer
|
25 |
|
|
decoder.
|
26 |
|
|
|
27 |
|
|
Since there is no decoder timing information available, the rules on
|
28 |
|
|
which frames can legally be used as spatial prediction frames cannot
|
29 |
|
|
be checked.
|
30 |
|
|
|
31 |
|
|
Lower layer frames are read field-wise or frame-wise, depending on the
|
32 |
|
|
lower_layer_progressive_frame flag. Consistency between layers is
|
33 |
|
|
checked since the file format for frame and field pictures differs.
|
34 |
|
|
|
35 |
|
|
Note that the base layer decoder must not use the -f option to enforce
|
36 |
|
|
frame-wise storage.
|
37 |
|
|
|
38 |
|
|
Note further that only yuv image format (option -o0) is supported as
|
39 |
|
|
input format.
|
40 |
|
|
|
41 |
|
|
spatpred()
|
42 |
|
|
|
43 |
|
|
The code for the various combinations of llprog_frame, llfieldsel and
|
44 |
|
|
prog_frame has been completed and verified with the tceh_conf23
|
45 |
|
|
bitstream that uses all permissive combinations.
|
46 |
|
|
|
47 |
|
|
|
48 |
|
|
getpic.c
|
49 |
|
|
--------
|
50 |
|
|
|
51 |
|
|
A small bug when storing an I- or P-frame: The prog_frame flag that
|
52 |
|
|
the decoder knows when storing the oldrefframe belongs to the current
|
53 |
|
|
refframe. Therefore the old value of the flag needs to be memorized.
|
54 |
|
|
|
55 |
|
|
|
56 |
|
|
store.c
|
57 |
|
|
-------
|
58 |
|
|
|
59 |
|
|
A potential problem: the filename variables char outname[32],
|
60 |
|
|
tmpname[32] are statically dimensioned and quite small.
|
61 |
|
|
|
62 |
|
|
|
63 |
|
|
The concept of time in this video decoder software
|
64 |
|
|
--------------------------------------------------
|
65 |
|
|
|
66 |
|
|
When decoding a non-scalable bitstream, the frame number (i.e.
|
67 |
|
|
temporal position) of the current I- or P-frame can be derived
|
68 |
|
|
implicitly from the number of preceding B-frames after they have been
|
69 |
|
|
decoded. Therefore the temporal_reference entry in the picture header
|
70 |
|
|
is somewhat redundant and does not necessarily have to be evaluated in
|
71 |
|
|
the decoding process.
|
72 |
|
|
|
73 |
|
|
Decoding of the enhancement layer of a spatial scalable hierarchy,
|
74 |
|
|
however, requires to know the temporal position of each frame at the
|
75 |
|
|
instant when it is decoded, since data from a lower layer reference
|
76 |
|
|
frame has to be incorporated.
|
77 |
|
|
|
78 |
|
|
In the architecture of this video-only decoder decoding of a spatial
|
79 |
|
|
scalable hierarchy of bitstreams is done by calling mpeg2decode once
|
80 |
|
|
for the base layer bitstream and a second time for the enhancement
|
81 |
|
|
layer bitstream, indicating where the decoded base layer frames can be
|
82 |
|
|
found (option -s).
|
83 |
|
|
|
84 |
|
|
Here the concept of time is only present in the form of frame numbers.
|
85 |
|
|
Therefore spatial scalable bitstream hierarchies can only be handled
|
86 |
|
|
under the assumption that base and enhancement layer bitstreams are
|
87 |
|
|
decoded to image sequences where corresponding images of both layers
|
88 |
|
|
have identical frame numbers.
|
89 |
|
|
|
90 |
|
|
More specifically this means that base and enhancement layer
|
91 |
|
|
bitstreams must contain video with the same frame rate. Furthermore
|
92 |
|
|
only the temporally coincident frame of the base layer can be accessed
|
93 |
|
|
for spatial prediction by the enhancement layer decoder, since it is
|
94 |
|
|
not possible to resolve unambiguously the lower_layer_temporal_reference
|
95 |
|
|
which is meant to further specify the lower layer reference frame.
|
96 |
|
|
|
97 |
|
|
======================== SPATIAL.DOC ========================0
|
98 |
|
|
|
99 |
|
|
Decoding a spatial scalable hierarchy of bitstreams
|
100 |
|
|
---------------------------------------------------
|
101 |
|
|
|
102 |
|
|
With this video-only decoder decoding of a spatial scalable hierarchy
|
103 |
|
|
of bitstreams is done by calling mpeg2decode once for the base layer
|
104 |
|
|
bitstream and a second time for the enhancement layer bitstream,
|
105 |
|
|
indicating where the decoded base layer frames can be found
|
106 |
|
|
(using option -s and supplying ).
|
107 |
|
|
|
108 |
|
|
mpeg2decode -r -o0 base.mpg base%d%c
|
109 |
|
|
mpeg2decode -r -o0 -f -s base%d%c enh.mpg enh%d
|
110 |
|
|
|
111 |
|
|
Note that the base layer decoder must not use the -f option to enforce
|
112 |
|
|
frame-wise storage.
|
113 |
|
|
|
114 |
|
|
Note further that only yuv image format (option -o0) is supported as
|
115 |
|
|
input format.
|
116 |
|
|
|
117 |
|
|
|
118 |
|
|
Timing / layer synchronisation in this video decoder software
|
119 |
|
|
-------------------------------------------------------------
|
120 |
|
|
|
121 |
|
|
When decoding a non-scalable bitstream, the frame number (i.e.
|
122 |
|
|
temporal position) of the current I- or P-frame can be derived
|
123 |
|
|
implicitly from the number of preceding B-frames after they have been
|
124 |
|
|
decoded. Therefore the temporal_reference entry in the picture header
|
125 |
|
|
is somewhat redundant and does not necessarily have to be evaluated in
|
126 |
|
|
the decoding process.
|
127 |
|
|
|
128 |
|
|
Decoding of the enhancement layer of a spatial scalable hierarchy,
|
129 |
|
|
however, requires to know the temporal position of each frame at the
|
130 |
|
|
instant when it is decoded, since data from a lower layer reference
|
131 |
|
|
frame has to be incorporated.
|
132 |
|
|
|
133 |
|
|
The concept of time is only present in the form of frame numbers.
|
134 |
|
|
Therefore spatial scalable bitstream hierarchies can only be handled
|
135 |
|
|
under the assumption that base and enhancement layer bitstreams are
|
136 |
|
|
decoded to image sequences where corresponding images of both layers
|
137 |
|
|
have identical frame numbers.
|
138 |
|
|
|
139 |
|
|
More specifically this means that base and enhancement layer
|
140 |
|
|
bitstreams must contain video with the same frame rate. Furthermore
|
141 |
|
|
only the temporally coincident frame of the base layer can be accessed
|
142 |
|
|
for spatial prediction by the enhancement layer decoder, since it is
|
143 |
|
|
not possible to resolve unambiguously the lower_layer_temporal_reference
|
144 |
|
|
which is meant to further specify the lower layer reference frame.
|
145 |
|
|
|
146 |
|
|
Lower layer frames are read field-wise or frame-wise, depending on the
|
147 |
|
|
lower_layer_progressive_frame flag. Consistency between layers in this
|
148 |
|
|
respect is checked since the file format for frame and field pictures
|
149 |
|
|
differs.
|
150 |
|
|
|
151 |
|
|
|
152 |
|
|
|
153 |
|
|
|
154 |
|
|
|