OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-stable/] [binutils-2.20.1/] [bfd/] [doc/] [mmo.texi] - Blame information for rev 846

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 205 julius
@section mmo backend
2
The mmo object format is used exclusively together with Professor
3
Donald E.@: Knuth's educational 64-bit processor MMIX.  The simulator
4
@command{mmix} which is available at
5
@url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}
6
understands this format.  That package also includes a combined
7
assembler and linker called @command{mmixal}.  The mmo format has
8
no advantages feature-wise compared to e.g. ELF.  It is a simple
9
non-relocatable object format with no support for archives or
10
debugging information, except for symbol value information and
11
line numbers (which is not yet implemented in BFD).  See
12
@url{http://www-cs-faculty.stanford.edu/~knuth/mmix.html} for more
13
information about MMIX.  The ELF format is used for intermediate
14
object files in the BFD implementation.
15
 
16
@c We want to xref the symbol table node.  A feature in "chew"
17
@c requires that "commands" do not contain spaces in the
18
@c arguments.  Hence the hyphen in "Symbol-table".
19
@menu
20
* File layout::
21
* Symbol-table::
22
* mmo section mapping::
23
@end menu
24
 
25
@node File layout, Symbol-table, mmo, mmo
26
@subsection File layout
27
The mmo file contents is not partitioned into named sections as
28
with e.g.@: ELF.  Memory areas is formed by specifying the
29
location of the data that follows.  Only the memory area
30
@samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} is executable, so
31
it is used for code (and constants) and the area
32
@samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} is used for
33
writable data.  @xref{mmo section mapping}.
34
 
35
There is provision for specifying ``special data'' of 65536
36
different types.  We use type 80 (decimal), arbitrarily chosen the
37
same as the ELF @code{e_machine} number for MMIX, filling it with
38
section information normally found in ELF objects. @xref{mmo
39
section mapping}.
40
 
41
Contents is entered as 32-bit words, xor:ed over previous
42
contents, always zero-initialized.  A word that starts with the
43
byte @samp{0x98} forms a command called a @samp{lopcode}, where
44
the next byte distinguished between the thirteen lopcodes.  The
45
two remaining bytes, called the @samp{Y} and @samp{Z} fields, or
46
the @samp{YZ} field (a 16-bit big-endian number), are used for
47
various purposes different for each lopcode.  As documented in
48
@url{http://www-cs-faculty.stanford.edu/~knuth/mmixal-intro.ps.gz},
49
the lopcodes are:
50
 
51
@table @code
52
@item lop_quote
53
0x98000001.  The next word is contents, regardless of whether it
54
starts with 0x98 or not.
55
 
56
@item lop_loc
57
0x9801YYZZ, where @samp{Z} is 1 or 2.  This is a location
58
directive, setting the location for the next data to the next
59
32-bit word (for @math{Z = 1}) or 64-bit word (for @math{Z = 2}),
60
plus @math{Y * 2^56}.  Normally @samp{Y} is 0 for the text segment
61
and 2 for the data segment.
62
 
63
@item lop_skip
64
0x9802YYZZ.  Increase the current location by @samp{YZ} bytes.
65
 
66
@item lop_fixo
67
0x9803YYZZ, where @samp{Z} is 1 or 2.  Store the current location
68
as 64 bits into the location pointed to by the next 32-bit
69
(@math{Z = 1}) or 64-bit (@math{Z = 2}) word, plus @math{Y *
70
2^56}.
71
 
72
@item lop_fixr
73
0x9804YYZZ.  @samp{YZ} is stored into the current location plus
74
@math{2 - 4 * YZ}.
75
 
76
@item lop_fixrx
77
0x980500ZZ.  @samp{Z} is 16 or 24.  A value @samp{L} derived from
78
the following 32-bit word are used in a manner similar to
79
@samp{YZ} in lop_fixr: it is xor:ed into the current location
80
minus @math{4 * L}.  The first byte of the word is 0 or 1.  If it
81
is 1, then @math{L = (@var{lowest 24 bits of word}) - 2^Z}, if 0,
82
then @math{L = (@var{lowest 24 bits of word})}.
83
 
84
@item lop_file
85
0x9806YYZZ.  @samp{Y} is the file number, @samp{Z} is count of
86
32-bit words.  Set the file number to @samp{Y} and the line
87
counter to 0.  The next @math{Z * 4} bytes contain the file name,
88
padded with zeros if the count is not a multiple of four.  The
89
same @samp{Y} may occur multiple times, but @samp{Z} must be 0 for
90
all but the first occurrence.
91
 
92
@item lop_line
93
0x9807YYZZ.  @samp{YZ} is the line number.  Together with
94
lop_file, it forms the source location for the next 32-bit word.
95
Note that for each non-lopcode 32-bit word, line numbers are
96
assumed incremented by one.
97
 
98
@item lop_spec
99
0x9808YYZZ.  @samp{YZ} is the type number.  Data until the next
100
lopcode other than lop_quote forms special data of type @samp{YZ}.
101
@xref{mmo section mapping}.
102
 
103
Other types than 80, (or type 80 with a content that does not
104
parse) is stored in sections named @code{.MMIX.spec_data.@var{n}}
105
where @var{n} is the @samp{YZ}-type.  The flags for such a
106
sections say not to allocate or load the data.  The vma is 0.
107
Contents of multiple occurrences of special data @var{n} is
108
concatenated to the data of the previous lop_spec @var{n}s.  The
109
location in data or code at which the lop_spec occurred is lost.
110
 
111
@item lop_pre
112
0x980901ZZ.  The first lopcode in a file.  The @samp{Z} field forms the
113
length of header information in 32-bit words, where the first word
114
tells the time in seconds since @samp{00:00:00 GMT Jan 1 1970}.
115
 
116
@item lop_post
117
0x980a00ZZ.  @math{Z > 32}.  This lopcode follows after all
118
content-generating lopcodes in a program.  The @samp{Z} field
119
denotes the value of @samp{rG} at the beginning of the program.
120
The following @math{256 - Z} big-endian 64-bit words are loaded
121
into global registers @samp{$G} @dots{} @samp{$255}.
122
 
123
@item lop_stab
124
0x980b0000.  The next-to-last lopcode in a program.  Must follow
125
immediately after the lop_post lopcode and its data.  After this
126
lopcode follows all symbols in a compressed format
127
(@pxref{Symbol-table}).
128
 
129
@item lop_end
130
0x980cYYZZ.  The last lopcode in a program.  It must follow the
131
lop_stab lopcode and its data.  The @samp{YZ} field contains the
132
number of 32-bit words of symbol table information after the
133
preceding lop_stab lopcode.
134
@end table
135
 
136
Note that the lopcode "fixups"; @code{lop_fixr}, @code{lop_fixrx} and
137
@code{lop_fixo} are not generated by BFD, but are handled.  They are
138
generated by @code{mmixal}.
139
 
140
This trivial one-label, one-instruction file:
141
 
142
@example
143
 :Main TRAP 1,2,3
144
@end example
145
 
146
can be represented this way in mmo:
147
 
148
@example
149
 0x98090101 - lop_pre, one 32-bit word with timestamp.
150
 <timestamp>
151
 0x98010002 - lop_loc, text segment, using a 64-bit address.
152
              Note that mmixal does not emit this for the file above.
153
 0x00000000 - Address, high 32 bits.
154
 0x00000000 - Address, low 32 bits.
155
 0x98060002 - lop_file, 2 32-bit words for file-name.
156
 0x74657374 - "test"
157
 0x2e730000 - ".s\0\0"
158
 0x98070001 - lop_line, line 1.
159
 0x00010203 - TRAP 1,2,3
160
 0x980a00ff - lop_post, setting $255 to 0.
161
 0x00000000
162
 0x00000000
163
 0x980b0000 - lop_stab for ":Main" = 0, serial 1.
164
 0x203a4040   @xref{Symbol-table}.
165
 0x10404020
166
 0x4d206120
167
 0x69016e00
168
 0x81000000
169
 0x980c0005 - lop_end; symbol table contained five 32-bit words.
170
@end example
171
@node Symbol-table, mmo section mapping, File layout, mmo
172
@subsection Symbol table format
173
From mmixal.w (or really, the generated mmixal.tex) in
174
@url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}):
175
``Symbols are stored and retrieved by means of a @samp{ternary
176
search trie}, following ideas of Bentley and Sedgewick. (See
177
ACM--SIAM Symp.@: on Discrete Algorithms @samp{8} (1997), 360--369;
178
R.@:Sedgewick, @samp{Algorithms in C} (Reading, Mass.@:
179
Addison--Wesley, 1998), @samp{15.4}.)  Each trie node stores a
180
character, and there are branches to subtries for the cases where
181
a given character is less than, equal to, or greater than the
182
character in the trie.  There also is a pointer to a symbol table
183
entry if a symbol ends at the current node.''
184
 
185
So it's a tree encoded as a stream of bytes.  The stream of bytes
186
acts on a single virtual global symbol, adding and removing
187
characters and signalling complete symbol points.  Here, we read
188
the stream and create symbols at the completion points.
189
 
190
First, there's a control byte @code{m}.  If any of the listed bits
191
in @code{m} is nonzero, we execute what stands at the right, in
192
the listed order:
193
 
194
@example
195
 (MMO3_LEFT)
196
 0x40 - Traverse left trie.
197
        (Read a new command byte and recurse.)
198
 
199
 (MMO3_SYMBITS)
200
 0x2f - Read the next byte as a character and store it in the
201
        current character position; increment character position.
202
        Test the bits of @code{m}:
203
 
204
        (MMO3_WCHAR)
205
        0x80 - The character is 16-bit (so read another byte,
206
               merge into current character.
207
 
208
        (MMO3_TYPEBITS)
209
        0xf  - We have a complete symbol; parse the type, value
210
               and serial number and do what should be done
211
               with a symbol.  The type and length information
212
               is in j = (m & 0xf).
213
 
214
               (MMO3_REGQUAL_BITS)
215
               j == 0xf: A register variable.  The following
216
                         byte tells which register.
217
               j <= 8:   An absolute symbol.  Read j bytes as the
218
                         big-endian number the symbol equals.
219
                         A j = 2 with two zero bytes denotes an
220
                         unknown symbol.
221
               j > 8:    As with j <= 8, but add (0x20 << 56)
222
                         to the value in the following j - 8
223
                         bytes.
224
 
225
               Then comes the serial number, as a variant of
226
               uleb128, but better named ubeb128:
227
               Read bytes and shift the previous value left 7
228
               (multiply by 128).  Add in the new byte, repeat
229
               until a byte has bit 7 set.  The serial number
230
               is the computed value minus 128.
231
 
232
        (MMO3_MIDDLE)
233
        0x20 - Traverse middle trie.  (Read a new command byte
234
               and recurse.)  Decrement character position.
235
 
236
 (MMO3_RIGHT)
237
 0x10 - Traverse right trie.  (Read a new command byte and
238
        recurse.)
239
@end example
240
 
241
Let's look again at the @code{lop_stab} for the trivial file
242
(@pxref{File layout}).
243
 
244
@example
245
 0x980b0000 - lop_stab for ":Main" = 0, serial 1.
246
 0x203a4040
247
 0x10404020
248
 0x4d206120
249
 0x69016e00
250
 0x81000000
251
@end example
252
 
253
This forms the trivial trie (note that the path between ``:'' and
254
``M'' is redundant):
255
 
256
@example
257
 203a     ":"
258
 40       /
259
 40      /
260
 10      \
261
 40      /
262
 40     /
263
 204d  "M"
264
 2061  "a"
265
 2069  "i"
266
 016e  "n" is the last character in a full symbol, and
267
       with a value represented in one byte.
268
 00    The value is 0.
269
 81    The serial number is 1.
270
@end example
271
 
272
@node mmo section mapping, , Symbol-table, mmo
273
@subsection mmo section mapping
274
The implementation in BFD uses special data type 80 (decimal) to
275
encapsulate and describe named sections, containing e.g.@: debug
276
information.  If needed, any datum in the encapsulation will be
277
quoted using lop_quote.  First comes a 32-bit word holding the
278
number of 32-bit words containing the zero-terminated zero-padded
279
segment name.  After the name there's a 32-bit word holding flags
280
describing the section type.  Then comes a 64-bit big-endian word
281
with the section length (in bytes), then another with the section
282
start address.  Depending on the type of section, the contents
283
might follow, zero-padded to 32-bit boundary.  For a loadable
284
section (such as data or code), the contents might follow at some
285
later point, not necessarily immediately, as a lop_loc with the
286
same start address as in the section description, followed by the
287
contents.  This in effect forms a descriptor that must be emitted
288
before the actual contents.  Sections described this way must not
289
overlap.
290
 
291
For areas that don't have such descriptors, synthetic sections are
292
formed by BFD.  Consecutive contents in the two memory areas
293
@samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} and
294
@samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} are entered in
295
sections named @code{.text} and @code{.data} respectively.  If an area
296
is not otherwise described, but would together with a neighboring
297
lower area be less than @samp{0x40000000} bytes long, it is joined
298
with the lower area and the gap is zero-filled.  For other cases,
299
a new section is formed, named @code{.MMIX.sec.@var{n}}.  Here,
300
@var{n} is a number, a running count through the mmo file,
301
starting at 0.
302
 
303
A loadable section specified as:
304
 
305
@example
306
 .section secname,"ax"
307
 TETRA 1,2,3,4,-1,-2009
308
 BYTE 80
309
@end example
310
 
311
and linked to address @samp{0x4}, is represented by the sequence:
312
 
313
@example
314
 0x98080050 - lop_spec 80
315
 0x00000002 - two 32-bit words for the section name
316
 0x7365636e - "secn"
317
 0x616d6500 - "ame\0"
318
 0x00000033 - flags CODE, READONLY, LOAD, ALLOC
319
 0x00000000 - high 32 bits of section length
320
 0x0000001c - section length is 28 bytes; 6 * 4 + 1 + alignment to 32 bits
321
 0x00000000 - high 32 bits of section address
322
 0x00000004 - section address is 4
323
 0x98010002 - 64 bits with address of following data
324
 0x00000000 - high 32 bits of address
325
 0x00000004 - low 32 bits: data starts at address 4
326
 0x00000001 - 1
327
 0x00000002 - 2
328
 0x00000003 - 3
329
 0x00000004 - 4
330
 0xffffffff - -1
331
 0xfffff827 - -2009
332
 0x50000000 - 80 as a byte, padded with zeros.
333
@end example
334
 
335
Note that the lop_spec wrapping does not include the section
336
contents.  Compare this to a non-loaded section specified as:
337
 
338
@example
339
 .section thirdsec
340
 TETRA 200001,100002
341
 BYTE 38,40
342
@end example
343
 
344
This, when linked to address @samp{0x200000000000001c}, is
345
represented by:
346
 
347
@example
348
 0x98080050 - lop_spec 80
349
 0x00000002 - two 32-bit words for the section name
350
 0x7365636e - "thir"
351
 0x616d6500 - "dsec"
352
 0x00000010 - flag READONLY
353
 0x00000000 - high 32 bits of section length
354
 0x0000000c - section length is 12 bytes; 2 * 4 + 2 + alignment to 32 bits
355
 0x20000000 - high 32 bits of address
356
 0x0000001c - low 32 bits of address 0x200000000000001c
357
 0x00030d41 - 200001
358
 0x000186a2 - 100002
359
 0x26280000 - 38, 40 as bytes, padded with zeros
360
@end example
361
 
362
For the latter example, the section contents must not be
363
loaded in memory, and is therefore specified as part of the
364
special data.  The address is usually unimportant but might
365
provide information for e.g.@: the DWARF 2 debugging format.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.