OpenCores
URL https://opencores.org/ocsvn/open8_urisc/open8_urisc/trunk

Subversion Repositories open8_urisc

[/] [open8_urisc/] [trunk/] [gnu/] [binutils/] [ld/] [ldint.texinfo] - Blame information for rev 158

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 145 khays
\input texinfo
2
@setfilename ldint.info
3
@c Copyright 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
4
@c 2003, 2005, 2006, 2007
5
@c Free Software Foundation, Inc.
6
 
7
@ifnottex
8
@dircategory Software development
9
@direntry
10
* Ld-Internals: (ldint).        The GNU linker internals.
11
@end direntry
12
@end ifnottex
13
 
14
@copying
15
This file documents the internals of the GNU linker ld.
16
 
17
Copyright @copyright{} 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2007
18
Free Software Foundation, Inc.
19
Contributed by Cygnus Support.
20
 
21
Permission is granted to copy, distribute and/or modify this document
22
under the terms of the GNU Free Documentation License, Version 1.3 or
23
any later version published by the Free Software Foundation; with the
24
Invariant Sections being ``GNU General Public License'' and ``Funding
25
Free Software'', the Front-Cover texts being (a) (see below), and with
26
the Back-Cover Texts being (b) (see below).  A copy of the license is
27
included in the section entitled ``GNU Free Documentation License''.
28
 
29
(a) The FSF's Front-Cover Text is:
30
 
31
     A GNU Manual
32
 
33
(b) The FSF's Back-Cover Text is:
34
 
35
     You have freedom to copy and modify this GNU Manual, like GNU
36
     software.  Copies published by the Free Software Foundation raise
37
     funds for GNU development.
38
@end copying
39
 
40
@iftex
41
@finalout
42
@setchapternewpage off
43
@settitle GNU Linker Internals
44
@titlepage
45
@title{A guide to the internals of the GNU linker}
46
@author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie
47
@author Cygnus Support
48
@page
49
 
50
@tex
51
\def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
52
\xdef\manvers{2.10.91}  % For use in headers, footers too
53
{\parskip=0pt
54
\hfill Cygnus Support\par
55
\hfill \manvers\par
56
\hfill \TeX{}info \texinfoversion\par
57
}
58
@end tex
59
 
60
@vskip 0pt plus 1filll
61
Copyright @copyright{} 1992, 1993, 1994, 1995, 1996, 1997, 1998, 2000
62
Free Software Foundation, Inc.
63
 
64
      Permission is granted to copy, distribute and/or modify this document
65
      under the terms of the GNU Free Documentation License, Version 1.3
66
      or any later version published by the Free Software Foundation;
67
      with no Invariant Sections, with no Front-Cover Texts, and with no
68
      Back-Cover Texts.  A copy of the license is included in the
69
      section entitled "GNU Free Documentation License".
70
 
71
@end titlepage
72
@end iftex
73
 
74
@node Top
75
@top
76
 
77
This file documents the internals of the GNU linker @code{ld}.  It is a
78
collection of miscellaneous information with little form at this point.
79
Mostly, it is a repository into which you can put information about
80
GNU @code{ld} as you discover it (or as you design changes to @code{ld}).
81
 
82
This document is distributed under the terms of the GNU Free
83
Documentation License.  A copy of the license is included in the
84
section entitled "GNU Free Documentation License".
85
 
86
@menu
87
* README::                      The README File
88
* Emulations::                  How linker emulations are generated
89
* Emulation Walkthrough::       A Walkthrough of a Typical Emulation
90
* Architecture Specific::       Some Architecture Specific Notes
91
* GNU Free Documentation License::  GNU Free Documentation License
92
@end menu
93
 
94
@node README
95
@chapter The @file{README} File
96
 
97
Check the @file{README} file; it often has useful information that does not
98
appear anywhere else in the directory.
99
 
100
@node Emulations
101
@chapter How linker emulations are generated
102
 
103
Each linker target has an @dfn{emulation}.  The emulation includes the
104
default linker script, and certain emulations also modify certain types
105
of linker behaviour.
106
 
107
Emulations are created during the build process by the shell script
108
@file{genscripts.sh}.
109
 
110
The @file{genscripts.sh} script starts by reading a file in the
111
@file{emulparams} directory.  This is a shell script which sets various
112
shell variables used by @file{genscripts.sh} and the other shell scripts
113
it invokes.
114
 
115
The @file{genscripts.sh} script will invoke a shell script in the
116
@file{scripttempl} directory in order to create default linker scripts
117
written in the linker command language.  The @file{scripttempl} script
118
will be invoked 5 (or, in some cases, 6) times, with different
119
assignments to shell variables, to create different default scripts.
120
The choice of script is made based on the command line options.
121
 
122
After creating the scripts, @file{genscripts.sh} will invoke yet another
123
shell script, this time in the @file{emultempl} directory.  That shell
124
script will create the emulation source file, which contains C code.
125
This C code permits the linker emulation to override various linker
126
behaviours.  Most targets use the generic emulation code, which is in
127
@file{emultempl/generic.em}.
128
 
129
To summarize, @file{genscripts.sh} reads three shell scripts: an
130
emulation parameters script in the @file{emulparams} directory, a linker
131
script generation script in the @file{scripttempl} directory, and an
132
emulation source file generation script in the @file{emultempl}
133
directory.
134
 
135
For example, the Sun 4 linker sets up variables in
136
@file{emulparams/sun4.sh}, creates linker scripts using
137
@file{scripttempl/aout.sc}, and creates the emulation code using
138
@file{emultempl/sunos.em}.
139
 
140
Note that the linker can support several emulations simultaneously,
141
depending upon how it is configured.  An emulation can be selected with
142
the @code{-m} option.  The @code{-V} option will list all supported
143
emulations.
144
 
145
@menu
146
* emulation parameters::        @file{emulparams} scripts
147
* linker scripts::              @file{scripttempl} scripts
148
* linker emulations::           @file{emultempl} scripts
149
@end menu
150
 
151
@node emulation parameters
152
@section @file{emulparams} scripts
153
 
154
Each target selects a particular file in the @file{emulparams} directory
155
by setting the shell variable @code{targ_emul} in @file{configure.tgt}.
156
This shell variable is used by the @file{configure} script to control
157
building an emulation source file.
158
 
159
Certain conventions are enforced.  Suppose the @code{targ_emul} variable
160
is set to @var{emul} in @file{configure.tgt}.  The name of the emulation
161
shell script will be @file{emulparams/@var{emul}.sh}.  The
162
@file{Makefile} must have a target named @file{e@var{emul}.c}; this
163
target must depend upon @file{emulparams/@var{emul}.sh}, as well as the
164
appropriate scripts in the @file{scripttempl} and @file{emultempl}
165
directories.  The @file{Makefile} target must invoke @code{GENSCRIPTS}
166
with two arguments: @var{emul}, and the value of the make variable
167
@code{tdir_@var{emul}}.  The value of the latter variable will be set by
168
the @file{configure} script, and is used to set the default target
169
directory to search.
170
 
171
By convention, the @file{emulparams/@var{emul}.sh} shell script should
172
only set shell variables.  It may set shell variables which are to be
173
interpreted by the @file{scripttempl} and the @file{emultempl} scripts.
174
Certain shell variables are interpreted directly by the
175
@file{genscripts.sh} script.
176
 
177
Here is a list of shell variables interpreted by @file{genscripts.sh},
178
as well as some conventional shell variables interpreted by the
179
@file{scripttempl} and @file{emultempl} scripts.
180
 
181
@table @code
182
@item SCRIPT_NAME
183
This is the name of the @file{scripttempl} script to use.  If
184
@code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
185
the script @file{scripttempl/@var{script}.sc}.
186
 
187
@item TEMPLATE_NAME
188
This is the name of the @file{emultempl} script to use.  If
189
@code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
190
use the script @file{emultempl/@var{template}.em}.  If this variable is
191
not set, the default value is @samp{generic}.
192
 
193
@item GENERATE_SHLIB_SCRIPT
194
If this is set to a nonempty string, @file{genscripts.sh} will invoke
195
the @file{scripttempl} script an extra time to create a shared library
196
script.  @ref{linker scripts}.
197
 
198
@item OUTPUT_FORMAT
199
This is normally set to indicate the BFD output format use (e.g.,
200
@samp{"a.out-sunos-big"}.  The @file{scripttempl} script will normally
201
use it in an @code{OUTPUT_FORMAT} expression in the linker script.
202
 
203
@item ARCH
204
This is normally set to indicate the architecture to use (e.g.,
205
@samp{sparc}).  The @file{scripttempl} script will normally use it in an
206
@code{OUTPUT_ARCH} expression in the linker script.
207
 
208
@item ENTRY
209
Some @file{scripttempl} scripts use this to set the entry address, in an
210
@code{ENTRY} expression in the linker script.
211
 
212
@item TEXT_START_ADDR
213
Some @file{scripttempl} scripts use this to set the start address of the
214
@samp{.text} section.
215
 
216
@item SEGMENT_SIZE
217
The @file{genscripts.sh} script uses this to set the default value of
218
@code{DATA_ALIGNMENT} when running the @file{scripttempl} script.
219
 
220
@item TARGET_PAGE_SIZE
221
If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script
222
uses this to define it.
223
 
224
@item ALIGNMENT
225
Some @file{scripttempl} scripts set this to a number to pass to
226
@code{ALIGN} to set the required alignment for the @code{end} symbol.
227
@end table
228
 
229
@node linker scripts
230
@section @file{scripttempl} scripts
231
 
232
Each linker target uses a @file{scripttempl} script to generate the
233
default linker scripts.  The name of the @file{scripttempl} script is
234
set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script.
235
If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will
236
invoke @file{scripttempl/@var{script}.sc}.
237
 
238
The @file{genscripts.sh} script will invoke the @file{scripttempl}
239
script 5 to 9 times.  Each time it will set the shell variable
240
@code{LD_FLAG} to a different value.  When the linker is run, the
241
options used will direct it to select a particular script.  (Script
242
selection is controlled by the @code{get_script} emulation entry point;
243
this describes the conventional behaviour).
244
 
245
The @file{scripttempl} script should just write a linker script, written
246
in the linker command language, to standard output.  If the emulation
247
name--the name of the @file{emulparams} file without the @file{.sc}
248
extension--is @var{emul}, then the output will be directed to
249
@file{ldscripts/@var{emul}.@var{extension}} in the build directory,
250
where @var{extension} changes each time the @file{scripttempl} script is
251
invoked.
252
 
253
Here is the list of values assigned to @code{LD_FLAG}.
254
 
255
@table @code
256
@item (empty)
257
The script generated is used by default (when none of the following
258
cases apply).  The output has an extension of @file{.x}.
259
@item n
260
The script generated is used when the linker is invoked with the
261
@code{-n} option.  The output has an extension of @file{.xn}.
262
@item N
263
The script generated is used when the linker is invoked with the
264
@code{-N} option.  The output has an extension of @file{.xbn}.
265
@item r
266
The script generated is used when the linker is invoked with the
267
@code{-r} option.  The output has an extension of @file{.xr}.
268
@item u
269
The script generated is used when the linker is invoked with the
270
@code{-Ur} option.  The output has an extension of @file{.xu}.
271
@item shared
272
The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
273
this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the
274
@file{emulparams} file.  The @file{emultempl} script must arrange to use
275
this script at the appropriate time, normally when the linker is invoked
276
with the @code{-shared} option.  The output has an extension of
277
@file{.xs}.
278
@item c
279
The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
280
this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
281
@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
282
@file{emultempl} script must arrange to use this script at the appropriate
283
time, normally when the linker is invoked with the @code{-z combreloc}
284
option.  The output has an extension of
285
@file{.xc}.
286
@item cshared
287
The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
288
this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
289
@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
290
@code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
291
The @file{emultempl} script must arrange to use this script at the
292
appropriate time, normally when the linker is invoked with the @code{-shared
293
-z combreloc} option.  The output has an extension of @file{.xsc}.
294
@item auto_import
295
The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
296
this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the
297
@file{emulparams} file.  The @file{emultempl} script must arrange to
298
use this script at the appropriate time, normally when the linker is
299
invoked with the @code{--enable-auto-import} option.  The output has
300
an extension of @file{.xa}.
301
@end table
302
 
303
Besides the shell variables set by the @file{emulparams} script, and the
304
@code{LD_FLAG} variable, the @file{genscripts.sh} script will set
305
certain variables for each run of the @file{scripttempl} script.
306
 
307
@table @code
308
@item RELOCATING
309
This will be set to a non-empty string when the linker is doing a final
310
relocation (e.g., all scripts other than @code{-r} and @code{-Ur}).
311
 
312
@item CONSTRUCTING
313
This will be set to a non-empty string when the linker is building
314
global constructor and destructor tables (e.g., all scripts other than
315
@code{-r}).
316
 
317
@item DATA_ALIGNMENT
318
This will be set to an @code{ALIGN} expression when the output should be
319
page aligned, or to @samp{.} when generating the @code{-N} script.
320
 
321
@item CREATE_SHLIB
322
This will be set to a non-empty string when generating a @code{-shared}
323
script.
324
 
325
@item COMBRELOC
326
This will be set to a non-empty string when generating @code{-z combreloc}
327
scripts to a temporary file name which can be used during script generation.
328
@end table
329
 
330
The conventional way to write a @file{scripttempl} script is to first
331
set a few shell variables, and then write out a linker script using
332
@code{cat} with a here document.  The linker script will use variable
333
substitutions, based on the above variables and those set in the
334
@file{emulparams} script, to control its behaviour.
335
 
336
When there are parts of the @file{scripttempl} script which should only
337
be run when doing a final relocation, they should be enclosed within a
338
variable substitution based on @code{RELOCATING}.  For example, on many
339
targets special symbols such as @code{_end} should be defined when doing
340
a final link.  Naturally, those symbols should not be defined when doing
341
a relocatable link using @code{-r}.  The @file{scripttempl} script
342
could use a construct like this to define those symbols:
343
@smallexample
344
  $@{RELOCATING+ _end = .;@}
345
@end smallexample
346
This will do the symbol assignment only if the @code{RELOCATING}
347
variable is defined.
348
 
349
The basic job of the linker script is to put the sections in the correct
350
order, and at the correct memory addresses.  For some targets, the
351
linker script may have to do some other operations.
352
 
353
For example, on most MIPS platforms, the linker is responsible for
354
defining the special symbol @code{_gp}, used to initialize the
355
@code{$gp} register.  It must be set to the start of the small data
356
section plus @code{0x8000}.  Naturally, it should only be defined when
357
doing a final relocation.  This will typically be done like this:
358
@smallexample
359
  $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@}
360
@end smallexample
361
This line would appear just before the sections which compose the small
362
data section (@samp{.sdata}, @samp{.sbss}).  All those sections would be
363
contiguous in memory.
364
 
365
Many COFF systems build constructor tables in the linker script.  The
366
compiler will arrange to output the address of each global constructor
367
in a @samp{.ctor} section, and the address of each global destructor in
368
a @samp{.dtor} section (this is done by defining
369
@code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the
370
@code{gcc} configuration files).  The @code{gcc} runtime support
371
routines expect the constructor table to be named @code{__CTOR_LIST__}.
372
They expect it to be a list of words, with the first word being the
373
count of the number of entries.  There should be a trailing zero word.
374
(Actually, the count may be -1 if the trailing word is present, and the
375
trailing word may be omitted if the count is correct, but, as the
376
@code{gcc} behaviour has changed slightly over the years, it is safest
377
to provide both).  Here is a typical way that might be handled in a
378
@file{scripttempl} file.
379
@smallexample
380
    $@{CONSTRUCTING+ __CTOR_LIST__ = .;@}
381
    $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@}
382
    $@{CONSTRUCTING+ *(.ctors)@}
383
    $@{CONSTRUCTING+ LONG(0)@}
384
    $@{CONSTRUCTING+ __CTOR_END__ = .;@}
385
    $@{CONSTRUCTING+ __DTOR_LIST__ = .;@}
386
    $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@}
387
    $@{CONSTRUCTING+ *(.dtors)@}
388
    $@{CONSTRUCTING+ LONG(0)@}
389
    $@{CONSTRUCTING+ __DTOR_END__ = .;@}
390
@end smallexample
391
The use of @code{CONSTRUCTING} ensures that these linker script commands
392
will only appear when the linker is supposed to be building the
393
constructor and destructor tables.  This example is written for a target
394
which uses 4 byte pointers.
395
 
396
Embedded systems often need to set a stack address.  This is normally
397
best done by using the @code{PROVIDE} construct with a default stack
398
address.  This permits the user to easily override the stack address
399
using the @code{--defsym} option.  Here is an example:
400
@smallexample
401
  $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@}
402
@end smallexample
403
The value of the symbol @code{__stack} would then be used in the startup
404
code to initialize the stack pointer.
405
 
406
@node linker emulations
407
@section @file{emultempl} scripts
408
 
409
Each linker target uses an @file{emultempl} script to generate the
410
emulation code.  The name of the @file{emultempl} script is set by the
411
@code{TEMPLATE_NAME} variable in the @file{emulparams} script.  If the
412
@code{TEMPLATE_NAME} variable is not set, the default is
413
@samp{generic}.  If the value of @code{TEMPLATE_NAME} is @var{template},
414
@file{genscripts.sh} will use @file{emultempl/@var{template}.em}.
415
 
416
Most targets use the generic @file{emultempl} script,
417
@file{emultempl/generic.em}.  A different @file{emultempl} script is
418
only needed if the linker must support unusual actions, such as linking
419
against shared libraries.
420
 
421
The @file{emultempl} script is normally written as a simple invocation
422
of @code{cat} with a here document.  The document will use a few
423
variable substitutions.  Typically each function names uses a
424
substitution involving @code{EMULATION_NAME}, for ease of debugging when
425
the linker supports multiple emulations.
426
 
427
Every function and variable in the emitted file should be static.  The
428
only globally visible object must be named
429
@code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is
430
the name of the emulation set in @file{configure.tgt} (this is also the
431
name of the @file{emulparams} file without the @file{.sh} extension).
432
The @file{genscripts.sh} script will set the shell variable
433
@code{EMULATION_NAME} before invoking the @file{emultempl} script.
434
 
435
The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a
436
@code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}.
437
It defines a set of function pointers which are invoked by the linker,
438
as well as strings for the emulation name (normally set from the shell
439
variable @code{EMULATION_NAME} and the default BFD target name (normally
440
set from the shell variable @code{OUTPUT_FORMAT} which is normally set
441
by the @file{emulparams} file).
442
 
443
The @file{genscripts.sh} script will set the shell variable
444
@code{COMPILE_IN} when it invokes the @file{emultempl} script for the
445
default emulation.  In this case, the @file{emultempl} script should
446
include the linker scripts directly, and return them from the
447
@code{get_scripts} entry point.  When the emulation is not the default,
448
the @code{get_scripts} entry point should just return a file name.  See
449
@file{emultempl/generic.em} for an example of how this is done.
450
 
451
At some point, the linker emulation entry points should be documented.
452
 
453
@node Emulation Walkthrough
454
@chapter A Walkthrough of a Typical Emulation
455
 
456
This chapter is to help people who are new to the way emulations
457
interact with the linker, or who are suddenly thrust into the position
458
of having to work with existing emulations.  It will discuss the files
459
you need to be aware of.  It will tell you when the given "hooks" in
460
the emulation will be called.  It will, hopefully, give you enough
461
information about when and how things happen that you'll be able to
462
get by.  As always, the source is the definitive reference to this.
463
 
464
The starting point for the linker is in @file{ldmain.c} where
465
@code{main} is defined.  The bulk of the code that's emulation
466
specific will initially be in @code{emultempl/@var{emulation}.em} but
467
will end up in @code{e@var{emulation}.c} when the build is done.
468
Most of the work to select and interface with emulations is in
469
@code{ldemul.h} and @code{ldemul.c}.  Specifically, @code{ldemul.h}
470
defines the @code{ld_emulation_xfer_struct} structure your emulation
471
exports.
472
 
473
Your emulation file exports a symbol
474
@code{ld_@var{EMULATION_NAME}_emulation}.  If your emulation is
475
selected (it usually is, since usually there's only one),
476
@code{ldemul.c} sets the variable @var{ld_emulation} to point to it.
477
@code{ldemul.c} also defines a number of API functions that interface
478
to your emulation, like @code{ldemul_after_parse} which simply calls
479
your @code{ld_@var{EMULATION}_emulation.after_parse} function.  For
480
the rest of this section, the functions will be mentioned, but you
481
should assume the indirect reference to your emulation also.
482
 
483
We will also skip or gloss over parts of the link process that don't
484
relate to emulations, like setting up internationalization.
485
 
486
After initialization, @code{main} selects an emulation by pre-scanning
487
the command line arguments.  It calls @code{ldemul_choose_target} to
488
choose a target.  If you set @code{choose_target} to
489
@code{ldemul_default_target}, it picks your @code{target_name} by
490
default.
491
 
492
@code{main} calls @code{ldemul_before_parse}, then @code{parse_args}.
493
@code{parse_args} calls @code{ldemul_parse_args} for each arg, which
494
must update the @code{getopt} globals if it recognizes the argument.
495
If the emulation doesn't recognize it, then parse_args checks to see
496
if it recognizes it.
497
 
498
Now that the emulation has had access to all its command-line options,
499
@code{main} calls @code{ldemul_set_symbols}.  This can be used for any
500
initialization that may be affected by options.  It is also supposed
501
to set up any variables needed by the emulation script.
502
 
503
@code{main} now calls @code{ldemul_get_script} to get the emulation
504
script to use (based on arguments, no doubt, @pxref{Emulations}) and
505
runs it.  While parsing, @code{ldgram.y} may call @code{ldemul_hll} or
506
@code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB}
507
commands.  It may call @code{ldemul_unrecognized_file} if you asked
508
the linker to link a file it doesn't recognize.  It will call
509
@code{ldemul_recognized_file} for each file it does recognize, in case
510
the emulation wants to handle some files specially.  All the while,
511
it's loading the files (possibly calling
512
@code{ldemul_open_dynamic_archive}) and symbols and stuff.  After it's
513
done reading the script, @code{main} calls @code{ldemul_after_parse}.
514
Use the after-parse hook to set up anything that depends on stuff the
515
script might have set up, like the entry point.
516
 
517
@code{main} next calls @code{lang_process} in @code{ldlang.c}.  This
518
appears to be the main core of the linking itself, as far as emulation
519
hooks are concerned(*).  It first opens the output file's BFD, calling
520
@code{ldemul_set_output_arch}, and calls
521
@code{ldemul_create_output_section_statements} in case you need to use
522
other means to find or create object files (i.e. shared libraries
523
found on a path, or fake stub objects).  Despite the name, nobody
524
creates output sections here.
525
 
526
(*) In most cases, the BFD library does the bulk of the actual
527
linking, handling symbol tables, symbol resolution, relocations, and
528
building the final output file.  See the BFD reference for all the
529
details.  Your emulation is usually concerned more with managing
530
things at the file and section level, like "put this here, add this
531
section", etc.
532
 
533
Next, the objects to be linked are opened and BFDs created for them,
534
and @code{ldemul_after_open} is called.  At this point, you have all
535
the objects and symbols loaded, but none of the data has been placed
536
yet.
537
 
538
Next comes the Big Linking Thingy (except for the parts BFD does).
539
All input sections are mapped to output sections according to the
540
script.  If a section doesn't get mapped by default,
541
@code{ldemul_place_orphan} will get called to figure out where it goes.
542
Next it figures out the offsets for each section, calling
543
@code{ldemul_before_allocation} before and
544
@code{ldemul_after_allocation} after deciding where each input section
545
ends up in the output sections.
546
 
547
The last part of @code{lang_process} is to figure out all the symbols'
548
values.  After assigning final values to the symbols,
549
@code{ldemul_finish} is called, and after that, any undefined symbols
550
are turned into fatal errors.
551
 
552
OK, back to @code{main}, which calls @code{ldwrite} in
553
@file{ldwrite.c}.  @code{ldwrite} calls BFD's final_link, which does
554
all the relocation fixups and writes the output bfd to disk, and we're
555
done.
556
 
557
In summary,
558
 
559
@itemize @bullet
560
 
561
@item @code{main()} in @file{ldmain.c}
562
@item @file{emultempl/@var{EMULATION}.em} has your code
563
@item @code{ldemul_choose_target} (defaults to your @code{target_name})
564
@item @code{ldemul_before_parse}
565
@item Parse argv, calls @code{ldemul_parse_args} for each
566
@item @code{ldemul_set_symbols}
567
@item @code{ldemul_get_script}
568
@item parse script
569
 
570
@itemize @bullet
571
@item may call @code{ldemul_hll} or @code{ldemul_syslib}
572
@item may call @code{ldemul_open_dynamic_archive}
573
@end itemize
574
 
575
@item @code{ldemul_after_parse}
576
@item @code{lang_process()} in @file{ldlang.c}
577
 
578
@itemize @bullet
579
@item create @code{output_bfd}
580
@item @code{ldemul_set_output_arch}
581
@item @code{ldemul_create_output_section_statements}
582
@item read objects, create input bfds - all symbols exist, but have no values
583
@item may call @code{ldemul_unrecognized_file}
584
@item will call @code{ldemul_recognized_file}
585
@item @code{ldemul_after_open}
586
@item map input sections to output sections
587
@item may call @code{ldemul_place_orphan} for remaining sections
588
@item @code{ldemul_before_allocation}
589
@item gives input sections offsets into output sections, places output sections
590
@item @code{ldemul_after_allocation} - section addresses valid
591
@item assigns values to symbols
592
@item @code{ldemul_finish} - symbol values valid
593
@end itemize
594
 
595
@item output bfd is written to disk
596
 
597
@end itemize
598
 
599
@node Architecture Specific
600
@chapter Some Architecture Specific Notes
601
 
602
This is the place for notes on the behavior of @code{ld} on
603
specific platforms.  Currently, only Intel x86 is documented (and
604
of that, only the auto-import behavior for DLLs).
605
 
606
@menu
607
* ix86::                        Intel x86
608
@end menu
609
 
610
@node ix86
611
@section Intel x86
612
 
613
@table @emph
614
@code{ld} can create DLLs that operate with various runtimes available
615
on a common x86 operating system.  These runtimes include native (using
616
the mingw "platform"), cygwin, and pw.
617
 
618
@item auto-import from DLLs
619
@enumerate
620
@item
621
With this feature on, DLL clients can import variables from DLL
622
without any concern from their side (for example, without any source
623
code modifications).  Auto-import can be enabled using the
624
@code{--enable-auto-import} flag, or disabled via the
625
@code{--disable-auto-import} flag.  Auto-import is disabled by default.
626
 
627
@item
628
This is done completely in bounds of the PE specification (to be fair,
629
there's a minor violation of the spec at one point, but in practice
630
auto-import works on all known variants of that common x86 operating
631
system)  So, the resulting DLL can be used with any other PE
632
compiler/linker.
633
 
634
@item
635
Auto-import is fully compatible with standard import method, in which
636
variables are decorated using attribute modifiers. Libraries of either
637
type may be mixed together.
638
 
639
@item
640
Overhead (space): 8 bytes per imported symbol, plus 20 for each
641
reference to it; Overhead (load time): negligible; Overhead
642
(virtual/physical memory): should be less than effect of DLL
643
relocation.
644
@end enumerate
645
 
646
Motivation
647
 
648
The obvious and only way to get rid of dllimport insanity is
649
to make client access variable directly in the DLL, bypassing
650
the extra dereference imposed by ordinary DLL runtime linking.
651
I.e., whenever client contains something like
652
 
653
@code{mov dll_var,%eax,}
654
 
655
address of dll_var in the command should be relocated to point
656
into loaded DLL. The aim is to make OS loader do so, and than
657
make ld help with that.  Import section of PE made following
658
way: there's a vector of structures each describing imports
659
from particular DLL. Each such structure points to two other
660
parallel vectors: one holding imported names, and one which
661
will hold address of corresponding imported name. So, the
662
solution is de-vectorize these structures, making import
663
locations be sparse and pointing directly into code.
664
 
665
Implementation
666
 
667
For each reference of data symbol to be imported from DLL (to
668
set of which belong symbols with name <sym>, if __imp_<sym> is
669
found in implib), the import fixup entry is generated. That
670
entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3
671
subsection. Each fixup entry contains pointer to symbol's address
672
within .text section (marked with __fuN_<sym> symbol, where N is
673
integer), pointer to DLL name (so, DLL name is referenced by
674
multiple entries), and pointer to symbol name thunk. Symbol name
675
thunk is singleton vector (__nm_th_<symbol>) pointing to
676
IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing
677
imported name. Here comes that "om the edge" problem mentioned above:
678
PE specification rambles that name vector (OriginalFirstThunk) should
679
run in parallel with addresses vector (FirstThunk), i.e. that they
680
should have same number of elements and terminated with zero. We violate
681
this, since FirstThunk points directly into machine code. But in
682
practice, OS loader implemented the sane way: it goes thru
683
OriginalFirstThunk and puts addresses to FirstThunk, not something
684
else. It once again should be noted that dll and symbol name
685
structures are reused across fixup entries and should be there
686
anyway to support standard import stuff, so sustained overhead is
687
20 bytes per reference. Other question is whether having several
688
IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes,
689
it is done even by native compiler/linker (libth32's functions are in
690
fact resident in windows9x kernel32.dll, so if you use it, you have
691
two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is
692
whether referencing the same PE structures several times is valid.
693
The answer is why not, prohibiting that (detecting violation) would
694
require more work on behalf of loader than not doing it.
695
 
696
@end table
697
 
698
@node GNU Free Documentation License
699
@chapter GNU Free Documentation License
700
 
701
@include fdl.texi
702
 
703
@contents
704
@bye

powered by: WebSVN 2.1.0

© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.