1 |
38 |
julius |
@c Copyright (C) 2002, 2003, 2004
|
2 |
|
|
@c Free Software Foundation, Inc.
|
3 |
|
|
@c This is part of the GCC manual.
|
4 |
|
|
@c For copying conditions, see the file gcc.texi.
|
5 |
|
|
|
6 |
|
|
@node Type Information
|
7 |
|
|
@chapter Memory Management and Type Information
|
8 |
|
|
@cindex GGC
|
9 |
|
|
@findex GTY
|
10 |
|
|
|
11 |
|
|
GCC uses some fairly sophisticated memory management techniques, which
|
12 |
|
|
involve determining information about GCC's data structures from GCC's
|
13 |
|
|
source code and using this information to perform garbage collection and
|
14 |
|
|
implement precompiled headers.
|
15 |
|
|
|
16 |
|
|
A full C parser would be too complicated for this task, so a limited
|
17 |
|
|
subset of C is interpreted and special markers are used to determine
|
18 |
|
|
what parts of the source to look at. All @code{struct} and
|
19 |
|
|
@code{union} declarations that define data structures that are
|
20 |
|
|
allocated under control of the garbage collector must be marked. All
|
21 |
|
|
global variables that hold pointers to garbage-collected memory must
|
22 |
|
|
also be marked. Finally, all global variables that need to be saved
|
23 |
|
|
and restored by a precompiled header must be marked. (The precompiled
|
24 |
|
|
header mechanism can only save static variables if they're scalar.
|
25 |
|
|
Complex data structures must be allocated in garbage-collected memory
|
26 |
|
|
to be saved in a precompiled header.)
|
27 |
|
|
|
28 |
|
|
The full format of a marker is
|
29 |
|
|
@smallexample
|
30 |
|
|
GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))
|
31 |
|
|
@end smallexample
|
32 |
|
|
@noindent
|
33 |
|
|
but in most cases no options are needed. The outer double parentheses
|
34 |
|
|
are still necessary, though: @code{GTY(())}. Markers can appear:
|
35 |
|
|
|
36 |
|
|
@itemize @bullet
|
37 |
|
|
@item
|
38 |
|
|
In a structure definition, before the open brace;
|
39 |
|
|
@item
|
40 |
|
|
In a global variable declaration, after the keyword @code{static} or
|
41 |
|
|
@code{extern}; and
|
42 |
|
|
@item
|
43 |
|
|
In a structure field definition, before the name of the field.
|
44 |
|
|
@end itemize
|
45 |
|
|
|
46 |
|
|
Here are some examples of marking simple data structures and globals.
|
47 |
|
|
|
48 |
|
|
@smallexample
|
49 |
|
|
struct @var{tag} GTY(())
|
50 |
|
|
@{
|
51 |
|
|
@var{fields}@dots{}
|
52 |
|
|
@};
|
53 |
|
|
|
54 |
|
|
typedef struct @var{tag} GTY(())
|
55 |
|
|
@{
|
56 |
|
|
@var{fields}@dots{}
|
57 |
|
|
@} *@var{typename};
|
58 |
|
|
|
59 |
|
|
static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */
|
60 |
|
|
static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */
|
61 |
|
|
@end smallexample
|
62 |
|
|
|
63 |
|
|
The parser understands simple typedefs such as
|
64 |
|
|
@code{typedef struct @var{tag} *@var{name};} and
|
65 |
|
|
@code{typedef int @var{name};}.
|
66 |
|
|
These don't need to be marked.
|
67 |
|
|
|
68 |
|
|
@menu
|
69 |
|
|
* GTY Options:: What goes inside a @code{GTY(())}.
|
70 |
|
|
* GGC Roots:: Making global variables GGC roots.
|
71 |
|
|
* Files:: How the generated files work.
|
72 |
|
|
@end menu
|
73 |
|
|
|
74 |
|
|
@node GTY Options
|
75 |
|
|
@section The Inside of a @code{GTY(())}
|
76 |
|
|
|
77 |
|
|
Sometimes the C code is not enough to fully describe the type
|
78 |
|
|
structure. Extra information can be provided with @code{GTY} options
|
79 |
|
|
and additional markers. Some options take a parameter, which may be
|
80 |
|
|
either a string or a type name, depending on the parameter. If an
|
81 |
|
|
option takes no parameter, it is acceptable either to omit the
|
82 |
|
|
parameter entirely, or to provide an empty string as a parameter. For
|
83 |
|
|
example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are
|
84 |
|
|
equivalent.
|
85 |
|
|
|
86 |
|
|
When the parameter is a string, often it is a fragment of C code. Four
|
87 |
|
|
special escapes may be used in these strings, to refer to pieces of
|
88 |
|
|
the data structure being marked:
|
89 |
|
|
|
90 |
|
|
@cindex % in GTY option
|
91 |
|
|
@table @code
|
92 |
|
|
@item %h
|
93 |
|
|
The current structure.
|
94 |
|
|
@item %1
|
95 |
|
|
The structure that immediately contains the current structure.
|
96 |
|
|
@item %0
|
97 |
|
|
The outermost structure that contains the current structure.
|
98 |
|
|
@item %a
|
99 |
|
|
A partial expression of the form @code{[i1][i2]...} that indexes
|
100 |
|
|
the array item currently being marked.
|
101 |
|
|
@end table
|
102 |
|
|
|
103 |
|
|
For instance, suppose that you have a structure of the form
|
104 |
|
|
@smallexample
|
105 |
|
|
struct A @{
|
106 |
|
|
...
|
107 |
|
|
@};
|
108 |
|
|
struct B @{
|
109 |
|
|
struct A foo[12];
|
110 |
|
|
@};
|
111 |
|
|
@end smallexample
|
112 |
|
|
@noindent
|
113 |
|
|
and @code{b} is a variable of type @code{struct B}. When marking
|
114 |
|
|
@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},
|
115 |
|
|
@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}
|
116 |
|
|
would expand to @samp{[11]}.
|
117 |
|
|
|
118 |
|
|
As in ordinary C, adjacent strings will be concatenated; this is
|
119 |
|
|
helpful when you have a complicated expression.
|
120 |
|
|
@smallexample
|
121 |
|
|
@group
|
122 |
|
|
GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"
|
123 |
|
|
" ? TYPE_NEXT_VARIANT (&%h.generic)"
|
124 |
|
|
" : TREE_CHAIN (&%h.generic)")))
|
125 |
|
|
@end group
|
126 |
|
|
@end smallexample
|
127 |
|
|
|
128 |
|
|
The available options are:
|
129 |
|
|
|
130 |
|
|
@table @code
|
131 |
|
|
@findex length
|
132 |
|
|
@item length ("@var{expression}")
|
133 |
|
|
|
134 |
|
|
There are two places the type machinery will need to be explicitly told
|
135 |
|
|
the length of an array. The first case is when a structure ends in a
|
136 |
|
|
variable-length array, like this:
|
137 |
|
|
@smallexample
|
138 |
|
|
struct rtvec_def GTY(()) @{
|
139 |
|
|
int num_elem; /* @r{number of elements} */
|
140 |
|
|
rtx GTY ((length ("%h.num_elem"))) elem[1];
|
141 |
|
|
@};
|
142 |
|
|
@end smallexample
|
143 |
|
|
|
144 |
|
|
In this case, the @code{length} option is used to override the specified
|
145 |
|
|
array length (which should usually be @code{1}). The parameter of the
|
146 |
|
|
option is a fragment of C code that calculates the length.
|
147 |
|
|
|
148 |
|
|
The second case is when a structure or a global variable contains a
|
149 |
|
|
pointer to an array, like this:
|
150 |
|
|
@smallexample
|
151 |
|
|
tree *
|
152 |
|
|
GTY ((length ("%h.regno_pointer_align_length"))) regno_decl;
|
153 |
|
|
@end smallexample
|
154 |
|
|
In this case, @code{regno_decl} has been allocated by writing something like
|
155 |
|
|
@smallexample
|
156 |
|
|
x->regno_decl =
|
157 |
|
|
ggc_alloc (x->regno_pointer_align_length * sizeof (tree));
|
158 |
|
|
@end smallexample
|
159 |
|
|
and the @code{length} provides the length of the field.
|
160 |
|
|
|
161 |
|
|
This second use of @code{length} also works on global variables, like:
|
162 |
|
|
@verbatim
|
163 |
|
|
static GTY((length ("reg_base_value_size")))
|
164 |
|
|
rtx *reg_base_value;
|
165 |
|
|
@end verbatim
|
166 |
|
|
|
167 |
|
|
@findex skip
|
168 |
|
|
@item skip
|
169 |
|
|
|
170 |
|
|
If @code{skip} is applied to a field, the type machinery will ignore it.
|
171 |
|
|
This is somewhat dangerous; the only safe use is in a union when one
|
172 |
|
|
field really isn't ever used.
|
173 |
|
|
|
174 |
|
|
@findex desc
|
175 |
|
|
@findex tag
|
176 |
|
|
@findex default
|
177 |
|
|
@item desc ("@var{expression}")
|
178 |
|
|
@itemx tag ("@var{constant}")
|
179 |
|
|
@itemx default
|
180 |
|
|
|
181 |
|
|
The type machinery needs to be told which field of a @code{union} is
|
182 |
|
|
currently active. This is done by giving each field a constant
|
183 |
|
|
@code{tag} value, and then specifying a discriminator using @code{desc}.
|
184 |
|
|
The value of the expression given by @code{desc} is compared against
|
185 |
|
|
each @code{tag} value, each of which should be different. If no
|
186 |
|
|
@code{tag} is matched, the field marked with @code{default} is used if
|
187 |
|
|
there is one, otherwise no field in the union will be marked.
|
188 |
|
|
|
189 |
|
|
In the @code{desc} option, the ``current structure'' is the union that
|
190 |
|
|
it discriminates. Use @code{%1} to mean the structure containing it.
|
191 |
|
|
There are no escapes available to the @code{tag} option, since it is a
|
192 |
|
|
constant.
|
193 |
|
|
|
194 |
|
|
For example,
|
195 |
|
|
@smallexample
|
196 |
|
|
struct tree_binding GTY(())
|
197 |
|
|
@{
|
198 |
|
|
struct tree_common common;
|
199 |
|
|
union tree_binding_u @{
|
200 |
|
|
tree GTY ((tag ("0"))) scope;
|
201 |
|
|
struct cp_binding_level * GTY ((tag ("1"))) level;
|
202 |
|
|
@} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
|
203 |
|
|
tree value;
|
204 |
|
|
@};
|
205 |
|
|
@end smallexample
|
206 |
|
|
|
207 |
|
|
In this example, the value of BINDING_HAS_LEVEL_P when applied to a
|
208 |
|
|
@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type
|
209 |
|
|
mechanism will treat the field @code{level} as being present and if 0,
|
210 |
|
|
will treat the field @code{scope} as being present.
|
211 |
|
|
|
212 |
|
|
@findex param_is
|
213 |
|
|
@findex use_param
|
214 |
|
|
@item param_is (@var{type})
|
215 |
|
|
@itemx use_param
|
216 |
|
|
|
217 |
|
|
Sometimes it's convenient to define some data structure to work on
|
218 |
|
|
generic pointers (that is, @code{PTR}) and then use it with a specific
|
219 |
|
|
type. @code{param_is} specifies the real type pointed to, and
|
220 |
|
|
@code{use_param} says where in the generic data structure that type
|
221 |
|
|
should be put.
|
222 |
|
|
|
223 |
|
|
For instance, to have a @code{htab_t} that points to trees, one would
|
224 |
|
|
write the definition of @code{htab_t} like this:
|
225 |
|
|
@smallexample
|
226 |
|
|
typedef struct GTY(()) @{
|
227 |
|
|
@dots{}
|
228 |
|
|
void ** GTY ((use_param, @dots{})) entries;
|
229 |
|
|
@dots{}
|
230 |
|
|
@} htab_t;
|
231 |
|
|
@end smallexample
|
232 |
|
|
and then declare variables like this:
|
233 |
|
|
@smallexample
|
234 |
|
|
static htab_t GTY ((param_is (union tree_node))) ict;
|
235 |
|
|
@end smallexample
|
236 |
|
|
|
237 |
|
|
@findex param@var{n}_is
|
238 |
|
|
@findex use_param@var{n}
|
239 |
|
|
@item param@var{n}_is (@var{type})
|
240 |
|
|
@itemx use_param@var{n}
|
241 |
|
|
|
242 |
|
|
In more complicated cases, the data structure might need to work on
|
243 |
|
|
several different types, which might not necessarily all be pointers.
|
244 |
|
|
For this, @code{param1_is} through @code{param9_is} may be used to
|
245 |
|
|
specify the real type of a field identified by @code{use_param1} through
|
246 |
|
|
@code{use_param9}.
|
247 |
|
|
|
248 |
|
|
@findex use_params
|
249 |
|
|
@item use_params
|
250 |
|
|
|
251 |
|
|
When a structure contains another structure that is parameterized,
|
252 |
|
|
there's no need to do anything special, the inner structure inherits the
|
253 |
|
|
parameters of the outer one. When a structure contains a pointer to a
|
254 |
|
|
parameterized structure, the type machinery won't automatically detect
|
255 |
|
|
this (it could, it just doesn't yet), so it's necessary to tell it that
|
256 |
|
|
the pointed-to structure should use the same parameters as the outer
|
257 |
|
|
structure. This is done by marking the pointer with the
|
258 |
|
|
@code{use_params} option.
|
259 |
|
|
|
260 |
|
|
@findex deletable
|
261 |
|
|
@item deletable
|
262 |
|
|
|
263 |
|
|
@code{deletable}, when applied to a global variable, indicates that when
|
264 |
|
|
garbage collection runs, there's no need to mark anything pointed to
|
265 |
|
|
by this variable, it can just be set to @code{NULL} instead. This is used
|
266 |
|
|
to keep a list of free structures around for re-use.
|
267 |
|
|
|
268 |
|
|
@findex if_marked
|
269 |
|
|
@item if_marked ("@var{expression}")
|
270 |
|
|
|
271 |
|
|
Suppose you want some kinds of object to be unique, and so you put them
|
272 |
|
|
in a hash table. If garbage collection marks the hash table, these
|
273 |
|
|
objects will never be freed, even if the last other reference to them
|
274 |
|
|
goes away. GGC has special handling to deal with this: if you use the
|
275 |
|
|
@code{if_marked} option on a global hash table, GGC will call the
|
276 |
|
|
routine whose name is the parameter to the option on each hash table
|
277 |
|
|
entry. If the routine returns nonzero, the hash table entry will
|
278 |
|
|
be marked as usual. If the routine returns zero, the hash table entry
|
279 |
|
|
will be deleted.
|
280 |
|
|
|
281 |
|
|
The routine @code{ggc_marked_p} can be used to determine if an element
|
282 |
|
|
has been marked already; in fact, the usual case is to use
|
283 |
|
|
@code{if_marked ("ggc_marked_p")}.
|
284 |
|
|
|
285 |
|
|
@findex maybe_undef
|
286 |
|
|
@item maybe_undef
|
287 |
|
|
|
288 |
|
|
When applied to a field, @code{maybe_undef} indicates that it's OK if
|
289 |
|
|
the structure that this fields points to is never defined, so long as
|
290 |
|
|
this field is always @code{NULL}. This is used to avoid requiring
|
291 |
|
|
backends to define certain optional structures. It doesn't work with
|
292 |
|
|
language frontends.
|
293 |
|
|
|
294 |
|
|
@findex nested_ptr
|
295 |
|
|
@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")
|
296 |
|
|
|
297 |
|
|
The type machinery expects all pointers to point to the start of an
|
298 |
|
|
object. Sometimes for abstraction purposes it's convenient to have
|
299 |
|
|
a pointer which points inside an object. So long as it's possible to
|
300 |
|
|
convert the original object to and from the pointer, such pointers
|
301 |
|
|
can still be used. @var{type} is the type of the original object,
|
302 |
|
|
the @var{to expression} returns the pointer given the original object,
|
303 |
|
|
and the @var{from expression} returns the original object given
|
304 |
|
|
the pointer. The pointer will be available using the @code{%h}
|
305 |
|
|
escape.
|
306 |
|
|
|
307 |
|
|
@findex chain_next
|
308 |
|
|
@findex chain_prev
|
309 |
|
|
@item chain_next ("@var{expression}")
|
310 |
|
|
@itemx chain_prev ("@var{expression}")
|
311 |
|
|
|
312 |
|
|
It's helpful for the type machinery to know if objects are often
|
313 |
|
|
chained together in long lists; this lets it generate code that uses
|
314 |
|
|
less stack space by iterating along the list instead of recursing down
|
315 |
|
|
it. @code{chain_next} is an expression for the next item in the list,
|
316 |
|
|
@code{chain_prev} is an expression for the previous item. For singly
|
317 |
|
|
linked lists, use only @code{chain_next}; for doubly linked lists, use
|
318 |
|
|
both. The machinery requires that taking the next item of the
|
319 |
|
|
previous item gives the original item.
|
320 |
|
|
|
321 |
|
|
@findex reorder
|
322 |
|
|
@item reorder ("@var{function name}")
|
323 |
|
|
|
324 |
|
|
Some data structures depend on the relative ordering of pointers. If
|
325 |
|
|
the precompiled header machinery needs to change that ordering, it
|
326 |
|
|
will call the function referenced by the @code{reorder} option, before
|
327 |
|
|
changing the pointers in the object that's pointed to by the field the
|
328 |
|
|
option applies to. The function must take four arguments, with the
|
329 |
|
|
signature @samp{@w{void *, void *, gt_pointer_operator, void *}}.
|
330 |
|
|
The first parameter is a pointer to the structure that contains the
|
331 |
|
|
object being updated, or the object itself if there is no containing
|
332 |
|
|
structure. The second parameter is a cookie that should be ignored.
|
333 |
|
|
The third parameter is a routine that, given a pointer, will update it
|
334 |
|
|
to its correct new value. The fourth parameter is a cookie that must
|
335 |
|
|
be passed to the second parameter.
|
336 |
|
|
|
337 |
|
|
PCH cannot handle data structures that depend on the absolute values
|
338 |
|
|
of pointers. @code{reorder} functions can be expensive. When
|
339 |
|
|
possible, it is better to depend on properties of the data, like an ID
|
340 |
|
|
number or the hash of a string instead.
|
341 |
|
|
|
342 |
|
|
@findex special
|
343 |
|
|
@item special ("@var{name}")
|
344 |
|
|
|
345 |
|
|
The @code{special} option is used to mark types that have to be dealt
|
346 |
|
|
with by special case machinery. The parameter is the name of the
|
347 |
|
|
special case. See @file{gengtype.c} for further details. Avoid
|
348 |
|
|
adding new special cases unless there is no other alternative.
|
349 |
|
|
@end table
|
350 |
|
|
|
351 |
|
|
@node GGC Roots
|
352 |
|
|
@section Marking Roots for the Garbage Collector
|
353 |
|
|
@cindex roots, marking
|
354 |
|
|
@cindex marking roots
|
355 |
|
|
|
356 |
|
|
In addition to keeping track of types, the type machinery also locates
|
357 |
|
|
the global variables (@dfn{roots}) that the garbage collector starts
|
358 |
|
|
at. Roots must be declared using one of the following syntaxes:
|
359 |
|
|
|
360 |
|
|
@itemize @bullet
|
361 |
|
|
@item
|
362 |
|
|
@code{extern GTY(([@var{options}])) @var{type} @var{name};}
|
363 |
|
|
@item
|
364 |
|
|
@code{static GTY(([@var{options}])) @var{type} @var{name};}
|
365 |
|
|
@end itemize
|
366 |
|
|
@noindent
|
367 |
|
|
The syntax
|
368 |
|
|
@itemize @bullet
|
369 |
|
|
@item
|
370 |
|
|
@code{GTY(([@var{options}])) @var{type} @var{name};}
|
371 |
|
|
@end itemize
|
372 |
|
|
@noindent
|
373 |
|
|
is @emph{not} accepted. There should be an @code{extern} declaration
|
374 |
|
|
of such a variable in a header somewhere---mark that, not the
|
375 |
|
|
definition. Or, if the variable is only used in one file, make it
|
376 |
|
|
@code{static}.
|
377 |
|
|
|
378 |
|
|
@node Files
|
379 |
|
|
@section Source Files Containing Type Information
|
380 |
|
|
@cindex generated files
|
381 |
|
|
@cindex files, generated
|
382 |
|
|
|
383 |
|
|
Whenever you add @code{GTY} markers to a source file that previously
|
384 |
|
|
had none, or create a new source file containing @code{GTY} markers,
|
385 |
|
|
there are three things you need to do:
|
386 |
|
|
|
387 |
|
|
@enumerate
|
388 |
|
|
@item
|
389 |
|
|
You need to add the file to the list of source files the type
|
390 |
|
|
machinery scans. There are four cases:
|
391 |
|
|
|
392 |
|
|
@enumerate a
|
393 |
|
|
@item
|
394 |
|
|
For a back-end file, this is usually done
|
395 |
|
|
automatically; if not, you should add it to @code{target_gtfiles} in
|
396 |
|
|
the appropriate port's entries in @file{config.gcc}.
|
397 |
|
|
|
398 |
|
|
@item
|
399 |
|
|
For files shared by all front ends, add the filename to the
|
400 |
|
|
@code{GTFILES} variable in @file{Makefile.in}.
|
401 |
|
|
|
402 |
|
|
@item
|
403 |
|
|
For files that are part of one front end, add the filename to the
|
404 |
|
|
@code{gtfiles} variable defined in the appropriate
|
405 |
|
|
@file{config-lang.in}. For C, the file is @file{c-config-lang.in}.
|
406 |
|
|
|
407 |
|
|
@item
|
408 |
|
|
For files that are part of some but not all front ends, add the
|
409 |
|
|
filename to the @code{gtfiles} variable of @emph{all} the front ends
|
410 |
|
|
that use it.
|
411 |
|
|
@end enumerate
|
412 |
|
|
|
413 |
|
|
@item
|
414 |
|
|
If the file was a header file, you'll need to check that it's included
|
415 |
|
|
in the right place to be visible to the generated files. For a back-end
|
416 |
|
|
header file, this should be done automatically. For a front-end header
|
417 |
|
|
file, it needs to be included by the same file that includes
|
418 |
|
|
@file{gtype-@var{lang}.h}. For other header files, it needs to be
|
419 |
|
|
included in @file{gtype-desc.c}, which is a generated file, so add it to
|
420 |
|
|
@code{ifiles} in @code{open_base_file} in @file{gengtype.c}.
|
421 |
|
|
|
422 |
|
|
For source files that aren't header files, the machinery will generate a
|
423 |
|
|
header file that should be included in the source file you just changed.
|
424 |
|
|
The file will be called @file{gt-@var{path}.h} where @var{path} is the
|
425 |
|
|
pathname relative to the @file{gcc} directory with slashes replaced by
|
426 |
|
|
@verb{|-|}, so for example the header file to be included in
|
427 |
|
|
@file{cp/parser.c} is called @file{gt-cp-parser.c}. The
|
428 |
|
|
generated header file should be included after everything else in the
|
429 |
|
|
source file. Don't forget to mention this file as a dependency in the
|
430 |
|
|
@file{Makefile}!
|
431 |
|
|
|
432 |
|
|
@end enumerate
|
433 |
|
|
|
434 |
|
|
For language frontends, there is another file that needs to be included
|
435 |
|
|
somewhere. It will be called @file{gtype-@var{lang}.h}, where
|
436 |
|
|
@var{lang} is the name of the subdirectory the language is contained in.
|