1 |
12 |
jlechner |
@c Copyright (c) 2004, 2005 Free Software Foundation, Inc.
|
2 |
|
|
@c Free Software Foundation, Inc.
|
3 |
|
|
@c This is part of the GCC manual.
|
4 |
|
|
@c For copying conditions, see the file gcc.texi.
|
5 |
|
|
|
6 |
|
|
@c ---------------------------------------------------------------------
|
7 |
|
|
@c Tree SSA
|
8 |
|
|
@c ---------------------------------------------------------------------
|
9 |
|
|
|
10 |
|
|
@node Tree SSA
|
11 |
|
|
@chapter Analysis and Optimization of GIMPLE Trees
|
12 |
|
|
@cindex Tree SSA
|
13 |
|
|
@cindex Optimization infrastructure for GIMPLE
|
14 |
|
|
|
15 |
|
|
GCC uses three main intermediate languages to represent the program
|
16 |
|
|
during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a
|
17 |
|
|
language-independent representation generated by each front end. It
|
18 |
|
|
is used to serve as an interface between the parser and optimizer.
|
19 |
|
|
GENERIC is a common representation that is able to represent programs
|
20 |
|
|
written in all the languages supported by GCC@.
|
21 |
|
|
|
22 |
|
|
GIMPLE and RTL are used to optimize the program. GIMPLE is used for
|
23 |
|
|
target and language independent optimizations (e.g., inlining,
|
24 |
|
|
constant propagation, tail call elimination, redundancy elimination,
|
25 |
|
|
etc). Much like GENERIC, GIMPLE is a language independent, tree based
|
26 |
|
|
representation. However, it differs from GENERIC in that the GIMPLE
|
27 |
|
|
grammar is more restrictive: expressions contain no more than 3
|
28 |
|
|
operands (except function calls), it has no control flow structures
|
29 |
|
|
and expressions with side-effects are only allowed on the right hand
|
30 |
|
|
side of assignments. See the chapter describing GENERIC and GIMPLE
|
31 |
|
|
for more details.
|
32 |
|
|
|
33 |
|
|
This chapter describes the data structures and functions used in the
|
34 |
|
|
GIMPLE optimizers (also known as ``tree optimizers'' or ``middle
|
35 |
|
|
end''). In particular, it focuses on all the macros, data structures,
|
36 |
|
|
functions and programming constructs needed to implement optimization
|
37 |
|
|
passes for GIMPLE@.
|
38 |
|
|
|
39 |
|
|
@menu
|
40 |
|
|
* GENERIC:: A high-level language-independent representation.
|
41 |
|
|
* GIMPLE:: A lower-level factored tree representation.
|
42 |
|
|
* Annotations:: Attributes for statements and variables.
|
43 |
|
|
* Statement Operands:: Variables referenced by GIMPLE statements.
|
44 |
|
|
* SSA:: Static Single Assignment representation.
|
45 |
|
|
* Alias analysis:: Representing aliased loads and stores.
|
46 |
|
|
@end menu
|
47 |
|
|
|
48 |
|
|
@node GENERIC
|
49 |
|
|
@section GENERIC
|
50 |
|
|
@cindex GENERIC
|
51 |
|
|
|
52 |
|
|
The purpose of GENERIC is simply to provide a language-independent way of
|
53 |
|
|
representing an entire function in trees. To this end, it was necessary to
|
54 |
|
|
add a few new tree codes to the back end, but most everything was already
|
55 |
|
|
there. If you can express it with the codes in @code{gcc/tree.def}, it's
|
56 |
|
|
GENERIC@.
|
57 |
|
|
|
58 |
|
|
Early on, there was a great deal of debate about how to think about
|
59 |
|
|
statements in a tree IL@. In GENERIC, a statement is defined as any
|
60 |
|
|
expression whose value, if any, is ignored. A statement will always
|
61 |
|
|
have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a
|
62 |
|
|
non-statement expression may also have side effects. A
|
63 |
|
|
@code{CALL_EXPR}, for instance.
|
64 |
|
|
|
65 |
|
|
It would be possible for some local optimizations to work on the
|
66 |
|
|
GENERIC form of a function; indeed, the adapted tree inliner works
|
67 |
|
|
fine on GENERIC, but the current compiler performs inlining after
|
68 |
|
|
lowering to GIMPLE (a restricted form described in the next section).
|
69 |
|
|
Indeed, currently the frontends perform this lowering before handing
|
70 |
|
|
off to @code{tree_rest_of_compilation}, but this seems inelegant.
|
71 |
|
|
|
72 |
|
|
If necessary, a front end can use some language-dependent tree codes
|
73 |
|
|
in its GENERIC representation, so long as it provides a hook for
|
74 |
|
|
converting them to GIMPLE and doesn't expect them to work with any
|
75 |
|
|
(hypothetical) optimizers that run before the conversion to GIMPLE@.
|
76 |
|
|
The intermediate representation used while parsing C and C++ looks
|
77 |
|
|
very little like GENERIC, but the C and C++ gimplifier hooks are
|
78 |
|
|
perfectly happy to take it as input and spit out GIMPLE@.
|
79 |
|
|
|
80 |
|
|
@node GIMPLE
|
81 |
|
|
@section GIMPLE
|
82 |
|
|
@cindex GIMPLE
|
83 |
|
|
|
84 |
|
|
GIMPLE is a simplified subset of GENERIC for use in optimization. The
|
85 |
|
|
particular subset chosen (and the name) was heavily influenced by the
|
86 |
|
|
SIMPLE IL used by the McCAT compiler project at McGill University,
|
87 |
|
|
though we have made some different choices. For one thing, SIMPLE
|
88 |
|
|
doesn't support @code{goto}; a production compiler can't afford that
|
89 |
|
|
kind of restriction.
|
90 |
|
|
|
91 |
|
|
GIMPLE retains much of the structure of the parse trees: lexical
|
92 |
|
|
scopes are represented as containers, rather than markers. However,
|
93 |
|
|
expressions are broken down into a 3-address form, using temporary
|
94 |
|
|
variables to hold intermediate values. Also, control structures are
|
95 |
|
|
lowered to gotos.
|
96 |
|
|
|
97 |
|
|
In GIMPLE no container node is ever used for its value; if a
|
98 |
|
|
@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a
|
99 |
|
|
temporary within the controlled blocks, and that temporary is used in
|
100 |
|
|
place of the container.
|
101 |
|
|
|
102 |
|
|
The compiler pass which lowers GENERIC to GIMPLE is referred to as the
|
103 |
|
|
@samp{gimplifier}. The gimplifier works recursively, replacing complex
|
104 |
|
|
statements with sequences of simple statements.
|
105 |
|
|
|
106 |
|
|
@c Currently, the only way to
|
107 |
|
|
@c tell whether or not an expression is in GIMPLE form is by recursively
|
108 |
|
|
@c examining it; in the future there will probably be a flag to help avoid
|
109 |
|
|
@c redundant work. FIXME FIXME
|
110 |
|
|
|
111 |
|
|
@menu
|
112 |
|
|
* Interfaces::
|
113 |
|
|
* Temporaries::
|
114 |
|
|
* GIMPLE Expressions::
|
115 |
|
|
* Statements::
|
116 |
|
|
* GIMPLE Example::
|
117 |
|
|
* Rough GIMPLE Grammar::
|
118 |
|
|
@end menu
|
119 |
|
|
|
120 |
|
|
@node Interfaces
|
121 |
|
|
@subsection Interfaces
|
122 |
|
|
@cindex gimplification
|
123 |
|
|
|
124 |
|
|
The tree representation of a function is stored in
|
125 |
|
|
@code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to
|
126 |
|
|
@code{gimplify_function_tree}.
|
127 |
|
|
|
128 |
|
|
If a front end wants to include language-specific tree codes in the tree
|
129 |
|
|
representation which it provides to the back end, it must provide a
|
130 |
|
|
definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to
|
131 |
|
|
convert the front end trees to GIMPLE@. Usually such a hook will involve
|
132 |
|
|
much of the same code for expanding front end trees to RTL@. This function
|
133 |
|
|
can return fully lowered GIMPLE, or it can return GENERIC trees and let the
|
134 |
|
|
main gimplifier lower them the rest of the way; this is often simpler.
|
135 |
|
|
|
136 |
|
|
The C and C++ front ends currently convert directly from front end
|
137 |
|
|
trees to GIMPLE, and hand that off to the back end rather than first
|
138 |
|
|
converting to GENERIC@. Their gimplifier hooks know about all the
|
139 |
|
|
@code{_STMT} nodes and how to convert them to GENERIC forms. There
|
140 |
|
|
was some work done on a genericization pass which would run first, but
|
141 |
|
|
the existence of @code{STMT_EXPR} meant that in order to convert all
|
142 |
|
|
of the C statements into GENERIC equivalents would involve walking the
|
143 |
|
|
entire tree anyway, so it was simpler to lower all the way. This
|
144 |
|
|
might change in the future if someone writes an optimization pass
|
145 |
|
|
which would work better with higher-level trees, but currently the
|
146 |
|
|
optimizers all expect GIMPLE@.
|
147 |
|
|
|
148 |
|
|
A front end which wants to use the tree optimizers (and already has
|
149 |
|
|
some sort of whole-function tree representation) only needs to provide
|
150 |
|
|
a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call
|
151 |
|
|
@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to
|
152 |
|
|
@code{tree_rest_of_compilation} to compile and output the function.
|
153 |
|
|
|
154 |
|
|
You can tell the compiler to dump a C-like representation of the GIMPLE
|
155 |
|
|
form with the flag @option{-fdump-tree-gimple}.
|
156 |
|
|
|
157 |
|
|
@node Temporaries
|
158 |
|
|
@subsection Temporaries
|
159 |
|
|
@cindex Temporaries
|
160 |
|
|
|
161 |
|
|
When gimplification encounters a subexpression which is too complex, it
|
162 |
|
|
creates a new temporary variable to hold the value of the subexpression,
|
163 |
|
|
and adds a new statement to initialize it before the current statement.
|
164 |
|
|
These special temporaries are known as @samp{expression temporaries}, and are
|
165 |
|
|
allocated using @code{get_formal_tmp_var}. The compiler tries to
|
166 |
|
|
always evaluate identical expressions into the same temporary, to simplify
|
167 |
|
|
elimination of redundant calculations.
|
168 |
|
|
|
169 |
|
|
We can only use expression temporaries when we know that it will not be
|
170 |
|
|
reevaluated before its value is used, and that it will not be otherwise
|
171 |
|
|
modified@footnote{These restrictions are derived from those in Morgan 4.8.}.
|
172 |
|
|
Other temporaries can be allocated using
|
173 |
|
|
@code{get_initialized_tmp_var} or @code{create_tmp_var}.
|
174 |
|
|
|
175 |
|
|
Currently, an expression like @code{a = b + 5} is not reduced any
|
176 |
|
|
further. We tried converting it to something like
|
177 |
|
|
@smallexample
|
178 |
|
|
T1 = b + 5;
|
179 |
|
|
a = T1;
|
180 |
|
|
@end smallexample
|
181 |
|
|
but this bloated the representation for minimal benefit. However, a
|
182 |
|
|
variable which must live in memory cannot appear in an expression; its
|
183 |
|
|
value is explicitly loaded into a temporary first. Similarly, storing
|
184 |
|
|
the value of an expression to a memory variable goes through a
|
185 |
|
|
temporary.
|
186 |
|
|
|
187 |
|
|
@node GIMPLE Expressions
|
188 |
|
|
@subsection Expressions
|
189 |
|
|
@cindex GIMPLE Expressions
|
190 |
|
|
|
191 |
|
|
In general, expressions in GIMPLE consist of an operation and the
|
192 |
|
|
appropriate number of simple operands; these operands must either be a
|
193 |
|
|
GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register
|
194 |
|
|
variable. More complex operands are factored out into temporaries, so
|
195 |
|
|
that
|
196 |
|
|
@smallexample
|
197 |
|
|
a = b + c + d
|
198 |
|
|
@end smallexample
|
199 |
|
|
becomes
|
200 |
|
|
@smallexample
|
201 |
|
|
T1 = b + c;
|
202 |
|
|
a = T1 + d;
|
203 |
|
|
@end smallexample
|
204 |
|
|
|
205 |
|
|
The same rule holds for arguments to a @code{CALL_EXPR}.
|
206 |
|
|
|
207 |
|
|
The target of an assignment is usually a variable, but can also be an
|
208 |
|
|
@code{INDIRECT_REF} or a compound lvalue as described below.
|
209 |
|
|
|
210 |
|
|
@menu
|
211 |
|
|
* Compound Expressions::
|
212 |
|
|
* Compound Lvalues::
|
213 |
|
|
* Conditional Expressions::
|
214 |
|
|
* Logical Operators::
|
215 |
|
|
@end menu
|
216 |
|
|
|
217 |
|
|
@node Compound Expressions
|
218 |
|
|
@subsubsection Compound Expressions
|
219 |
|
|
@cindex Compound Expressions
|
220 |
|
|
|
221 |
|
|
The left-hand side of a C comma expression is simply moved into a separate
|
222 |
|
|
statement.
|
223 |
|
|
|
224 |
|
|
@node Compound Lvalues
|
225 |
|
|
@subsubsection Compound Lvalues
|
226 |
|
|
@cindex Compound Lvalues
|
227 |
|
|
|
228 |
|
|
Currently compound lvalues involving array and structure field references
|
229 |
|
|
are not broken down; an expression like @code{a.b[2] = 42} is not reduced
|
230 |
|
|
any further (though complex array subscripts are). This restriction is a
|
231 |
|
|
workaround for limitations in later optimizers; if we were to convert this
|
232 |
|
|
to
|
233 |
|
|
|
234 |
|
|
@smallexample
|
235 |
|
|
T1 = &a.b;
|
236 |
|
|
T1[2] = 42;
|
237 |
|
|
@end smallexample
|
238 |
|
|
|
239 |
|
|
alias analysis would not remember that the reference to @code{T1[2]} came
|
240 |
|
|
by way of @code{a.b}, so it would think that the assignment could alias
|
241 |
|
|
another member of @code{a}; this broke @code{struct-alias-1.c}. Future
|
242 |
|
|
optimizer improvements may make this limitation unnecessary.
|
243 |
|
|
|
244 |
|
|
@node Conditional Expressions
|
245 |
|
|
@subsubsection Conditional Expressions
|
246 |
|
|
@cindex Conditional Expressions
|
247 |
|
|
|
248 |
|
|
A C @code{?:} expression is converted into an @code{if} statement with
|
249 |
|
|
each branch assigning to the same temporary. So,
|
250 |
|
|
|
251 |
|
|
@smallexample
|
252 |
|
|
a = b ? c : d;
|
253 |
|
|
@end smallexample
|
254 |
|
|
becomes
|
255 |
|
|
@smallexample
|
256 |
|
|
if (b)
|
257 |
|
|
T1 = c;
|
258 |
|
|
else
|
259 |
|
|
T1 = d;
|
260 |
|
|
a = T1;
|
261 |
|
|
@end smallexample
|
262 |
|
|
|
263 |
|
|
Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate.
|
264 |
|
|
It is used to vectorize loops with conditions using vector conditional operations.
|
265 |
|
|
|
266 |
|
|
Note that in GIMPLE, @code{if} statements are also represented using
|
267 |
|
|
@code{COND_EXPR}, as described below.
|
268 |
|
|
|
269 |
|
|
@node Logical Operators
|
270 |
|
|
@subsubsection Logical Operators
|
271 |
|
|
@cindex Logical Operators
|
272 |
|
|
|
273 |
|
|
Except when they appear in the condition operand of a @code{COND_EXPR},
|
274 |
|
|
logical `and' and `or' operators are simplified as follows:
|
275 |
|
|
@code{a = b && c} becomes
|
276 |
|
|
|
277 |
|
|
@smallexample
|
278 |
|
|
T1 = (bool)b;
|
279 |
|
|
if (T1)
|
280 |
|
|
T1 = (bool)c;
|
281 |
|
|
a = T1;
|
282 |
|
|
@end smallexample
|
283 |
|
|
|
284 |
|
|
Note that @code{T1} in this example cannot be an expression temporary,
|
285 |
|
|
because it has two different assignments.
|
286 |
|
|
|
287 |
|
|
@node Statements
|
288 |
|
|
@subsection Statements
|
289 |
|
|
@cindex Statements
|
290 |
|
|
|
291 |
|
|
Most statements will be assignment statements, represented by
|
292 |
|
|
@code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can
|
293 |
|
|
also be a statement. No other C expressions can appear at statement level;
|
294 |
|
|
a reference to a volatile object is converted into a @code{MODIFY_EXPR}.
|
295 |
|
|
In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful. Instead, use type
|
296 |
|
|
of LHS or RHS@.
|
297 |
|
|
|
298 |
|
|
There are also several varieties of complex statements.
|
299 |
|
|
|
300 |
|
|
@menu
|
301 |
|
|
* Blocks::
|
302 |
|
|
* Statement Sequences::
|
303 |
|
|
* Empty Statements::
|
304 |
|
|
* Loops::
|
305 |
|
|
* Selection Statements::
|
306 |
|
|
* Jumps::
|
307 |
|
|
* Cleanups::
|
308 |
|
|
* GIMPLE Exception Handling::
|
309 |
|
|
@end menu
|
310 |
|
|
|
311 |
|
|
@node Blocks
|
312 |
|
|
@subsubsection Blocks
|
313 |
|
|
@cindex Blocks
|
314 |
|
|
|
315 |
|
|
Block scopes and the variables they declare in GENERIC and GIMPLE are
|
316 |
|
|
expressed using the @code{BIND_EXPR} code, which in previous versions of
|
317 |
|
|
GCC was primarily used for the C statement-expression extension.
|
318 |
|
|
|
319 |
|
|
Variables in a block are collected into @code{BIND_EXPR_VARS} in
|
320 |
|
|
declaration order. Any runtime initialization is moved out of
|
321 |
|
|
@code{DECL_INITIAL} and into a statement in the controlled block. When
|
322 |
|
|
gimplifying from C or C++, this initialization replaces the
|
323 |
|
|
@code{DECL_STMT}.
|
324 |
|
|
|
325 |
|
|
Variable-length arrays (VLAs) complicate this process, as their size often
|
326 |
|
|
refers to variables initialized earlier in the block. To handle this, we
|
327 |
|
|
currently split the block at that point, and move the VLA into a new, inner
|
328 |
|
|
@code{BIND_EXPR}. This strategy may change in the future.
|
329 |
|
|
|
330 |
|
|
@code{DECL_SAVED_TREE} for a GIMPLE function will always be a
|
331 |
|
|
@code{BIND_EXPR} which contains declarations for the temporary variables
|
332 |
|
|
used in the function.
|
333 |
|
|
|
334 |
|
|
A C++ program will usually contain more @code{BIND_EXPR}s than there are
|
335 |
|
|
syntactic blocks in the source code, since several C++ constructs have
|
336 |
|
|
implicit scopes associated with them. On the other hand, although the C++
|
337 |
|
|
front end uses pseudo-scopes to handle cleanups for objects with
|
338 |
|
|
destructors, these don't translate into the GIMPLE form; multiple
|
339 |
|
|
declarations at the same level use the same @code{BIND_EXPR}.
|
340 |
|
|
|
341 |
|
|
@node Statement Sequences
|
342 |
|
|
@subsubsection Statement Sequences
|
343 |
|
|
@cindex Statement Sequences
|
344 |
|
|
|
345 |
|
|
Multiple statements at the same nesting level are collected into a
|
346 |
|
|
@code{STATEMENT_LIST}. Statement lists are modified and traversed
|
347 |
|
|
using the interface in @samp{tree-iterator.h}.
|
348 |
|
|
|
349 |
|
|
@node Empty Statements
|
350 |
|
|
@subsubsection Empty Statements
|
351 |
|
|
@cindex Empty Statements
|
352 |
|
|
|
353 |
|
|
Whenever possible, statements with no effect are discarded. But if they
|
354 |
|
|
are nested within another construct which cannot be discarded for some
|
355 |
|
|
reason, they are instead replaced with an empty statement, generated by
|
356 |
|
|
@code{build_empty_stmt}. Initially, all empty statements were shared,
|
357 |
|
|
after the pattern of the Java front end, but this caused a lot of trouble in
|
358 |
|
|
practice.
|
359 |
|
|
|
360 |
|
|
An empty statement is represented as @code{(void)0}.
|
361 |
|
|
|
362 |
|
|
@node Loops
|
363 |
|
|
@subsubsection Loops
|
364 |
|
|
@cindex Loops
|
365 |
|
|
|
366 |
|
|
At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but
|
367 |
|
|
now they are lowered to explicit gotos.
|
368 |
|
|
|
369 |
|
|
@node Selection Statements
|
370 |
|
|
@subsubsection Selection Statements
|
371 |
|
|
@cindex Selection Statements
|
372 |
|
|
|
373 |
|
|
A simple selection statement, such as the C @code{if} statement, is
|
374 |
|
|
expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is
|
375 |
|
|
used, the other is filled with an empty statement.
|
376 |
|
|
|
377 |
|
|
Normally, the condition expression is reduced to a simple comparison. If
|
378 |
|
|
it is a shortcut (@code{&&} or @code{||}) expression, however, we try to
|
379 |
|
|
break up the @code{if} into multiple @code{if}s so that the implied shortcut
|
380 |
|
|
is taken directly, much like the transformation done by @code{do_jump} in
|
381 |
|
|
the RTL expander.
|
382 |
|
|
|
383 |
|
|
A @code{SWITCH_EXPR} in GIMPLE contains the condition and a
|
384 |
|
|
@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values
|
385 |
|
|
and corresponding @code{LABEL_DECL}s to jump to. The body of the
|
386 |
|
|
@code{switch} is moved after the @code{SWITCH_EXPR}.
|
387 |
|
|
|
388 |
|
|
@node Jumps
|
389 |
|
|
@subsubsection Jumps
|
390 |
|
|
@cindex Jumps
|
391 |
|
|
|
392 |
|
|
Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}.
|
393 |
|
|
|
394 |
|
|
The operand of a @code{GOTO_EXPR} must be either a label or a variable
|
395 |
|
|
containing the address to jump to.
|
396 |
|
|
|
397 |
|
|
The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE} or a
|
398 |
|
|
@code{MODIFY_EXPR} which sets the return value. It would be nice to
|
399 |
|
|
move the @code{MODIFY_EXPR} into a separate statement, but the special
|
400 |
|
|
return semantics in @code{expand_return} make that difficult. It may
|
401 |
|
|
still happen in the future, perhaps by moving most of that logic into
|
402 |
|
|
@code{expand_assignment}.
|
403 |
|
|
|
404 |
|
|
@node Cleanups
|
405 |
|
|
@subsubsection Cleanups
|
406 |
|
|
@cindex Cleanups
|
407 |
|
|
|
408 |
|
|
Destructors for local C++ objects and similar dynamic cleanups are
|
409 |
|
|
represented in GIMPLE by a @code{TRY_FINALLY_EXPR}.
|
410 |
|
|
@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence
|
411 |
|
|
of statements to execute. The first sequence is executed. When it
|
412 |
|
|
completes the second sequence is executed.
|
413 |
|
|
|
414 |
|
|
The first sequence may complete in the following ways:
|
415 |
|
|
|
416 |
|
|
@enumerate
|
417 |
|
|
|
418 |
|
|
@item Execute the last statement in the sequence and fall off the
|
419 |
|
|
end.
|
420 |
|
|
|
421 |
|
|
@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary
|
422 |
|
|
label outside the sequence.
|
423 |
|
|
|
424 |
|
|
@item Execute a return statement (@code{RETURN_EXPR}).
|
425 |
|
|
|
426 |
|
|
@item Throw an exception. This is currently not explicitly represented in
|
427 |
|
|
GIMPLE.
|
428 |
|
|
|
429 |
|
|
@end enumerate
|
430 |
|
|
|
431 |
|
|
The second sequence is not executed if the first sequence completes by
|
432 |
|
|
calling @code{setjmp} or @code{exit} or any other function that does
|
433 |
|
|
not return. The second sequence is also not executed if the first
|
434 |
|
|
sequence completes via a non-local goto or a computed goto (in general
|
435 |
|
|
the compiler does not know whether such a goto statement exits the
|
436 |
|
|
first sequence or not, so we assume that it doesn't).
|
437 |
|
|
|
438 |
|
|
After the second sequence is executed, if it completes normally by
|
439 |
|
|
falling off the end, execution continues wherever the first sequence
|
440 |
|
|
would have continued, by falling off the end, or doing a goto, etc.
|
441 |
|
|
|
442 |
|
|
@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
|
443 |
|
|
needs to appear on every edge out of the controlled block; this
|
444 |
|
|
reduces the freedom to move code across these edges. Therefore, the
|
445 |
|
|
EH lowering pass which runs before most of the optimization passes
|
446 |
|
|
eliminates these expressions by explicitly adding the cleanup to each
|
447 |
|
|
edge. Rethrowing the exception is represented using @code{RESX_EXPR}.
|
448 |
|
|
|
449 |
|
|
|
450 |
|
|
@node GIMPLE Exception Handling
|
451 |
|
|
@subsubsection Exception Handling
|
452 |
|
|
@cindex GIMPLE Exception Handling
|
453 |
|
|
|
454 |
|
|
Other exception handling constructs are represented using
|
455 |
|
|
@code{TRY_CATCH_EXPR}. @code{TRY_CATCH_EXPR} has two operands. The
|
456 |
|
|
first operand is a sequence of statements to execute. If executing
|
457 |
|
|
these statements does not throw an exception, then the second operand
|
458 |
|
|
is ignored. Otherwise, if an exception is thrown, then the second
|
459 |
|
|
operand of the @code{TRY_CATCH_EXPR} is checked. The second operand
|
460 |
|
|
may have the following forms:
|
461 |
|
|
|
462 |
|
|
@enumerate
|
463 |
|
|
|
464 |
|
|
@item A sequence of statements to execute. When an exception occurs,
|
465 |
|
|
these statements are executed, and then the exception is rethrown.
|
466 |
|
|
|
467 |
|
|
@item A sequence of @code{CATCH_EXPR} expressions. Each @code{CATCH_EXPR}
|
468 |
|
|
has a list of applicable exception types and handler code. If the
|
469 |
|
|
thrown exception matches one of the caught types, the associated
|
470 |
|
|
handler code is executed. If the handler code falls off the bottom,
|
471 |
|
|
execution continues after the original @code{TRY_CATCH_EXPR}.
|
472 |
|
|
|
473 |
|
|
@item An @code{EH_FILTER_EXPR} expression. This has a list of
|
474 |
|
|
permitted exception types, and code to handle a match failure. If the
|
475 |
|
|
thrown exception does not match one of the allowed types, the
|
476 |
|
|
associated match failure code is executed. If the thrown exception
|
477 |
|
|
does match, it continues unwinding the stack looking for the next
|
478 |
|
|
handler.
|
479 |
|
|
|
480 |
|
|
@end enumerate
|
481 |
|
|
|
482 |
|
|
Currently throwing an exception is not directly represented in GIMPLE,
|
483 |
|
|
since it is implemented by calling a function. At some point in the future
|
484 |
|
|
we will want to add some way to express that the call will throw an
|
485 |
|
|
exception of a known type.
|
486 |
|
|
|
487 |
|
|
Just before running the optimizers, the compiler lowers the high-level
|
488 |
|
|
EH constructs above into a set of @samp{goto}s, magic labels, and EH
|
489 |
|
|
regions. Continuing to unwind at the end of a cleanup is represented
|
490 |
|
|
with a @code{RESX_EXPR}.
|
491 |
|
|
|
492 |
|
|
@node GIMPLE Example
|
493 |
|
|
@subsection GIMPLE Example
|
494 |
|
|
@cindex GIMPLE Example
|
495 |
|
|
|
496 |
|
|
@smallexample
|
497 |
|
|
struct A @{ A(); ~A(); @};
|
498 |
|
|
|
499 |
|
|
int i;
|
500 |
|
|
int g();
|
501 |
|
|
void f()
|
502 |
|
|
@{
|
503 |
|
|
A a;
|
504 |
|
|
int j = (--i, i ? 0 : 1);
|
505 |
|
|
|
506 |
|
|
for (int x = 42; x > 0; --x)
|
507 |
|
|
@{
|
508 |
|
|
i += g()*4 + 32;
|
509 |
|
|
@}
|
510 |
|
|
@}
|
511 |
|
|
@end smallexample
|
512 |
|
|
|
513 |
|
|
becomes
|
514 |
|
|
|
515 |
|
|
@smallexample
|
516 |
|
|
void f()
|
517 |
|
|
@{
|
518 |
|
|
int i.0;
|
519 |
|
|
int T.1;
|
520 |
|
|
int iftmp.2;
|
521 |
|
|
int T.3;
|
522 |
|
|
int T.4;
|
523 |
|
|
int T.5;
|
524 |
|
|
int T.6;
|
525 |
|
|
|
526 |
|
|
@{
|
527 |
|
|
struct A a;
|
528 |
|
|
int j;
|
529 |
|
|
|
530 |
|
|
__comp_ctor (&a);
|
531 |
|
|
try
|
532 |
|
|
@{
|
533 |
|
|
i.0 = i;
|
534 |
|
|
T.1 = i.0 - 1;
|
535 |
|
|
i = T.1;
|
536 |
|
|
i.0 = i;
|
537 |
|
|
if (i.0 == 0)
|
538 |
|
|
iftmp.2 = 1;
|
539 |
|
|
else
|
540 |
|
|
iftmp.2 = 0;
|
541 |
|
|
j = iftmp.2;
|
542 |
|
|
@{
|
543 |
|
|
int x;
|
544 |
|
|
|
545 |
|
|
x = 42;
|
546 |
|
|
goto test;
|
547 |
|
|
loop:;
|
548 |
|
|
|
549 |
|
|
T.3 = g ();
|
550 |
|
|
T.4 = T.3 * 4;
|
551 |
|
|
i.0 = i;
|
552 |
|
|
T.5 = T.4 + i.0;
|
553 |
|
|
T.6 = T.5 + 32;
|
554 |
|
|
i = T.6;
|
555 |
|
|
x = x - 1;
|
556 |
|
|
|
557 |
|
|
test:;
|
558 |
|
|
if (x > 0)
|
559 |
|
|
goto loop;
|
560 |
|
|
else
|
561 |
|
|
goto break_;
|
562 |
|
|
break_:;
|
563 |
|
|
@}
|
564 |
|
|
@}
|
565 |
|
|
finally
|
566 |
|
|
@{
|
567 |
|
|
__comp_dtor (&a);
|
568 |
|
|
@}
|
569 |
|
|
@}
|
570 |
|
|
@}
|
571 |
|
|
@end smallexample
|
572 |
|
|
|
573 |
|
|
@node Rough GIMPLE Grammar
|
574 |
|
|
@subsection Rough GIMPLE Grammar
|
575 |
|
|
@cindex Rough GIMPLE Grammar
|
576 |
|
|
|
577 |
|
|
@smallexample
|
578 |
|
|
function : FUNCTION_DECL
|
579 |
|
|
DECL_SAVED_TREE -> compound-stmt
|
580 |
|
|
|
581 |
|
|
compound-stmt: STATEMENT_LIST
|
582 |
|
|
members -> stmt
|
583 |
|
|
|
584 |
|
|
stmt : block
|
585 |
|
|
| if-stmt
|
586 |
|
|
| switch-stmt
|
587 |
|
|
| goto-stmt
|
588 |
|
|
| return-stmt
|
589 |
|
|
| resx-stmt
|
590 |
|
|
| label-stmt
|
591 |
|
|
| try-stmt
|
592 |
|
|
| modify-stmt
|
593 |
|
|
| call-stmt
|
594 |
|
|
|
595 |
|
|
block : BIND_EXPR
|
596 |
|
|
BIND_EXPR_VARS -> chain of DECLs
|
597 |
|
|
BIND_EXPR_BLOCK -> BLOCK
|
598 |
|
|
BIND_EXPR_BODY -> compound-stmt
|
599 |
|
|
|
600 |
|
|
if-stmt : COND_EXPR
|
601 |
|
|
op0 -> condition
|
602 |
|
|
op1 -> compound-stmt
|
603 |
|
|
op2 -> compound-stmt
|
604 |
|
|
|
605 |
|
|
switch-stmt : SWITCH_EXPR
|
606 |
|
|
op0 -> val
|
607 |
|
|
op1 -> NULL
|
608 |
|
|
op2 -> TREE_VEC of CASE_LABEL_EXPRs
|
609 |
|
|
The CASE_LABEL_EXPRs are sorted by CASE_LOW,
|
610 |
|
|
and default is last.
|
611 |
|
|
|
612 |
|
|
goto-stmt : GOTO_EXPR
|
613 |
|
|
op0 -> LABEL_DECL | val
|
614 |
|
|
|
615 |
|
|
return-stmt : RETURN_EXPR
|
616 |
|
|
op0 -> return-value
|
617 |
|
|
|
618 |
|
|
return-value : NULL
|
619 |
|
|
| RESULT_DECL
|
620 |
|
|
| MODIFY_EXPR
|
621 |
|
|
op0 -> RESULT_DECL
|
622 |
|
|
op1 -> lhs
|
623 |
|
|
|
624 |
|
|
resx-stmt : RESX_EXPR
|
625 |
|
|
|
626 |
|
|
label-stmt : LABEL_EXPR
|
627 |
|
|
op0 -> LABEL_DECL
|
628 |
|
|
|
629 |
|
|
try-stmt : TRY_CATCH_EXPR
|
630 |
|
|
op0 -> compound-stmt
|
631 |
|
|
op1 -> handler
|
632 |
|
|
| TRY_FINALLY_EXPR
|
633 |
|
|
op0 -> compound-stmt
|
634 |
|
|
op1 -> compound-stmt
|
635 |
|
|
|
636 |
|
|
handler : catch-seq
|
637 |
|
|
| EH_FILTER_EXPR
|
638 |
|
|
| compound-stmt
|
639 |
|
|
|
640 |
|
|
catch-seq : STATEMENT_LIST
|
641 |
|
|
members -> CATCH_EXPR
|
642 |
|
|
|
643 |
|
|
modify-stmt : MODIFY_EXPR
|
644 |
|
|
op0 -> lhs
|
645 |
|
|
op1 -> rhs
|
646 |
|
|
|
647 |
|
|
call-stmt : CALL_EXPR
|
648 |
|
|
op0 -> val | OBJ_TYPE_REF
|
649 |
|
|
op1 -> call-arg-list
|
650 |
|
|
|
651 |
|
|
call-arg-list: TREE_LIST
|
652 |
|
|
members -> lhs | CONST
|
653 |
|
|
|
654 |
|
|
addr-expr-arg: ID
|
655 |
|
|
| compref
|
656 |
|
|
|
657 |
|
|
addressable : addr-expr-arg
|
658 |
|
|
| indirectref
|
659 |
|
|
|
660 |
|
|
with-size-arg: addressable
|
661 |
|
|
| call-stmt
|
662 |
|
|
|
663 |
|
|
indirectref : INDIRECT_REF
|
664 |
|
|
op0 -> val
|
665 |
|
|
|
666 |
|
|
lhs : addressable
|
667 |
|
|
| bitfieldref
|
668 |
|
|
| WITH_SIZE_EXPR
|
669 |
|
|
op0 -> with-size-arg
|
670 |
|
|
op1 -> val
|
671 |
|
|
|
672 |
|
|
min-lval : ID
|
673 |
|
|
| indirectref
|
674 |
|
|
|
675 |
|
|
bitfieldref : BIT_FIELD_REF
|
676 |
|
|
op0 -> inner-compref
|
677 |
|
|
op1 -> CONST
|
678 |
|
|
op2 -> var
|
679 |
|
|
|
680 |
|
|
compref : inner-compref
|
681 |
|
|
| TARGET_MEM_REF
|
682 |
|
|
op0 -> ID
|
683 |
|
|
op1 -> val
|
684 |
|
|
op2 -> val
|
685 |
|
|
op3 -> CONST
|
686 |
|
|
op4 -> CONST
|
687 |
|
|
| REALPART_EXPR
|
688 |
|
|
op0 -> inner-compref
|
689 |
|
|
| IMAGPART_EXPR
|
690 |
|
|
op0 -> inner-compref
|
691 |
|
|
|
692 |
|
|
inner-compref: min-lval
|
693 |
|
|
| COMPONENT_REF
|
694 |
|
|
op0 -> inner-compref
|
695 |
|
|
op1 -> FIELD_DECL
|
696 |
|
|
op2 -> val
|
697 |
|
|
| ARRAY_REF
|
698 |
|
|
op0 -> inner-compref
|
699 |
|
|
op1 -> val
|
700 |
|
|
op2 -> val
|
701 |
|
|
op3 -> val
|
702 |
|
|
| ARRAY_RANGE_REF
|
703 |
|
|
op0 -> inner-compref
|
704 |
|
|
op1 -> val
|
705 |
|
|
op2 -> val
|
706 |
|
|
op3 -> val
|
707 |
|
|
| VIEW_CONVERT_EXPR
|
708 |
|
|
op0 -> inner-compref
|
709 |
|
|
|
710 |
|
|
condition : val
|
711 |
|
|
| RELOP
|
712 |
|
|
op0 -> val
|
713 |
|
|
op1 -> val
|
714 |
|
|
|
715 |
|
|
val : ID
|
716 |
|
|
| CONST
|
717 |
|
|
|
718 |
|
|
rhs : lhs
|
719 |
|
|
| CONST
|
720 |
|
|
| call-stmt
|
721 |
|
|
| ADDR_EXPR
|
722 |
|
|
op0 -> addr-expr-arg
|
723 |
|
|
| UNOP
|
724 |
|
|
op0 -> val
|
725 |
|
|
| BINOP
|
726 |
|
|
op0 -> val
|
727 |
|
|
op1 -> val
|
728 |
|
|
| RELOP
|
729 |
|
|
op0 -> val
|
730 |
|
|
op1 -> val
|
731 |
|
|
| COND_EXPR
|
732 |
|
|
op0 -> condition
|
733 |
|
|
op1 -> val
|
734 |
|
|
op2 -> val
|
735 |
|
|
@end smallexample
|
736 |
|
|
|
737 |
|
|
@node Annotations
|
738 |
|
|
@section Annotations
|
739 |
|
|
@cindex annotations
|
740 |
|
|
|
741 |
|
|
The optimizers need to associate attributes with statements and
|
742 |
|
|
variables during the optimization process. For instance, we need to
|
743 |
|
|
know what basic block a statement belongs to or whether a variable
|
744 |
|
|
has aliases. All these attributes are stored in data structures
|
745 |
|
|
called annotations which are then linked to the field @code{ann} in
|
746 |
|
|
@code{struct tree_common}.
|
747 |
|
|
|
748 |
|
|
Presently, we define annotations for statements (@code{stmt_ann_t}),
|
749 |
|
|
variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}).
|
750 |
|
|
Annotations are defined and documented in @file{tree-flow.h}.
|
751 |
|
|
|
752 |
|
|
|
753 |
|
|
@node Statement Operands
|
754 |
|
|
@section Statement Operands
|
755 |
|
|
@cindex operands
|
756 |
|
|
@cindex virtual operands
|
757 |
|
|
@cindex real operands
|
758 |
|
|
@findex update_stmt
|
759 |
|
|
|
760 |
|
|
Almost every GIMPLE statement will contain a reference to a variable
|
761 |
|
|
or memory location. Since statements come in different shapes and
|
762 |
|
|
sizes, their operands are going to be located at various spots inside
|
763 |
|
|
the statement's tree. To facilitate access to the statement's
|
764 |
|
|
operands, they are organized into lists associated inside each
|
765 |
|
|
statement's annotation. Each element in an operand list is a pointer
|
766 |
|
|
to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node.
|
767 |
|
|
This provides a very convenient way of examining and replacing
|
768 |
|
|
operands.
|
769 |
|
|
|
770 |
|
|
Data flow analysis and optimization is done on all tree nodes
|
771 |
|
|
representing variables. Any node for which @code{SSA_VAR_P} returns
|
772 |
|
|
nonzero is considered when scanning statement operands. However, not
|
773 |
|
|
all @code{SSA_VAR_P} variables are processed in the same way. For the
|
774 |
|
|
purposes of optimization, we need to distinguish between references to
|
775 |
|
|
local scalar variables and references to globals, statics, structures,
|
776 |
|
|
arrays, aliased variables, etc. The reason is simple, the compiler
|
777 |
|
|
can gather complete data flow information for a local scalar. On the
|
778 |
|
|
other hand, a global variable may be modified by a function call, it
|
779 |
|
|
may not be possible to keep track of all the elements of an array or
|
780 |
|
|
the fields of a structure, etc.
|
781 |
|
|
|
782 |
|
|
The operand scanner gathers two kinds of operands: @dfn{real} and
|
783 |
|
|
@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true
|
784 |
|
|
is considered real, otherwise it is a virtual operand. We also
|
785 |
|
|
distinguish between uses and definitions. An operand is used if its
|
786 |
|
|
value is loaded by the statement (e.g., the operand at the RHS of an
|
787 |
|
|
assignment). If the statement assigns a new value to the operand, the
|
788 |
|
|
operand is considered a definition (e.g., the operand at the LHS of
|
789 |
|
|
an assignment).
|
790 |
|
|
|
791 |
|
|
Virtual and real operands also have very different data flow
|
792 |
|
|
properties. Real operands are unambiguous references to the
|
793 |
|
|
full object that they represent. For instance, given
|
794 |
|
|
|
795 |
|
|
@smallexample
|
796 |
|
|
@{
|
797 |
|
|
int a, b;
|
798 |
|
|
a = b
|
799 |
|
|
@}
|
800 |
|
|
@end smallexample
|
801 |
|
|
|
802 |
|
|
Since @code{a} and @code{b} are non-aliased locals, the statement
|
803 |
|
|
@code{a = b} will have one real definition and one real use because
|
804 |
|
|
variable @code{b} is completely modified with the contents of
|
805 |
|
|
variable @code{a}. Real definition are also known as @dfn{killing
|
806 |
|
|
definitions}. Similarly, the use of @code{a} reads all its bits.
|
807 |
|
|
|
808 |
|
|
In contrast, virtual operands are used with variables that can have
|
809 |
|
|
a partial or ambiguous reference. This includes structures, arrays,
|
810 |
|
|
globals, and aliased variables. In these cases, we have two types of
|
811 |
|
|
definitions. For globals, structures, and arrays, we can determine from
|
812 |
|
|
a statement whether a variable of these types has a killing definition.
|
813 |
|
|
If the variable does, then the statement is marked as having a
|
814 |
|
|
@dfn{must definition} of that variable. However, if a statement is only
|
815 |
|
|
defining a part of the variable (i.e.@: a field in a structure), or if we
|
816 |
|
|
know that a statement might define the variable but we cannot say for sure,
|
817 |
|
|
then we mark that statement as having a @dfn{may definition}. For
|
818 |
|
|
instance, given
|
819 |
|
|
|
820 |
|
|
@smallexample
|
821 |
|
|
@{
|
822 |
|
|
int a, b, *p;
|
823 |
|
|
|
824 |
|
|
if (...)
|
825 |
|
|
p = &a;
|
826 |
|
|
else
|
827 |
|
|
p = &b;
|
828 |
|
|
*p = 5;
|
829 |
|
|
return *p;
|
830 |
|
|
@}
|
831 |
|
|
@end smallexample
|
832 |
|
|
|
833 |
|
|
The assignment @code{*p = 5} may be a definition of @code{a} or
|
834 |
|
|
@code{b}. If we cannot determine statically where @code{p} is
|
835 |
|
|
pointing to at the time of the store operation, we create virtual
|
836 |
|
|
definitions to mark that statement as a potential definition site for
|
837 |
|
|
@code{a} and @code{b}. Memory loads are similarly marked with virtual
|
838 |
|
|
use operands. Virtual operands are shown in tree dumps right before
|
839 |
|
|
the statement that contains them. To request a tree dump with virtual
|
840 |
|
|
operands, use the @option{-vops} option to @option{-fdump-tree}:
|
841 |
|
|
|
842 |
|
|
@smallexample
|
843 |
|
|
@{
|
844 |
|
|
int a, b, *p;
|
845 |
|
|
|
846 |
|
|
if (...)
|
847 |
|
|
p = &a;
|
848 |
|
|
else
|
849 |
|
|
p = &b;
|
850 |
|
|
# a = V_MAY_DEF <a>
|
851 |
|
|
# b = V_MAY_DEF <b>
|
852 |
|
|
*p = 5;
|
853 |
|
|
|
854 |
|
|
# VUSE <a>
|
855 |
|
|
# VUSE <b>
|
856 |
|
|
return *p;
|
857 |
|
|
@}
|
858 |
|
|
@end smallexample
|
859 |
|
|
|
860 |
|
|
Notice that @code{V_MAY_DEF} operands have two copies of the referenced
|
861 |
|
|
variable. This indicates that this is not a killing definition of
|
862 |
|
|
that variable. In this case we refer to it as a @dfn{may definition}
|
863 |
|
|
or @dfn{aliased store}. The presence of the second copy of the
|
864 |
|
|
variable in the @code{V_MAY_DEF} operand will become important when the
|
865 |
|
|
function is converted into SSA form. This will be used to link all
|
866 |
|
|
the non-killing definitions to prevent optimizations from making
|
867 |
|
|
incorrect assumptions about them.
|
868 |
|
|
|
869 |
|
|
Operands are updated as soon as the statement is finished via a call
|
870 |
|
|
to @code{update_stmt}. If statement elements are changed via
|
871 |
|
|
@code{SET_USE} or @code{SET_DEF}, then no further action is required
|
872 |
|
|
(ie, those macros take care of updating the statement). If changes
|
873 |
|
|
are made by manipulating the statement's tree directly, then a call
|
874 |
|
|
must be made to @code{update_stmt} when complete. Calling one of the
|
875 |
|
|
@code{bsi_insert} routines or @code{bsi_replace} performs an implicit
|
876 |
|
|
call to @code{update_stmt}.
|
877 |
|
|
|
878 |
|
|
@subsection Operand Iterators And Access Routines
|
879 |
|
|
@cindex Operand Iterators
|
880 |
|
|
@cindex Operand Access Routines
|
881 |
|
|
|
882 |
|
|
Operands are collected by @file{tree-ssa-operands.c}. They are stored
|
883 |
|
|
inside each statement's annotation and can be accessed through either the
|
884 |
|
|
operand iterators or an access routine.
|
885 |
|
|
|
886 |
|
|
The following access routines are available for examining operands:
|
887 |
|
|
|
888 |
|
|
@enumerate
|
889 |
|
|
@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return
|
890 |
|
|
NULL unless there is exactly one operand matching the specified flags. If
|
891 |
|
|
there is exactly one operand, the operand is returned as either a @code{tree},
|
892 |
|
|
@code{def_operand_p}, or @code{use_operand_p}.
|
893 |
|
|
|
894 |
|
|
@smallexample
|
895 |
|
|
tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags);
|
896 |
|
|
use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES);
|
897 |
|
|
def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS);
|
898 |
|
|
@end smallexample
|
899 |
|
|
|
900 |
|
|
@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no
|
901 |
|
|
operands matching the specified flags.
|
902 |
|
|
|
903 |
|
|
@smallexample
|
904 |
|
|
if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
|
905 |
|
|
return;
|
906 |
|
|
@end smallexample
|
907 |
|
|
|
908 |
|
|
@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands
|
909 |
|
|
matching 'flags'. This actually executes a loop to perform the count, so
|
910 |
|
|
only use this if it is really needed.
|
911 |
|
|
|
912 |
|
|
@smallexample
|
913 |
|
|
int count = NUM_SSA_OPERANDS (stmt, flags)
|
914 |
|
|
@end smallexample
|
915 |
|
|
@end enumerate
|
916 |
|
|
|
917 |
|
|
|
918 |
|
|
If you wish to iterate over some or all operands, use the
|
919 |
|
|
@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print
|
920 |
|
|
all the operands for a statement:
|
921 |
|
|
|
922 |
|
|
@smallexample
|
923 |
|
|
void
|
924 |
|
|
print_ops (tree stmt)
|
925 |
|
|
@{
|
926 |
|
|
ssa_op_iter;
|
927 |
|
|
tree var;
|
928 |
|
|
|
929 |
|
|
FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS)
|
930 |
|
|
print_generic_expr (stderr, var, TDF_SLIM);
|
931 |
|
|
@}
|
932 |
|
|
@end smallexample
|
933 |
|
|
|
934 |
|
|
|
935 |
|
|
How to choose the appropriate iterator:
|
936 |
|
|
|
937 |
|
|
@enumerate
|
938 |
|
|
@item Determine whether you are need to see the operand pointers, or just the
|
939 |
|
|
trees, and choose the appropriate macro:
|
940 |
|
|
|
941 |
|
|
@smallexample
|
942 |
|
|
Need Macro:
|
943 |
|
|
---- -------
|
944 |
|
|
use_operand_p FOR_EACH_SSA_USE_OPERAND
|
945 |
|
|
def_operand_p FOR_EACH_SSA_DEF_OPERAND
|
946 |
|
|
tree FOR_EACH_SSA_TREE_OPERAND
|
947 |
|
|
@end smallexample
|
948 |
|
|
|
949 |
|
|
@item You need to declare a variable of the type you are interested
|
950 |
|
|
in, and an ssa_op_iter structure which serves as the loop
|
951 |
|
|
controlling variable.
|
952 |
|
|
|
953 |
|
|
@item Determine which operands you wish to use, and specify the flags of
|
954 |
|
|
those you are interested in. They are documented in
|
955 |
|
|
@file{tree-ssa-operands.h}:
|
956 |
|
|
|
957 |
|
|
@smallexample
|
958 |
|
|
#define SSA_OP_USE 0x01 /* @r{Real USE operands.} */
|
959 |
|
|
#define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */
|
960 |
|
|
#define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */
|
961 |
|
|
#define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of V_MAY_DEFS.} */
|
962 |
|
|
#define SSA_OP_VMAYDEF 0x10 /* @r{DEF portion of V_MAY_DEFS.} */
|
963 |
|
|
#define SSA_OP_VMUSTDEF 0x20 /* @r{V_MUST_DEF definitions.} */
|
964 |
|
|
|
965 |
|
|
/* @r{These are commonly grouped operand flags.} */
|
966 |
|
|
#define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE)
|
967 |
|
|
#define SSA_OP_VIRTUAL_DEFS (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF)
|
968 |
|
|
#define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE)
|
969 |
|
|
#define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF)
|
970 |
|
|
#define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS)
|
971 |
|
|
@end smallexample
|
972 |
|
|
@end enumerate
|
973 |
|
|
|
974 |
|
|
So if you want to look at the use pointers for all the @code{USE} and
|
975 |
|
|
@code{VUSE} operands, you would do something like:
|
976 |
|
|
|
977 |
|
|
@smallexample
|
978 |
|
|
use_operand_p use_p;
|
979 |
|
|
ssa_op_iter iter;
|
980 |
|
|
|
981 |
|
|
FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE))
|
982 |
|
|
@{
|
983 |
|
|
process_use_ptr (use_p);
|
984 |
|
|
@}
|
985 |
|
|
@end smallexample
|
986 |
|
|
|
987 |
|
|
The @code{TREE} macro is basically the same as the @code{USE} and
|
988 |
|
|
@code{DEF} macros, only with the use or def dereferenced via
|
989 |
|
|
@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we
|
990 |
|
|
aren't using operand pointers, use and defs flags can be mixed.
|
991 |
|
|
|
992 |
|
|
@smallexample
|
993 |
|
|
tree var;
|
994 |
|
|
ssa_op_iter iter;
|
995 |
|
|
|
996 |
|
|
FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF)
|
997 |
|
|
@{
|
998 |
|
|
print_generic_expr (stderr, var, TDF_SLIM);
|
999 |
|
|
@}
|
1000 |
|
|
@end smallexample
|
1001 |
|
|
|
1002 |
|
|
@code{V_MAY_DEF}s are broken into two flags, one for the
|
1003 |
|
|
@code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion
|
1004 |
|
|
(@code{SSA_OP_VMAYUSE}). If all you want to look at are the
|
1005 |
|
|
@code{V_MAY_DEF}s together, there is a fourth iterator macro for this,
|
1006 |
|
|
which returns both a def_operand_p and a use_operand_p for each
|
1007 |
|
|
@code{V_MAY_DEF} in the statement. Note that you don't need any flags for
|
1008 |
|
|
this one.
|
1009 |
|
|
|
1010 |
|
|
@smallexample
|
1011 |
|
|
use_operand_p use_p;
|
1012 |
|
|
def_operand_p def_p;
|
1013 |
|
|
ssa_op_iter iter;
|
1014 |
|
|
|
1015 |
|
|
FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter)
|
1016 |
|
|
@{
|
1017 |
|
|
my_code;
|
1018 |
|
|
@}
|
1019 |
|
|
@end smallexample
|
1020 |
|
|
|
1021 |
|
|
@code{V_MUST_DEF}s are broken into two flags, one for the
|
1022 |
|
|
@code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion
|
1023 |
|
|
(@code{SSA_OP_VMUSTKILL}). If all you want to look at are the
|
1024 |
|
|
@code{V_MUST_DEF}s together, there is a fourth iterator macro for this,
|
1025 |
|
|
which returns both a def_operand_p and a use_operand_p for each
|
1026 |
|
|
@code{V_MUST_DEF} in the statement. Note that you don't need any flags for
|
1027 |
|
|
this one.
|
1028 |
|
|
|
1029 |
|
|
@smallexample
|
1030 |
|
|
use_operand_p kill_p;
|
1031 |
|
|
def_operand_p def_p;
|
1032 |
|
|
ssa_op_iter iter;
|
1033 |
|
|
|
1034 |
|
|
FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter)
|
1035 |
|
|
@{
|
1036 |
|
|
my_code;
|
1037 |
|
|
@}
|
1038 |
|
|
@end smallexample
|
1039 |
|
|
|
1040 |
|
|
|
1041 |
|
|
There are many examples in the code as well, as well as the
|
1042 |
|
|
documentation in @file{tree-ssa-operands.h}.
|
1043 |
|
|
|
1044 |
|
|
There are also a couple of variants on the stmt iterators regarding PHI
|
1045 |
|
|
nodes.
|
1046 |
|
|
|
1047 |
|
|
@code{FOR_EACH_PHI_ARG} Works exactly like
|
1048 |
|
|
@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments
|
1049 |
|
|
instead of statement operands.
|
1050 |
|
|
|
1051 |
|
|
@smallexample
|
1052 |
|
|
/* Look at every virtual PHI use. */
|
1053 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES)
|
1054 |
|
|
@{
|
1055 |
|
|
my_code;
|
1056 |
|
|
@}
|
1057 |
|
|
|
1058 |
|
|
/* Look at every real PHI use. */
|
1059 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES)
|
1060 |
|
|
my_code;
|
1061 |
|
|
|
1062 |
|
|
/* Look at every every PHI use. */
|
1063 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES)
|
1064 |
|
|
my_code;
|
1065 |
|
|
@end smallexample
|
1066 |
|
|
|
1067 |
|
|
@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like
|
1068 |
|
|
@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on
|
1069 |
|
|
either a statement or a @code{PHI} node. These should be used when it is
|
1070 |
|
|
appropriate but they are not quite as efficient as the individual
|
1071 |
|
|
@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines.
|
1072 |
|
|
|
1073 |
|
|
@smallexample
|
1074 |
|
|
FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags)
|
1075 |
|
|
@{
|
1076 |
|
|
my_code;
|
1077 |
|
|
@}
|
1078 |
|
|
|
1079 |
|
|
FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags)
|
1080 |
|
|
@{
|
1081 |
|
|
my_code;
|
1082 |
|
|
@}
|
1083 |
|
|
@end smallexample
|
1084 |
|
|
|
1085 |
|
|
@subsection Immediate Uses
|
1086 |
|
|
@cindex Immediate Uses
|
1087 |
|
|
|
1088 |
|
|
Immediate use information is now always available. Using the immediate use
|
1089 |
|
|
iterators, you may examine every use of any @code{SSA_NAME}. For instance,
|
1090 |
|
|
to change each use of @code{ssa_var} to @code{ssa_var2}:
|
1091 |
|
|
|
1092 |
|
|
@smallexample
|
1093 |
|
|
use_operand_p imm_use_p;
|
1094 |
|
|
imm_use_iterator iterator;
|
1095 |
|
|
tree ssa_var
|
1096 |
|
|
|
1097 |
|
|
FOR_EACH_IMM_USE_SAFE (imm_use_p, iterator, ssa_var)
|
1098 |
|
|
SET_USE (imm_use_p, ssa_var_2);
|
1099 |
|
|
@end smallexample
|
1100 |
|
|
|
1101 |
|
|
There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is used
|
1102 |
|
|
when the immediate uses are not changed, ie. you are looking at the uses, but
|
1103 |
|
|
not setting them.
|
1104 |
|
|
|
1105 |
|
|
If they do get changed, then care must be taken that things are not changed
|
1106 |
|
|
under the iterators, so use the @code{FOR_EACH_IMM_USE_SAFE} iterator. It
|
1107 |
|
|
attempts to preserve the sanity of the use list by moving an iterator element
|
1108 |
|
|
through the use list, preventing insertions and deletions in the list from
|
1109 |
|
|
resulting in invalid pointers. This is a little slower since it adds a
|
1110 |
|
|
placeholder element and moves it through the list. This element must be
|
1111 |
|
|
also be removed if the loop is terminated early. A macro
|
1112 |
|
|
(@code{BREAK_FROM SAFE_IMM_USE}) is provided for this:
|
1113 |
|
|
|
1114 |
|
|
@smallexample
|
1115 |
|
|
FOR_EACH_IMM_USE_SAFE (use_p, iter, var)
|
1116 |
|
|
@{
|
1117 |
|
|
if (var == last_var)
|
1118 |
|
|
BREAK_FROM_SAFE_IMM_USE (iter);
|
1119 |
|
|
else
|
1120 |
|
|
SET_USE (use_p, var2);
|
1121 |
|
|
@}
|
1122 |
|
|
@end smallexample
|
1123 |
|
|
|
1124 |
|
|
There are checks in @code{verify_ssa} which verify that the immediate use list
|
1125 |
|
|
is up to date, as well as checking that an optimization didn't break from the
|
1126 |
|
|
loop without using this macro. It is safe to simply 'break'; from a
|
1127 |
|
|
@code{FOR_EACH_IMM_USE_FAST} traverse.
|
1128 |
|
|
|
1129 |
|
|
Some useful functions and macros:
|
1130 |
|
|
@enumerate
|
1131 |
|
|
@item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of
|
1132 |
|
|
@code{ssa_var}.
|
1133 |
|
|
@item @code{has_single_use (ssa_var)} : Returns true if there is only a
|
1134 |
|
|
single use of @code{ssa_var}.
|
1135 |
|
|
@item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} :
|
1136 |
|
|
Returns true if there is only a single use of @code{ssa_var}, and also returns
|
1137 |
|
|
the use pointer and statement it occurs in in the second and third parameters.
|
1138 |
|
|
@item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of
|
1139 |
|
|
@code{ssa_var}. It is better not to use this if possible since it simply
|
1140 |
|
|
utilizes a loop to count the uses.
|
1141 |
|
|
@item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI}
|
1142 |
|
|
node, return the index number for the use. An assert is triggered if the use
|
1143 |
|
|
isn't located in a @code{PHI} node.
|
1144 |
|
|
@item @code{USE_STMT (use_p)} : Return the statement a use occurs in.
|
1145 |
|
|
@end enumerate
|
1146 |
|
|
|
1147 |
|
|
Note that uses are not put into an immediate use list until their statement is
|
1148 |
|
|
actually inserted into the instruction stream via a @code{bsi_*} routine.
|
1149 |
|
|
|
1150 |
|
|
It is also still possible to utilize lazy updating of statements, but this
|
1151 |
|
|
should be used only when absolutely required. Both alias analysis and the
|
1152 |
|
|
dominator optimizations currently do this.
|
1153 |
|
|
|
1154 |
|
|
When lazy updating is being used, the immediate use information is out of date
|
1155 |
|
|
and cannot be used reliably. Lazy updating is achieved by simply marking
|
1156 |
|
|
statements modified via calls to @code{mark_stmt_modified} instead of
|
1157 |
|
|
@code{update_stmt}. When lazy updating is no longer required, all the
|
1158 |
|
|
modified statements must have @code{update_stmt} called in order to bring them
|
1159 |
|
|
up to date. This must be done before the optimization is finished, or
|
1160 |
|
|
@code{verify_ssa} will trigger an abort.
|
1161 |
|
|
|
1162 |
|
|
This is done with a simple loop over the instruction stream:
|
1163 |
|
|
@smallexample
|
1164 |
|
|
block_stmt_iterator bsi;
|
1165 |
|
|
basic_block bb;
|
1166 |
|
|
FOR_EACH_BB (bb)
|
1167 |
|
|
@{
|
1168 |
|
|
for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
|
1169 |
|
|
update_stmt_if_modified (bsi_stmt (bsi));
|
1170 |
|
|
@}
|
1171 |
|
|
@end smallexample
|
1172 |
|
|
|
1173 |
|
|
@node SSA
|
1174 |
|
|
@section Static Single Assignment
|
1175 |
|
|
@cindex SSA
|
1176 |
|
|
@cindex static single assignment
|
1177 |
|
|
|
1178 |
|
|
Most of the tree optimizers rely on the data flow information provided
|
1179 |
|
|
by the Static Single Assignment (SSA) form. We implement the SSA form
|
1180 |
|
|
as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and
|
1181 |
|
|
K. Zadeck. Efficiently Computing Static Single Assignment Form and the
|
1182 |
|
|
Control Dependence Graph. ACM Transactions on Programming Languages
|
1183 |
|
|
and Systems, 13(4):451-490, October 1991}.
|
1184 |
|
|
|
1185 |
|
|
The SSA form is based on the premise that program variables are
|
1186 |
|
|
assigned in exactly one location in the program. Multiple assignments
|
1187 |
|
|
to the same variable create new versions of that variable. Naturally,
|
1188 |
|
|
actual programs are seldom in SSA form initially because variables
|
1189 |
|
|
tend to be assigned multiple times. The compiler modifies the program
|
1190 |
|
|
representation so that every time a variable is assigned in the code,
|
1191 |
|
|
a new version of the variable is created. Different versions of the
|
1192 |
|
|
same variable are distinguished by subscripting the variable name with
|
1193 |
|
|
its version number. Variables used in the right-hand side of
|
1194 |
|
|
expressions are renamed so that their version number matches that of
|
1195 |
|
|
the most recent assignment.
|
1196 |
|
|
|
1197 |
|
|
We represent variable versions using @code{SSA_NAME} nodes. The
|
1198 |
|
|
renaming process in @file{tree-ssa.c} wraps every real and
|
1199 |
|
|
virtual operand with an @code{SSA_NAME} node which contains
|
1200 |
|
|
the version number and the statement that created the
|
1201 |
|
|
@code{SSA_NAME}. Only definitions and virtual definitions may
|
1202 |
|
|
create new @code{SSA_NAME} nodes.
|
1203 |
|
|
|
1204 |
|
|
Sometimes, flow of control makes it impossible to determine what is the
|
1205 |
|
|
most recent version of a variable. In these cases, the compiler
|
1206 |
|
|
inserts an artificial definition for that variable called
|
1207 |
|
|
@dfn{PHI function} or @dfn{PHI node}. This new definition merges
|
1208 |
|
|
all the incoming versions of the variable to create a new name
|
1209 |
|
|
for it. For instance,
|
1210 |
|
|
|
1211 |
|
|
@smallexample
|
1212 |
|
|
if (...)
|
1213 |
|
|
a_1 = 5;
|
1214 |
|
|
else if (...)
|
1215 |
|
|
a_2 = 2;
|
1216 |
|
|
else
|
1217 |
|
|
a_3 = 13;
|
1218 |
|
|
|
1219 |
|
|
# a_4 = PHI <a_1, a_2, a_3>
|
1220 |
|
|
return a_4;
|
1221 |
|
|
@end smallexample
|
1222 |
|
|
|
1223 |
|
|
Since it is not possible to determine which of the three branches
|
1224 |
|
|
will be taken at runtime, we don't know which of @code{a_1},
|
1225 |
|
|
@code{a_2} or @code{a_3} to use at the return statement. So, the
|
1226 |
|
|
SSA renamer creates a new version @code{a_4} which is assigned
|
1227 |
|
|
the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}.
|
1228 |
|
|
Hence, PHI nodes mean ``one of these operands. I don't know
|
1229 |
|
|
which''.
|
1230 |
|
|
|
1231 |
|
|
The following macros can be used to examine PHI nodes
|
1232 |
|
|
|
1233 |
|
|
@defmac PHI_RESULT (@var{phi})
|
1234 |
|
|
Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e.,
|
1235 |
|
|
@var{phi}'s LHS)@.
|
1236 |
|
|
@end defmac
|
1237 |
|
|
|
1238 |
|
|
@defmac PHI_NUM_ARGS (@var{phi})
|
1239 |
|
|
Returns the number of arguments in @var{phi}. This number is exactly
|
1240 |
|
|
the number of incoming edges to the basic block holding @var{phi}@.
|
1241 |
|
|
@end defmac
|
1242 |
|
|
|
1243 |
|
|
@defmac PHI_ARG_ELT (@var{phi}, @var{i})
|
1244 |
|
|
Returns a tuple representing the @var{i}th argument of @var{phi}@.
|
1245 |
|
|
Each element of this tuple contains an @code{SSA_NAME} @var{var} and
|
1246 |
|
|
the incoming edge through which @var{var} flows.
|
1247 |
|
|
@end defmac
|
1248 |
|
|
|
1249 |
|
|
@defmac PHI_ARG_EDGE (@var{phi}, @var{i})
|
1250 |
|
|
Returns the incoming edge for the @var{i}th argument of @var{phi}.
|
1251 |
|
|
@end defmac
|
1252 |
|
|
|
1253 |
|
|
@defmac PHI_ARG_DEF (@var{phi}, @var{i})
|
1254 |
|
|
Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}.
|
1255 |
|
|
@end defmac
|
1256 |
|
|
|
1257 |
|
|
|
1258 |
|
|
@subsection Preserving the SSA form
|
1259 |
|
|
@findex update_ssa
|
1260 |
|
|
@cindex preserving SSA form
|
1261 |
|
|
Some optimization passes make changes to the function that
|
1262 |
|
|
invalidate the SSA property. This can happen when a pass has
|
1263 |
|
|
added new symbols or changed the program so that variables that
|
1264 |
|
|
were previously aliased aren't anymore. Whenever something like this
|
1265 |
|
|
happens, the affected symbols must be renamed into SSA form again.
|
1266 |
|
|
Transformations that emit new code or replicate existing statements
|
1267 |
|
|
will also need to update the SSA form@.
|
1268 |
|
|
|
1269 |
|
|
Since GCC implements two different SSA forms for register and virtual
|
1270 |
|
|
variables, keeping the SSA form up to date depends on whether you are
|
1271 |
|
|
updating register or virtual names. In both cases, the general idea
|
1272 |
|
|
behind incremental SSA updates is similar: when new SSA names are
|
1273 |
|
|
created, they typically are meant to replace other existing names in
|
1274 |
|
|
the program@.
|
1275 |
|
|
|
1276 |
|
|
For instance, given the following code:
|
1277 |
|
|
|
1278 |
|
|
@smallexample
|
1279 |
|
|
1 L0:
|
1280 |
|
|
2 x_1 = PHI (0, x_5)
|
1281 |
|
|
3 if (x_1 < 10)
|
1282 |
|
|
4 if (x_1 > 7)
|
1283 |
|
|
5 y_2 = 0
|
1284 |
|
|
6 else
|
1285 |
|
|
7 y_3 = x_1 + x_7
|
1286 |
|
|
8 endif
|
1287 |
|
|
9 x_5 = x_1 + 1
|
1288 |
|
|
10 goto L0;
|
1289 |
|
|
11 endif
|
1290 |
|
|
@end smallexample
|
1291 |
|
|
|
1292 |
|
|
Suppose that we insert new names @code{x_10} and @code{x_11} (lines
|
1293 |
|
|
@code{4} and @code{8})@.
|
1294 |
|
|
|
1295 |
|
|
@smallexample
|
1296 |
|
|
1 L0:
|
1297 |
|
|
2 x_1 = PHI (0, x_5)
|
1298 |
|
|
3 if (x_1 < 10)
|
1299 |
|
|
4 x_10 = ...
|
1300 |
|
|
5 if (x_1 > 7)
|
1301 |
|
|
6 y_2 = 0
|
1302 |
|
|
7 else
|
1303 |
|
|
8 x_11 = ...
|
1304 |
|
|
9 y_3 = x_1 + x_7
|
1305 |
|
|
10 endif
|
1306 |
|
|
11 x_5 = x_1 + 1
|
1307 |
|
|
12 goto L0;
|
1308 |
|
|
13 endif
|
1309 |
|
|
@end smallexample
|
1310 |
|
|
|
1311 |
|
|
We want to replace all the uses of @code{x_1} with the new definitions
|
1312 |
|
|
of @code{x_10} and @code{x_11}. Note that the only uses that should
|
1313 |
|
|
be replaced are those at lines @code{5}, @code{9} and @code{11}.
|
1314 |
|
|
Also, the use of @code{x_7} at line @code{9} should @emph{not} be
|
1315 |
|
|
replaced (this is why we cannot just mark symbol @code{x} for
|
1316 |
|
|
renaming)@.
|
1317 |
|
|
|
1318 |
|
|
Additionally, we may need to insert a PHI node at line @code{11}
|
1319 |
|
|
because that is a merge point for @code{x_10} and @code{x_11}. So the
|
1320 |
|
|
use of @code{x_1} at line @code{11} will be replaced with the new PHI
|
1321 |
|
|
node. The insertion of PHI nodes is optional. They are not strictly
|
1322 |
|
|
necessary to preserve the SSA form, and depending on what the caller
|
1323 |
|
|
inserted, they may not even be useful for the optimizers@.
|
1324 |
|
|
|
1325 |
|
|
Updating the SSA form is a two step process. First, the pass has to
|
1326 |
|
|
identify which names need to be updated and/or which symbols need to
|
1327 |
|
|
be renamed into SSA form for the first time. When new names are
|
1328 |
|
|
introduced to replace existing names in the program, the mapping
|
1329 |
|
|
between the old and the new names are registered by calling
|
1330 |
|
|
@code{register_new_name_mapping} (note that if your pass creates new
|
1331 |
|
|
code by duplicating basic blocks, the call to @code{tree_duplicate_bb}
|
1332 |
|
|
will set up the necessary mappings automatically). On the other hand,
|
1333 |
|
|
if your pass exposes a new symbol that should be put in SSA form for
|
1334 |
|
|
the first time, the new symbol should be registered with
|
1335 |
|
|
@code{mark_sym_for_renaming}.
|
1336 |
|
|
|
1337 |
|
|
After the replacement mappings have been registered and new symbols
|
1338 |
|
|
marked for renaming, a call to @code{update_ssa} makes the registered
|
1339 |
|
|
changes. This can be done with an explicit call or by creating
|
1340 |
|
|
@code{TODO} flags in the @code{tree_opt_pass} structure for your pass.
|
1341 |
|
|
There are several @code{TODO} flags that control the behavior of
|
1342 |
|
|
@code{update_ssa}:
|
1343 |
|
|
|
1344 |
|
|
@itemize @bullet
|
1345 |
|
|
@item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes
|
1346 |
|
|
for newly exposed symbols and virtual names marked for updating.
|
1347 |
|
|
When updating real names, only insert PHI nodes for a real name
|
1348 |
|
|
@code{O_j} in blocks reached by all the new and old definitions for
|
1349 |
|
|
@code{O_j}. If the iterated dominance frontier for @code{O_j}
|
1350 |
|
|
is not pruned, we may end up inserting PHI nodes in blocks that
|
1351 |
|
|
have one or more edges with no incoming definition for
|
1352 |
|
|
@code{O_j}. This would lead to uninitialized warnings for
|
1353 |
|
|
@code{O_j}'s symbol@.
|
1354 |
|
|
|
1355 |
|
|
@item @code{TODO_update_ssa_no_phi}. Update the SSA form without
|
1356 |
|
|
inserting any new PHI nodes at all. This is used by passes that
|
1357 |
|
|
have either inserted all the PHI nodes themselves or passes that
|
1358 |
|
|
need only to patch use-def and def-def chains for virtuals
|
1359 |
|
|
(e.g., DCE)@.
|
1360 |
|
|
|
1361 |
|
|
|
1362 |
|
|
@item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere
|
1363 |
|
|
they are needed. No pruning of the IDF is done. This is used
|
1364 |
|
|
by passes that need the PHI nodes for @code{O_j} even if it
|
1365 |
|
|
means that some arguments will come from the default definition
|
1366 |
|
|
of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@.
|
1367 |
|
|
|
1368 |
|
|
WARNING: If you need to use this flag, chances are that your
|
1369 |
|
|
pass may be doing something wrong. Inserting PHI nodes for an
|
1370 |
|
|
old name where not all edges carry a new replacement may lead to
|
1371 |
|
|
silent codegen errors or spurious uninitialized warnings@.
|
1372 |
|
|
|
1373 |
|
|
@item @code{TODO_update_ssa_only_virtuals}. Passes that update the
|
1374 |
|
|
SSA form on their own may want to delegate the updating of
|
1375 |
|
|
virtual names to the generic updater. Since FUD chains are
|
1376 |
|
|
easier to maintain, this simplifies the work they need to do.
|
1377 |
|
|
NOTE: If this flag is used, any OLD->NEW mappings for real names
|
1378 |
|
|
are explicitly destroyed and only the symbols marked for
|
1379 |
|
|
renaming are processed@.
|
1380 |
|
|
@end itemize
|
1381 |
|
|
|
1382 |
|
|
|
1383 |
|
|
@subsection Examining @code{SSA_NAME} nodes
|
1384 |
|
|
@cindex examining SSA_NAMEs
|
1385 |
|
|
|
1386 |
|
|
The following macros can be used to examine @code{SSA_NAME} nodes
|
1387 |
|
|
|
1388 |
|
|
@defmac SSA_NAME_DEF_STMT (@var{var})
|
1389 |
|
|
Returns the statement @var{s} that creates the @code{SSA_NAME}
|
1390 |
|
|
@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT
|
1391 |
|
|
(@var{s})} returns @code{true}), it means that the first reference to
|
1392 |
|
|
this variable is a USE or a VUSE@.
|
1393 |
|
|
@end defmac
|
1394 |
|
|
|
1395 |
|
|
@defmac SSA_NAME_VERSION (@var{var})
|
1396 |
|
|
Returns the version number of the @code{SSA_NAME} object @var{var}.
|
1397 |
|
|
@end defmac
|
1398 |
|
|
|
1399 |
|
|
|
1400 |
|
|
@subsection Walking use-def chains
|
1401 |
|
|
|
1402 |
|
|
@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data})
|
1403 |
|
|
|
1404 |
|
|
Walks use-def chains starting at the @code{SSA_NAME} node @var{var}.
|
1405 |
|
|
Calls function @var{fn} at each reaching definition found. Function
|
1406 |
|
|
@var{FN} takes three arguments: @var{var}, its defining statement
|
1407 |
|
|
(@var{def_stmt}) and a generic pointer to whatever state information
|
1408 |
|
|
that @var{fn} may want to maintain (@var{data}). Function @var{fn} is
|
1409 |
|
|
able to stop the walk by returning @code{true}, otherwise in order to
|
1410 |
|
|
continue the walk, @var{fn} should return @code{false}.
|
1411 |
|
|
|
1412 |
|
|
Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are
|
1413 |
|
|
slightly different. For each argument @var{arg} of the PHI node, this
|
1414 |
|
|
function will:
|
1415 |
|
|
|
1416 |
|
|
@enumerate
|
1417 |
|
|
@item Walk the use-def chains for @var{arg}.
|
1418 |
|
|
@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}.
|
1419 |
|
|
@end enumerate
|
1420 |
|
|
|
1421 |
|
|
Note how the first argument to @var{fn} is no longer the original
|
1422 |
|
|
variable @var{var}, but the PHI argument currently being examined.
|
1423 |
|
|
If @var{fn} wants to get at @var{var}, it should call
|
1424 |
|
|
@code{PHI_RESULT} (@var{phi}).
|
1425 |
|
|
@end deftypefn
|
1426 |
|
|
|
1427 |
|
|
@subsection Walking the dominator tree
|
1428 |
|
|
|
1429 |
|
|
@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb})
|
1430 |
|
|
|
1431 |
|
|
This function walks the dominator tree for the current CFG calling a
|
1432 |
|
|
set of callback functions defined in @var{struct dom_walk_data} in
|
1433 |
|
|
@file{domwalk.h}. The call back functions you need to define give you
|
1434 |
|
|
hooks to execute custom code at various points during traversal:
|
1435 |
|
|
|
1436 |
|
|
@enumerate
|
1437 |
|
|
@item Once to initialize any local data needed while processing
|
1438 |
|
|
@var{bb} and its children. This local data is pushed into an
|
1439 |
|
|
internal stack which is automatically pushed and popped as the
|
1440 |
|
|
walker traverses the dominator tree.
|
1441 |
|
|
|
1442 |
|
|
@item Once before traversing all the statements in the @var{bb}.
|
1443 |
|
|
|
1444 |
|
|
@item Once for every statement inside @var{bb}.
|
1445 |
|
|
|
1446 |
|
|
@item Once after traversing all the statements and before recursing
|
1447 |
|
|
into @var{bb}'s dominator children.
|
1448 |
|
|
|
1449 |
|
|
@item It then recurses into all the dominator children of @var{bb}.
|
1450 |
|
|
|
1451 |
|
|
@item After recursing into all the dominator children of @var{bb} it
|
1452 |
|
|
can, optionally, traverse every statement in @var{bb} again
|
1453 |
|
|
(i.e., repeating steps 2 and 3).
|
1454 |
|
|
|
1455 |
|
|
@item Once after walking the statements in @var{bb} and @var{bb}'s
|
1456 |
|
|
dominator children. At this stage, the block local data stack
|
1457 |
|
|
is popped.
|
1458 |
|
|
@end enumerate
|
1459 |
|
|
@end deftypefn
|
1460 |
|
|
|
1461 |
|
|
@node Alias analysis
|
1462 |
|
|
@section Alias analysis
|
1463 |
|
|
@cindex alias
|
1464 |
|
|
@cindex flow-sensitive alias analysis
|
1465 |
|
|
@cindex flow-insensitive alias analysis
|
1466 |
|
|
|
1467 |
|
|
Alias analysis proceeds in 4 main phases:
|
1468 |
|
|
|
1469 |
|
|
@enumerate
|
1470 |
|
|
@item Structural alias analysis.
|
1471 |
|
|
|
1472 |
|
|
This phase walks the types for structure variables, and determines which
|
1473 |
|
|
of the fields can overlap using offset and size of each field. For each
|
1474 |
|
|
field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is
|
1475 |
|
|
created, which represents that field as a separate variable. All
|
1476 |
|
|
accesses that could possibly overlap with a given field will have
|
1477 |
|
|
virtual operands for the SFT of that field.
|
1478 |
|
|
|
1479 |
|
|
@smallexample
|
1480 |
|
|
struct foo
|
1481 |
|
|
@{
|
1482 |
|
|
int a;
|
1483 |
|
|
int b;
|
1484 |
|
|
@}
|
1485 |
|
|
struct foo temp;
|
1486 |
|
|
int bar (void)
|
1487 |
|
|
@{
|
1488 |
|
|
int tmp1, tmp2, tmp3;
|
1489 |
|
|
SFT.0_2 = V_MUST_DEF <SFT.0_1>
|
1490 |
|
|
temp.a = 5;
|
1491 |
|
|
SFT.1_4 = V_MUST_DEF <SFT.1_3>
|
1492 |
|
|
temp.b = 6;
|
1493 |
|
|
|
1494 |
|
|
VUSE <SFT.1_4>
|
1495 |
|
|
tmp1_5 = temp.b;
|
1496 |
|
|
VUSE <SFT.0_2>
|
1497 |
|
|
tmp2_6 = temp.a;
|
1498 |
|
|
|
1499 |
|
|
tmp3_7 = tmp1_5 + tmp2_6;
|
1500 |
|
|
return tmp3_7;
|
1501 |
|
|
@}
|
1502 |
|
|
@end smallexample
|
1503 |
|
|
|
1504 |
|
|
If you copy the type tag for a variable for some reason, you probably
|
1505 |
|
|
also want to copy the subvariables for that variable.
|
1506 |
|
|
|
1507 |
|
|
@item Points-to and escape analysis.
|
1508 |
|
|
|
1509 |
|
|
This phase walks the use-def chains in the SSA web looking for
|
1510 |
|
|
three things:
|
1511 |
|
|
|
1512 |
|
|
@itemize @bullet
|
1513 |
|
|
@item Assignments of the form @code{P_i = &VAR}
|
1514 |
|
|
@item Assignments of the form P_i = malloc()
|
1515 |
|
|
@item Pointers and ADDR_EXPR that escape the current function.
|
1516 |
|
|
@end itemize
|
1517 |
|
|
|
1518 |
|
|
The concept of `escaping' is the same one used in the Java world.
|
1519 |
|
|
When a pointer or an ADDR_EXPR escapes, it means that it has been
|
1520 |
|
|
exposed outside of the current function. So, assignment to
|
1521 |
|
|
global variables, function arguments and returning a pointer are
|
1522 |
|
|
all escape sites.
|
1523 |
|
|
|
1524 |
|
|
This is where we are currently limited. Since not everything is
|
1525 |
|
|
renamed into SSA, we lose track of escape properties when a
|
1526 |
|
|
pointer is stashed inside a field in a structure, for instance.
|
1527 |
|
|
In those cases, we are assuming that the pointer does escape.
|
1528 |
|
|
|
1529 |
|
|
We use escape analysis to determine whether a variable is
|
1530 |
|
|
call-clobbered. Simply put, if an ADDR_EXPR escapes, then the
|
1531 |
|
|
variable is call-clobbered. If a pointer P_i escapes, then all
|
1532 |
|
|
the variables pointed-to by P_i (and its memory tag) also escape.
|
1533 |
|
|
|
1534 |
|
|
@item Compute flow-sensitive aliases
|
1535 |
|
|
|
1536 |
|
|
We have two classes of memory tags. Memory tags associated with
|
1537 |
|
|
the pointed-to data type of the pointers in the program. These
|
1538 |
|
|
tags are called ``type memory tag'' (TMT)@. The other class are
|
1539 |
|
|
those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@.
|
1540 |
|
|
The basic idea is that when adding operands for an INDIRECT_REF
|
1541 |
|
|
*P_i, we will first check whether P_i has a name tag, if it does
|
1542 |
|
|
we use it, because that will have more precise aliasing
|
1543 |
|
|
information. Otherwise, we use the standard type tag.
|
1544 |
|
|
|
1545 |
|
|
In this phase, we go through all the pointers we found in
|
1546 |
|
|
points-to analysis and create alias sets for the name memory tags
|
1547 |
|
|
associated with each pointer P_i. If P_i escapes, we mark
|
1548 |
|
|
call-clobbered the variables it points to and its tag.
|
1549 |
|
|
|
1550 |
|
|
|
1551 |
|
|
@item Compute flow-insensitive aliases
|
1552 |
|
|
|
1553 |
|
|
This pass will compare the alias set of every type memory tag and
|
1554 |
|
|
every addressable variable found in the program. Given a type
|
1555 |
|
|
memory tag TMT and an addressable variable V@. If the alias sets
|
1556 |
|
|
of TMT and V conflict (as computed by may_alias_p), then V is
|
1557 |
|
|
marked as an alias tag and added to the alias set of TMT@.
|
1558 |
|
|
@end enumerate
|
1559 |
|
|
|
1560 |
|
|
For instance, consider the following function:
|
1561 |
|
|
|
1562 |
|
|
@smallexample
|
1563 |
|
|
foo (int i)
|
1564 |
|
|
@{
|
1565 |
|
|
int *p, *q, a, b;
|
1566 |
|
|
|
1567 |
|
|
if (i > 10)
|
1568 |
|
|
p = &a;
|
1569 |
|
|
else
|
1570 |
|
|
q = &b;
|
1571 |
|
|
|
1572 |
|
|
*p = 3;
|
1573 |
|
|
*q = 5;
|
1574 |
|
|
a = b + 2;
|
1575 |
|
|
return *p;
|
1576 |
|
|
@}
|
1577 |
|
|
@end smallexample
|
1578 |
|
|
|
1579 |
|
|
After aliasing analysis has finished, the type memory tag for
|
1580 |
|
|
pointer @code{p} will have two aliases, namely variables @code{a} and
|
1581 |
|
|
@code{b}.
|
1582 |
|
|
Every time pointer @code{p} is dereferenced, we want to mark the
|
1583 |
|
|
operation as a potential reference to @code{a} and @code{b}.
|
1584 |
|
|
|
1585 |
|
|
@smallexample
|
1586 |
|
|
foo (int i)
|
1587 |
|
|
@{
|
1588 |
|
|
int *p, a, b;
|
1589 |
|
|
|
1590 |
|
|
if (i_2 > 10)
|
1591 |
|
|
p_4 = &a;
|
1592 |
|
|
else
|
1593 |
|
|
p_6 = &b;
|
1594 |
|
|
# p_1 = PHI <p_4(1), p_6(2)>;
|
1595 |
|
|
|
1596 |
|
|
# a_7 = V_MAY_DEF <a_3>;
|
1597 |
|
|
# b_8 = V_MAY_DEF <b_5>;
|
1598 |
|
|
*p_1 = 3;
|
1599 |
|
|
|
1600 |
|
|
# a_9 = V_MAY_DEF <a_7>
|
1601 |
|
|
# VUSE <b_8>
|
1602 |
|
|
a_9 = b_8 + 2;
|
1603 |
|
|
|
1604 |
|
|
# VUSE <a_9>;
|
1605 |
|
|
# VUSE <b_8>;
|
1606 |
|
|
return *p_1;
|
1607 |
|
|
@}
|
1608 |
|
|
@end smallexample
|
1609 |
|
|
|
1610 |
|
|
In certain cases, the list of may aliases for a pointer may grow
|
1611 |
|
|
too large. This may cause an explosion in the number of virtual
|
1612 |
|
|
operands inserted in the code. Resulting in increased memory
|
1613 |
|
|
consumption and compilation time.
|
1614 |
|
|
|
1615 |
|
|
When the number of virtual operands needed to represent aliased
|
1616 |
|
|
loads and stores grows too large (configurable with @option{--param
|
1617 |
|
|
max-aliased-vops}), alias sets are grouped to avoid severe
|
1618 |
|
|
compile-time slow downs and memory consumption. The alias
|
1619 |
|
|
grouping heuristic proceeds as follows:
|
1620 |
|
|
|
1621 |
|
|
@enumerate
|
1622 |
|
|
@item Sort the list of pointers in decreasing number of contributed
|
1623 |
|
|
virtual operands.
|
1624 |
|
|
|
1625 |
|
|
@item Take the first pointer from the list and reverse the role
|
1626 |
|
|
of the memory tag and its aliases. Usually, whenever an
|
1627 |
|
|
aliased variable Vi is found to alias with a memory tag
|
1628 |
|
|
T, we add Vi to the may-aliases set for T@. Meaning that
|
1629 |
|
|
after alias analysis, we will have:
|
1630 |
|
|
|
1631 |
|
|
@smallexample
|
1632 |
|
|
may-aliases(T) = @{ V1, V2, V3, ..., Vn @}
|
1633 |
|
|
@end smallexample
|
1634 |
|
|
|
1635 |
|
|
This means that every statement that references T, will get
|
1636 |
|
|
@code{n} virtual operands for each of the Vi tags. But, when
|
1637 |
|
|
alias grouping is enabled, we make T an alias tag and add it
|
1638 |
|
|
to the alias set of all the Vi variables:
|
1639 |
|
|
|
1640 |
|
|
@smallexample
|
1641 |
|
|
may-aliases(V1) = @{ T @}
|
1642 |
|
|
may-aliases(V2) = @{ T @}
|
1643 |
|
|
...
|
1644 |
|
|
may-aliases(Vn) = @{ T @}
|
1645 |
|
|
@end smallexample
|
1646 |
|
|
|
1647 |
|
|
This has two effects: (a) statements referencing T will only get
|
1648 |
|
|
a single virtual operand, and, (b) all the variables Vi will now
|
1649 |
|
|
appear to alias each other. So, we lose alias precision to
|
1650 |
|
|
improve compile time. But, in theory, a program with such a high
|
1651 |
|
|
level of aliasing should not be very optimizable in the first
|
1652 |
|
|
place.
|
1653 |
|
|
|
1654 |
|
|
@item Since variables may be in the alias set of more than one
|
1655 |
|
|
memory tag, the grouping done in step (2) needs to be extended
|
1656 |
|
|
to all the memory tags that have a non-empty intersection with
|
1657 |
|
|
the may-aliases set of tag T@. For instance, if we originally
|
1658 |
|
|
had these may-aliases sets:
|
1659 |
|
|
|
1660 |
|
|
@smallexample
|
1661 |
|
|
may-aliases(T) = @{ V1, V2, V3 @}
|
1662 |
|
|
may-aliases(R) = @{ V2, V4 @}
|
1663 |
|
|
@end smallexample
|
1664 |
|
|
|
1665 |
|
|
In step (2) we would have reverted the aliases for T as:
|
1666 |
|
|
|
1667 |
|
|
@smallexample
|
1668 |
|
|
may-aliases(V1) = @{ T @}
|
1669 |
|
|
may-aliases(V2) = @{ T @}
|
1670 |
|
|
may-aliases(V3) = @{ T @}
|
1671 |
|
|
@end smallexample
|
1672 |
|
|
|
1673 |
|
|
But note that now V2 is no longer aliased with R@. We could
|
1674 |
|
|
add R to may-aliases(V2), but we are in the process of
|
1675 |
|
|
grouping aliases to reduce virtual operands so what we do is
|
1676 |
|
|
add V4 to the grouping to obtain:
|
1677 |
|
|
|
1678 |
|
|
@smallexample
|
1679 |
|
|
may-aliases(V1) = @{ T @}
|
1680 |
|
|
may-aliases(V2) = @{ T @}
|
1681 |
|
|
may-aliases(V3) = @{ T @}
|
1682 |
|
|
may-aliases(V4) = @{ T @}
|
1683 |
|
|
@end smallexample
|
1684 |
|
|
|
1685 |
|
|
@item If the total number of virtual operands due to aliasing is
|
1686 |
|
|
still above the threshold set by max-alias-vops, go back to (2).
|
1687 |
|
|
@end enumerate
|