| 1 |
284 |
jeremybenn |
@c Copyright (c) 2004, 2005, 2007, 2008, 2010
|
| 2 |
|
|
@c Free Software Foundation, Inc.
|
| 3 |
|
|
@c This is part of the GCC manual.
|
| 4 |
|
|
@c For copying conditions, see the file gcc.texi.
|
| 5 |
|
|
|
| 6 |
|
|
@c ---------------------------------------------------------------------
|
| 7 |
|
|
@c Tree SSA
|
| 8 |
|
|
@c ---------------------------------------------------------------------
|
| 9 |
|
|
|
| 10 |
|
|
@node Tree SSA
|
| 11 |
|
|
@chapter Analysis and Optimization of GIMPLE tuples
|
| 12 |
|
|
@cindex Tree SSA
|
| 13 |
|
|
@cindex Optimization infrastructure for GIMPLE
|
| 14 |
|
|
|
| 15 |
|
|
GCC uses three main intermediate languages to represent the program
|
| 16 |
|
|
during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a
|
| 17 |
|
|
language-independent representation generated by each front end. It
|
| 18 |
|
|
is used to serve as an interface between the parser and optimizer.
|
| 19 |
|
|
GENERIC is a common representation that is able to represent programs
|
| 20 |
|
|
written in all the languages supported by GCC@.
|
| 21 |
|
|
|
| 22 |
|
|
GIMPLE and RTL are used to optimize the program. GIMPLE is used for
|
| 23 |
|
|
target and language independent optimizations (e.g., inlining,
|
| 24 |
|
|
constant propagation, tail call elimination, redundancy elimination,
|
| 25 |
|
|
etc). Much like GENERIC, GIMPLE is a language independent, tree based
|
| 26 |
|
|
representation. However, it differs from GENERIC in that the GIMPLE
|
| 27 |
|
|
grammar is more restrictive: expressions contain no more than 3
|
| 28 |
|
|
operands (except function calls), it has no control flow structures
|
| 29 |
|
|
and expressions with side-effects are only allowed on the right hand
|
| 30 |
|
|
side of assignments. See the chapter describing GENERIC and GIMPLE
|
| 31 |
|
|
for more details.
|
| 32 |
|
|
|
| 33 |
|
|
This chapter describes the data structures and functions used in the
|
| 34 |
|
|
GIMPLE optimizers (also known as ``tree optimizers'' or ``middle
|
| 35 |
|
|
end''). In particular, it focuses on all the macros, data structures,
|
| 36 |
|
|
functions and programming constructs needed to implement optimization
|
| 37 |
|
|
passes for GIMPLE@.
|
| 38 |
|
|
|
| 39 |
|
|
@menu
|
| 40 |
|
|
* Annotations:: Attributes for variables.
|
| 41 |
|
|
* SSA Operands:: SSA names referenced by GIMPLE statements.
|
| 42 |
|
|
* SSA:: Static Single Assignment representation.
|
| 43 |
|
|
* Alias analysis:: Representing aliased loads and stores.
|
| 44 |
|
|
* Memory model:: Memory model used by the middle-end.
|
| 45 |
|
|
@end menu
|
| 46 |
|
|
|
| 47 |
|
|
@node Annotations
|
| 48 |
|
|
@section Annotations
|
| 49 |
|
|
@cindex annotations
|
| 50 |
|
|
|
| 51 |
|
|
The optimizers need to associate attributes with variables during the
|
| 52 |
|
|
optimization process. For instance, we need to know whether a
|
| 53 |
|
|
variable has aliases. All these attributes are stored in data
|
| 54 |
|
|
structures called annotations which are then linked to the field
|
| 55 |
|
|
@code{ann} in @code{struct tree_common}.
|
| 56 |
|
|
|
| 57 |
|
|
Presently, we define annotations for variables (@code{var_ann_t}).
|
| 58 |
|
|
Annotations are defined and documented in @file{tree-flow.h}.
|
| 59 |
|
|
|
| 60 |
|
|
|
| 61 |
|
|
@node SSA Operands
|
| 62 |
|
|
@section SSA Operands
|
| 63 |
|
|
@cindex operands
|
| 64 |
|
|
@cindex virtual operands
|
| 65 |
|
|
@cindex real operands
|
| 66 |
|
|
@findex update_stmt
|
| 67 |
|
|
|
| 68 |
|
|
Almost every GIMPLE statement will contain a reference to a variable
|
| 69 |
|
|
or memory location. Since statements come in different shapes and
|
| 70 |
|
|
sizes, their operands are going to be located at various spots inside
|
| 71 |
|
|
the statement's tree. To facilitate access to the statement's
|
| 72 |
|
|
operands, they are organized into lists associated inside each
|
| 73 |
|
|
statement's annotation. Each element in an operand list is a pointer
|
| 74 |
|
|
to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node.
|
| 75 |
|
|
This provides a very convenient way of examining and replacing
|
| 76 |
|
|
operands.
|
| 77 |
|
|
|
| 78 |
|
|
Data flow analysis and optimization is done on all tree nodes
|
| 79 |
|
|
representing variables. Any node for which @code{SSA_VAR_P} returns
|
| 80 |
|
|
nonzero is considered when scanning statement operands. However, not
|
| 81 |
|
|
all @code{SSA_VAR_P} variables are processed in the same way. For the
|
| 82 |
|
|
purposes of optimization, we need to distinguish between references to
|
| 83 |
|
|
local scalar variables and references to globals, statics, structures,
|
| 84 |
|
|
arrays, aliased variables, etc. The reason is simple, the compiler
|
| 85 |
|
|
can gather complete data flow information for a local scalar. On the
|
| 86 |
|
|
other hand, a global variable may be modified by a function call, it
|
| 87 |
|
|
may not be possible to keep track of all the elements of an array or
|
| 88 |
|
|
the fields of a structure, etc.
|
| 89 |
|
|
|
| 90 |
|
|
The operand scanner gathers two kinds of operands: @dfn{real} and
|
| 91 |
|
|
@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true
|
| 92 |
|
|
is considered real, otherwise it is a virtual operand. We also
|
| 93 |
|
|
distinguish between uses and definitions. An operand is used if its
|
| 94 |
|
|
value is loaded by the statement (e.g., the operand at the RHS of an
|
| 95 |
|
|
assignment). If the statement assigns a new value to the operand, the
|
| 96 |
|
|
operand is considered a definition (e.g., the operand at the LHS of
|
| 97 |
|
|
an assignment).
|
| 98 |
|
|
|
| 99 |
|
|
Virtual and real operands also have very different data flow
|
| 100 |
|
|
properties. Real operands are unambiguous references to the
|
| 101 |
|
|
full object that they represent. For instance, given
|
| 102 |
|
|
|
| 103 |
|
|
@smallexample
|
| 104 |
|
|
@{
|
| 105 |
|
|
int a, b;
|
| 106 |
|
|
a = b
|
| 107 |
|
|
@}
|
| 108 |
|
|
@end smallexample
|
| 109 |
|
|
|
| 110 |
|
|
Since @code{a} and @code{b} are non-aliased locals, the statement
|
| 111 |
|
|
@code{a = b} will have one real definition and one real use because
|
| 112 |
|
|
variable @code{a} is completely modified with the contents of
|
| 113 |
|
|
variable @code{b}. Real definition are also known as @dfn{killing
|
| 114 |
|
|
definitions}. Similarly, the use of @code{b} reads all its bits.
|
| 115 |
|
|
|
| 116 |
|
|
In contrast, virtual operands are used with variables that can have
|
| 117 |
|
|
a partial or ambiguous reference. This includes structures, arrays,
|
| 118 |
|
|
globals, and aliased variables. In these cases, we have two types of
|
| 119 |
|
|
definitions. For globals, structures, and arrays, we can determine from
|
| 120 |
|
|
a statement whether a variable of these types has a killing definition.
|
| 121 |
|
|
If the variable does, then the statement is marked as having a
|
| 122 |
|
|
@dfn{must definition} of that variable. However, if a statement is only
|
| 123 |
|
|
defining a part of the variable (i.e.@: a field in a structure), or if we
|
| 124 |
|
|
know that a statement might define the variable but we cannot say for sure,
|
| 125 |
|
|
then we mark that statement as having a @dfn{may definition}. For
|
| 126 |
|
|
instance, given
|
| 127 |
|
|
|
| 128 |
|
|
@smallexample
|
| 129 |
|
|
@{
|
| 130 |
|
|
int a, b, *p;
|
| 131 |
|
|
|
| 132 |
|
|
if (@dots{})
|
| 133 |
|
|
p = &a;
|
| 134 |
|
|
else
|
| 135 |
|
|
p = &b;
|
| 136 |
|
|
*p = 5;
|
| 137 |
|
|
return *p;
|
| 138 |
|
|
@}
|
| 139 |
|
|
@end smallexample
|
| 140 |
|
|
|
| 141 |
|
|
The assignment @code{*p = 5} may be a definition of @code{a} or
|
| 142 |
|
|
@code{b}. If we cannot determine statically where @code{p} is
|
| 143 |
|
|
pointing to at the time of the store operation, we create virtual
|
| 144 |
|
|
definitions to mark that statement as a potential definition site for
|
| 145 |
|
|
@code{a} and @code{b}. Memory loads are similarly marked with virtual
|
| 146 |
|
|
use operands. Virtual operands are shown in tree dumps right before
|
| 147 |
|
|
the statement that contains them. To request a tree dump with virtual
|
| 148 |
|
|
operands, use the @option{-vops} option to @option{-fdump-tree}:
|
| 149 |
|
|
|
| 150 |
|
|
@smallexample
|
| 151 |
|
|
@{
|
| 152 |
|
|
int a, b, *p;
|
| 153 |
|
|
|
| 154 |
|
|
if (@dots{})
|
| 155 |
|
|
p = &a;
|
| 156 |
|
|
else
|
| 157 |
|
|
p = &b;
|
| 158 |
|
|
# a = VDEF <a>
|
| 159 |
|
|
# b = VDEF <b>
|
| 160 |
|
|
*p = 5;
|
| 161 |
|
|
|
| 162 |
|
|
# VUSE <a>
|
| 163 |
|
|
# VUSE <b>
|
| 164 |
|
|
return *p;
|
| 165 |
|
|
@}
|
| 166 |
|
|
@end smallexample
|
| 167 |
|
|
|
| 168 |
|
|
Notice that @code{VDEF} operands have two copies of the referenced
|
| 169 |
|
|
variable. This indicates that this is not a killing definition of
|
| 170 |
|
|
that variable. In this case we refer to it as a @dfn{may definition}
|
| 171 |
|
|
or @dfn{aliased store}. The presence of the second copy of the
|
| 172 |
|
|
variable in the @code{VDEF} operand will become important when the
|
| 173 |
|
|
function is converted into SSA form. This will be used to link all
|
| 174 |
|
|
the non-killing definitions to prevent optimizations from making
|
| 175 |
|
|
incorrect assumptions about them.
|
| 176 |
|
|
|
| 177 |
|
|
Operands are updated as soon as the statement is finished via a call
|
| 178 |
|
|
to @code{update_stmt}. If statement elements are changed via
|
| 179 |
|
|
@code{SET_USE} or @code{SET_DEF}, then no further action is required
|
| 180 |
|
|
(i.e., those macros take care of updating the statement). If changes
|
| 181 |
|
|
are made by manipulating the statement's tree directly, then a call
|
| 182 |
|
|
must be made to @code{update_stmt} when complete. Calling one of the
|
| 183 |
|
|
@code{bsi_insert} routines or @code{bsi_replace} performs an implicit
|
| 184 |
|
|
call to @code{update_stmt}.
|
| 185 |
|
|
|
| 186 |
|
|
@subsection Operand Iterators And Access Routines
|
| 187 |
|
|
@cindex Operand Iterators
|
| 188 |
|
|
@cindex Operand Access Routines
|
| 189 |
|
|
|
| 190 |
|
|
Operands are collected by @file{tree-ssa-operands.c}. They are stored
|
| 191 |
|
|
inside each statement's annotation and can be accessed through either the
|
| 192 |
|
|
operand iterators or an access routine.
|
| 193 |
|
|
|
| 194 |
|
|
The following access routines are available for examining operands:
|
| 195 |
|
|
|
| 196 |
|
|
@enumerate
|
| 197 |
|
|
@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return
|
| 198 |
|
|
NULL unless there is exactly one operand matching the specified flags. If
|
| 199 |
|
|
there is exactly one operand, the operand is returned as either a @code{tree},
|
| 200 |
|
|
@code{def_operand_p}, or @code{use_operand_p}.
|
| 201 |
|
|
|
| 202 |
|
|
@smallexample
|
| 203 |
|
|
tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags);
|
| 204 |
|
|
use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES);
|
| 205 |
|
|
def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS);
|
| 206 |
|
|
@end smallexample
|
| 207 |
|
|
|
| 208 |
|
|
@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no
|
| 209 |
|
|
operands matching the specified flags.
|
| 210 |
|
|
|
| 211 |
|
|
@smallexample
|
| 212 |
|
|
if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
|
| 213 |
|
|
return;
|
| 214 |
|
|
@end smallexample
|
| 215 |
|
|
|
| 216 |
|
|
@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands
|
| 217 |
|
|
matching 'flags'. This actually executes a loop to perform the count, so
|
| 218 |
|
|
only use this if it is really needed.
|
| 219 |
|
|
|
| 220 |
|
|
@smallexample
|
| 221 |
|
|
int count = NUM_SSA_OPERANDS (stmt, flags)
|
| 222 |
|
|
@end smallexample
|
| 223 |
|
|
@end enumerate
|
| 224 |
|
|
|
| 225 |
|
|
|
| 226 |
|
|
If you wish to iterate over some or all operands, use the
|
| 227 |
|
|
@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print
|
| 228 |
|
|
all the operands for a statement:
|
| 229 |
|
|
|
| 230 |
|
|
@smallexample
|
| 231 |
|
|
void
|
| 232 |
|
|
print_ops (tree stmt)
|
| 233 |
|
|
@{
|
| 234 |
|
|
ssa_op_iter;
|
| 235 |
|
|
tree var;
|
| 236 |
|
|
|
| 237 |
|
|
FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS)
|
| 238 |
|
|
print_generic_expr (stderr, var, TDF_SLIM);
|
| 239 |
|
|
@}
|
| 240 |
|
|
@end smallexample
|
| 241 |
|
|
|
| 242 |
|
|
|
| 243 |
|
|
How to choose the appropriate iterator:
|
| 244 |
|
|
|
| 245 |
|
|
@enumerate
|
| 246 |
|
|
@item Determine whether you are need to see the operand pointers, or just the
|
| 247 |
|
|
trees, and choose the appropriate macro:
|
| 248 |
|
|
|
| 249 |
|
|
@smallexample
|
| 250 |
|
|
Need Macro:
|
| 251 |
|
|
---- -------
|
| 252 |
|
|
use_operand_p FOR_EACH_SSA_USE_OPERAND
|
| 253 |
|
|
def_operand_p FOR_EACH_SSA_DEF_OPERAND
|
| 254 |
|
|
tree FOR_EACH_SSA_TREE_OPERAND
|
| 255 |
|
|
@end smallexample
|
| 256 |
|
|
|
| 257 |
|
|
@item You need to declare a variable of the type you are interested
|
| 258 |
|
|
in, and an ssa_op_iter structure which serves as the loop controlling
|
| 259 |
|
|
variable.
|
| 260 |
|
|
|
| 261 |
|
|
@item Determine which operands you wish to use, and specify the flags of
|
| 262 |
|
|
those you are interested in. They are documented in
|
| 263 |
|
|
@file{tree-ssa-operands.h}:
|
| 264 |
|
|
|
| 265 |
|
|
@smallexample
|
| 266 |
|
|
#define SSA_OP_USE 0x01 /* @r{Real USE operands.} */
|
| 267 |
|
|
#define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */
|
| 268 |
|
|
#define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */
|
| 269 |
|
|
#define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of VDEFS.} */
|
| 270 |
|
|
#define SSA_OP_VDEF 0x10 /* @r{DEF portion of VDEFS.} */
|
| 271 |
|
|
|
| 272 |
|
|
/* @r{These are commonly grouped operand flags.} */
|
| 273 |
|
|
#define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE)
|
| 274 |
|
|
#define SSA_OP_VIRTUAL_DEFS (SSA_OP_VDEF)
|
| 275 |
|
|
#define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE)
|
| 276 |
|
|
#define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF)
|
| 277 |
|
|
#define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS)
|
| 278 |
|
|
@end smallexample
|
| 279 |
|
|
@end enumerate
|
| 280 |
|
|
|
| 281 |
|
|
So if you want to look at the use pointers for all the @code{USE} and
|
| 282 |
|
|
@code{VUSE} operands, you would do something like:
|
| 283 |
|
|
|
| 284 |
|
|
@smallexample
|
| 285 |
|
|
use_operand_p use_p;
|
| 286 |
|
|
ssa_op_iter iter;
|
| 287 |
|
|
|
| 288 |
|
|
FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE))
|
| 289 |
|
|
@{
|
| 290 |
|
|
process_use_ptr (use_p);
|
| 291 |
|
|
@}
|
| 292 |
|
|
@end smallexample
|
| 293 |
|
|
|
| 294 |
|
|
The @code{TREE} macro is basically the same as the @code{USE} and
|
| 295 |
|
|
@code{DEF} macros, only with the use or def dereferenced via
|
| 296 |
|
|
@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we
|
| 297 |
|
|
aren't using operand pointers, use and defs flags can be mixed.
|
| 298 |
|
|
|
| 299 |
|
|
@smallexample
|
| 300 |
|
|
tree var;
|
| 301 |
|
|
ssa_op_iter iter;
|
| 302 |
|
|
|
| 303 |
|
|
FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE)
|
| 304 |
|
|
@{
|
| 305 |
|
|
print_generic_expr (stderr, var, TDF_SLIM);
|
| 306 |
|
|
@}
|
| 307 |
|
|
@end smallexample
|
| 308 |
|
|
|
| 309 |
|
|
@code{VDEF}s are broken into two flags, one for the
|
| 310 |
|
|
@code{DEF} portion (@code{SSA_OP_VDEF}) and one for the USE portion
|
| 311 |
|
|
(@code{SSA_OP_VMAYUSE}). If all you want to look at are the
|
| 312 |
|
|
@code{VDEF}s together, there is a fourth iterator macro for this,
|
| 313 |
|
|
which returns both a def_operand_p and a use_operand_p for each
|
| 314 |
|
|
@code{VDEF} in the statement. Note that you don't need any flags for
|
| 315 |
|
|
this one.
|
| 316 |
|
|
|
| 317 |
|
|
@smallexample
|
| 318 |
|
|
use_operand_p use_p;
|
| 319 |
|
|
def_operand_p def_p;
|
| 320 |
|
|
ssa_op_iter iter;
|
| 321 |
|
|
|
| 322 |
|
|
FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter)
|
| 323 |
|
|
@{
|
| 324 |
|
|
my_code;
|
| 325 |
|
|
@}
|
| 326 |
|
|
@end smallexample
|
| 327 |
|
|
|
| 328 |
|
|
There are many examples in the code as well, as well as the
|
| 329 |
|
|
documentation in @file{tree-ssa-operands.h}.
|
| 330 |
|
|
|
| 331 |
|
|
There are also a couple of variants on the stmt iterators regarding PHI
|
| 332 |
|
|
nodes.
|
| 333 |
|
|
|
| 334 |
|
|
@code{FOR_EACH_PHI_ARG} Works exactly like
|
| 335 |
|
|
@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments
|
| 336 |
|
|
instead of statement operands.
|
| 337 |
|
|
|
| 338 |
|
|
@smallexample
|
| 339 |
|
|
/* Look at every virtual PHI use. */
|
| 340 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES)
|
| 341 |
|
|
@{
|
| 342 |
|
|
my_code;
|
| 343 |
|
|
@}
|
| 344 |
|
|
|
| 345 |
|
|
/* Look at every real PHI use. */
|
| 346 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES)
|
| 347 |
|
|
my_code;
|
| 348 |
|
|
|
| 349 |
|
|
/* Look at every PHI use. */
|
| 350 |
|
|
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES)
|
| 351 |
|
|
my_code;
|
| 352 |
|
|
@end smallexample
|
| 353 |
|
|
|
| 354 |
|
|
@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like
|
| 355 |
|
|
@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on
|
| 356 |
|
|
either a statement or a @code{PHI} node. These should be used when it is
|
| 357 |
|
|
appropriate but they are not quite as efficient as the individual
|
| 358 |
|
|
@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines.
|
| 359 |
|
|
|
| 360 |
|
|
@smallexample
|
| 361 |
|
|
FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags)
|
| 362 |
|
|
@{
|
| 363 |
|
|
my_code;
|
| 364 |
|
|
@}
|
| 365 |
|
|
|
| 366 |
|
|
FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags)
|
| 367 |
|
|
@{
|
| 368 |
|
|
my_code;
|
| 369 |
|
|
@}
|
| 370 |
|
|
@end smallexample
|
| 371 |
|
|
|
| 372 |
|
|
@subsection Immediate Uses
|
| 373 |
|
|
@cindex Immediate Uses
|
| 374 |
|
|
|
| 375 |
|
|
Immediate use information is now always available. Using the immediate use
|
| 376 |
|
|
iterators, you may examine every use of any @code{SSA_NAME}. For instance,
|
| 377 |
|
|
to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on
|
| 378 |
|
|
each stmt after that is done:
|
| 379 |
|
|
|
| 380 |
|
|
@smallexample
|
| 381 |
|
|
use_operand_p imm_use_p;
|
| 382 |
|
|
imm_use_iterator iterator;
|
| 383 |
|
|
tree ssa_var, stmt;
|
| 384 |
|
|
|
| 385 |
|
|
|
| 386 |
|
|
FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var)
|
| 387 |
|
|
@{
|
| 388 |
|
|
FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator)
|
| 389 |
|
|
SET_USE (imm_use_p, ssa_var_2);
|
| 390 |
|
|
fold_stmt (stmt);
|
| 391 |
|
|
@}
|
| 392 |
|
|
@end smallexample
|
| 393 |
|
|
|
| 394 |
|
|
There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is
|
| 395 |
|
|
used when the immediate uses are not changed, i.e., you are looking at the
|
| 396 |
|
|
uses, but not setting them.
|
| 397 |
|
|
|
| 398 |
|
|
If they do get changed, then care must be taken that things are not changed
|
| 399 |
|
|
under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and
|
| 400 |
|
|
@code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the
|
| 401 |
|
|
sanity of the use list by moving all the uses for a statement into
|
| 402 |
|
|
a controlled position, and then iterating over those uses. Then the
|
| 403 |
|
|
optimization can manipulate the stmt when all the uses have been
|
| 404 |
|
|
processed. This is a little slower than the FAST version since it adds a
|
| 405 |
|
|
placeholder element and must sort through the list a bit for each statement.
|
| 406 |
|
|
This placeholder element must be also be removed if the loop is
|
| 407 |
|
|
terminated early. The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided
|
| 408 |
|
|
to do this :
|
| 409 |
|
|
|
| 410 |
|
|
@smallexample
|
| 411 |
|
|
FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var)
|
| 412 |
|
|
@{
|
| 413 |
|
|
if (stmt == last_stmt)
|
| 414 |
|
|
BREAK_FROM_SAFE_IMM_USE (iter);
|
| 415 |
|
|
|
| 416 |
|
|
FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator)
|
| 417 |
|
|
SET_USE (imm_use_p, ssa_var_2);
|
| 418 |
|
|
fold_stmt (stmt);
|
| 419 |
|
|
@}
|
| 420 |
|
|
@end smallexample
|
| 421 |
|
|
|
| 422 |
|
|
There are checks in @code{verify_ssa} which verify that the immediate use list
|
| 423 |
|
|
is up to date, as well as checking that an optimization didn't break from the
|
| 424 |
|
|
loop without using this macro. It is safe to simply 'break'; from a
|
| 425 |
|
|
@code{FOR_EACH_IMM_USE_FAST} traverse.
|
| 426 |
|
|
|
| 427 |
|
|
Some useful functions and macros:
|
| 428 |
|
|
@enumerate
|
| 429 |
|
|
@item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of
|
| 430 |
|
|
@code{ssa_var}.
|
| 431 |
|
|
@item @code{has_single_use (ssa_var)} : Returns true if there is only a
|
| 432 |
|
|
single use of @code{ssa_var}.
|
| 433 |
|
|
@item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} :
|
| 434 |
|
|
Returns true if there is only a single use of @code{ssa_var}, and also returns
|
| 435 |
|
|
the use pointer and statement it occurs in, in the second and third parameters.
|
| 436 |
|
|
@item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of
|
| 437 |
|
|
@code{ssa_var}. It is better not to use this if possible since it simply
|
| 438 |
|
|
utilizes a loop to count the uses.
|
| 439 |
|
|
@item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI}
|
| 440 |
|
|
node, return the index number for the use. An assert is triggered if the use
|
| 441 |
|
|
isn't located in a @code{PHI} node.
|
| 442 |
|
|
@item @code{USE_STMT (use_p)} : Return the statement a use occurs in.
|
| 443 |
|
|
@end enumerate
|
| 444 |
|
|
|
| 445 |
|
|
Note that uses are not put into an immediate use list until their statement is
|
| 446 |
|
|
actually inserted into the instruction stream via a @code{bsi_*} routine.
|
| 447 |
|
|
|
| 448 |
|
|
It is also still possible to utilize lazy updating of statements, but this
|
| 449 |
|
|
should be used only when absolutely required. Both alias analysis and the
|
| 450 |
|
|
dominator optimizations currently do this.
|
| 451 |
|
|
|
| 452 |
|
|
When lazy updating is being used, the immediate use information is out of date
|
| 453 |
|
|
and cannot be used reliably. Lazy updating is achieved by simply marking
|
| 454 |
|
|
statements modified via calls to @code{mark_stmt_modified} instead of
|
| 455 |
|
|
@code{update_stmt}. When lazy updating is no longer required, all the
|
| 456 |
|
|
modified statements must have @code{update_stmt} called in order to bring them
|
| 457 |
|
|
up to date. This must be done before the optimization is finished, or
|
| 458 |
|
|
@code{verify_ssa} will trigger an abort.
|
| 459 |
|
|
|
| 460 |
|
|
This is done with a simple loop over the instruction stream:
|
| 461 |
|
|
@smallexample
|
| 462 |
|
|
block_stmt_iterator bsi;
|
| 463 |
|
|
basic_block bb;
|
| 464 |
|
|
FOR_EACH_BB (bb)
|
| 465 |
|
|
@{
|
| 466 |
|
|
for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
|
| 467 |
|
|
update_stmt_if_modified (bsi_stmt (bsi));
|
| 468 |
|
|
@}
|
| 469 |
|
|
@end smallexample
|
| 470 |
|
|
|
| 471 |
|
|
@node SSA
|
| 472 |
|
|
@section Static Single Assignment
|
| 473 |
|
|
@cindex SSA
|
| 474 |
|
|
@cindex static single assignment
|
| 475 |
|
|
|
| 476 |
|
|
Most of the tree optimizers rely on the data flow information provided
|
| 477 |
|
|
by the Static Single Assignment (SSA) form. We implement the SSA form
|
| 478 |
|
|
as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and
|
| 479 |
|
|
K. Zadeck. Efficiently Computing Static Single Assignment Form and the
|
| 480 |
|
|
Control Dependence Graph. ACM Transactions on Programming Languages
|
| 481 |
|
|
and Systems, 13(4):451-490, October 1991}.
|
| 482 |
|
|
|
| 483 |
|
|
The SSA form is based on the premise that program variables are
|
| 484 |
|
|
assigned in exactly one location in the program. Multiple assignments
|
| 485 |
|
|
to the same variable create new versions of that variable. Naturally,
|
| 486 |
|
|
actual programs are seldom in SSA form initially because variables
|
| 487 |
|
|
tend to be assigned multiple times. The compiler modifies the program
|
| 488 |
|
|
representation so that every time a variable is assigned in the code,
|
| 489 |
|
|
a new version of the variable is created. Different versions of the
|
| 490 |
|
|
same variable are distinguished by subscripting the variable name with
|
| 491 |
|
|
its version number. Variables used in the right-hand side of
|
| 492 |
|
|
expressions are renamed so that their version number matches that of
|
| 493 |
|
|
the most recent assignment.
|
| 494 |
|
|
|
| 495 |
|
|
We represent variable versions using @code{SSA_NAME} nodes. The
|
| 496 |
|
|
renaming process in @file{tree-ssa.c} wraps every real and
|
| 497 |
|
|
virtual operand with an @code{SSA_NAME} node which contains
|
| 498 |
|
|
the version number and the statement that created the
|
| 499 |
|
|
@code{SSA_NAME}. Only definitions and virtual definitions may
|
| 500 |
|
|
create new @code{SSA_NAME} nodes.
|
| 501 |
|
|
|
| 502 |
|
|
@cindex PHI nodes
|
| 503 |
|
|
Sometimes, flow of control makes it impossible to determine the
|
| 504 |
|
|
most recent version of a variable. In these cases, the compiler
|
| 505 |
|
|
inserts an artificial definition for that variable called
|
| 506 |
|
|
@dfn{PHI function} or @dfn{PHI node}. This new definition merges
|
| 507 |
|
|
all the incoming versions of the variable to create a new name
|
| 508 |
|
|
for it. For instance,
|
| 509 |
|
|
|
| 510 |
|
|
@smallexample
|
| 511 |
|
|
if (@dots{})
|
| 512 |
|
|
a_1 = 5;
|
| 513 |
|
|
else if (@dots{})
|
| 514 |
|
|
a_2 = 2;
|
| 515 |
|
|
else
|
| 516 |
|
|
a_3 = 13;
|
| 517 |
|
|
|
| 518 |
|
|
# a_4 = PHI <a_1, a_2, a_3>
|
| 519 |
|
|
return a_4;
|
| 520 |
|
|
@end smallexample
|
| 521 |
|
|
|
| 522 |
|
|
Since it is not possible to determine which of the three branches
|
| 523 |
|
|
will be taken at runtime, we don't know which of @code{a_1},
|
| 524 |
|
|
@code{a_2} or @code{a_3} to use at the return statement. So, the
|
| 525 |
|
|
SSA renamer creates a new version @code{a_4} which is assigned
|
| 526 |
|
|
the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}.
|
| 527 |
|
|
Hence, PHI nodes mean ``one of these operands. I don't know
|
| 528 |
|
|
which''.
|
| 529 |
|
|
|
| 530 |
|
|
The following macros can be used to examine PHI nodes
|
| 531 |
|
|
|
| 532 |
|
|
@defmac PHI_RESULT (@var{phi})
|
| 533 |
|
|
Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e.,
|
| 534 |
|
|
@var{phi}'s LHS)@.
|
| 535 |
|
|
@end defmac
|
| 536 |
|
|
|
| 537 |
|
|
@defmac PHI_NUM_ARGS (@var{phi})
|
| 538 |
|
|
Returns the number of arguments in @var{phi}. This number is exactly
|
| 539 |
|
|
the number of incoming edges to the basic block holding @var{phi}@.
|
| 540 |
|
|
@end defmac
|
| 541 |
|
|
|
| 542 |
|
|
@defmac PHI_ARG_ELT (@var{phi}, @var{i})
|
| 543 |
|
|
Returns a tuple representing the @var{i}th argument of @var{phi}@.
|
| 544 |
|
|
Each element of this tuple contains an @code{SSA_NAME} @var{var} and
|
| 545 |
|
|
the incoming edge through which @var{var} flows.
|
| 546 |
|
|
@end defmac
|
| 547 |
|
|
|
| 548 |
|
|
@defmac PHI_ARG_EDGE (@var{phi}, @var{i})
|
| 549 |
|
|
Returns the incoming edge for the @var{i}th argument of @var{phi}.
|
| 550 |
|
|
@end defmac
|
| 551 |
|
|
|
| 552 |
|
|
@defmac PHI_ARG_DEF (@var{phi}, @var{i})
|
| 553 |
|
|
Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}.
|
| 554 |
|
|
@end defmac
|
| 555 |
|
|
|
| 556 |
|
|
|
| 557 |
|
|
@subsection Preserving the SSA form
|
| 558 |
|
|
@findex update_ssa
|
| 559 |
|
|
@cindex preserving SSA form
|
| 560 |
|
|
Some optimization passes make changes to the function that
|
| 561 |
|
|
invalidate the SSA property. This can happen when a pass has
|
| 562 |
|
|
added new symbols or changed the program so that variables that
|
| 563 |
|
|
were previously aliased aren't anymore. Whenever something like this
|
| 564 |
|
|
happens, the affected symbols must be renamed into SSA form again.
|
| 565 |
|
|
Transformations that emit new code or replicate existing statements
|
| 566 |
|
|
will also need to update the SSA form@.
|
| 567 |
|
|
|
| 568 |
|
|
Since GCC implements two different SSA forms for register and virtual
|
| 569 |
|
|
variables, keeping the SSA form up to date depends on whether you are
|
| 570 |
|
|
updating register or virtual names. In both cases, the general idea
|
| 571 |
|
|
behind incremental SSA updates is similar: when new SSA names are
|
| 572 |
|
|
created, they typically are meant to replace other existing names in
|
| 573 |
|
|
the program@.
|
| 574 |
|
|
|
| 575 |
|
|
For instance, given the following code:
|
| 576 |
|
|
|
| 577 |
|
|
@smallexample
|
| 578 |
|
|
1 L0:
|
| 579 |
|
|
2 x_1 = PHI (0, x_5)
|
| 580 |
|
|
3 if (x_1 < 10)
|
| 581 |
|
|
4 if (x_1 > 7)
|
| 582 |
|
|
5 y_2 = 0
|
| 583 |
|
|
6 else
|
| 584 |
|
|
7 y_3 = x_1 + x_7
|
| 585 |
|
|
8 endif
|
| 586 |
|
|
9 x_5 = x_1 + 1
|
| 587 |
|
|
10 goto L0;
|
| 588 |
|
|
11 endif
|
| 589 |
|
|
@end smallexample
|
| 590 |
|
|
|
| 591 |
|
|
Suppose that we insert new names @code{x_10} and @code{x_11} (lines
|
| 592 |
|
|
@code{4} and @code{8})@.
|
| 593 |
|
|
|
| 594 |
|
|
@smallexample
|
| 595 |
|
|
1 L0:
|
| 596 |
|
|
2 x_1 = PHI (0, x_5)
|
| 597 |
|
|
3 if (x_1 < 10)
|
| 598 |
|
|
4 x_10 = @dots{}
|
| 599 |
|
|
5 if (x_1 > 7)
|
| 600 |
|
|
6 y_2 = 0
|
| 601 |
|
|
7 else
|
| 602 |
|
|
8 x_11 = @dots{}
|
| 603 |
|
|
9 y_3 = x_1 + x_7
|
| 604 |
|
|
10 endif
|
| 605 |
|
|
11 x_5 = x_1 + 1
|
| 606 |
|
|
12 goto L0;
|
| 607 |
|
|
13 endif
|
| 608 |
|
|
@end smallexample
|
| 609 |
|
|
|
| 610 |
|
|
We want to replace all the uses of @code{x_1} with the new definitions
|
| 611 |
|
|
of @code{x_10} and @code{x_11}. Note that the only uses that should
|
| 612 |
|
|
be replaced are those at lines @code{5}, @code{9} and @code{11}.
|
| 613 |
|
|
Also, the use of @code{x_7} at line @code{9} should @emph{not} be
|
| 614 |
|
|
replaced (this is why we cannot just mark symbol @code{x} for
|
| 615 |
|
|
renaming)@.
|
| 616 |
|
|
|
| 617 |
|
|
Additionally, we may need to insert a PHI node at line @code{11}
|
| 618 |
|
|
because that is a merge point for @code{x_10} and @code{x_11}. So the
|
| 619 |
|
|
use of @code{x_1} at line @code{11} will be replaced with the new PHI
|
| 620 |
|
|
node. The insertion of PHI nodes is optional. They are not strictly
|
| 621 |
|
|
necessary to preserve the SSA form, and depending on what the caller
|
| 622 |
|
|
inserted, they may not even be useful for the optimizers@.
|
| 623 |
|
|
|
| 624 |
|
|
Updating the SSA form is a two step process. First, the pass has to
|
| 625 |
|
|
identify which names need to be updated and/or which symbols need to
|
| 626 |
|
|
be renamed into SSA form for the first time. When new names are
|
| 627 |
|
|
introduced to replace existing names in the program, the mapping
|
| 628 |
|
|
between the old and the new names are registered by calling
|
| 629 |
|
|
@code{register_new_name_mapping} (note that if your pass creates new
|
| 630 |
|
|
code by duplicating basic blocks, the call to @code{tree_duplicate_bb}
|
| 631 |
|
|
will set up the necessary mappings automatically). On the other hand,
|
| 632 |
|
|
if your pass exposes a new symbol that should be put in SSA form for
|
| 633 |
|
|
the first time, the new symbol should be registered with
|
| 634 |
|
|
@code{mark_sym_for_renaming}.
|
| 635 |
|
|
|
| 636 |
|
|
After the replacement mappings have been registered and new symbols
|
| 637 |
|
|
marked for renaming, a call to @code{update_ssa} makes the registered
|
| 638 |
|
|
changes. This can be done with an explicit call or by creating
|
| 639 |
|
|
@code{TODO} flags in the @code{tree_opt_pass} structure for your pass.
|
| 640 |
|
|
There are several @code{TODO} flags that control the behavior of
|
| 641 |
|
|
@code{update_ssa}:
|
| 642 |
|
|
|
| 643 |
|
|
@itemize @bullet
|
| 644 |
|
|
@item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes
|
| 645 |
|
|
for newly exposed symbols and virtual names marked for updating.
|
| 646 |
|
|
When updating real names, only insert PHI nodes for a real name
|
| 647 |
|
|
@code{O_j} in blocks reached by all the new and old definitions for
|
| 648 |
|
|
@code{O_j}. If the iterated dominance frontier for @code{O_j}
|
| 649 |
|
|
is not pruned, we may end up inserting PHI nodes in blocks that
|
| 650 |
|
|
have one or more edges with no incoming definition for
|
| 651 |
|
|
@code{O_j}. This would lead to uninitialized warnings for
|
| 652 |
|
|
@code{O_j}'s symbol@.
|
| 653 |
|
|
|
| 654 |
|
|
@item @code{TODO_update_ssa_no_phi}. Update the SSA form without
|
| 655 |
|
|
inserting any new PHI nodes at all. This is used by passes that
|
| 656 |
|
|
have either inserted all the PHI nodes themselves or passes that
|
| 657 |
|
|
need only to patch use-def and def-def chains for virtuals
|
| 658 |
|
|
(e.g., DCE)@.
|
| 659 |
|
|
|
| 660 |
|
|
|
| 661 |
|
|
@item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere
|
| 662 |
|
|
they are needed. No pruning of the IDF is done. This is used
|
| 663 |
|
|
by passes that need the PHI nodes for @code{O_j} even if it
|
| 664 |
|
|
means that some arguments will come from the default definition
|
| 665 |
|
|
of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@.
|
| 666 |
|
|
|
| 667 |
|
|
WARNING: If you need to use this flag, chances are that your
|
| 668 |
|
|
pass may be doing something wrong. Inserting PHI nodes for an
|
| 669 |
|
|
old name where not all edges carry a new replacement may lead to
|
| 670 |
|
|
silent codegen errors or spurious uninitialized warnings@.
|
| 671 |
|
|
|
| 672 |
|
|
@item @code{TODO_update_ssa_only_virtuals}. Passes that update the
|
| 673 |
|
|
SSA form on their own may want to delegate the updating of
|
| 674 |
|
|
virtual names to the generic updater. Since FUD chains are
|
| 675 |
|
|
easier to maintain, this simplifies the work they need to do.
|
| 676 |
|
|
NOTE: If this flag is used, any OLD->NEW mappings for real names
|
| 677 |
|
|
are explicitly destroyed and only the symbols marked for
|
| 678 |
|
|
renaming are processed@.
|
| 679 |
|
|
@end itemize
|
| 680 |
|
|
|
| 681 |
|
|
@subsection Preserving the virtual SSA form
|
| 682 |
|
|
@cindex preserving virtual SSA form
|
| 683 |
|
|
|
| 684 |
|
|
The virtual SSA form is harder to preserve than the non-virtual SSA form
|
| 685 |
|
|
mainly because the set of virtual operands for a statement may change at
|
| 686 |
|
|
what some would consider unexpected times. In general, statement
|
| 687 |
|
|
modifications should be bracketed between calls to
|
| 688 |
|
|
@code{push_stmt_changes} and @code{pop_stmt_changes}. For example,
|
| 689 |
|
|
|
| 690 |
|
|
@smallexample
|
| 691 |
|
|
munge_stmt (tree stmt)
|
| 692 |
|
|
@{
|
| 693 |
|
|
push_stmt_changes (&stmt);
|
| 694 |
|
|
@dots{} rewrite STMT @dots{}
|
| 695 |
|
|
pop_stmt_changes (&stmt);
|
| 696 |
|
|
@}
|
| 697 |
|
|
@end smallexample
|
| 698 |
|
|
|
| 699 |
|
|
The call to @code{push_stmt_changes} saves the current state of the
|
| 700 |
|
|
statement operands and the call to @code{pop_stmt_changes} compares
|
| 701 |
|
|
the saved state with the current one and does the appropriate symbol
|
| 702 |
|
|
marking for the SSA renamer.
|
| 703 |
|
|
|
| 704 |
|
|
It is possible to modify several statements at a time, provided that
|
| 705 |
|
|
@code{push_stmt_changes} and @code{pop_stmt_changes} are called in
|
| 706 |
|
|
LIFO order, as when processing a stack of statements.
|
| 707 |
|
|
|
| 708 |
|
|
Additionally, if the pass discovers that it did not need to make
|
| 709 |
|
|
changes to the statement after calling @code{push_stmt_changes}, it
|
| 710 |
|
|
can simply discard the topmost change buffer by calling
|
| 711 |
|
|
@code{discard_stmt_changes}. This will avoid the expensive operand
|
| 712 |
|
|
re-scan operation and the buffer comparison that determines if symbols
|
| 713 |
|
|
need to be marked for renaming.
|
| 714 |
|
|
|
| 715 |
|
|
@subsection Examining @code{SSA_NAME} nodes
|
| 716 |
|
|
@cindex examining SSA_NAMEs
|
| 717 |
|
|
|
| 718 |
|
|
The following macros can be used to examine @code{SSA_NAME} nodes
|
| 719 |
|
|
|
| 720 |
|
|
@defmac SSA_NAME_DEF_STMT (@var{var})
|
| 721 |
|
|
Returns the statement @var{s} that creates the @code{SSA_NAME}
|
| 722 |
|
|
@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT
|
| 723 |
|
|
(@var{s})} returns @code{true}), it means that the first reference to
|
| 724 |
|
|
this variable is a USE or a VUSE@.
|
| 725 |
|
|
@end defmac
|
| 726 |
|
|
|
| 727 |
|
|
@defmac SSA_NAME_VERSION (@var{var})
|
| 728 |
|
|
Returns the version number of the @code{SSA_NAME} object @var{var}.
|
| 729 |
|
|
@end defmac
|
| 730 |
|
|
|
| 731 |
|
|
|
| 732 |
|
|
@subsection Walking use-def chains
|
| 733 |
|
|
|
| 734 |
|
|
@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data})
|
| 735 |
|
|
|
| 736 |
|
|
Walks use-def chains starting at the @code{SSA_NAME} node @var{var}.
|
| 737 |
|
|
Calls function @var{fn} at each reaching definition found. Function
|
| 738 |
|
|
@var{FN} takes three arguments: @var{var}, its defining statement
|
| 739 |
|
|
(@var{def_stmt}) and a generic pointer to whatever state information
|
| 740 |
|
|
that @var{fn} may want to maintain (@var{data}). Function @var{fn} is
|
| 741 |
|
|
able to stop the walk by returning @code{true}, otherwise in order to
|
| 742 |
|
|
continue the walk, @var{fn} should return @code{false}.
|
| 743 |
|
|
|
| 744 |
|
|
Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are
|
| 745 |
|
|
slightly different. For each argument @var{arg} of the PHI node, this
|
| 746 |
|
|
function will:
|
| 747 |
|
|
|
| 748 |
|
|
@enumerate
|
| 749 |
|
|
@item Walk the use-def chains for @var{arg}.
|
| 750 |
|
|
@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}.
|
| 751 |
|
|
@end enumerate
|
| 752 |
|
|
|
| 753 |
|
|
Note how the first argument to @var{fn} is no longer the original
|
| 754 |
|
|
variable @var{var}, but the PHI argument currently being examined.
|
| 755 |
|
|
If @var{fn} wants to get at @var{var}, it should call
|
| 756 |
|
|
@code{PHI_RESULT} (@var{phi}).
|
| 757 |
|
|
@end deftypefn
|
| 758 |
|
|
|
| 759 |
|
|
@subsection Walking the dominator tree
|
| 760 |
|
|
|
| 761 |
|
|
@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb})
|
| 762 |
|
|
|
| 763 |
|
|
This function walks the dominator tree for the current CFG calling a
|
| 764 |
|
|
set of callback functions defined in @var{struct dom_walk_data} in
|
| 765 |
|
|
@file{domwalk.h}. The call back functions you need to define give you
|
| 766 |
|
|
hooks to execute custom code at various points during traversal:
|
| 767 |
|
|
|
| 768 |
|
|
@enumerate
|
| 769 |
|
|
@item Once to initialize any local data needed while processing
|
| 770 |
|
|
@var{bb} and its children. This local data is pushed into an
|
| 771 |
|
|
internal stack which is automatically pushed and popped as the
|
| 772 |
|
|
walker traverses the dominator tree.
|
| 773 |
|
|
|
| 774 |
|
|
@item Once before traversing all the statements in the @var{bb}.
|
| 775 |
|
|
|
| 776 |
|
|
@item Once for every statement inside @var{bb}.
|
| 777 |
|
|
|
| 778 |
|
|
@item Once after traversing all the statements and before recursing
|
| 779 |
|
|
into @var{bb}'s dominator children.
|
| 780 |
|
|
|
| 781 |
|
|
@item It then recurses into all the dominator children of @var{bb}.
|
| 782 |
|
|
|
| 783 |
|
|
@item After recursing into all the dominator children of @var{bb} it
|
| 784 |
|
|
can, optionally, traverse every statement in @var{bb} again
|
| 785 |
|
|
(i.e., repeating steps 2 and 3).
|
| 786 |
|
|
|
| 787 |
|
|
@item Once after walking the statements in @var{bb} and @var{bb}'s
|
| 788 |
|
|
dominator children. At this stage, the block local data stack
|
| 789 |
|
|
is popped.
|
| 790 |
|
|
@end enumerate
|
| 791 |
|
|
@end deftypefn
|
| 792 |
|
|
|
| 793 |
|
|
@node Alias analysis
|
| 794 |
|
|
@section Alias analysis
|
| 795 |
|
|
@cindex alias
|
| 796 |
|
|
@cindex flow-sensitive alias analysis
|
| 797 |
|
|
@cindex flow-insensitive alias analysis
|
| 798 |
|
|
|
| 799 |
|
|
Alias analysis in GIMPLE SSA form consists of two pieces. First
|
| 800 |
|
|
the virtual SSA web ties conflicting memory accesses and provides
|
| 801 |
|
|
a SSA use-def chain and SSA immediate-use chains for walking
|
| 802 |
|
|
possibly dependent memory accesses. Second an alias-oracle can
|
| 803 |
|
|
be queried to disambiguate explicit and implicit memory references.
|
| 804 |
|
|
|
| 805 |
|
|
@enumerate
|
| 806 |
|
|
@item Memory SSA form.
|
| 807 |
|
|
|
| 808 |
|
|
All statements that may use memory have exactly one accompanied use of
|
| 809 |
|
|
a virtual SSA name that represents the state of memory at the
|
| 810 |
|
|
given point in the IL.
|
| 811 |
|
|
|
| 812 |
|
|
All statements that may define memory have exactly one accompanied
|
| 813 |
|
|
definition of a virtual SSA name using the previous state of memory
|
| 814 |
|
|
and defining the new state of memory after the given point in the IL.
|
| 815 |
|
|
|
| 816 |
|
|
@smallexample
|
| 817 |
|
|
int i;
|
| 818 |
|
|
int foo (void)
|
| 819 |
|
|
@{
|
| 820 |
|
|
# .MEM_3 = VDEF <.MEM_2(D)>
|
| 821 |
|
|
i = 1;
|
| 822 |
|
|
# VUSE <.MEM_3>
|
| 823 |
|
|
return i;
|
| 824 |
|
|
@}
|
| 825 |
|
|
@end smallexample
|
| 826 |
|
|
|
| 827 |
|
|
The virtual SSA names in this case are @code{.MEM_2(D)} and
|
| 828 |
|
|
@code{.MEM_3}. The store to the global variable @code{i}
|
| 829 |
|
|
defines @code{.MEM_3} invalidating @code{.MEM_2(D)}. The
|
| 830 |
|
|
load from @code{i} uses that new state @code{.MEM_3}.
|
| 831 |
|
|
|
| 832 |
|
|
The virtual SSA web serves as constraints to SSA optimizers
|
| 833 |
|
|
preventing illegitimate code-motion and optimization. It
|
| 834 |
|
|
also provides a way to walk related memory statements.
|
| 835 |
|
|
|
| 836 |
|
|
@item Points-to and escape analysis.
|
| 837 |
|
|
|
| 838 |
|
|
Points-to analysis builds a set of constraints from the GIMPLE
|
| 839 |
|
|
SSA IL representing all pointer operations and facts we do
|
| 840 |
|
|
or do not know about pointers. Solving this set of constraints
|
| 841 |
|
|
yields a conservatively correct solution for each pointer
|
| 842 |
|
|
variable in the program (though we are only interested in
|
| 843 |
|
|
SSA name pointers) as to what it may possibly point to.
|
| 844 |
|
|
|
| 845 |
|
|
This points-to solution for a given SSA name pointer is stored
|
| 846 |
|
|
in the @code{pt_solution} sub-structure of the
|
| 847 |
|
|
@code{SSA_NAME_PTR_INFO} record. The following accessor
|
| 848 |
|
|
functions are available:
|
| 849 |
|
|
|
| 850 |
|
|
@itemize @bullet
|
| 851 |
|
|
@item @code{pt_solution_includes}
|
| 852 |
|
|
@item @code{pt_solutions_intersect}
|
| 853 |
|
|
@end itemize
|
| 854 |
|
|
|
| 855 |
|
|
Points-to analysis also computes the solution for two special
|
| 856 |
|
|
set of pointers, @code{ESCAPED} and @code{CALLUSED}. Those
|
| 857 |
|
|
represent all memory that has escaped the scope of analysis
|
| 858 |
|
|
or that is used by pure or nested const calls.
|
| 859 |
|
|
|
| 860 |
|
|
@item Type-based alias analysis
|
| 861 |
|
|
|
| 862 |
|
|
Type-based alias analysis is frontend dependent though generic
|
| 863 |
|
|
support is provided by the middle-end in @code{alias.c}. TBAA
|
| 864 |
|
|
code is used by both tree optimizers and RTL optimizers.
|
| 865 |
|
|
|
| 866 |
|
|
Every language that wishes to perform language-specific alias analysis
|
| 867 |
|
|
should define a function that computes, given a @code{tree}
|
| 868 |
|
|
node, an alias set for the node. Nodes in different alias sets are not
|
| 869 |
|
|
allowed to alias. For an example, see the C front-end function
|
| 870 |
|
|
@code{c_get_alias_set}.
|
| 871 |
|
|
|
| 872 |
|
|
@item Tree alias-oracle
|
| 873 |
|
|
|
| 874 |
|
|
The tree alias-oracle provides means to disambiguate two memory
|
| 875 |
|
|
references and memory references against statements. The following
|
| 876 |
|
|
queries are available:
|
| 877 |
|
|
|
| 878 |
|
|
@itemize @bullet
|
| 879 |
|
|
@item @code{refs_may_alias_p}
|
| 880 |
|
|
@item @code{ref_maybe_used_by_stmt_p}
|
| 881 |
|
|
@item @code{stmt_may_clobber_ref_p}
|
| 882 |
|
|
@end itemize
|
| 883 |
|
|
|
| 884 |
|
|
In addition to those two kind of statement walkers are available
|
| 885 |
|
|
walking statements related to a reference ref.
|
| 886 |
|
|
@code{walk_non_aliased_vuses} walks over dominating memory defining
|
| 887 |
|
|
statements and calls back if the statement does not clobber ref
|
| 888 |
|
|
providing the non-aliased VUSE. The walk stops at
|
| 889 |
|
|
the first clobbering statement or if asked to.
|
| 890 |
|
|
@code{walk_aliased_vdefs} walks over dominating memory defining
|
| 891 |
|
|
statements and calls back on each statement clobbering ref
|
| 892 |
|
|
providing its aliasing VDEF. The walk stops if asked to.
|
| 893 |
|
|
|
| 894 |
|
|
@end enumerate
|
| 895 |
|
|
|
| 896 |
|
|
|
| 897 |
|
|
@node Memory model
|
| 898 |
|
|
@section Memory model
|
| 899 |
|
|
@cindex memory model
|
| 900 |
|
|
|
| 901 |
|
|
The memory model used by the middle-end models that of the C/C++
|
| 902 |
|
|
languages. The middle-end has the notion of an effective type
|
| 903 |
|
|
of a memory region which is used for type-based alias analysis.
|
| 904 |
|
|
|
| 905 |
|
|
The following is a refinement of ISO C99 6.5/6, clarifying the block copy case
|
| 906 |
|
|
to follow common sense and extending the concept of a dynamic effective
|
| 907 |
|
|
type to objects with a declared type as required for C++.
|
| 908 |
|
|
|
| 909 |
|
|
@smallexample
|
| 910 |
|
|
The effective type of an object for an access to its stored value is
|
| 911 |
|
|
the declared type of the object or the effective type determined by
|
| 912 |
|
|
a previous store to it. If a value is stored into an object through
|
| 913 |
|
|
an lvalue having a type that is not a character type, then the
|
| 914 |
|
|
type of the lvalue becomes the effective type of the object for that
|
| 915 |
|
|
access and for subsequent accesses that do not modify the stored value.
|
| 916 |
|
|
If a value is copied into an object using @code{memcpy} or @code{memmove},
|
| 917 |
|
|
or is copied as an array of character type, then the effective type
|
| 918 |
|
|
of the modified object for that access and for subsequent accesses that
|
| 919 |
|
|
do not modify the value is undetermined. For all other accesses to an
|
| 920 |
|
|
object, the effective type of the object is simply the type of the
|
| 921 |
|
|
lvalue used for the access.
|
| 922 |
|
|
@end smallexample
|
| 923 |
|
|
|