OpenCores
URL https://opencores.org/ocsvn/scarts/scarts/trunk

Subversion Repositories scarts

[/] [scarts/] [trunk/] [toolchain/] [scarts-gcc/] [gcc-4.1.1/] [gcc/] [doc/] [tree-ssa.texi] - Blame information for rev 20

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 12 jlechner
@c Copyright (c) 2004, 2005 Free Software Foundation, Inc.
2
@c Free Software Foundation, Inc.
3
@c This is part of the GCC manual.
4
@c For copying conditions, see the file gcc.texi.
5
 
6
@c ---------------------------------------------------------------------
7
@c Tree SSA
8
@c ---------------------------------------------------------------------
9
 
10
@node Tree SSA
11
@chapter Analysis and Optimization of GIMPLE Trees
12
@cindex Tree SSA
13
@cindex Optimization infrastructure for GIMPLE
14
 
15
GCC uses three main intermediate languages to represent the program
16
during compilation: GENERIC, GIMPLE and RTL@.  GENERIC is a
17
language-independent representation generated by each front end.  It
18
is used to serve as an interface between the parser and optimizer.
19
GENERIC is a common representation that is able to represent programs
20
written in all the languages supported by GCC@.
21
 
22
GIMPLE and RTL are used to optimize the program.  GIMPLE is used for
23
target and language independent optimizations (e.g., inlining,
24
constant propagation, tail call elimination, redundancy elimination,
25
etc).  Much like GENERIC, GIMPLE is a language independent, tree based
26
representation.  However, it differs from GENERIC in that the GIMPLE
27
grammar is more restrictive: expressions contain no more than 3
28
operands (except function calls), it has no control flow structures
29
and expressions with side-effects are only allowed on the right hand
30
side of assignments.  See the chapter describing GENERIC and GIMPLE
31
for more details.
32
 
33
This chapter describes the data structures and functions used in the
34
GIMPLE optimizers (also known as ``tree optimizers'' or ``middle
35
end'').  In particular, it focuses on all the macros, data structures,
36
functions and programming constructs needed to implement optimization
37
passes for GIMPLE@.
38
 
39
@menu
40
* GENERIC::             A high-level language-independent representation.
41
* GIMPLE::              A lower-level factored tree representation.
42
* Annotations::         Attributes for statements and variables.
43
* Statement Operands::  Variables referenced by GIMPLE statements.
44
* SSA::                 Static Single Assignment representation.
45
* Alias analysis::      Representing aliased loads and stores.
46
@end menu
47
 
48
@node GENERIC
49
@section GENERIC
50
@cindex GENERIC
51
 
52
The purpose of GENERIC is simply to provide a language-independent way of
53
representing an entire function in trees.  To this end, it was necessary to
54
add a few new tree codes to the back end, but most everything was already
55
there.  If you can express it with the codes in @code{gcc/tree.def}, it's
56
GENERIC@.
57
 
58
Early on, there was a great deal of debate about how to think about
59
statements in a tree IL@.  In GENERIC, a statement is defined as any
60
expression whose value, if any, is ignored.  A statement will always
61
have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a
62
non-statement expression may also have side effects.  A
63
@code{CALL_EXPR}, for instance.
64
 
65
It would be possible for some local optimizations to work on the
66
GENERIC form of a function; indeed, the adapted tree inliner works
67
fine on GENERIC, but the current compiler performs inlining after
68
lowering to GIMPLE (a restricted form described in the next section).
69
Indeed, currently the frontends perform this lowering before handing
70
off to @code{tree_rest_of_compilation}, but this seems inelegant.
71
 
72
If necessary, a front end can use some language-dependent tree codes
73
in its GENERIC representation, so long as it provides a hook for
74
converting them to GIMPLE and doesn't expect them to work with any
75
(hypothetical) optimizers that run before the conversion to GIMPLE@.
76
The intermediate representation used while parsing C and C++ looks
77
very little like GENERIC, but the C and C++ gimplifier hooks are
78
perfectly happy to take it as input and spit out GIMPLE@.
79
 
80
@node GIMPLE
81
@section GIMPLE
82
@cindex GIMPLE
83
 
84
GIMPLE is a simplified subset of GENERIC for use in optimization.  The
85
particular subset chosen (and the name) was heavily influenced by the
86
SIMPLE IL used by the McCAT compiler project at McGill University,
87
though we have made some different choices.  For one thing, SIMPLE
88
doesn't support @code{goto}; a production compiler can't afford that
89
kind of restriction.
90
 
91
GIMPLE retains much of the structure of the parse trees: lexical
92
scopes are represented as containers, rather than markers.  However,
93
expressions are broken down into a 3-address form, using temporary
94
variables to hold intermediate values.  Also, control structures are
95
lowered to gotos.
96
 
97
In GIMPLE no container node is ever used for its value; if a
98
@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a
99
temporary within the controlled blocks, and that temporary is used in
100
place of the container.
101
 
102
The compiler pass which lowers GENERIC to GIMPLE is referred to as the
103
@samp{gimplifier}.  The gimplifier works recursively, replacing complex
104
statements with sequences of simple statements.
105
 
106
@c Currently, the only way to
107
@c tell whether or not an expression is in GIMPLE form is by recursively
108
@c examining it; in the future there will probably be a flag to help avoid
109
@c redundant work.  FIXME FIXME
110
 
111
@menu
112
* Interfaces::
113
* Temporaries::
114
* GIMPLE Expressions::
115
* Statements::
116
* GIMPLE Example::
117
* Rough GIMPLE Grammar::
118
@end menu
119
 
120
@node Interfaces
121
@subsection Interfaces
122
@cindex gimplification
123
 
124
The tree representation of a function is stored in
125
@code{DECL_SAVED_TREE}.  It is lowered to GIMPLE by a call to
126
@code{gimplify_function_tree}.
127
 
128
If a front end wants to include language-specific tree codes in the tree
129
representation which it provides to the back end, it must provide a
130
definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to
131
convert the front end trees to GIMPLE@.  Usually such a hook will involve
132
much of the same code for expanding front end trees to RTL@.  This function
133
can return fully lowered GIMPLE, or it can return GENERIC trees and let the
134
main gimplifier lower them the rest of the way; this is often simpler.
135
 
136
The C and C++ front ends currently convert directly from front end
137
trees to GIMPLE, and hand that off to the back end rather than first
138
converting to GENERIC@.  Their gimplifier hooks know about all the
139
@code{_STMT} nodes and how to convert them to GENERIC forms.  There
140
was some work done on a genericization pass which would run first, but
141
the existence of @code{STMT_EXPR} meant that in order to convert all
142
of the C statements into GENERIC equivalents would involve walking the
143
entire tree anyway, so it was simpler to lower all the way.  This
144
might change in the future if someone writes an optimization pass
145
which would work better with higher-level trees, but currently the
146
optimizers all expect GIMPLE@.
147
 
148
A front end which wants to use the tree optimizers (and already has
149
some sort of whole-function tree representation) only needs to provide
150
a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call
151
@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to
152
@code{tree_rest_of_compilation} to compile and output the function.
153
 
154
You can tell the compiler to dump a C-like representation of the GIMPLE
155
form with the flag @option{-fdump-tree-gimple}.
156
 
157
@node Temporaries
158
@subsection Temporaries
159
@cindex Temporaries
160
 
161
When gimplification encounters a subexpression which is too complex, it
162
creates a new temporary variable to hold the value of the subexpression,
163
and adds a new statement to initialize it before the current statement.
164
These special temporaries are known as @samp{expression temporaries}, and are
165
allocated using @code{get_formal_tmp_var}.  The compiler tries to
166
always evaluate identical expressions into the same temporary, to simplify
167
elimination of redundant calculations.
168
 
169
We can only use expression temporaries when we know that it will not be
170
reevaluated before its value is used, and that it will not be otherwise
171
modified@footnote{These restrictions are derived from those in Morgan 4.8.}.
172
Other temporaries can be allocated using
173
@code{get_initialized_tmp_var} or @code{create_tmp_var}.
174
 
175
Currently, an expression like @code{a = b + 5} is not reduced any
176
further.  We tried converting it to something like
177
@smallexample
178
  T1 = b + 5;
179
  a = T1;
180
@end smallexample
181
but this bloated the representation for minimal benefit.  However, a
182
variable which must live in memory cannot appear in an expression; its
183
value is explicitly loaded into a temporary first.  Similarly, storing
184
the value of an expression to a memory variable goes through a
185
temporary.
186
 
187
@node GIMPLE Expressions
188
@subsection Expressions
189
@cindex GIMPLE Expressions
190
 
191
In general, expressions in GIMPLE consist of an operation and the
192
appropriate number of simple operands; these operands must either be a
193
GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register
194
variable.  More complex operands are factored out into temporaries, so
195
that
196
@smallexample
197
  a = b + c + d
198
@end smallexample
199
becomes
200
@smallexample
201
  T1 = b + c;
202
  a = T1 + d;
203
@end smallexample
204
 
205
The same rule holds for arguments to a @code{CALL_EXPR}.
206
 
207
The target of an assignment is usually a variable, but can also be an
208
@code{INDIRECT_REF} or a compound lvalue as described below.
209
 
210
@menu
211
* Compound Expressions::
212
* Compound Lvalues::
213
* Conditional Expressions::
214
* Logical Operators::
215
@end menu
216
 
217
@node Compound Expressions
218
@subsubsection Compound Expressions
219
@cindex Compound Expressions
220
 
221
The left-hand side of a C comma expression is simply moved into a separate
222
statement.
223
 
224
@node Compound Lvalues
225
@subsubsection Compound Lvalues
226
@cindex Compound Lvalues
227
 
228
Currently compound lvalues involving array and structure field references
229
are not broken down; an expression like @code{a.b[2] = 42} is not reduced
230
any further (though complex array subscripts are).  This restriction is a
231
workaround for limitations in later optimizers; if we were to convert this
232
to
233
 
234
@smallexample
235
  T1 = &a.b;
236
  T1[2] = 42;
237
@end smallexample
238
 
239
alias analysis would not remember that the reference to @code{T1[2]} came
240
by way of @code{a.b}, so it would think that the assignment could alias
241
another member of @code{a}; this broke @code{struct-alias-1.c}.  Future
242
optimizer improvements may make this limitation unnecessary.
243
 
244
@node Conditional Expressions
245
@subsubsection Conditional Expressions
246
@cindex Conditional Expressions
247
 
248
A C @code{?:} expression is converted into an @code{if} statement with
249
each branch assigning to the same temporary.  So,
250
 
251
@smallexample
252
  a = b ? c : d;
253
@end smallexample
254
becomes
255
@smallexample
256
  if (b)
257
    T1 = c;
258
  else
259
    T1 = d;
260
  a = T1;
261
@end smallexample
262
 
263
Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate.
264
It is used to vectorize loops with conditions using vector conditional operations.
265
 
266
Note that in GIMPLE, @code{if} statements are also represented using
267
@code{COND_EXPR}, as described below.
268
 
269
@node Logical Operators
270
@subsubsection Logical Operators
271
@cindex Logical Operators
272
 
273
Except when they appear in the condition operand of a @code{COND_EXPR},
274
logical `and' and `or' operators are simplified as follows:
275
@code{a = b && c} becomes
276
 
277
@smallexample
278
  T1 = (bool)b;
279
  if (T1)
280
    T1 = (bool)c;
281
  a = T1;
282
@end smallexample
283
 
284
Note that @code{T1} in this example cannot be an expression temporary,
285
because it has two different assignments.
286
 
287
@node Statements
288
@subsection Statements
289
@cindex Statements
290
 
291
Most statements will be assignment statements, represented by
292
@code{MODIFY_EXPR}.  A @code{CALL_EXPR} whose value is ignored can
293
also be a statement.  No other C expressions can appear at statement level;
294
a reference to a volatile object is converted into a @code{MODIFY_EXPR}.
295
In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful.  Instead, use type
296
of LHS or RHS@.
297
 
298
There are also several varieties of complex statements.
299
 
300
@menu
301
* Blocks::
302
* Statement Sequences::
303
* Empty Statements::
304
* Loops::
305
* Selection Statements::
306
* Jumps::
307
* Cleanups::
308
* GIMPLE Exception Handling::
309
@end menu
310
 
311
@node Blocks
312
@subsubsection Blocks
313
@cindex Blocks
314
 
315
Block scopes and the variables they declare in GENERIC and GIMPLE are
316
expressed using the @code{BIND_EXPR} code, which in previous versions of
317
GCC was primarily used for the C statement-expression extension.
318
 
319
Variables in a block are collected into @code{BIND_EXPR_VARS} in
320
declaration order.  Any runtime initialization is moved out of
321
@code{DECL_INITIAL} and into a statement in the controlled block.  When
322
gimplifying from C or C++, this initialization replaces the
323
@code{DECL_STMT}.
324
 
325
Variable-length arrays (VLAs) complicate this process, as their size often
326
refers to variables initialized earlier in the block.  To handle this, we
327
currently split the block at that point, and move the VLA into a new, inner
328
@code{BIND_EXPR}.  This strategy may change in the future.
329
 
330
@code{DECL_SAVED_TREE} for a GIMPLE function will always be a
331
@code{BIND_EXPR} which contains declarations for the temporary variables
332
used in the function.
333
 
334
A C++ program will usually contain more @code{BIND_EXPR}s than there are
335
syntactic blocks in the source code, since several C++ constructs have
336
implicit scopes associated with them.  On the other hand, although the C++
337
front end uses pseudo-scopes to handle cleanups for objects with
338
destructors, these don't translate into the GIMPLE form; multiple
339
declarations at the same level use the same @code{BIND_EXPR}.
340
 
341
@node Statement Sequences
342
@subsubsection Statement Sequences
343
@cindex Statement Sequences
344
 
345
Multiple statements at the same nesting level are collected into a
346
@code{STATEMENT_LIST}.  Statement lists are modified and traversed
347
using the interface in @samp{tree-iterator.h}.
348
 
349
@node Empty Statements
350
@subsubsection Empty Statements
351
@cindex Empty Statements
352
 
353
Whenever possible, statements with no effect are discarded.  But if they
354
are nested within another construct which cannot be discarded for some
355
reason, they are instead replaced with an empty statement, generated by
356
@code{build_empty_stmt}.  Initially, all empty statements were shared,
357
after the pattern of the Java front end, but this caused a lot of trouble in
358
practice.
359
 
360
An empty statement is represented as @code{(void)0}.
361
 
362
@node Loops
363
@subsubsection Loops
364
@cindex Loops
365
 
366
At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but
367
now they are lowered to explicit gotos.
368
 
369
@node Selection Statements
370
@subsubsection Selection Statements
371
@cindex Selection Statements
372
 
373
A simple selection statement, such as the C @code{if} statement, is
374
expressed in GIMPLE using a void @code{COND_EXPR}.  If only one branch is
375
used, the other is filled with an empty statement.
376
 
377
Normally, the condition expression is reduced to a simple comparison.  If
378
it is a shortcut (@code{&&} or @code{||}) expression, however, we try to
379
break up the @code{if} into multiple @code{if}s so that the implied shortcut
380
is taken directly, much like the transformation done by @code{do_jump} in
381
the RTL expander.
382
 
383
A @code{SWITCH_EXPR} in GIMPLE contains the condition and a
384
@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values
385
and corresponding @code{LABEL_DECL}s to jump to.  The body of the
386
@code{switch} is moved after the @code{SWITCH_EXPR}.
387
 
388
@node Jumps
389
@subsubsection Jumps
390
@cindex Jumps
391
 
392
Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}.
393
 
394
The operand of a @code{GOTO_EXPR} must be either a label or a variable
395
containing the address to jump to.
396
 
397
The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE} or a
398
@code{MODIFY_EXPR} which sets the return value.  It would be nice to
399
move the @code{MODIFY_EXPR} into a separate statement, but the special
400
return semantics in @code{expand_return} make that difficult.  It may
401
still happen in the future, perhaps by moving most of that logic into
402
@code{expand_assignment}.
403
 
404
@node Cleanups
405
@subsubsection Cleanups
406
@cindex Cleanups
407
 
408
Destructors for local C++ objects and similar dynamic cleanups are
409
represented in GIMPLE by a @code{TRY_FINALLY_EXPR}.
410
@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence
411
of statements to execute.  The first sequence is executed.  When it
412
completes the second sequence is executed.
413
 
414
The first sequence may complete in the following ways:
415
 
416
@enumerate
417
 
418
@item Execute the last statement in the sequence and fall off the
419
end.
420
 
421
@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary
422
label outside the sequence.
423
 
424
@item Execute a return statement (@code{RETURN_EXPR}).
425
 
426
@item Throw an exception.  This is currently not explicitly represented in
427
GIMPLE.
428
 
429
@end enumerate
430
 
431
The second sequence is not executed if the first sequence completes by
432
calling @code{setjmp} or @code{exit} or any other function that does
433
not return.  The second sequence is also not executed if the first
434
sequence completes via a non-local goto or a computed goto (in general
435
the compiler does not know whether such a goto statement exits the
436
first sequence or not, so we assume that it doesn't).
437
 
438
After the second sequence is executed, if it completes normally by
439
falling off the end, execution continues wherever the first sequence
440
would have continued, by falling off the end, or doing a goto, etc.
441
 
442
@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
443
needs to appear on every edge out of the controlled block; this
444
reduces the freedom to move code across these edges.  Therefore, the
445
EH lowering pass which runs before most of the optimization passes
446
eliminates these expressions by explicitly adding the cleanup to each
447
edge.  Rethrowing the exception is represented using @code{RESX_EXPR}.
448
 
449
 
450
@node GIMPLE Exception Handling
451
@subsubsection Exception Handling
452
@cindex GIMPLE Exception Handling
453
 
454
Other exception handling constructs are represented using
455
@code{TRY_CATCH_EXPR}.  @code{TRY_CATCH_EXPR} has two operands.  The
456
first operand is a sequence of statements to execute.  If executing
457
these statements does not throw an exception, then the second operand
458
is ignored.  Otherwise, if an exception is thrown, then the second
459
operand of the @code{TRY_CATCH_EXPR} is checked.  The second operand
460
may have the following forms:
461
 
462
@enumerate
463
 
464
@item A sequence of statements to execute.  When an exception occurs,
465
these statements are executed, and then the exception is rethrown.
466
 
467
@item A sequence of @code{CATCH_EXPR} expressions.  Each @code{CATCH_EXPR}
468
has a list of applicable exception types and handler code.  If the
469
thrown exception matches one of the caught types, the associated
470
handler code is executed.  If the handler code falls off the bottom,
471
execution continues after the original @code{TRY_CATCH_EXPR}.
472
 
473
@item An @code{EH_FILTER_EXPR} expression.  This has a list of
474
permitted exception types, and code to handle a match failure.  If the
475
thrown exception does not match one of the allowed types, the
476
associated match failure code is executed.  If the thrown exception
477
does match, it continues unwinding the stack looking for the next
478
handler.
479
 
480
@end enumerate
481
 
482
Currently throwing an exception is not directly represented in GIMPLE,
483
since it is implemented by calling a function.  At some point in the future
484
we will want to add some way to express that the call will throw an
485
exception of a known type.
486
 
487
Just before running the optimizers, the compiler lowers the high-level
488
EH constructs above into a set of @samp{goto}s, magic labels, and EH
489
regions.  Continuing to unwind at the end of a cleanup is represented
490
with a @code{RESX_EXPR}.
491
 
492
@node GIMPLE Example
493
@subsection GIMPLE Example
494
@cindex GIMPLE Example
495
 
496
@smallexample
497
struct A @{ A(); ~A(); @};
498
 
499
int i;
500
int g();
501
void f()
502
@{
503
  A a;
504
  int j = (--i, i ? 0 : 1);
505
 
506
  for (int x = 42; x > 0; --x)
507
    @{
508
      i += g()*4 + 32;
509
    @}
510
@}
511
@end smallexample
512
 
513
becomes
514
 
515
@smallexample
516
void f()
517
@{
518
  int i.0;
519
  int T.1;
520
  int iftmp.2;
521
  int T.3;
522
  int T.4;
523
  int T.5;
524
  int T.6;
525
 
526
  @{
527
    struct A a;
528
    int j;
529
 
530
    __comp_ctor (&a);
531
    try
532
      @{
533
        i.0 = i;
534
        T.1 = i.0 - 1;
535
        i = T.1;
536
        i.0 = i;
537
        if (i.0 == 0)
538
          iftmp.2 = 1;
539
        else
540
          iftmp.2 = 0;
541
        j = iftmp.2;
542
        @{
543
          int x;
544
 
545
          x = 42;
546
          goto test;
547
          loop:;
548
 
549
          T.3 = g ();
550
          T.4 = T.3 * 4;
551
          i.0 = i;
552
          T.5 = T.4 + i.0;
553
          T.6 = T.5 + 32;
554
          i = T.6;
555
          x = x - 1;
556
 
557
          test:;
558
          if (x > 0)
559
            goto loop;
560
          else
561
            goto break_;
562
          break_:;
563
        @}
564
      @}
565
    finally
566
      @{
567
        __comp_dtor (&a);
568
      @}
569
  @}
570
@}
571
@end smallexample
572
 
573
@node Rough GIMPLE Grammar
574
@subsection Rough GIMPLE Grammar
575
@cindex Rough GIMPLE Grammar
576
 
577
@smallexample
578
   function     : FUNCTION_DECL
579
                        DECL_SAVED_TREE -> compound-stmt
580
 
581
   compound-stmt: STATEMENT_LIST
582
                        members -> stmt
583
 
584
   stmt         : block
585
                | if-stmt
586
                | switch-stmt
587
                | goto-stmt
588
                | return-stmt
589
                | resx-stmt
590
                | label-stmt
591
                | try-stmt
592
                | modify-stmt
593
                | call-stmt
594
 
595
   block        : BIND_EXPR
596
                        BIND_EXPR_VARS -> chain of DECLs
597
                        BIND_EXPR_BLOCK -> BLOCK
598
                        BIND_EXPR_BODY -> compound-stmt
599
 
600
   if-stmt      : COND_EXPR
601
                        op0 -> condition
602
                        op1 -> compound-stmt
603
                        op2 -> compound-stmt
604
 
605
   switch-stmt  : SWITCH_EXPR
606
                        op0 -> val
607
                        op1 -> NULL
608
                        op2 -> TREE_VEC of CASE_LABEL_EXPRs
609
                            The CASE_LABEL_EXPRs are sorted by CASE_LOW,
610
                            and default is last.
611
 
612
   goto-stmt    : GOTO_EXPR
613
                        op0 -> LABEL_DECL | val
614
 
615
   return-stmt  : RETURN_EXPR
616
                        op0 -> return-value
617
 
618
   return-value : NULL
619
                | RESULT_DECL
620
                | MODIFY_EXPR
621
                        op0 -> RESULT_DECL
622
                        op1 -> lhs
623
 
624
   resx-stmt    : RESX_EXPR
625
 
626
   label-stmt   : LABEL_EXPR
627
                        op0 -> LABEL_DECL
628
 
629
   try-stmt     : TRY_CATCH_EXPR
630
                        op0 -> compound-stmt
631
                        op1 -> handler
632
                | TRY_FINALLY_EXPR
633
                        op0 -> compound-stmt
634
                        op1 -> compound-stmt
635
 
636
   handler      : catch-seq
637
                | EH_FILTER_EXPR
638
                | compound-stmt
639
 
640
   catch-seq    : STATEMENT_LIST
641
                        members -> CATCH_EXPR
642
 
643
   modify-stmt  : MODIFY_EXPR
644
                        op0 -> lhs
645
                        op1 -> rhs
646
 
647
   call-stmt    : CALL_EXPR
648
                        op0 -> val | OBJ_TYPE_REF
649
                        op1 -> call-arg-list
650
 
651
   call-arg-list: TREE_LIST
652
                        members -> lhs | CONST
653
 
654
   addr-expr-arg: ID
655
                | compref
656
 
657
   addressable  : addr-expr-arg
658
                | indirectref
659
 
660
   with-size-arg: addressable
661
                | call-stmt
662
 
663
   indirectref  : INDIRECT_REF
664
                        op0 -> val
665
 
666
   lhs          : addressable
667
                | bitfieldref
668
                | WITH_SIZE_EXPR
669
                        op0 -> with-size-arg
670
                        op1 -> val
671
 
672
   min-lval     : ID
673
                | indirectref
674
 
675
   bitfieldref  : BIT_FIELD_REF
676
                        op0 -> inner-compref
677
                        op1 -> CONST
678
                        op2 -> var
679
 
680
   compref      : inner-compref
681
                | TARGET_MEM_REF
682
                        op0 -> ID
683
                        op1 -> val
684
                        op2 -> val
685
                        op3 -> CONST
686
                        op4 -> CONST
687
                | REALPART_EXPR
688
                        op0 -> inner-compref
689
                | IMAGPART_EXPR
690
                        op0 -> inner-compref
691
 
692
   inner-compref: min-lval
693
                | COMPONENT_REF
694
                        op0 -> inner-compref
695
                        op1 -> FIELD_DECL
696
                        op2 -> val
697
                | ARRAY_REF
698
                        op0 -> inner-compref
699
                        op1 -> val
700
                        op2 -> val
701
                        op3 -> val
702
                | ARRAY_RANGE_REF
703
                        op0 -> inner-compref
704
                        op1 -> val
705
                        op2 -> val
706
                        op3 -> val
707
                | VIEW_CONVERT_EXPR
708
                        op0 -> inner-compref
709
 
710
   condition    : val
711
                | RELOP
712
                        op0 -> val
713
                        op1 -> val
714
 
715
   val          : ID
716
                | CONST
717
 
718
   rhs          : lhs
719
                | CONST
720
                | call-stmt
721
                | ADDR_EXPR
722
                        op0 -> addr-expr-arg
723
                | UNOP
724
                        op0 -> val
725
                | BINOP
726
                        op0 -> val
727
                        op1 -> val
728
                | RELOP
729
                        op0 -> val
730
                        op1 -> val
731
                | COND_EXPR
732
                        op0 -> condition
733
                        op1 -> val
734
                        op2 -> val
735
@end smallexample
736
 
737
@node Annotations
738
@section Annotations
739
@cindex annotations
740
 
741
The optimizers need to associate attributes with statements and
742
variables during the optimization process.  For instance, we need to
743
know what basic block a statement belongs to or whether a variable
744
has aliases.  All these attributes are stored in data structures
745
called annotations which are then linked to the field @code{ann} in
746
@code{struct tree_common}.
747
 
748
Presently, we define annotations for statements (@code{stmt_ann_t}),
749
variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}).
750
Annotations are defined and documented in @file{tree-flow.h}.
751
 
752
 
753
@node Statement Operands
754
@section Statement Operands
755
@cindex operands
756
@cindex virtual operands
757
@cindex real operands
758
@findex update_stmt
759
 
760
Almost every GIMPLE statement will contain a reference to a variable
761
or memory location.  Since statements come in different shapes and
762
sizes, their operands are going to be located at various spots inside
763
the statement's tree.  To facilitate access to the statement's
764
operands, they are organized into lists associated inside each
765
statement's annotation.  Each element in an operand list is a pointer
766
to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node.
767
This provides a very convenient way of examining and replacing
768
operands.
769
 
770
Data flow analysis and optimization is done on all tree nodes
771
representing variables.  Any node for which @code{SSA_VAR_P} returns
772
nonzero is considered when scanning statement operands.  However, not
773
all @code{SSA_VAR_P} variables are processed in the same way.  For the
774
purposes of optimization, we need to distinguish between references to
775
local scalar variables and references to globals, statics, structures,
776
arrays, aliased variables, etc.  The reason is simple, the compiler
777
can gather complete data flow information for a local scalar.  On the
778
other hand, a global variable may be modified by a function call, it
779
may not be possible to keep track of all the elements of an array or
780
the fields of a structure, etc.
781
 
782
The operand scanner gathers two kinds of operands: @dfn{real} and
783
@dfn{virtual}.  An operand for which @code{is_gimple_reg} returns true
784
is considered real, otherwise it is a virtual operand.  We also
785
distinguish between uses and definitions.  An operand is used if its
786
value is loaded by the statement (e.g., the operand at the RHS of an
787
assignment).  If the statement assigns a new value to the operand, the
788
operand is considered a definition (e.g., the operand at the LHS of
789
an assignment).
790
 
791
Virtual and real operands also have very different data flow
792
properties.  Real operands are unambiguous references to the
793
full object that they represent.  For instance, given
794
 
795
@smallexample
796
@{
797
  int a, b;
798
  a = b
799
@}
800
@end smallexample
801
 
802
Since @code{a} and @code{b} are non-aliased locals, the statement
803
@code{a = b} will have one real definition and one real use because
804
variable @code{b} is completely modified with the contents of
805
variable @code{a}.  Real definition are also known as @dfn{killing
806
definitions}.  Similarly, the use of @code{a} reads all its bits.
807
 
808
In contrast, virtual operands are used with variables that can have
809
a partial or ambiguous reference.  This includes structures, arrays,
810
globals, and aliased variables.  In these cases, we have two types of
811
definitions.  For globals, structures, and arrays, we can determine from
812
a statement whether a variable of these types has a killing definition.
813
If the variable does, then the statement is marked as having a
814
@dfn{must definition} of that variable.  However, if a statement is only
815
defining a part of the variable (i.e.@: a field in a structure), or if we
816
know that a statement might define the variable but we cannot say for sure,
817
then we mark that statement as having a @dfn{may definition}.  For
818
instance, given
819
 
820
@smallexample
821
@{
822
  int a, b, *p;
823
 
824
  if (...)
825
    p = &a;
826
  else
827
    p = &b;
828
  *p = 5;
829
  return *p;
830
@}
831
@end smallexample
832
 
833
The assignment @code{*p = 5} may be a definition of @code{a} or
834
@code{b}.  If we cannot determine statically where @code{p} is
835
pointing to at the time of the store operation, we create virtual
836
definitions to mark that statement as a potential definition site for
837
@code{a} and @code{b}.  Memory loads are similarly marked with virtual
838
use operands.  Virtual operands are shown in tree dumps right before
839
the statement that contains them.  To request a tree dump with virtual
840
operands, use the @option{-vops} option to @option{-fdump-tree}:
841
 
842
@smallexample
843
@{
844
  int a, b, *p;
845
 
846
  if (...)
847
    p = &a;
848
  else
849
    p = &b;
850
  # a = V_MAY_DEF <a>
851
  # b = V_MAY_DEF <b>
852
  *p = 5;
853
 
854
  # VUSE <a>
855
  # VUSE <b>
856
  return *p;
857
@}
858
@end smallexample
859
 
860
Notice that @code{V_MAY_DEF} operands have two copies of the referenced
861
variable.  This indicates that this is not a killing definition of
862
that variable.  In this case we refer to it as a @dfn{may definition}
863
or @dfn{aliased store}.  The presence of the second copy of the
864
variable in the @code{V_MAY_DEF} operand will become important when the
865
function is converted into SSA form.  This will be used to link all
866
the non-killing definitions to prevent optimizations from making
867
incorrect assumptions about them.
868
 
869
Operands are updated as soon as the statement is finished via a call
870
to @code{update_stmt}.  If statement elements are changed via
871
@code{SET_USE} or @code{SET_DEF}, then no further action is required
872
(ie, those macros take care of updating the statement).  If changes
873
are made by manipulating the statement's tree directly, then a call
874
must be made to @code{update_stmt} when complete.  Calling one of the
875
@code{bsi_insert} routines or @code{bsi_replace} performs an implicit
876
call to @code{update_stmt}.
877
 
878
@subsection Operand Iterators And Access Routines
879
@cindex Operand Iterators
880
@cindex Operand Access Routines
881
 
882
Operands are collected by @file{tree-ssa-operands.c}.  They are stored
883
inside each statement's annotation and can be accessed through either the
884
operand iterators or an access routine.
885
 
886
The following access routines are available for examining operands:
887
 
888
@enumerate
889
@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return
890
NULL unless there is exactly one operand matching the specified flags.  If
891
there is exactly one operand, the operand is returned as either a @code{tree},
892
@code{def_operand_p}, or @code{use_operand_p}.
893
 
894
@smallexample
895
tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags);
896
use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES);
897
def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS);
898
@end smallexample
899
 
900
@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no
901
operands matching the specified flags.
902
 
903
@smallexample
904
if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
905
  return;
906
@end smallexample
907
 
908
@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands
909
matching 'flags'.  This actually executes a loop to perform the count, so
910
only use this if it is really needed.
911
 
912
@smallexample
913
int count = NUM_SSA_OPERANDS (stmt, flags)
914
@end smallexample
915
@end enumerate
916
 
917
 
918
If you wish to iterate over some or all operands, use the
919
@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator.  For example, to print
920
all the operands for a statement:
921
 
922
@smallexample
923
void
924
print_ops (tree stmt)
925
@{
926
  ssa_op_iter;
927
  tree var;
928
 
929
  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS)
930
    print_generic_expr (stderr, var, TDF_SLIM);
931
@}
932
@end smallexample
933
 
934
 
935
How to choose the appropriate iterator:
936
 
937
@enumerate
938
@item Determine whether you are need to see the operand pointers, or just the
939
    trees, and choose the appropriate macro:
940
 
941
@smallexample
942
Need            Macro:
943
----            -------
944
use_operand_p   FOR_EACH_SSA_USE_OPERAND
945
def_operand_p   FOR_EACH_SSA_DEF_OPERAND
946
tree            FOR_EACH_SSA_TREE_OPERAND
947
@end smallexample
948
 
949
@item You need to declare a variable of the type you are interested
950
    in, and an ssa_op_iter structure which serves as the loop
951
    controlling variable.
952
 
953
@item Determine which operands you wish to use, and specify the flags of
954
    those you are interested in.  They are documented in
955
    @file{tree-ssa-operands.h}:
956
 
957
@smallexample
958
#define SSA_OP_USE              0x01    /* @r{Real USE operands.}  */
959
#define SSA_OP_DEF              0x02    /* @r{Real DEF operands.}  */
960
#define SSA_OP_VUSE             0x04    /* @r{VUSE operands.}  */
961
#define SSA_OP_VMAYUSE          0x08    /* @r{USE portion of V_MAY_DEFS.}  */
962
#define SSA_OP_VMAYDEF          0x10    /* @r{DEF portion of V_MAY_DEFS.}  */
963
#define SSA_OP_VMUSTDEF         0x20    /* @r{V_MUST_DEF definitions.}  */
964
 
965
/* @r{These are commonly grouped operand flags.}  */
966
#define SSA_OP_VIRTUAL_USES     (SSA_OP_VUSE | SSA_OP_VMAYUSE)
967
#define SSA_OP_VIRTUAL_DEFS     (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF)
968
#define SSA_OP_ALL_USES         (SSA_OP_VIRTUAL_USES | SSA_OP_USE)
969
#define SSA_OP_ALL_DEFS         (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF)
970
#define SSA_OP_ALL_OPERANDS     (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS)
971
@end smallexample
972
@end enumerate
973
 
974
So if you want to look at the use pointers for all the @code{USE} and
975
@code{VUSE} operands, you would do something like:
976
 
977
@smallexample
978
  use_operand_p use_p;
979
  ssa_op_iter iter;
980
 
981
  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE))
982
    @{
983
      process_use_ptr (use_p);
984
    @}
985
@end smallexample
986
 
987
The @code{TREE} macro is basically the same as the @code{USE} and
988
@code{DEF} macros, only with the use or def dereferenced via
989
@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}.  Since we
990
aren't using operand pointers, use and defs flags can be mixed.
991
 
992
@smallexample
993
  tree var;
994
  ssa_op_iter iter;
995
 
996
  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF)
997
    @{
998
       print_generic_expr (stderr, var, TDF_SLIM);
999
    @}
1000
@end smallexample
1001
 
1002
@code{V_MAY_DEF}s are broken into two flags, one for the
1003
@code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion
1004
(@code{SSA_OP_VMAYUSE}).  If all you want to look at are the
1005
@code{V_MAY_DEF}s together, there is a fourth iterator macro for this,
1006
which returns both a def_operand_p and a use_operand_p for each
1007
@code{V_MAY_DEF} in the statement.  Note that you don't need any flags for
1008
this one.
1009
 
1010
@smallexample
1011
  use_operand_p use_p;
1012
  def_operand_p def_p;
1013
  ssa_op_iter iter;
1014
 
1015
  FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter)
1016
    @{
1017
      my_code;
1018
    @}
1019
@end smallexample
1020
 
1021
@code{V_MUST_DEF}s are broken into two flags, one for the
1022
@code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion
1023
(@code{SSA_OP_VMUSTKILL}).  If all you want to look at are the
1024
@code{V_MUST_DEF}s together, there is a fourth iterator macro for this,
1025
which returns both a def_operand_p and a use_operand_p for each
1026
@code{V_MUST_DEF} in the statement.  Note that you don't need any flags for
1027
this one.
1028
 
1029
@smallexample
1030
  use_operand_p kill_p;
1031
  def_operand_p def_p;
1032
  ssa_op_iter iter;
1033
 
1034
  FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter)
1035
    @{
1036
      my_code;
1037
    @}
1038
@end smallexample
1039
 
1040
 
1041
There are many examples in the code as well, as well as the
1042
documentation in @file{tree-ssa-operands.h}.
1043
 
1044
There are also a couple of variants on the stmt iterators regarding PHI
1045
nodes.
1046
 
1047
@code{FOR_EACH_PHI_ARG} Works exactly like
1048
@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments
1049
instead of statement operands.
1050
 
1051
@smallexample
1052
/* Look at every virtual PHI use.  */
1053
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES)
1054
@{
1055
   my_code;
1056
@}
1057
 
1058
/* Look at every real PHI use.  */
1059
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES)
1060
  my_code;
1061
 
1062
/* Look at every every PHI use.  */
1063
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES)
1064
  my_code;
1065
@end smallexample
1066
 
1067
@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like
1068
@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on
1069
either a statement or a @code{PHI} node.  These should be used when it is
1070
appropriate but they are not quite as efficient as the individual
1071
@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines.
1072
 
1073
@smallexample
1074
FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags)
1075
  @{
1076
     my_code;
1077
  @}
1078
 
1079
FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags)
1080
  @{
1081
     my_code;
1082
  @}
1083
@end smallexample
1084
 
1085
@subsection Immediate Uses
1086
@cindex Immediate Uses
1087
 
1088
Immediate use information is now always available.  Using the immediate use
1089
iterators, you may examine every use of any @code{SSA_NAME}. For instance,
1090
to change each use of @code{ssa_var} to @code{ssa_var2}:
1091
 
1092
@smallexample
1093
  use_operand_p imm_use_p;
1094
  imm_use_iterator iterator;
1095
  tree ssa_var
1096
 
1097
  FOR_EACH_IMM_USE_SAFE (imm_use_p, iterator, ssa_var)
1098
    SET_USE (imm_use_p, ssa_var_2);
1099
@end smallexample
1100
 
1101
There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is used
1102
when the immediate uses are not changed, ie. you are looking at the uses, but
1103
not setting them.
1104
 
1105
If they do get changed, then care must be taken that things are not changed
1106
under the iterators, so use the @code{FOR_EACH_IMM_USE_SAFE} iterator.  It
1107
attempts to preserve the sanity of the use list by moving an iterator element
1108
through the use list, preventing insertions and deletions in the list from
1109
resulting in invalid pointers.  This is a little slower since it adds a
1110
placeholder element and moves it through the list.  This element must be
1111
also be removed if the loop is terminated early.  A macro
1112
(@code{BREAK_FROM SAFE_IMM_USE}) is provided for this:
1113
 
1114
@smallexample
1115
  FOR_EACH_IMM_USE_SAFE (use_p, iter, var)
1116
    @{
1117
      if (var == last_var)
1118
        BREAK_FROM_SAFE_IMM_USE (iter);
1119
      else
1120
        SET_USE (use_p, var2);
1121
    @}
1122
@end smallexample
1123
 
1124
There are checks in @code{verify_ssa} which verify that the immediate use list
1125
is up to date, as well as checking that an optimization didn't break from the
1126
loop without using this macro.  It is safe to simply 'break'; from a
1127
@code{FOR_EACH_IMM_USE_FAST} traverse.
1128
 
1129
Some useful functions and macros:
1130
@enumerate
1131
@item  @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of
1132
@code{ssa_var}.
1133
@item   @code{has_single_use (ssa_var)} : Returns true if there is only a
1134
single use of @code{ssa_var}.
1135
@item   @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} :
1136
Returns true if there is only a single use of @code{ssa_var}, and also returns
1137
the use pointer and statement it occurs in in the second and third parameters.
1138
@item   @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of
1139
@code{ssa_var}. It is better not to use this if possible since it simply
1140
utilizes a loop to count the uses.
1141
@item  @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI}
1142
node, return the index number for the use.  An assert is triggered if the use
1143
isn't located in a @code{PHI} node.
1144
@item  @code{USE_STMT (use_p)} : Return the statement a use occurs in.
1145
@end enumerate
1146
 
1147
Note that uses are not put into an immediate use list until their statement is
1148
actually inserted into the instruction stream via a @code{bsi_*} routine.
1149
 
1150
It is also still possible to utilize lazy updating of statements, but this
1151
should be used only when absolutely required.  Both alias analysis and the
1152
dominator optimizations currently do this.
1153
 
1154
When lazy updating is being used, the immediate use information is out of date
1155
and cannot be used reliably.  Lazy updating is achieved by simply marking
1156
statements modified via calls to @code{mark_stmt_modified} instead of
1157
@code{update_stmt}.  When lazy updating is no longer required, all the
1158
modified statements must have @code{update_stmt} called in order to bring them
1159
up to date.  This must be done before the optimization is finished, or
1160
@code{verify_ssa} will trigger an abort.
1161
 
1162
This is done with a simple loop over the instruction stream:
1163
@smallexample
1164
  block_stmt_iterator bsi;
1165
  basic_block bb;
1166
  FOR_EACH_BB (bb)
1167
    @{
1168
      for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
1169
        update_stmt_if_modified (bsi_stmt (bsi));
1170
    @}
1171
@end smallexample
1172
 
1173
@node SSA
1174
@section Static Single Assignment
1175
@cindex SSA
1176
@cindex static single assignment
1177
 
1178
Most of the tree optimizers rely on the data flow information provided
1179
by the Static Single Assignment (SSA) form.  We implement the SSA form
1180
as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and
1181
K. Zadeck.  Efficiently Computing Static Single Assignment Form and the
1182
Control Dependence Graph.  ACM Transactions on Programming Languages
1183
and Systems, 13(4):451-490, October 1991}.
1184
 
1185
The SSA form is based on the premise that program variables are
1186
assigned in exactly one location in the program.  Multiple assignments
1187
to the same variable create new versions of that variable.  Naturally,
1188
actual programs are seldom in SSA form initially because variables
1189
tend to be assigned multiple times.  The compiler modifies the program
1190
representation so that every time a variable is assigned in the code,
1191
a new version of the variable is created.  Different versions of the
1192
same variable are distinguished by subscripting the variable name with
1193
its version number.  Variables used in the right-hand side of
1194
expressions are renamed so that their version number matches that of
1195
the most recent assignment.
1196
 
1197
We represent variable versions using @code{SSA_NAME} nodes.  The
1198
renaming process in @file{tree-ssa.c} wraps every real and
1199
virtual operand with an @code{SSA_NAME} node which contains
1200
the version number and the statement that created the
1201
@code{SSA_NAME}.  Only definitions and virtual definitions may
1202
create new @code{SSA_NAME} nodes.
1203
 
1204
Sometimes, flow of control makes it impossible to determine what is the
1205
most recent version of a variable.  In these cases, the compiler
1206
inserts an artificial definition for that variable called
1207
@dfn{PHI function} or @dfn{PHI node}.  This new definition merges
1208
all the incoming versions of the variable to create a new name
1209
for it.  For instance,
1210
 
1211
@smallexample
1212
if (...)
1213
  a_1 = 5;
1214
else if (...)
1215
  a_2 = 2;
1216
else
1217
  a_3 = 13;
1218
 
1219
# a_4 = PHI <a_1, a_2, a_3>
1220
return a_4;
1221
@end smallexample
1222
 
1223
Since it is not possible to determine which of the three branches
1224
will be taken at runtime, we don't know which of @code{a_1},
1225
@code{a_2} or @code{a_3} to use at the return statement.  So, the
1226
SSA renamer creates a new version @code{a_4} which is assigned
1227
the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}.
1228
Hence, PHI nodes mean ``one of these operands.  I don't know
1229
which''.
1230
 
1231
The following macros can be used to examine PHI nodes
1232
 
1233
@defmac PHI_RESULT (@var{phi})
1234
Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e.,
1235
@var{phi}'s LHS)@.
1236
@end defmac
1237
 
1238
@defmac PHI_NUM_ARGS (@var{phi})
1239
Returns the number of arguments in @var{phi}.  This number is exactly
1240
the number of incoming edges to the basic block holding @var{phi}@.
1241
@end defmac
1242
 
1243
@defmac PHI_ARG_ELT (@var{phi}, @var{i})
1244
Returns a tuple representing the @var{i}th argument of @var{phi}@.
1245
Each element of this tuple contains an @code{SSA_NAME} @var{var} and
1246
the incoming edge through which @var{var} flows.
1247
@end defmac
1248
 
1249
@defmac PHI_ARG_EDGE (@var{phi}, @var{i})
1250
Returns the incoming edge for the @var{i}th argument of @var{phi}.
1251
@end defmac
1252
 
1253
@defmac PHI_ARG_DEF (@var{phi}, @var{i})
1254
Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}.
1255
@end defmac
1256
 
1257
 
1258
@subsection Preserving the SSA form
1259
@findex update_ssa
1260
@cindex preserving SSA form
1261
Some optimization passes make changes to the function that
1262
invalidate the SSA property.  This can happen when a pass has
1263
added new symbols or changed the program so that variables that
1264
were previously aliased aren't anymore.  Whenever something like this
1265
happens, the affected symbols must be renamed into SSA form again.
1266
Transformations that emit new code or replicate existing statements
1267
will also need to update the SSA form@.
1268
 
1269
Since GCC implements two different SSA forms for register and virtual
1270
variables, keeping the SSA form up to date depends on whether you are
1271
updating register or virtual names.  In both cases, the general idea
1272
behind incremental SSA updates is similar: when new SSA names are
1273
created, they typically are meant to replace other existing names in
1274
the program@.
1275
 
1276
For instance, given the following code:
1277
 
1278
@smallexample
1279
     1  L0:
1280
     2  x_1 = PHI (0, x_5)
1281
     3  if (x_1 < 10)
1282
     4    if (x_1 > 7)
1283
     5      y_2 = 0
1284
     6    else
1285
     7      y_3 = x_1 + x_7
1286
     8    endif
1287
     9    x_5 = x_1 + 1
1288
     10   goto L0;
1289
     11 endif
1290
@end smallexample
1291
 
1292
Suppose that we insert new names @code{x_10} and @code{x_11} (lines
1293
@code{4} and @code{8})@.
1294
 
1295
@smallexample
1296
     1  L0:
1297
     2  x_1 = PHI (0, x_5)
1298
     3  if (x_1 < 10)
1299
     4    x_10 = ...
1300
     5    if (x_1 > 7)
1301
     6      y_2 = 0
1302
     7    else
1303
     8      x_11 = ...
1304
     9      y_3 = x_1 + x_7
1305
     10   endif
1306
     11   x_5 = x_1 + 1
1307
     12   goto L0;
1308
     13 endif
1309
@end smallexample
1310
 
1311
We want to replace all the uses of @code{x_1} with the new definitions
1312
of @code{x_10} and @code{x_11}.  Note that the only uses that should
1313
be replaced are those at lines @code{5}, @code{9} and @code{11}.
1314
Also, the use of @code{x_7} at line @code{9} should @emph{not} be
1315
replaced (this is why we cannot just mark symbol @code{x} for
1316
renaming)@.
1317
 
1318
Additionally, we may need to insert a PHI node at line @code{11}
1319
because that is a merge point for @code{x_10} and @code{x_11}.  So the
1320
use of @code{x_1} at line @code{11} will be replaced with the new PHI
1321
node.  The insertion of PHI nodes is optional.  They are not strictly
1322
necessary to preserve the SSA form, and depending on what the caller
1323
inserted, they may not even be useful for the optimizers@.
1324
 
1325
Updating the SSA form is a two step process.  First, the pass has to
1326
identify which names need to be updated and/or which symbols need to
1327
be renamed into SSA form for the first time.  When new names are
1328
introduced to replace existing names in the program, the mapping
1329
between the old and the new names are registered by calling
1330
@code{register_new_name_mapping} (note that if your pass creates new
1331
code by duplicating basic blocks, the call to @code{tree_duplicate_bb}
1332
will set up the necessary mappings automatically).  On the other hand,
1333
if your pass exposes a new symbol that should be put in SSA form for
1334
the first time, the new symbol should be registered with
1335
@code{mark_sym_for_renaming}.
1336
 
1337
After the replacement mappings have been registered and new symbols
1338
marked for renaming, a call to @code{update_ssa} makes the registered
1339
changes.  This can be done with an explicit call or by creating
1340
@code{TODO} flags in the @code{tree_opt_pass} structure for your pass.
1341
There are several @code{TODO} flags that control the behavior of
1342
@code{update_ssa}:
1343
 
1344
@itemize @bullet
1345
@item @code{TODO_update_ssa}.  Update the SSA form inserting PHI nodes
1346
      for newly exposed symbols and virtual names marked for updating.
1347
      When updating real names, only insert PHI nodes for a real name
1348
      @code{O_j} in blocks reached by all the new and old definitions for
1349
      @code{O_j}.  If the iterated dominance frontier for @code{O_j}
1350
      is not pruned, we may end up inserting PHI nodes in blocks that
1351
      have one or more edges with no incoming definition for
1352
      @code{O_j}.  This would lead to uninitialized warnings for
1353
      @code{O_j}'s symbol@.
1354
 
1355
@item @code{TODO_update_ssa_no_phi}.  Update the SSA form without
1356
      inserting any new PHI nodes at all.  This is used by passes that
1357
      have either inserted all the PHI nodes themselves or passes that
1358
      need only to patch use-def and def-def chains for virtuals
1359
      (e.g., DCE)@.
1360
 
1361
 
1362
@item @code{TODO_update_ssa_full_phi}.  Insert PHI nodes everywhere
1363
      they are needed.  No pruning of the IDF is done.  This is used
1364
      by passes that need the PHI nodes for @code{O_j} even if it
1365
      means that some arguments will come from the default definition
1366
      of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@.
1367
 
1368
      WARNING: If you need to use this flag, chances are that your
1369
      pass may be doing something wrong.  Inserting PHI nodes for an
1370
      old name where not all edges carry a new replacement may lead to
1371
      silent codegen errors or spurious uninitialized warnings@.
1372
 
1373
@item @code{TODO_update_ssa_only_virtuals}.  Passes that update the
1374
      SSA form on their own may want to delegate the updating of
1375
      virtual names to the generic updater.  Since FUD chains are
1376
      easier to maintain, this simplifies the work they need to do.
1377
      NOTE: If this flag is used, any OLD->NEW mappings for real names
1378
      are explicitly destroyed and only the symbols marked for
1379
      renaming are processed@.
1380
@end itemize
1381
 
1382
 
1383
@subsection Examining @code{SSA_NAME} nodes
1384
@cindex examining SSA_NAMEs
1385
 
1386
The following macros can be used to examine @code{SSA_NAME} nodes
1387
 
1388
@defmac SSA_NAME_DEF_STMT (@var{var})
1389
Returns the statement @var{s} that creates the @code{SSA_NAME}
1390
@var{var}.  If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT
1391
(@var{s})} returns @code{true}), it means that the first reference to
1392
this variable is a USE or a VUSE@.
1393
@end defmac
1394
 
1395
@defmac SSA_NAME_VERSION (@var{var})
1396
Returns the version number of the @code{SSA_NAME} object @var{var}.
1397
@end defmac
1398
 
1399
 
1400
@subsection Walking use-def chains
1401
 
1402
@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data})
1403
 
1404
Walks use-def chains starting at the @code{SSA_NAME} node @var{var}.
1405
Calls function @var{fn} at each reaching definition found.  Function
1406
@var{FN} takes three arguments: @var{var}, its defining statement
1407
(@var{def_stmt}) and a generic pointer to whatever state information
1408
that @var{fn} may want to maintain (@var{data}).  Function @var{fn} is
1409
able to stop the walk by returning @code{true}, otherwise in order to
1410
continue the walk, @var{fn} should return @code{false}.
1411
 
1412
Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are
1413
slightly different.  For each argument @var{arg} of the PHI node, this
1414
function will:
1415
 
1416
@enumerate
1417
@item   Walk the use-def chains for @var{arg}.
1418
@item   Call @code{FN (@var{arg}, @var{phi}, @var{data})}.
1419
@end enumerate
1420
 
1421
Note how the first argument to @var{fn} is no longer the original
1422
variable @var{var}, but the PHI argument currently being examined.
1423
If @var{fn} wants to get at @var{var}, it should call
1424
@code{PHI_RESULT} (@var{phi}).
1425
@end deftypefn
1426
 
1427
@subsection Walking the dominator tree
1428
 
1429
@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb})
1430
 
1431
This function walks the dominator tree for the current CFG calling a
1432
set of callback functions defined in @var{struct dom_walk_data} in
1433
@file{domwalk.h}.  The call back functions you need to define give you
1434
hooks to execute custom code at various points during traversal:
1435
 
1436
@enumerate
1437
@item Once to initialize any local data needed while processing
1438
      @var{bb} and its children.  This local data is pushed into an
1439
      internal stack which is automatically pushed and popped as the
1440
      walker traverses the dominator tree.
1441
 
1442
@item Once before traversing all the statements in the @var{bb}.
1443
 
1444
@item Once for every statement inside @var{bb}.
1445
 
1446
@item Once after traversing all the statements and before recursing
1447
      into @var{bb}'s dominator children.
1448
 
1449
@item It then recurses into all the dominator children of @var{bb}.
1450
 
1451
@item After recursing into all the dominator children of @var{bb} it
1452
      can, optionally, traverse every statement in @var{bb} again
1453
      (i.e., repeating steps 2 and 3).
1454
 
1455
@item Once after walking the statements in @var{bb} and @var{bb}'s
1456
      dominator children.  At this stage, the block local data stack
1457
      is popped.
1458
@end enumerate
1459
@end deftypefn
1460
 
1461
@node Alias analysis
1462
@section Alias analysis
1463
@cindex alias
1464
@cindex flow-sensitive alias analysis
1465
@cindex flow-insensitive alias analysis
1466
 
1467
Alias analysis proceeds in 4 main phases:
1468
 
1469
@enumerate
1470
@item   Structural alias analysis.
1471
 
1472
This phase walks the types for structure variables, and determines which
1473
of the fields can overlap using offset and size of each field.  For each
1474
field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is
1475
created, which represents that field as a separate variable.  All
1476
accesses that could possibly overlap with a given field will have
1477
virtual operands for the SFT of that field.
1478
 
1479
@smallexample
1480
struct foo
1481
@{
1482
  int a;
1483
  int b;
1484
@}
1485
struct foo temp;
1486
int bar (void)
1487
@{
1488
  int tmp1, tmp2, tmp3;
1489
  SFT.0_2 = V_MUST_DEF <SFT.0_1>
1490
  temp.a = 5;
1491
  SFT.1_4 = V_MUST_DEF <SFT.1_3>
1492
  temp.b = 6;
1493
 
1494
  VUSE <SFT.1_4>
1495
  tmp1_5 = temp.b;
1496
  VUSE <SFT.0_2>
1497
  tmp2_6 = temp.a;
1498
 
1499
  tmp3_7 = tmp1_5 + tmp2_6;
1500
  return tmp3_7;
1501
@}
1502
@end smallexample
1503
 
1504
If you copy the type tag for a variable for some reason, you probably
1505
also want to copy the subvariables for that variable.
1506
 
1507
@item   Points-to and escape analysis.
1508
 
1509
This phase walks the use-def chains in the SSA web looking for
1510
three things:
1511
 
1512
        @itemize @bullet
1513
        @item   Assignments of the form @code{P_i = &VAR}
1514
        @item   Assignments of the form P_i = malloc()
1515
        @item   Pointers and ADDR_EXPR that escape the current function.
1516
        @end itemize
1517
 
1518
The concept of `escaping' is the same one used in the Java world.
1519
When a pointer or an ADDR_EXPR escapes, it means that it has been
1520
exposed outside of the current function.  So, assignment to
1521
global variables, function arguments and returning a pointer are
1522
all escape sites.
1523
 
1524
This is where we are currently limited.  Since not everything is
1525
renamed into SSA, we lose track of escape properties when a
1526
pointer is stashed inside a field in a structure, for instance.
1527
In those cases, we are assuming that the pointer does escape.
1528
 
1529
We use escape analysis to determine whether a variable is
1530
call-clobbered.  Simply put, if an ADDR_EXPR escapes, then the
1531
variable is call-clobbered.  If a pointer P_i escapes, then all
1532
the variables pointed-to by P_i (and its memory tag) also escape.
1533
 
1534
@item   Compute flow-sensitive aliases
1535
 
1536
We have two classes of memory tags.  Memory tags associated with
1537
the pointed-to data type of the pointers in the program.  These
1538
tags are called ``type memory tag'' (TMT)@.  The other class are
1539
those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@.
1540
The basic idea is that when adding operands for an INDIRECT_REF
1541
*P_i, we will first check whether P_i has a name tag, if it does
1542
we use it, because that will have more precise aliasing
1543
information.  Otherwise, we use the standard type tag.
1544
 
1545
In this phase, we go through all the pointers we found in
1546
points-to analysis and create alias sets for the name memory tags
1547
associated with each pointer P_i.  If P_i escapes, we mark
1548
call-clobbered the variables it points to and its tag.
1549
 
1550
 
1551
@item   Compute flow-insensitive aliases
1552
 
1553
This pass will compare the alias set of every type memory tag and
1554
every addressable variable found in the program.  Given a type
1555
memory tag TMT and an addressable variable V@.  If the alias sets
1556
of TMT and V conflict (as computed by may_alias_p), then V is
1557
marked as an alias tag and added to the alias set of TMT@.
1558
@end enumerate
1559
 
1560
For instance, consider the following function:
1561
 
1562
@smallexample
1563
foo (int i)
1564
@{
1565
  int *p, *q, a, b;
1566
 
1567
  if (i > 10)
1568
    p = &a;
1569
  else
1570
    q = &b;
1571
 
1572
  *p = 3;
1573
  *q = 5;
1574
  a = b + 2;
1575
  return *p;
1576
@}
1577
@end smallexample
1578
 
1579
After aliasing analysis has finished, the type memory tag for
1580
pointer @code{p} will have two aliases, namely variables @code{a} and
1581
@code{b}.
1582
Every time pointer @code{p} is dereferenced, we want to mark the
1583
operation as a potential reference to @code{a} and @code{b}.
1584
 
1585
@smallexample
1586
foo (int i)
1587
@{
1588
  int *p, a, b;
1589
 
1590
  if (i_2 > 10)
1591
    p_4 = &a;
1592
  else
1593
    p_6 = &b;
1594
  # p_1 = PHI <p_4(1), p_6(2)>;
1595
 
1596
  # a_7 = V_MAY_DEF <a_3>;
1597
  # b_8 = V_MAY_DEF <b_5>;
1598
  *p_1 = 3;
1599
 
1600
  # a_9 = V_MAY_DEF <a_7>
1601
  # VUSE <b_8>
1602
  a_9 = b_8 + 2;
1603
 
1604
  # VUSE <a_9>;
1605
  # VUSE <b_8>;
1606
  return *p_1;
1607
@}
1608
@end smallexample
1609
 
1610
In certain cases, the list of may aliases for a pointer may grow
1611
too large.  This may cause an explosion in the number of virtual
1612
operands inserted in the code.  Resulting in increased memory
1613
consumption and compilation time.
1614
 
1615
When the number of virtual operands needed to represent aliased
1616
loads and stores grows too large (configurable with @option{--param
1617
max-aliased-vops}), alias sets are grouped to avoid severe
1618
compile-time slow downs and memory consumption.  The alias
1619
grouping heuristic proceeds as follows:
1620
 
1621
@enumerate
1622
@item Sort the list of pointers in decreasing number of contributed
1623
virtual operands.
1624
 
1625
@item Take the first pointer from the list and reverse the role
1626
of the memory tag and its aliases.  Usually, whenever an
1627
aliased variable Vi is found to alias with a memory tag
1628
T, we add Vi to the may-aliases set for T@.  Meaning that
1629
after alias analysis, we will have:
1630
 
1631
@smallexample
1632
may-aliases(T) = @{ V1, V2, V3, ..., Vn @}
1633
@end smallexample
1634
 
1635
This means that every statement that references T, will get
1636
@code{n} virtual operands for each of the Vi tags.  But, when
1637
alias grouping is enabled, we make T an alias tag and add it
1638
to the alias set of all the Vi variables:
1639
 
1640
@smallexample
1641
may-aliases(V1) = @{ T @}
1642
may-aliases(V2) = @{ T @}
1643
...
1644
may-aliases(Vn) = @{ T @}
1645
@end smallexample
1646
 
1647
This has two effects: (a) statements referencing T will only get
1648
a single virtual operand, and, (b) all the variables Vi will now
1649
appear to alias each other.  So, we lose alias precision to
1650
improve compile time.  But, in theory, a program with such a high
1651
level of aliasing should not be very optimizable in the first
1652
place.
1653
 
1654
@item Since variables may be in the alias set of more than one
1655
memory tag, the grouping done in step (2) needs to be extended
1656
to all the memory tags that have a non-empty intersection with
1657
the may-aliases set of tag T@.  For instance, if we originally
1658
had these may-aliases sets:
1659
 
1660
@smallexample
1661
may-aliases(T) = @{ V1, V2, V3 @}
1662
may-aliases(R) = @{ V2, V4 @}
1663
@end smallexample
1664
 
1665
In step (2) we would have reverted the aliases for T as:
1666
 
1667
@smallexample
1668
may-aliases(V1) = @{ T @}
1669
may-aliases(V2) = @{ T @}
1670
may-aliases(V3) = @{ T @}
1671
@end smallexample
1672
 
1673
But note that now V2 is no longer aliased with R@.  We could
1674
add R to may-aliases(V2), but we are in the process of
1675
grouping aliases to reduce virtual operands so what we do is
1676
add V4 to the grouping to obtain:
1677
 
1678
@smallexample
1679
may-aliases(V1) = @{ T @}
1680
may-aliases(V2) = @{ T @}
1681
may-aliases(V3) = @{ T @}
1682
may-aliases(V4) = @{ T @}
1683
@end smallexample
1684
 
1685
@item If the total number of virtual operands due to aliasing is
1686
still above the threshold set by max-alias-vops, go back to (2).
1687
@end enumerate

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.