OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-old/] [gcc-4.2.2/] [gcc/] [doc/] [tree-ssa.texi] - Blame information for rev 847

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 38 julius
@c Copyright (c) 2004, 2005 Free Software Foundation, Inc.
2
@c Free Software Foundation, Inc.
3
@c This is part of the GCC manual.
4
@c For copying conditions, see the file gcc.texi.
5
 
6
@c ---------------------------------------------------------------------
7
@c Tree SSA
8
@c ---------------------------------------------------------------------
9
 
10
@node Tree SSA
11
@chapter Analysis and Optimization of GIMPLE Trees
12
@cindex Tree SSA
13
@cindex Optimization infrastructure for GIMPLE
14
 
15
GCC uses three main intermediate languages to represent the program
16
during compilation: GENERIC, GIMPLE and RTL@.  GENERIC is a
17
language-independent representation generated by each front end.  It
18
is used to serve as an interface between the parser and optimizer.
19
GENERIC is a common representation that is able to represent programs
20
written in all the languages supported by GCC@.
21
 
22
GIMPLE and RTL are used to optimize the program.  GIMPLE is used for
23
target and language independent optimizations (e.g., inlining,
24
constant propagation, tail call elimination, redundancy elimination,
25
etc).  Much like GENERIC, GIMPLE is a language independent, tree based
26
representation.  However, it differs from GENERIC in that the GIMPLE
27
grammar is more restrictive: expressions contain no more than 3
28
operands (except function calls), it has no control flow structures
29
and expressions with side-effects are only allowed on the right hand
30
side of assignments.  See the chapter describing GENERIC and GIMPLE
31
for more details.
32
 
33
This chapter describes the data structures and functions used in the
34
GIMPLE optimizers (also known as ``tree optimizers'' or ``middle
35
end'').  In particular, it focuses on all the macros, data structures,
36
functions and programming constructs needed to implement optimization
37
passes for GIMPLE@.
38
 
39
@menu
40
* GENERIC::             A high-level language-independent representation.
41
* GIMPLE::              A lower-level factored tree representation.
42
* Annotations::         Attributes for statements and variables.
43
* Statement Operands::  Variables referenced by GIMPLE statements.
44
* SSA::                 Static Single Assignment representation.
45
* Alias analysis::      Representing aliased loads and stores.
46
@end menu
47
 
48
@node GENERIC
49
@section GENERIC
50
@cindex GENERIC
51
 
52
The purpose of GENERIC is simply to provide a language-independent way of
53
representing an entire function in trees.  To this end, it was necessary to
54
add a few new tree codes to the back end, but most everything was already
55
there.  If you can express it with the codes in @code{gcc/tree.def}, it's
56
GENERIC@.
57
 
58
Early on, there was a great deal of debate about how to think about
59
statements in a tree IL@.  In GENERIC, a statement is defined as any
60
expression whose value, if any, is ignored.  A statement will always
61
have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a
62
non-statement expression may also have side effects.  A
63
@code{CALL_EXPR}, for instance.
64
 
65
It would be possible for some local optimizations to work on the
66
GENERIC form of a function; indeed, the adapted tree inliner works
67
fine on GENERIC, but the current compiler performs inlining after
68
lowering to GIMPLE (a restricted form described in the next section).
69
Indeed, currently the frontends perform this lowering before handing
70
off to @code{tree_rest_of_compilation}, but this seems inelegant.
71
 
72
If necessary, a front end can use some language-dependent tree codes
73
in its GENERIC representation, so long as it provides a hook for
74
converting them to GIMPLE and doesn't expect them to work with any
75
(hypothetical) optimizers that run before the conversion to GIMPLE@.
76
The intermediate representation used while parsing C and C++ looks
77
very little like GENERIC, but the C and C++ gimplifier hooks are
78
perfectly happy to take it as input and spit out GIMPLE@.
79
 
80
@node GIMPLE
81
@section GIMPLE
82
@cindex GIMPLE
83
 
84
GIMPLE is a simplified subset of GENERIC for use in optimization.  The
85
particular subset chosen (and the name) was heavily influenced by the
86
SIMPLE IL used by the McCAT compiler project at McGill University,
87
though we have made some different choices.  For one thing, SIMPLE
88
doesn't support @code{goto}; a production compiler can't afford that
89
kind of restriction.
90
 
91
GIMPLE retains much of the structure of the parse trees: lexical
92
scopes are represented as containers, rather than markers.  However,
93
expressions are broken down into a 3-address form, using temporary
94
variables to hold intermediate values.  Also, control structures are
95
lowered to gotos.
96
 
97
In GIMPLE no container node is ever used for its value; if a
98
@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a
99
temporary within the controlled blocks, and that temporary is used in
100
place of the container.
101
 
102
The compiler pass which lowers GENERIC to GIMPLE is referred to as the
103
@samp{gimplifier}.  The gimplifier works recursively, replacing complex
104
statements with sequences of simple statements.
105
 
106
@c Currently, the only way to
107
@c tell whether or not an expression is in GIMPLE form is by recursively
108
@c examining it; in the future there will probably be a flag to help avoid
109
@c redundant work.  FIXME FIXME
110
 
111
@menu
112
* Interfaces::
113
* Temporaries::
114
* GIMPLE Expressions::
115
* Statements::
116
* GIMPLE Example::
117
* Rough GIMPLE Grammar::
118
@end menu
119
 
120
@node Interfaces
121
@subsection Interfaces
122
@cindex gimplification
123
 
124
The tree representation of a function is stored in
125
@code{DECL_SAVED_TREE}.  It is lowered to GIMPLE by a call to
126
@code{gimplify_function_tree}.
127
 
128
If a front end wants to include language-specific tree codes in the tree
129
representation which it provides to the back end, it must provide a
130
definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to
131
convert the front end trees to GIMPLE@.  Usually such a hook will involve
132
much of the same code for expanding front end trees to RTL@.  This function
133
can return fully lowered GIMPLE, or it can return GENERIC trees and let the
134
main gimplifier lower them the rest of the way; this is often simpler.
135
GIMPLE that is not fully lowered is known as ``high GIMPLE'' and
136
consists of the IL before the pass @code{pass_lower_cf}.  High GIMPLE
137
still contains lexical scopes and nested expressions, while low GIMPLE
138
exposes all of the implicit jumps for control expressions like
139
@code{COND_EXPR}.
140
 
141
The C and C++ front ends currently convert directly from front end
142
trees to GIMPLE, and hand that off to the back end rather than first
143
converting to GENERIC@.  Their gimplifier hooks know about all the
144
@code{_STMT} nodes and how to convert them to GENERIC forms.  There
145
was some work done on a genericization pass which would run first, but
146
the existence of @code{STMT_EXPR} meant that in order to convert all
147
of the C statements into GENERIC equivalents would involve walking the
148
entire tree anyway, so it was simpler to lower all the way.  This
149
might change in the future if someone writes an optimization pass
150
which would work better with higher-level trees, but currently the
151
optimizers all expect GIMPLE@.
152
 
153
A front end which wants to use the tree optimizers (and already has
154
some sort of whole-function tree representation) only needs to provide
155
a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call
156
@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to
157
@code{tree_rest_of_compilation} to compile and output the function.
158
 
159
You can tell the compiler to dump a C-like representation of the GIMPLE
160
form with the flag @option{-fdump-tree-gimple}.
161
 
162
@node Temporaries
163
@subsection Temporaries
164
@cindex Temporaries
165
 
166
When gimplification encounters a subexpression which is too complex, it
167
creates a new temporary variable to hold the value of the subexpression,
168
and adds a new statement to initialize it before the current statement.
169
These special temporaries are known as @samp{expression temporaries}, and are
170
allocated using @code{get_formal_tmp_var}.  The compiler tries to
171
always evaluate identical expressions into the same temporary, to simplify
172
elimination of redundant calculations.
173
 
174
We can only use expression temporaries when we know that it will not be
175
reevaluated before its value is used, and that it will not be otherwise
176
modified@footnote{These restrictions are derived from those in Morgan 4.8.}.
177
Other temporaries can be allocated using
178
@code{get_initialized_tmp_var} or @code{create_tmp_var}.
179
 
180
Currently, an expression like @code{a = b + 5} is not reduced any
181
further.  We tried converting it to something like
182
@smallexample
183
  T1 = b + 5;
184
  a = T1;
185
@end smallexample
186
but this bloated the representation for minimal benefit.  However, a
187
variable which must live in memory cannot appear in an expression; its
188
value is explicitly loaded into a temporary first.  Similarly, storing
189
the value of an expression to a memory variable goes through a
190
temporary.
191
 
192
@node GIMPLE Expressions
193
@subsection Expressions
194
@cindex GIMPLE Expressions
195
 
196
In general, expressions in GIMPLE consist of an operation and the
197
appropriate number of simple operands; these operands must either be a
198
GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register
199
variable.  More complex operands are factored out into temporaries, so
200
that
201
@smallexample
202
  a = b + c + d
203
@end smallexample
204
becomes
205
@smallexample
206
  T1 = b + c;
207
  a = T1 + d;
208
@end smallexample
209
 
210
The same rule holds for arguments to a @code{CALL_EXPR}.
211
 
212
The target of an assignment is usually a variable, but can also be an
213
@code{INDIRECT_REF} or a compound lvalue as described below.
214
 
215
@menu
216
* Compound Expressions::
217
* Compound Lvalues::
218
* Conditional Expressions::
219
* Logical Operators::
220
@end menu
221
 
222
@node Compound Expressions
223
@subsubsection Compound Expressions
224
@cindex Compound Expressions
225
 
226
The left-hand side of a C comma expression is simply moved into a separate
227
statement.
228
 
229
@node Compound Lvalues
230
@subsubsection Compound Lvalues
231
@cindex Compound Lvalues
232
 
233
Currently compound lvalues involving array and structure field references
234
are not broken down; an expression like @code{a.b[2] = 42} is not reduced
235
any further (though complex array subscripts are).  This restriction is a
236
workaround for limitations in later optimizers; if we were to convert this
237
to
238
 
239
@smallexample
240
  T1 = &a.b;
241
  T1[2] = 42;
242
@end smallexample
243
 
244
alias analysis would not remember that the reference to @code{T1[2]} came
245
by way of @code{a.b}, so it would think that the assignment could alias
246
another member of @code{a}; this broke @code{struct-alias-1.c}.  Future
247
optimizer improvements may make this limitation unnecessary.
248
 
249
@node Conditional Expressions
250
@subsubsection Conditional Expressions
251
@cindex Conditional Expressions
252
 
253
A C @code{?:} expression is converted into an @code{if} statement with
254
each branch assigning to the same temporary.  So,
255
 
256
@smallexample
257
  a = b ? c : d;
258
@end smallexample
259
becomes
260
@smallexample
261
  if (b)
262
    T1 = c;
263
  else
264
    T1 = d;
265
  a = T1;
266
@end smallexample
267
 
268
Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate.
269
It is used to vectorize loops with conditions using vector conditional operations.
270
 
271
Note that in GIMPLE, @code{if} statements are also represented using
272
@code{COND_EXPR}, as described below.
273
 
274
@node Logical Operators
275
@subsubsection Logical Operators
276
@cindex Logical Operators
277
 
278
Except when they appear in the condition operand of a @code{COND_EXPR},
279
logical `and' and `or' operators are simplified as follows:
280
@code{a = b && c} becomes
281
 
282
@smallexample
283
  T1 = (bool)b;
284
  if (T1)
285
    T1 = (bool)c;
286
  a = T1;
287
@end smallexample
288
 
289
Note that @code{T1} in this example cannot be an expression temporary,
290
because it has two different assignments.
291
 
292
@node Statements
293
@subsection Statements
294
@cindex Statements
295
 
296
Most statements will be assignment statements, represented by
297
@code{MODIFY_EXPR}.  A @code{CALL_EXPR} whose value is ignored can
298
also be a statement.  No other C expressions can appear at statement level;
299
a reference to a volatile object is converted into a @code{MODIFY_EXPR}.
300
In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful.  Instead, use type
301
of LHS or RHS@.
302
 
303
There are also several varieties of complex statements.
304
 
305
@menu
306
* Blocks::
307
* Statement Sequences::
308
* Empty Statements::
309
* Loops::
310
* Selection Statements::
311
* Jumps::
312
* Cleanups::
313
* GIMPLE Exception Handling::
314
@end menu
315
 
316
@node Blocks
317
@subsubsection Blocks
318
@cindex Blocks
319
 
320
Block scopes and the variables they declare in GENERIC and GIMPLE are
321
expressed using the @code{BIND_EXPR} code, which in previous versions of
322
GCC was primarily used for the C statement-expression extension.
323
 
324
Variables in a block are collected into @code{BIND_EXPR_VARS} in
325
declaration order.  Any runtime initialization is moved out of
326
@code{DECL_INITIAL} and into a statement in the controlled block.  When
327
gimplifying from C or C++, this initialization replaces the
328
@code{DECL_STMT}.
329
 
330
Variable-length arrays (VLAs) complicate this process, as their size often
331
refers to variables initialized earlier in the block.  To handle this, we
332
currently split the block at that point, and move the VLA into a new, inner
333
@code{BIND_EXPR}.  This strategy may change in the future.
334
 
335
@code{DECL_SAVED_TREE} for a GIMPLE function will always be a
336
@code{BIND_EXPR} which contains declarations for the temporary variables
337
used in the function.
338
 
339
A C++ program will usually contain more @code{BIND_EXPR}s than there are
340
syntactic blocks in the source code, since several C++ constructs have
341
implicit scopes associated with them.  On the other hand, although the C++
342
front end uses pseudo-scopes to handle cleanups for objects with
343
destructors, these don't translate into the GIMPLE form; multiple
344
declarations at the same level use the same @code{BIND_EXPR}.
345
 
346
@node Statement Sequences
347
@subsubsection Statement Sequences
348
@cindex Statement Sequences
349
 
350
Multiple statements at the same nesting level are collected into a
351
@code{STATEMENT_LIST}.  Statement lists are modified and traversed
352
using the interface in @samp{tree-iterator.h}.
353
 
354
@node Empty Statements
355
@subsubsection Empty Statements
356
@cindex Empty Statements
357
 
358
Whenever possible, statements with no effect are discarded.  But if they
359
are nested within another construct which cannot be discarded for some
360
reason, they are instead replaced with an empty statement, generated by
361
@code{build_empty_stmt}.  Initially, all empty statements were shared,
362
after the pattern of the Java front end, but this caused a lot of trouble in
363
practice.
364
 
365
An empty statement is represented as @code{(void)0}.
366
 
367
@node Loops
368
@subsubsection Loops
369
@cindex Loops
370
 
371
At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but
372
now they are lowered to explicit gotos.
373
 
374
@node Selection Statements
375
@subsubsection Selection Statements
376
@cindex Selection Statements
377
 
378
A simple selection statement, such as the C @code{if} statement, is
379
expressed in GIMPLE using a void @code{COND_EXPR}.  If only one branch is
380
used, the other is filled with an empty statement.
381
 
382
Normally, the condition expression is reduced to a simple comparison.  If
383
it is a shortcut (@code{&&} or @code{||}) expression, however, we try to
384
break up the @code{if} into multiple @code{if}s so that the implied shortcut
385
is taken directly, much like the transformation done by @code{do_jump} in
386
the RTL expander.
387
 
388
A @code{SWITCH_EXPR} in GIMPLE contains the condition and a
389
@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values
390
and corresponding @code{LABEL_DECL}s to jump to.  The body of the
391
@code{switch} is moved after the @code{SWITCH_EXPR}.
392
 
393
@node Jumps
394
@subsubsection Jumps
395
@cindex Jumps
396
 
397
Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}.
398
 
399
The operand of a @code{GOTO_EXPR} must be either a label or a variable
400
containing the address to jump to.
401
 
402
The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE},
403
@code{RESULT_DECL}, or a @code{MODIFY_EXPR} which sets the return value.  It
404
would be nice to move the @code{MODIFY_EXPR} into a separate statement, but the
405
special return semantics in @code{expand_return} make that difficult.  It may
406
still happen in the future, perhaps by moving most of that logic into
407
@code{expand_assignment}.
408
 
409
@node Cleanups
410
@subsubsection Cleanups
411
@cindex Cleanups
412
 
413
Destructors for local C++ objects and similar dynamic cleanups are
414
represented in GIMPLE by a @code{TRY_FINALLY_EXPR}.
415
@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence
416
of statements to execute.  The first sequence is executed.  When it
417
completes the second sequence is executed.
418
 
419
The first sequence may complete in the following ways:
420
 
421
@enumerate
422
 
423
@item Execute the last statement in the sequence and fall off the
424
end.
425
 
426
@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary
427
label outside the sequence.
428
 
429
@item Execute a return statement (@code{RETURN_EXPR}).
430
 
431
@item Throw an exception.  This is currently not explicitly represented in
432
GIMPLE.
433
 
434
@end enumerate
435
 
436
The second sequence is not executed if the first sequence completes by
437
calling @code{setjmp} or @code{exit} or any other function that does
438
not return.  The second sequence is also not executed if the first
439
sequence completes via a non-local goto or a computed goto (in general
440
the compiler does not know whether such a goto statement exits the
441
first sequence or not, so we assume that it doesn't).
442
 
443
After the second sequence is executed, if it completes normally by
444
falling off the end, execution continues wherever the first sequence
445
would have continued, by falling off the end, or doing a goto, etc.
446
 
447
@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
448
needs to appear on every edge out of the controlled block; this
449
reduces the freedom to move code across these edges.  Therefore, the
450
EH lowering pass which runs before most of the optimization passes
451
eliminates these expressions by explicitly adding the cleanup to each
452
edge.  Rethrowing the exception is represented using @code{RESX_EXPR}.
453
 
454
 
455
@node GIMPLE Exception Handling
456
@subsubsection Exception Handling
457
@cindex GIMPLE Exception Handling
458
 
459
Other exception handling constructs are represented using
460
@code{TRY_CATCH_EXPR}.  @code{TRY_CATCH_EXPR} has two operands.  The
461
first operand is a sequence of statements to execute.  If executing
462
these statements does not throw an exception, then the second operand
463
is ignored.  Otherwise, if an exception is thrown, then the second
464
operand of the @code{TRY_CATCH_EXPR} is checked.  The second operand
465
may have the following forms:
466
 
467
@enumerate
468
 
469
@item A sequence of statements to execute.  When an exception occurs,
470
these statements are executed, and then the exception is rethrown.
471
 
472
@item A sequence of @code{CATCH_EXPR} expressions.  Each @code{CATCH_EXPR}
473
has a list of applicable exception types and handler code.  If the
474
thrown exception matches one of the caught types, the associated
475
handler code is executed.  If the handler code falls off the bottom,
476
execution continues after the original @code{TRY_CATCH_EXPR}.
477
 
478
@item An @code{EH_FILTER_EXPR} expression.  This has a list of
479
permitted exception types, and code to handle a match failure.  If the
480
thrown exception does not match one of the allowed types, the
481
associated match failure code is executed.  If the thrown exception
482
does match, it continues unwinding the stack looking for the next
483
handler.
484
 
485
@end enumerate
486
 
487
Currently throwing an exception is not directly represented in GIMPLE,
488
since it is implemented by calling a function.  At some point in the future
489
we will want to add some way to express that the call will throw an
490
exception of a known type.
491
 
492
Just before running the optimizers, the compiler lowers the high-level
493
EH constructs above into a set of @samp{goto}s, magic labels, and EH
494
regions.  Continuing to unwind at the end of a cleanup is represented
495
with a @code{RESX_EXPR}.
496
 
497
@node GIMPLE Example
498
@subsection GIMPLE Example
499
@cindex GIMPLE Example
500
 
501
@smallexample
502
struct A @{ A(); ~A(); @};
503
 
504
int i;
505
int g();
506
void f()
507
@{
508
  A a;
509
  int j = (--i, i ? 0 : 1);
510
 
511
  for (int x = 42; x > 0; --x)
512
    @{
513
      i += g()*4 + 32;
514
    @}
515
@}
516
@end smallexample
517
 
518
becomes
519
 
520
@smallexample
521
void f()
522
@{
523
  int i.0;
524
  int T.1;
525
  int iftmp.2;
526
  int T.3;
527
  int T.4;
528
  int T.5;
529
  int T.6;
530
 
531
  @{
532
    struct A a;
533
    int j;
534
 
535
    __comp_ctor (&a);
536
    try
537
      @{
538
        i.0 = i;
539
        T.1 = i.0 - 1;
540
        i = T.1;
541
        i.0 = i;
542
        if (i.0 == 0)
543
          iftmp.2 = 1;
544
        else
545
          iftmp.2 = 0;
546
        j = iftmp.2;
547
        @{
548
          int x;
549
 
550
          x = 42;
551
          goto test;
552
          loop:;
553
 
554
          T.3 = g ();
555
          T.4 = T.3 * 4;
556
          i.0 = i;
557
          T.5 = T.4 + i.0;
558
          T.6 = T.5 + 32;
559
          i = T.6;
560
          x = x - 1;
561
 
562
          test:;
563
          if (x > 0)
564
            goto loop;
565
          else
566
            goto break_;
567
          break_:;
568
        @}
569
      @}
570
    finally
571
      @{
572
        __comp_dtor (&a);
573
      @}
574
  @}
575
@}
576
@end smallexample
577
 
578
@node Rough GIMPLE Grammar
579
@subsection Rough GIMPLE Grammar
580
@cindex Rough GIMPLE Grammar
581
 
582
@smallexample
583
   function     : FUNCTION_DECL
584
                        DECL_SAVED_TREE -> compound-stmt
585
 
586
   compound-stmt: STATEMENT_LIST
587
                        members -> stmt
588
 
589
   stmt         : block
590
                | if-stmt
591
                | switch-stmt
592
                | goto-stmt
593
                | return-stmt
594
                | resx-stmt
595
                | label-stmt
596
                | try-stmt
597
                | modify-stmt
598
                | call-stmt
599
 
600
   block        : BIND_EXPR
601
                        BIND_EXPR_VARS -> chain of DECLs
602
                        BIND_EXPR_BLOCK -> BLOCK
603
                        BIND_EXPR_BODY -> compound-stmt
604
 
605
   if-stmt      : COND_EXPR
606
                        op0 -> condition
607
                        op1 -> compound-stmt
608
                        op2 -> compound-stmt
609
 
610
   switch-stmt  : SWITCH_EXPR
611
                        op0 -> val
612
                        op1 -> NULL
613
                        op2 -> TREE_VEC of CASE_LABEL_EXPRs
614
                            The CASE_LABEL_EXPRs are sorted by CASE_LOW,
615
                            and default is last.
616
 
617
   goto-stmt    : GOTO_EXPR
618
                        op0 -> LABEL_DECL | val
619
 
620
   return-stmt  : RETURN_EXPR
621
                        op0 -> return-value
622
 
623
   return-value : NULL
624
                | RESULT_DECL
625
                | MODIFY_EXPR
626
                        op0 -> RESULT_DECL
627
                        op1 -> lhs
628
 
629
   resx-stmt    : RESX_EXPR
630
 
631
   label-stmt   : LABEL_EXPR
632
                        op0 -> LABEL_DECL
633
 
634
   try-stmt     : TRY_CATCH_EXPR
635
                        op0 -> compound-stmt
636
                        op1 -> handler
637
                | TRY_FINALLY_EXPR
638
                        op0 -> compound-stmt
639
                        op1 -> compound-stmt
640
 
641
   handler      : catch-seq
642
                | EH_FILTER_EXPR
643
                | compound-stmt
644
 
645
   catch-seq    : STATEMENT_LIST
646
                        members -> CATCH_EXPR
647
 
648
   modify-stmt  : MODIFY_EXPR
649
                        op0 -> lhs
650
                        op1 -> rhs
651
 
652
   call-stmt    : CALL_EXPR
653
                        op0 -> val | OBJ_TYPE_REF
654
                        op1 -> call-arg-list
655
 
656
   call-arg-list: TREE_LIST
657
                        members -> lhs | CONST
658
 
659
   addr-expr-arg: ID
660
                | compref
661
 
662
   addressable  : addr-expr-arg
663
                | indirectref
664
 
665
   with-size-arg: addressable
666
                | call-stmt
667
 
668
   indirectref  : INDIRECT_REF
669
                        op0 -> val
670
 
671
   lhs          : addressable
672
                | bitfieldref
673
                | WITH_SIZE_EXPR
674
                        op0 -> with-size-arg
675
                        op1 -> val
676
 
677
   min-lval     : ID
678
                | indirectref
679
 
680
   bitfieldref  : BIT_FIELD_REF
681
                        op0 -> inner-compref
682
                        op1 -> CONST
683
                        op2 -> var
684
 
685
   compref      : inner-compref
686
                | TARGET_MEM_REF
687
                        op0 -> ID
688
                        op1 -> val
689
                        op2 -> val
690
                        op3 -> CONST
691
                        op4 -> CONST
692
                | REALPART_EXPR
693
                        op0 -> inner-compref
694
                | IMAGPART_EXPR
695
                        op0 -> inner-compref
696
 
697
   inner-compref: min-lval
698
                | COMPONENT_REF
699
                        op0 -> inner-compref
700
                        op1 -> FIELD_DECL
701
                        op2 -> val
702
                | ARRAY_REF
703
                        op0 -> inner-compref
704
                        op1 -> val
705
                        op2 -> val
706
                        op3 -> val
707
                | ARRAY_RANGE_REF
708
                        op0 -> inner-compref
709
                        op1 -> val
710
                        op2 -> val
711
                        op3 -> val
712
                | VIEW_CONVERT_EXPR
713
                        op0 -> inner-compref
714
 
715
   condition    : val
716
                | RELOP
717
                        op0 -> val
718
                        op1 -> val
719
 
720
   val          : ID
721
                | CONST
722
 
723
   rhs          : lhs
724
                | CONST
725
                | call-stmt
726
                | ADDR_EXPR
727
                        op0 -> addr-expr-arg
728
                | UNOP
729
                        op0 -> val
730
                | BINOP
731
                        op0 -> val
732
                        op1 -> val
733
                | RELOP
734
                        op0 -> val
735
                        op1 -> val
736
                | COND_EXPR
737
                        op0 -> condition
738
                        op1 -> val
739
                        op2 -> val
740
@end smallexample
741
 
742
@node Annotations
743
@section Annotations
744
@cindex annotations
745
 
746
The optimizers need to associate attributes with statements and
747
variables during the optimization process.  For instance, we need to
748
know what basic block a statement belongs to or whether a variable
749
has aliases.  All these attributes are stored in data structures
750
called annotations which are then linked to the field @code{ann} in
751
@code{struct tree_common}.
752
 
753
Presently, we define annotations for statements (@code{stmt_ann_t}),
754
variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}).
755
Annotations are defined and documented in @file{tree-flow.h}.
756
 
757
 
758
@node Statement Operands
759
@section Statement Operands
760
@cindex operands
761
@cindex virtual operands
762
@cindex real operands
763
@findex update_stmt
764
 
765
Almost every GIMPLE statement will contain a reference to a variable
766
or memory location.  Since statements come in different shapes and
767
sizes, their operands are going to be located at various spots inside
768
the statement's tree.  To facilitate access to the statement's
769
operands, they are organized into lists associated inside each
770
statement's annotation.  Each element in an operand list is a pointer
771
to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node.
772
This provides a very convenient way of examining and replacing
773
operands.
774
 
775
Data flow analysis and optimization is done on all tree nodes
776
representing variables.  Any node for which @code{SSA_VAR_P} returns
777
nonzero is considered when scanning statement operands.  However, not
778
all @code{SSA_VAR_P} variables are processed in the same way.  For the
779
purposes of optimization, we need to distinguish between references to
780
local scalar variables and references to globals, statics, structures,
781
arrays, aliased variables, etc.  The reason is simple, the compiler
782
can gather complete data flow information for a local scalar.  On the
783
other hand, a global variable may be modified by a function call, it
784
may not be possible to keep track of all the elements of an array or
785
the fields of a structure, etc.
786
 
787
The operand scanner gathers two kinds of operands: @dfn{real} and
788
@dfn{virtual}.  An operand for which @code{is_gimple_reg} returns true
789
is considered real, otherwise it is a virtual operand.  We also
790
distinguish between uses and definitions.  An operand is used if its
791
value is loaded by the statement (e.g., the operand at the RHS of an
792
assignment).  If the statement assigns a new value to the operand, the
793
operand is considered a definition (e.g., the operand at the LHS of
794
an assignment).
795
 
796
Virtual and real operands also have very different data flow
797
properties.  Real operands are unambiguous references to the
798
full object that they represent.  For instance, given
799
 
800
@smallexample
801
@{
802
  int a, b;
803
  a = b
804
@}
805
@end smallexample
806
 
807
Since @code{a} and @code{b} are non-aliased locals, the statement
808
@code{a = b} will have one real definition and one real use because
809
variable @code{b} is completely modified with the contents of
810
variable @code{a}.  Real definition are also known as @dfn{killing
811
definitions}.  Similarly, the use of @code{a} reads all its bits.
812
 
813
In contrast, virtual operands are used with variables that can have
814
a partial or ambiguous reference.  This includes structures, arrays,
815
globals, and aliased variables.  In these cases, we have two types of
816
definitions.  For globals, structures, and arrays, we can determine from
817
a statement whether a variable of these types has a killing definition.
818
If the variable does, then the statement is marked as having a
819
@dfn{must definition} of that variable.  However, if a statement is only
820
defining a part of the variable (i.e.@: a field in a structure), or if we
821
know that a statement might define the variable but we cannot say for sure,
822
then we mark that statement as having a @dfn{may definition}.  For
823
instance, given
824
 
825
@smallexample
826
@{
827
  int a, b, *p;
828
 
829
  if (...)
830
    p = &a;
831
  else
832
    p = &b;
833
  *p = 5;
834
  return *p;
835
@}
836
@end smallexample
837
 
838
The assignment @code{*p = 5} may be a definition of @code{a} or
839
@code{b}.  If we cannot determine statically where @code{p} is
840
pointing to at the time of the store operation, we create virtual
841
definitions to mark that statement as a potential definition site for
842
@code{a} and @code{b}.  Memory loads are similarly marked with virtual
843
use operands.  Virtual operands are shown in tree dumps right before
844
the statement that contains them.  To request a tree dump with virtual
845
operands, use the @option{-vops} option to @option{-fdump-tree}:
846
 
847
@smallexample
848
@{
849
  int a, b, *p;
850
 
851
  if (...)
852
    p = &a;
853
  else
854
    p = &b;
855
  # a = V_MAY_DEF <a>
856
  # b = V_MAY_DEF <b>
857
  *p = 5;
858
 
859
  # VUSE <a>
860
  # VUSE <b>
861
  return *p;
862
@}
863
@end smallexample
864
 
865
Notice that @code{V_MAY_DEF} operands have two copies of the referenced
866
variable.  This indicates that this is not a killing definition of
867
that variable.  In this case we refer to it as a @dfn{may definition}
868
or @dfn{aliased store}.  The presence of the second copy of the
869
variable in the @code{V_MAY_DEF} operand will become important when the
870
function is converted into SSA form.  This will be used to link all
871
the non-killing definitions to prevent optimizations from making
872
incorrect assumptions about them.
873
 
874
Operands are updated as soon as the statement is finished via a call
875
to @code{update_stmt}.  If statement elements are changed via
876
@code{SET_USE} or @code{SET_DEF}, then no further action is required
877
(i.e., those macros take care of updating the statement).  If changes
878
are made by manipulating the statement's tree directly, then a call
879
must be made to @code{update_stmt} when complete.  Calling one of the
880
@code{bsi_insert} routines or @code{bsi_replace} performs an implicit
881
call to @code{update_stmt}.
882
 
883
@subsection Operand Iterators And Access Routines
884
@cindex Operand Iterators
885
@cindex Operand Access Routines
886
 
887
Operands are collected by @file{tree-ssa-operands.c}.  They are stored
888
inside each statement's annotation and can be accessed through either the
889
operand iterators or an access routine.
890
 
891
The following access routines are available for examining operands:
892
 
893
@enumerate
894
@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return
895
NULL unless there is exactly one operand matching the specified flags.  If
896
there is exactly one operand, the operand is returned as either a @code{tree},
897
@code{def_operand_p}, or @code{use_operand_p}.
898
 
899
@smallexample
900
tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags);
901
use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES);
902
def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS);
903
@end smallexample
904
 
905
@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no
906
operands matching the specified flags.
907
 
908
@smallexample
909
if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
910
  return;
911
@end smallexample
912
 
913
@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands
914
matching 'flags'.  This actually executes a loop to perform the count, so
915
only use this if it is really needed.
916
 
917
@smallexample
918
int count = NUM_SSA_OPERANDS (stmt, flags)
919
@end smallexample
920
@end enumerate
921
 
922
 
923
If you wish to iterate over some or all operands, use the
924
@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator.  For example, to print
925
all the operands for a statement:
926
 
927
@smallexample
928
void
929
print_ops (tree stmt)
930
@{
931
  ssa_op_iter;
932
  tree var;
933
 
934
  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS)
935
    print_generic_expr (stderr, var, TDF_SLIM);
936
@}
937
@end smallexample
938
 
939
 
940
How to choose the appropriate iterator:
941
 
942
@enumerate
943
@item Determine whether you are need to see the operand pointers, or just the
944
    trees, and choose the appropriate macro:
945
 
946
@smallexample
947
Need            Macro:
948
----            -------
949
use_operand_p   FOR_EACH_SSA_USE_OPERAND
950
def_operand_p   FOR_EACH_SSA_DEF_OPERAND
951
tree            FOR_EACH_SSA_TREE_OPERAND
952
@end smallexample
953
 
954
@item You need to declare a variable of the type you are interested
955
    in, and an ssa_op_iter structure which serves as the loop
956
    controlling variable.
957
 
958
@item Determine which operands you wish to use, and specify the flags of
959
    those you are interested in.  They are documented in
960
    @file{tree-ssa-operands.h}:
961
 
962
@smallexample
963
#define SSA_OP_USE              0x01    /* @r{Real USE operands.}  */
964
#define SSA_OP_DEF              0x02    /* @r{Real DEF operands.}  */
965
#define SSA_OP_VUSE             0x04    /* @r{VUSE operands.}  */
966
#define SSA_OP_VMAYUSE          0x08    /* @r{USE portion of V_MAY_DEFS.}  */
967
#define SSA_OP_VMAYDEF          0x10    /* @r{DEF portion of V_MAY_DEFS.}  */
968
#define SSA_OP_VMUSTDEF         0x20    /* @r{V_MUST_DEF definitions.}  */
969
 
970
/* @r{These are commonly grouped operand flags.}  */
971
#define SSA_OP_VIRTUAL_USES     (SSA_OP_VUSE | SSA_OP_VMAYUSE)
972
#define SSA_OP_VIRTUAL_DEFS     (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF)
973
#define SSA_OP_ALL_USES         (SSA_OP_VIRTUAL_USES | SSA_OP_USE)
974
#define SSA_OP_ALL_DEFS         (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF)
975
#define SSA_OP_ALL_OPERANDS     (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS)
976
@end smallexample
977
@end enumerate
978
 
979
So if you want to look at the use pointers for all the @code{USE} and
980
@code{VUSE} operands, you would do something like:
981
 
982
@smallexample
983
  use_operand_p use_p;
984
  ssa_op_iter iter;
985
 
986
  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE))
987
    @{
988
      process_use_ptr (use_p);
989
    @}
990
@end smallexample
991
 
992
The @code{TREE} macro is basically the same as the @code{USE} and
993
@code{DEF} macros, only with the use or def dereferenced via
994
@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}.  Since we
995
aren't using operand pointers, use and defs flags can be mixed.
996
 
997
@smallexample
998
  tree var;
999
  ssa_op_iter iter;
1000
 
1001
  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF)
1002
    @{
1003
       print_generic_expr (stderr, var, TDF_SLIM);
1004
    @}
1005
@end smallexample
1006
 
1007
@code{V_MAY_DEF}s are broken into two flags, one for the
1008
@code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion
1009
(@code{SSA_OP_VMAYUSE}).  If all you want to look at are the
1010
@code{V_MAY_DEF}s together, there is a fourth iterator macro for this,
1011
which returns both a def_operand_p and a use_operand_p for each
1012
@code{V_MAY_DEF} in the statement.  Note that you don't need any flags for
1013
this one.
1014
 
1015
@smallexample
1016
  use_operand_p use_p;
1017
  def_operand_p def_p;
1018
  ssa_op_iter iter;
1019
 
1020
  FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter)
1021
    @{
1022
      my_code;
1023
    @}
1024
@end smallexample
1025
 
1026
@code{V_MUST_DEF}s are broken into two flags, one for the
1027
@code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion
1028
(@code{SSA_OP_VMUSTKILL}).  If all you want to look at are the
1029
@code{V_MUST_DEF}s together, there is a fourth iterator macro for this,
1030
which returns both a def_operand_p and a use_operand_p for each
1031
@code{V_MUST_DEF} in the statement.  Note that you don't need any flags for
1032
this one.
1033
 
1034
@smallexample
1035
  use_operand_p kill_p;
1036
  def_operand_p def_p;
1037
  ssa_op_iter iter;
1038
 
1039
  FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter)
1040
    @{
1041
      my_code;
1042
    @}
1043
@end smallexample
1044
 
1045
 
1046
There are many examples in the code as well, as well as the
1047
documentation in @file{tree-ssa-operands.h}.
1048
 
1049
There are also a couple of variants on the stmt iterators regarding PHI
1050
nodes.
1051
 
1052
@code{FOR_EACH_PHI_ARG} Works exactly like
1053
@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments
1054
instead of statement operands.
1055
 
1056
@smallexample
1057
/* Look at every virtual PHI use.  */
1058
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES)
1059
@{
1060
   my_code;
1061
@}
1062
 
1063
/* Look at every real PHI use.  */
1064
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES)
1065
  my_code;
1066
 
1067
/* Look at every every PHI use.  */
1068
FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES)
1069
  my_code;
1070
@end smallexample
1071
 
1072
@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like
1073
@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on
1074
either a statement or a @code{PHI} node.  These should be used when it is
1075
appropriate but they are not quite as efficient as the individual
1076
@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines.
1077
 
1078
@smallexample
1079
FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags)
1080
  @{
1081
     my_code;
1082
  @}
1083
 
1084
FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags)
1085
  @{
1086
     my_code;
1087
  @}
1088
@end smallexample
1089
 
1090
@subsection Immediate Uses
1091
@cindex Immediate Uses
1092
 
1093
Immediate use information is now always available.  Using the immediate use
1094
iterators, you may examine every use of any @code{SSA_NAME}. For instance,
1095
to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on
1096
each stmt after that is done:
1097
 
1098
@smallexample
1099
  use_operand_p imm_use_p;
1100
  imm_use_iterator iterator;
1101
  tree ssa_var, stmt;
1102
 
1103
 
1104
  FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var)
1105
    @{
1106
      FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator)
1107
        SET_USE (imm_use_p, ssa_var_2);
1108
      fold_stmt (stmt);
1109
    @}
1110
@end smallexample
1111
 
1112
There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is
1113
used when the immediate uses are not changed, i.e., you are looking at the
1114
uses, but not setting them.
1115
 
1116
If they do get changed, then care must be taken that things are not changed
1117
under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and
1118
@code{FOR_EACH_IMM_USE_ON_STMT} iterators.  They attempt to preserve the
1119
sanity of the use list by moving all the uses for a statement into
1120
a controlled position, and then iterating over those uses.  Then the
1121
optimization can manipulate the stmt when all the uses have been
1122
processed.  This is a little slower than the FAST version since it adds a
1123
placeholder element and must sort through the list a bit for each statement.
1124
This placeholder element must be also be removed if the loop is
1125
terminated early.  The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided
1126
to do this :
1127
 
1128
@smallexample
1129
  FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var)
1130
    @{
1131
      if (stmt == last_stmt)
1132
        BREAK_FROM_SAFE_IMM_USE (iter);
1133
 
1134
      FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator)
1135
        SET_USE (imm_use_p, ssa_var_2);
1136
      fold_stmt (stmt);
1137
    @}
1138
@end smallexample
1139
 
1140
There are checks in @code{verify_ssa} which verify that the immediate use list
1141
is up to date, as well as checking that an optimization didn't break from the
1142
loop without using this macro.  It is safe to simply 'break'; from a
1143
@code{FOR_EACH_IMM_USE_FAST} traverse.
1144
 
1145
Some useful functions and macros:
1146
@enumerate
1147
@item  @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of
1148
@code{ssa_var}.
1149
@item   @code{has_single_use (ssa_var)} : Returns true if there is only a
1150
single use of @code{ssa_var}.
1151
@item   @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} :
1152
Returns true if there is only a single use of @code{ssa_var}, and also returns
1153
the use pointer and statement it occurs in in the second and third parameters.
1154
@item   @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of
1155
@code{ssa_var}. It is better not to use this if possible since it simply
1156
utilizes a loop to count the uses.
1157
@item  @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI}
1158
node, return the index number for the use.  An assert is triggered if the use
1159
isn't located in a @code{PHI} node.
1160
@item  @code{USE_STMT (use_p)} : Return the statement a use occurs in.
1161
@end enumerate
1162
 
1163
Note that uses are not put into an immediate use list until their statement is
1164
actually inserted into the instruction stream via a @code{bsi_*} routine.
1165
 
1166
It is also still possible to utilize lazy updating of statements, but this
1167
should be used only when absolutely required.  Both alias analysis and the
1168
dominator optimizations currently do this.
1169
 
1170
When lazy updating is being used, the immediate use information is out of date
1171
and cannot be used reliably.  Lazy updating is achieved by simply marking
1172
statements modified via calls to @code{mark_stmt_modified} instead of
1173
@code{update_stmt}.  When lazy updating is no longer required, all the
1174
modified statements must have @code{update_stmt} called in order to bring them
1175
up to date.  This must be done before the optimization is finished, or
1176
@code{verify_ssa} will trigger an abort.
1177
 
1178
This is done with a simple loop over the instruction stream:
1179
@smallexample
1180
  block_stmt_iterator bsi;
1181
  basic_block bb;
1182
  FOR_EACH_BB (bb)
1183
    @{
1184
      for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
1185
        update_stmt_if_modified (bsi_stmt (bsi));
1186
    @}
1187
@end smallexample
1188
 
1189
@node SSA
1190
@section Static Single Assignment
1191
@cindex SSA
1192
@cindex static single assignment
1193
 
1194
Most of the tree optimizers rely on the data flow information provided
1195
by the Static Single Assignment (SSA) form.  We implement the SSA form
1196
as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and
1197
K. Zadeck.  Efficiently Computing Static Single Assignment Form and the
1198
Control Dependence Graph.  ACM Transactions on Programming Languages
1199
and Systems, 13(4):451-490, October 1991}.
1200
 
1201
The SSA form is based on the premise that program variables are
1202
assigned in exactly one location in the program.  Multiple assignments
1203
to the same variable create new versions of that variable.  Naturally,
1204
actual programs are seldom in SSA form initially because variables
1205
tend to be assigned multiple times.  The compiler modifies the program
1206
representation so that every time a variable is assigned in the code,
1207
a new version of the variable is created.  Different versions of the
1208
same variable are distinguished by subscripting the variable name with
1209
its version number.  Variables used in the right-hand side of
1210
expressions are renamed so that their version number matches that of
1211
the most recent assignment.
1212
 
1213
We represent variable versions using @code{SSA_NAME} nodes.  The
1214
renaming process in @file{tree-ssa.c} wraps every real and
1215
virtual operand with an @code{SSA_NAME} node which contains
1216
the version number and the statement that created the
1217
@code{SSA_NAME}.  Only definitions and virtual definitions may
1218
create new @code{SSA_NAME} nodes.
1219
 
1220
Sometimes, flow of control makes it impossible to determine what is the
1221
most recent version of a variable.  In these cases, the compiler
1222
inserts an artificial definition for that variable called
1223
@dfn{PHI function} or @dfn{PHI node}.  This new definition merges
1224
all the incoming versions of the variable to create a new name
1225
for it.  For instance,
1226
 
1227
@smallexample
1228
if (...)
1229
  a_1 = 5;
1230
else if (...)
1231
  a_2 = 2;
1232
else
1233
  a_3 = 13;
1234
 
1235
# a_4 = PHI <a_1, a_2, a_3>
1236
return a_4;
1237
@end smallexample
1238
 
1239
Since it is not possible to determine which of the three branches
1240
will be taken at runtime, we don't know which of @code{a_1},
1241
@code{a_2} or @code{a_3} to use at the return statement.  So, the
1242
SSA renamer creates a new version @code{a_4} which is assigned
1243
the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}.
1244
Hence, PHI nodes mean ``one of these operands.  I don't know
1245
which''.
1246
 
1247
The following macros can be used to examine PHI nodes
1248
 
1249
@defmac PHI_RESULT (@var{phi})
1250
Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e.,
1251
@var{phi}'s LHS)@.
1252
@end defmac
1253
 
1254
@defmac PHI_NUM_ARGS (@var{phi})
1255
Returns the number of arguments in @var{phi}.  This number is exactly
1256
the number of incoming edges to the basic block holding @var{phi}@.
1257
@end defmac
1258
 
1259
@defmac PHI_ARG_ELT (@var{phi}, @var{i})
1260
Returns a tuple representing the @var{i}th argument of @var{phi}@.
1261
Each element of this tuple contains an @code{SSA_NAME} @var{var} and
1262
the incoming edge through which @var{var} flows.
1263
@end defmac
1264
 
1265
@defmac PHI_ARG_EDGE (@var{phi}, @var{i})
1266
Returns the incoming edge for the @var{i}th argument of @var{phi}.
1267
@end defmac
1268
 
1269
@defmac PHI_ARG_DEF (@var{phi}, @var{i})
1270
Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}.
1271
@end defmac
1272
 
1273
 
1274
@subsection Preserving the SSA form
1275
@findex update_ssa
1276
@cindex preserving SSA form
1277
Some optimization passes make changes to the function that
1278
invalidate the SSA property.  This can happen when a pass has
1279
added new symbols or changed the program so that variables that
1280
were previously aliased aren't anymore.  Whenever something like this
1281
happens, the affected symbols must be renamed into SSA form again.
1282
Transformations that emit new code or replicate existing statements
1283
will also need to update the SSA form@.
1284
 
1285
Since GCC implements two different SSA forms for register and virtual
1286
variables, keeping the SSA form up to date depends on whether you are
1287
updating register or virtual names.  In both cases, the general idea
1288
behind incremental SSA updates is similar: when new SSA names are
1289
created, they typically are meant to replace other existing names in
1290
the program@.
1291
 
1292
For instance, given the following code:
1293
 
1294
@smallexample
1295
     1  L0:
1296
     2  x_1 = PHI (0, x_5)
1297
     3  if (x_1 < 10)
1298
     4    if (x_1 > 7)
1299
     5      y_2 = 0
1300
     6    else
1301
     7      y_3 = x_1 + x_7
1302
     8    endif
1303
     9    x_5 = x_1 + 1
1304
     10   goto L0;
1305
     11 endif
1306
@end smallexample
1307
 
1308
Suppose that we insert new names @code{x_10} and @code{x_11} (lines
1309
@code{4} and @code{8})@.
1310
 
1311
@smallexample
1312
     1  L0:
1313
     2  x_1 = PHI (0, x_5)
1314
     3  if (x_1 < 10)
1315
     4    x_10 = ...
1316
     5    if (x_1 > 7)
1317
     6      y_2 = 0
1318
     7    else
1319
     8      x_11 = ...
1320
     9      y_3 = x_1 + x_7
1321
     10   endif
1322
     11   x_5 = x_1 + 1
1323
     12   goto L0;
1324
     13 endif
1325
@end smallexample
1326
 
1327
We want to replace all the uses of @code{x_1} with the new definitions
1328
of @code{x_10} and @code{x_11}.  Note that the only uses that should
1329
be replaced are those at lines @code{5}, @code{9} and @code{11}.
1330
Also, the use of @code{x_7} at line @code{9} should @emph{not} be
1331
replaced (this is why we cannot just mark symbol @code{x} for
1332
renaming)@.
1333
 
1334
Additionally, we may need to insert a PHI node at line @code{11}
1335
because that is a merge point for @code{x_10} and @code{x_11}.  So the
1336
use of @code{x_1} at line @code{11} will be replaced with the new PHI
1337
node.  The insertion of PHI nodes is optional.  They are not strictly
1338
necessary to preserve the SSA form, and depending on what the caller
1339
inserted, they may not even be useful for the optimizers@.
1340
 
1341
Updating the SSA form is a two step process.  First, the pass has to
1342
identify which names need to be updated and/or which symbols need to
1343
be renamed into SSA form for the first time.  When new names are
1344
introduced to replace existing names in the program, the mapping
1345
between the old and the new names are registered by calling
1346
@code{register_new_name_mapping} (note that if your pass creates new
1347
code by duplicating basic blocks, the call to @code{tree_duplicate_bb}
1348
will set up the necessary mappings automatically).  On the other hand,
1349
if your pass exposes a new symbol that should be put in SSA form for
1350
the first time, the new symbol should be registered with
1351
@code{mark_sym_for_renaming}.
1352
 
1353
After the replacement mappings have been registered and new symbols
1354
marked for renaming, a call to @code{update_ssa} makes the registered
1355
changes.  This can be done with an explicit call or by creating
1356
@code{TODO} flags in the @code{tree_opt_pass} structure for your pass.
1357
There are several @code{TODO} flags that control the behavior of
1358
@code{update_ssa}:
1359
 
1360
@itemize @bullet
1361
@item @code{TODO_update_ssa}.  Update the SSA form inserting PHI nodes
1362
      for newly exposed symbols and virtual names marked for updating.
1363
      When updating real names, only insert PHI nodes for a real name
1364
      @code{O_j} in blocks reached by all the new and old definitions for
1365
      @code{O_j}.  If the iterated dominance frontier for @code{O_j}
1366
      is not pruned, we may end up inserting PHI nodes in blocks that
1367
      have one or more edges with no incoming definition for
1368
      @code{O_j}.  This would lead to uninitialized warnings for
1369
      @code{O_j}'s symbol@.
1370
 
1371
@item @code{TODO_update_ssa_no_phi}.  Update the SSA form without
1372
      inserting any new PHI nodes at all.  This is used by passes that
1373
      have either inserted all the PHI nodes themselves or passes that
1374
      need only to patch use-def and def-def chains for virtuals
1375
      (e.g., DCE)@.
1376
 
1377
 
1378
@item @code{TODO_update_ssa_full_phi}.  Insert PHI nodes everywhere
1379
      they are needed.  No pruning of the IDF is done.  This is used
1380
      by passes that need the PHI nodes for @code{O_j} even if it
1381
      means that some arguments will come from the default definition
1382
      of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@.
1383
 
1384
      WARNING: If you need to use this flag, chances are that your
1385
      pass may be doing something wrong.  Inserting PHI nodes for an
1386
      old name where not all edges carry a new replacement may lead to
1387
      silent codegen errors or spurious uninitialized warnings@.
1388
 
1389
@item @code{TODO_update_ssa_only_virtuals}.  Passes that update the
1390
      SSA form on their own may want to delegate the updating of
1391
      virtual names to the generic updater.  Since FUD chains are
1392
      easier to maintain, this simplifies the work they need to do.
1393
      NOTE: If this flag is used, any OLD->NEW mappings for real names
1394
      are explicitly destroyed and only the symbols marked for
1395
      renaming are processed@.
1396
@end itemize
1397
 
1398
@subsection Preserving the virtual SSA form
1399
@cindex preserving virtual SSA form
1400
 
1401
The virtual SSA form is harder to preserve than the non-virtual SSA form
1402
mainly because the set of virtual operands for a statement may change at
1403
what some would consider unexpected times.  In general, any time you
1404
have modified a statement that has virtual operands, you should verify
1405
whether the list of virtual operands has changed, and if so, mark the
1406
newly exposed symbols by calling @code{mark_new_vars_to_rename}.
1407
 
1408
There is one additional caveat to preserving virtual SSA form.  When the
1409
entire set of virtual operands may be eliminated due to better
1410
disambiguation, a bare SMT will be added to the list of virtual
1411
operands, to signify the non-visible aliases that the are still being
1412
referenced.  If the set of bare SMT's may change,
1413
@code{TODO_update_smt_usage} should be added to the todo flags.
1414
 
1415
With the current pruning code, this can only occur when constants are
1416
propagated into array references that were previously non-constant, or
1417
address expressions are propagated into their uses.
1418
 
1419
@subsection Examining @code{SSA_NAME} nodes
1420
@cindex examining SSA_NAMEs
1421
 
1422
The following macros can be used to examine @code{SSA_NAME} nodes
1423
 
1424
@defmac SSA_NAME_DEF_STMT (@var{var})
1425
Returns the statement @var{s} that creates the @code{SSA_NAME}
1426
@var{var}.  If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT
1427
(@var{s})} returns @code{true}), it means that the first reference to
1428
this variable is a USE or a VUSE@.
1429
@end defmac
1430
 
1431
@defmac SSA_NAME_VERSION (@var{var})
1432
Returns the version number of the @code{SSA_NAME} object @var{var}.
1433
@end defmac
1434
 
1435
 
1436
@subsection Walking use-def chains
1437
 
1438
@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data})
1439
 
1440
Walks use-def chains starting at the @code{SSA_NAME} node @var{var}.
1441
Calls function @var{fn} at each reaching definition found.  Function
1442
@var{FN} takes three arguments: @var{var}, its defining statement
1443
(@var{def_stmt}) and a generic pointer to whatever state information
1444
that @var{fn} may want to maintain (@var{data}).  Function @var{fn} is
1445
able to stop the walk by returning @code{true}, otherwise in order to
1446
continue the walk, @var{fn} should return @code{false}.
1447
 
1448
Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are
1449
slightly different.  For each argument @var{arg} of the PHI node, this
1450
function will:
1451
 
1452
@enumerate
1453
@item   Walk the use-def chains for @var{arg}.
1454
@item   Call @code{FN (@var{arg}, @var{phi}, @var{data})}.
1455
@end enumerate
1456
 
1457
Note how the first argument to @var{fn} is no longer the original
1458
variable @var{var}, but the PHI argument currently being examined.
1459
If @var{fn} wants to get at @var{var}, it should call
1460
@code{PHI_RESULT} (@var{phi}).
1461
@end deftypefn
1462
 
1463
@subsection Walking the dominator tree
1464
 
1465
@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb})
1466
 
1467
This function walks the dominator tree for the current CFG calling a
1468
set of callback functions defined in @var{struct dom_walk_data} in
1469
@file{domwalk.h}.  The call back functions you need to define give you
1470
hooks to execute custom code at various points during traversal:
1471
 
1472
@enumerate
1473
@item Once to initialize any local data needed while processing
1474
      @var{bb} and its children.  This local data is pushed into an
1475
      internal stack which is automatically pushed and popped as the
1476
      walker traverses the dominator tree.
1477
 
1478
@item Once before traversing all the statements in the @var{bb}.
1479
 
1480
@item Once for every statement inside @var{bb}.
1481
 
1482
@item Once after traversing all the statements and before recursing
1483
      into @var{bb}'s dominator children.
1484
 
1485
@item It then recurses into all the dominator children of @var{bb}.
1486
 
1487
@item After recursing into all the dominator children of @var{bb} it
1488
      can, optionally, traverse every statement in @var{bb} again
1489
      (i.e., repeating steps 2 and 3).
1490
 
1491
@item Once after walking the statements in @var{bb} and @var{bb}'s
1492
      dominator children.  At this stage, the block local data stack
1493
      is popped.
1494
@end enumerate
1495
@end deftypefn
1496
 
1497
@node Alias analysis
1498
@section Alias analysis
1499
@cindex alias
1500
@cindex flow-sensitive alias analysis
1501
@cindex flow-insensitive alias analysis
1502
 
1503
Alias analysis proceeds in 4 main phases:
1504
 
1505
@enumerate
1506
@item   Structural alias analysis.
1507
 
1508
This phase walks the types for structure variables, and determines which
1509
of the fields can overlap using offset and size of each field.  For each
1510
field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is
1511
created, which represents that field as a separate variable.  All
1512
accesses that could possibly overlap with a given field will have
1513
virtual operands for the SFT of that field.
1514
 
1515
@smallexample
1516
struct foo
1517
@{
1518
  int a;
1519
  int b;
1520
@}
1521
struct foo temp;
1522
int bar (void)
1523
@{
1524
  int tmp1, tmp2, tmp3;
1525
  SFT.0_2 = V_MUST_DEF <SFT.0_1>
1526
  temp.a = 5;
1527
  SFT.1_4 = V_MUST_DEF <SFT.1_3>
1528
  temp.b = 6;
1529
 
1530
  VUSE <SFT.1_4>
1531
  tmp1_5 = temp.b;
1532
  VUSE <SFT.0_2>
1533
  tmp2_6 = temp.a;
1534
 
1535
  tmp3_7 = tmp1_5 + tmp2_6;
1536
  return tmp3_7;
1537
@}
1538
@end smallexample
1539
 
1540
If you copy the symbol tag for a variable for some reason, you probably
1541
also want to copy the subvariables for that variable.
1542
 
1543
@item   Points-to and escape analysis.
1544
 
1545
This phase walks the use-def chains in the SSA web looking for
1546
three things:
1547
 
1548
        @itemize @bullet
1549
        @item   Assignments of the form @code{P_i = &VAR}
1550
        @item   Assignments of the form P_i = malloc()
1551
        @item   Pointers and ADDR_EXPR that escape the current function.
1552
        @end itemize
1553
 
1554
The concept of `escaping' is the same one used in the Java world.
1555
When a pointer or an ADDR_EXPR escapes, it means that it has been
1556
exposed outside of the current function.  So, assignment to
1557
global variables, function arguments and returning a pointer are
1558
all escape sites.
1559
 
1560
This is where we are currently limited.  Since not everything is
1561
renamed into SSA, we lose track of escape properties when a
1562
pointer is stashed inside a field in a structure, for instance.
1563
In those cases, we are assuming that the pointer does escape.
1564
 
1565
We use escape analysis to determine whether a variable is
1566
call-clobbered.  Simply put, if an ADDR_EXPR escapes, then the
1567
variable is call-clobbered.  If a pointer P_i escapes, then all
1568
the variables pointed-to by P_i (and its memory tag) also escape.
1569
 
1570
@item   Compute flow-sensitive aliases
1571
 
1572
We have two classes of memory tags.  Memory tags associated with
1573
the pointed-to data type of the pointers in the program.  These
1574
tags are called ``symbol memory tag'' (SMT)@.  The other class are
1575
those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@.
1576
The basic idea is that when adding operands for an INDIRECT_REF
1577
*P_i, we will first check whether P_i has a name tag, if it does
1578
we use it, because that will have more precise aliasing
1579
information.  Otherwise, we use the standard symbol tag.
1580
 
1581
In this phase, we go through all the pointers we found in
1582
points-to analysis and create alias sets for the name memory tags
1583
associated with each pointer P_i.  If P_i escapes, we mark
1584
call-clobbered the variables it points to and its tag.
1585
 
1586
 
1587
@item   Compute flow-insensitive aliases
1588
 
1589
This pass will compare the alias set of every symbol memory tag and
1590
every addressable variable found in the program.  Given a symbol
1591
memory tag SMT and an addressable variable V@.  If the alias sets
1592
of SMT and V conflict (as computed by may_alias_p), then V is
1593
marked as an alias tag and added to the alias set of SMT@.
1594
@end enumerate
1595
 
1596
For instance, consider the following function:
1597
 
1598
@smallexample
1599
foo (int i)
1600
@{
1601
  int *p, *q, a, b;
1602
 
1603
  if (i > 10)
1604
    p = &a;
1605
  else
1606
    q = &b;
1607
 
1608
  *p = 3;
1609
  *q = 5;
1610
  a = b + 2;
1611
  return *p;
1612
@}
1613
@end smallexample
1614
 
1615
After aliasing analysis has finished, the symbol memory tag for
1616
pointer @code{p} will have two aliases, namely variables @code{a} and
1617
@code{b}.
1618
Every time pointer @code{p} is dereferenced, we want to mark the
1619
operation as a potential reference to @code{a} and @code{b}.
1620
 
1621
@smallexample
1622
foo (int i)
1623
@{
1624
  int *p, a, b;
1625
 
1626
  if (i_2 > 10)
1627
    p_4 = &a;
1628
  else
1629
    p_6 = &b;
1630
  # p_1 = PHI <p_4(1), p_6(2)>;
1631
 
1632
  # a_7 = V_MAY_DEF <a_3>;
1633
  # b_8 = V_MAY_DEF <b_5>;
1634
  *p_1 = 3;
1635
 
1636
  # a_9 = V_MAY_DEF <a_7>
1637
  # VUSE <b_8>
1638
  a_9 = b_8 + 2;
1639
 
1640
  # VUSE <a_9>;
1641
  # VUSE <b_8>;
1642
  return *p_1;
1643
@}
1644
@end smallexample
1645
 
1646
In certain cases, the list of may aliases for a pointer may grow
1647
too large.  This may cause an explosion in the number of virtual
1648
operands inserted in the code.  Resulting in increased memory
1649
consumption and compilation time.
1650
 
1651
When the number of virtual operands needed to represent aliased
1652
loads and stores grows too large (configurable with @option{--param
1653
max-aliased-vops}), alias sets are grouped to avoid severe
1654
compile-time slow downs and memory consumption.  The alias
1655
grouping heuristic proceeds as follows:
1656
 
1657
@enumerate
1658
@item Sort the list of pointers in decreasing number of contributed
1659
virtual operands.
1660
 
1661
@item Take the first pointer from the list and reverse the role
1662
of the memory tag and its aliases.  Usually, whenever an
1663
aliased variable Vi is found to alias with a memory tag
1664
T, we add Vi to the may-aliases set for T@.  Meaning that
1665
after alias analysis, we will have:
1666
 
1667
@smallexample
1668
may-aliases(T) = @{ V1, V2, V3, ..., Vn @}
1669
@end smallexample
1670
 
1671
This means that every statement that references T, will get
1672
@code{n} virtual operands for each of the Vi tags.  But, when
1673
alias grouping is enabled, we make T an alias tag and add it
1674
to the alias set of all the Vi variables:
1675
 
1676
@smallexample
1677
may-aliases(V1) = @{ T @}
1678
may-aliases(V2) = @{ T @}
1679
...
1680
may-aliases(Vn) = @{ T @}
1681
@end smallexample
1682
 
1683
This has two effects: (a) statements referencing T will only get
1684
a single virtual operand, and, (b) all the variables Vi will now
1685
appear to alias each other.  So, we lose alias precision to
1686
improve compile time.  But, in theory, a program with such a high
1687
level of aliasing should not be very optimizable in the first
1688
place.
1689
 
1690
@item Since variables may be in the alias set of more than one
1691
memory tag, the grouping done in step (2) needs to be extended
1692
to all the memory tags that have a non-empty intersection with
1693
the may-aliases set of tag T@.  For instance, if we originally
1694
had these may-aliases sets:
1695
 
1696
@smallexample
1697
may-aliases(T) = @{ V1, V2, V3 @}
1698
may-aliases(R) = @{ V2, V4 @}
1699
@end smallexample
1700
 
1701
In step (2) we would have reverted the aliases for T as:
1702
 
1703
@smallexample
1704
may-aliases(V1) = @{ T @}
1705
may-aliases(V2) = @{ T @}
1706
may-aliases(V3) = @{ T @}
1707
@end smallexample
1708
 
1709
But note that now V2 is no longer aliased with R@.  We could
1710
add R to may-aliases(V2), but we are in the process of
1711
grouping aliases to reduce virtual operands so what we do is
1712
add V4 to the grouping to obtain:
1713
 
1714
@smallexample
1715
may-aliases(V1) = @{ T @}
1716
may-aliases(V2) = @{ T @}
1717
may-aliases(V3) = @{ T @}
1718
may-aliases(V4) = @{ T @}
1719
@end smallexample
1720
 
1721
@item If the total number of virtual operands due to aliasing is
1722
still above the threshold set by max-alias-vops, go back to (2).
1723
@end enumerate

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.