OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-dev/] [or1k-gcc/] [gcc/] [doc/] [loop.texi] - Blame information for rev 731

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 711 jeremybenn
@c Copyright (c) 2006, 2007, 2008 Free Software Foundation, Inc.
2
@c Free Software Foundation, Inc.
3
@c This is part of the GCC manual.
4
@c For copying conditions, see the file gcc.texi.
5
 
6
@c ---------------------------------------------------------------------
7
@c Loop Representation
8
@c ---------------------------------------------------------------------
9
 
10
@node Loop Analysis and Representation
11
@chapter Analysis and Representation of Loops
12
 
13
GCC provides extensive infrastructure for work with natural loops, i.e.,
14
strongly connected components of CFG with only one entry block.  This
15
chapter describes representation of loops in GCC, both on GIMPLE and in
16
RTL, as well as the interfaces to loop-related analyses (induction
17
variable analysis and number of iterations analysis).
18
 
19
@menu
20
* Loop representation::         Representation and analysis of loops.
21
* Loop querying::               Getting information about loops.
22
* Loop manipulation::           Loop manipulation functions.
23
* LCSSA::                       Loop-closed SSA form.
24
* Scalar evolutions::           Induction variables on GIMPLE.
25
* loop-iv::                     Induction variables on RTL.
26
* Number of iterations::        Number of iterations analysis.
27
* Dependency analysis::         Data dependency analysis.
28
* Lambda::                      Linear loop transformations framework.
29
* Omega::                       A solver for linear programming problems.
30
@end menu
31
 
32
@node Loop representation
33
@section Loop representation
34
@cindex Loop representation
35
@cindex Loop analysis
36
 
37
This chapter describes the representation of loops in GCC, and functions
38
that can be used to build, modify and analyze this representation.  Most
39
of the interfaces and data structures are declared in @file{cfgloop.h}.
40
At the moment, loop structures are analyzed and this information is
41
updated only by the optimization passes that deal with loops, but some
42
efforts are being made to make it available throughout most of the
43
optimization passes.
44
 
45
In general, a natural loop has one entry block (header) and possibly
46
several back edges (latches) leading to the header from the inside of
47
the loop.  Loops with several latches may appear if several loops share
48
a single header, or if there is a branching in the middle of the loop.
49
The representation of loops in GCC however allows only loops with a
50
single latch.  During loop analysis, headers of such loops are split and
51
forwarder blocks are created in order to disambiguate their structures.
52
Heuristic based on profile information and structure of the induction
53
variables in the loops is used to determine whether the latches
54
correspond to sub-loops or to control flow in a single loop.  This means
55
that the analysis sometimes changes the CFG, and if you run it in the
56
middle of an optimization pass, you must be able to deal with the new
57
blocks.  You may avoid CFG changes by passing
58
@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery,
59
note however that most other loop manipulation functions will not work
60
correctly for loops with multiple latch edges (the functions that only
61
query membership of blocks to loops and subloop relationships, or
62
enumerate and test loop exits, can be expected to work).
63
 
64
Body of the loop is the set of blocks that are dominated by its header,
65
and reachable from its latch against the direction of edges in CFG@.  The
66
loops are organized in a containment hierarchy (tree) such that all the
67
loops immediately contained inside loop L are the children of L in the
68
tree.  This tree is represented by the @code{struct loops} structure.
69
The root of this tree is a fake loop that contains all blocks in the
70
function.  Each of the loops is represented in a @code{struct loop}
71
structure.  Each loop is assigned an index (@code{num} field of the
72
@code{struct loop} structure), and the pointer to the loop is stored in
73
the corresponding field of the @code{larray} vector in the loops
74
structure.  The indices do not have to be continuous, there may be
75
empty (@code{NULL}) entries in the @code{larray} created by deleting
76
loops.  Also, there is no guarantee on the relative order of a loop
77
and its subloops in the numbering.  The index of a loop never changes.
78
 
79
The entries of the @code{larray} field should not be accessed directly.
80
The function @code{get_loop} returns the loop description for a loop with
81
the given index.  @code{number_of_loops} function returns number of
82
loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
83
macro.  The @code{flags} argument of the macro is used to determine
84
the direction of traversal and the set of loops visited.  Each loop is
85
guaranteed to be visited exactly once, regardless of the changes to the
86
loop tree, and the loops may be removed during the traversal.  The newly
87
created loops are never traversed, if they need to be visited, this
88
must be done separately after their creation.  The @code{FOR_EACH_LOOP}
89
macro allocates temporary variables.  If the @code{FOR_EACH_LOOP} loop
90
were ended using break or goto, they would not be released;
91
@code{FOR_EACH_LOOP_BREAK} macro must be used instead.
92
 
93
Each basic block contains the reference to the innermost loop it belongs
94
to (@code{loop_father}).  For this reason, it is only possible to have
95
one @code{struct loops} structure initialized at the same time for each
96
CFG@.  The global variable @code{current_loops} contains the
97
@code{struct loops} structure.  Many of the loop manipulation functions
98
assume that dominance information is up-to-date.
99
 
100
The loops are analyzed through @code{loop_optimizer_init} function.  The
101
argument of this function is a set of flags represented in an integer
102
bitmask.  These flags specify what other properties of the loop
103
structures should be calculated/enforced and preserved later:
104
 
105
@itemize
106
@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no
107
changes to CFG will be performed in the loop analysis, in particular,
108
loops with multiple latch edges will not be disambiguated.  If a loop
109
has multiple latches, its latch block is set to NULL@.  Most of
110
the loop manipulation functions will not work for loops in this shape.
111
No other flags that require CFG changes can be passed to
112
loop_optimizer_init.
113
@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
114
a way that each loop has only one entry edge, and additionally, the
115
source block of this entry edge has only one successor.  This creates a
116
natural place where the code can be moved out of the loop, and ensures
117
that the entry edge of the loop leads from its immediate super-loop.
118
@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
119
force the latch block of each loop to have only one successor.  This
120
ensures that the latch of the loop does not belong to any of its
121
sub-loops, and makes manipulation with the loops significantly easier.
122
Most of the loop manipulation functions assume that the loops are in
123
this shape.  Note that with this flag, the ``normal'' loop without any
124
control flow inside and with one exit consists of two basic blocks.
125
@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
126
edges in the strongly connected components that are not natural loops
127
(have more than one entry block) are marked with
128
@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
129
flag is not set for blocks and edges that belong to natural loops that
130
are in such an irreducible region (but it is set for the entry and exit
131
edges of such a loop, if they lead to/from this region).
132
@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded
133
and updated for each loop.  This makes some functions (e.g.,
134
@code{get_loop_exit_edges}) more efficient.  Some functions (e.g.,
135
@code{single_exit}) can be used only if the lists of exits are
136
recorded.
137
@end itemize
138
 
139
These properties may also be computed/enforced later, using functions
140
@code{create_preheaders}, @code{force_single_succ_latches},
141
@code{mark_irreducible_loops} and @code{record_loop_exits}.
142
 
143
The memory occupied by the loops structures should be freed with
144
@code{loop_optimizer_finalize} function.
145
 
146
The CFG manipulation functions in general do not update loop structures.
147
Specialized versions that additionally do so are provided for the most
148
common tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
149
used to cleanup CFG while updating the loops structures if
150
@code{current_loops} is set.
151
 
152
@node Loop querying
153
@section Loop querying
154
@cindex Loop querying
155
 
156
The functions to query the information about loops are declared in
157
@file{cfgloop.h}.  Some of the information can be taken directly from
158
the structures.  @code{loop_father} field of each basic block contains
159
the innermost loop to that the block belongs.  The most useful fields of
160
loop structure (that are kept up-to-date at all times) are:
161
 
162
@itemize
163
@item @code{header}, @code{latch}: Header and latch basic blocks of the
164
loop.
165
@item @code{num_nodes}: Number of basic blocks in the loop (including
166
the basic blocks of the sub-loops).
167
@item @code{depth}: The depth of the loop in the loops tree, i.e., the
168
number of super-loops of the loop.
169
@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
170
sub-loop, and the sibling of the loop in the loops tree.
171
@end itemize
172
 
173
There are other fields in the loop structures, many of them used only by
174
some of the passes, or not updated during CFG changes; in general, they
175
should not be accessed directly.
176
 
177
The most important functions to query loop structures are:
178
 
179
@itemize
180
@item @code{flow_loops_dump}: Dumps the information about loops to a
181
file.
182
@item @code{verify_loop_structure}: Checks consistency of the loop
183
structures.
184
@item @code{loop_latch_edge}: Returns the latch edge of a loop.
185
@item @code{loop_preheader_edge}: If loops have preheaders, returns
186
the preheader edge of a loop.
187
@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
188
another loop.
189
@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
190
to a loop (including its sub-loops).
191
@item @code{find_common_loop}: Finds the common super-loop of two loops.
192
@item @code{superloop_at_depth}: Returns the super-loop of a loop with
193
the given depth.
194
@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
195
number of insns in the loop, on GIMPLE and on RTL.
196
@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
197
loop.
198
@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
199
with @code{EDGE_LOOP_EXIT} flag.
200
@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
201
@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
202
loop in depth-first search order in reversed CFG, ordered by dominance
203
relation, and breath-first search order, respectively.
204
@item @code{single_exit}: Returns the single exit edge of the loop, or
205
@code{NULL} if the loop has more than one exit.  You can only use this
206
function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used.
207
@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
208
@item @code{just_once_each_iteration_p}: Returns true if the basic block
209
is executed exactly once during each iteration of a loop (that is, it
210
does not belong to a sub-loop, and it dominates the latch of the loop).
211
@end itemize
212
 
213
@node Loop manipulation
214
@section Loop manipulation
215
@cindex Loop manipulation
216
 
217
The loops tree can be manipulated using the following functions:
218
 
219
@itemize
220
@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
221
@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
222
@item @code{add_bb_to_loop}: Adds a basic block to a loop.
223
@item @code{remove_bb_from_loops}: Removes a basic block from loops.
224
@end itemize
225
 
226
Most low-level CFG functions update loops automatically.  The following
227
functions handle some more complicated cases of CFG manipulations:
228
 
229
@itemize
230
@item @code{remove_path}: Removes an edge and all blocks it dominates.
231
@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
232
ensuring that PHI node arguments remain in the loop (this ensures that
233
loop-closed SSA form is preserved).  Only useful on GIMPLE.
234
@end itemize
235
 
236
Finally, there are some higher-level loop transformations implemented.
237
While some of them are written so that they should work on non-innermost
238
loops, they are mostly untested in that case, and at the moment, they
239
are only reliable for the innermost loops:
240
 
241
@itemize
242
@item @code{create_iv}: Creates a new induction variable.  Only works on
243
GIMPLE@.  @code{standard_iv_increment_position} can be used to find a
244
suitable place for the iv increment.
245
@item @code{duplicate_loop_to_header_edge},
246
@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
247
on GIMPLE) duplicate the body of the loop prescribed number of times on
248
one of the edges entering loop header, thus performing either loop
249
unrolling or loop peeling.  @code{can_duplicate_loop_p}
250
(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
251
loop.
252
@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
253
create a copy of a loop, and a branch before them that selects one of
254
them depending on the prescribed condition.  This is useful for
255
optimizations that need to verify some assumptions in runtime (one of
256
the copies of the loop is usually left unchanged, while the other one is
257
transformed in some way).
258
@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
259
extra iterations to make the number of iterations divisible by unroll
260
factor, updating the exit condition, and removing the exits that now
261
cannot be taken.  Works only on GIMPLE.
262
@end itemize
263
 
264
@node LCSSA
265
@section Loop-closed SSA form
266
@cindex LCSSA
267
@cindex Loop-closed SSA form
268
 
269
Throughout the loop optimizations on tree level, one extra condition is
270
enforced on the SSA form:  No SSA name is used outside of the loop in
271
that it is defined.  The SSA form satisfying this condition is called
272
``loop-closed SSA form'' -- LCSSA@.  To enforce LCSSA, PHI nodes must be
273
created at the exits of the loops for the SSA names that are used
274
outside of them.  Only the real operands (not virtual SSA names) are
275
held in LCSSA, in order to save memory.
276
 
277
There are various benefits of LCSSA:
278
 
279
@itemize
280
@item Many optimizations (value range analysis, final value
281
replacement) are interested in the values that are defined in the loop
282
and used outside of it, i.e., exactly those for that we create new PHI
283
nodes.
284
@item In induction variable analysis, it is not necessary to specify the
285
loop in that the analysis should be performed -- the scalar evolution
286
analysis always returns the results with respect to the loop in that the
287
SSA name is defined.
288
@item It makes updating of SSA form during loop transformations simpler.
289
Without LCSSA, operations like loop unrolling may force creation of PHI
290
nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
291
updated locally.  However, since we only keep real operands in LCSSA, we
292
cannot use this advantage (we could have local updating of real
293
operands, but it is not much more efficient than to use generic SSA form
294
updating for it as well; the amount of changes to SSA is the same).
295
@end itemize
296
 
297
However, it also means LCSSA must be updated.  This is usually
298
straightforward, unless you create a new value in loop and use it
299
outside, or unless you manipulate loop exit edges (functions are
300
provided to make these manipulations simple).
301
@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
302
LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
303
LCSSA is preserved.
304
 
305
@node Scalar evolutions
306
@section Scalar evolutions
307
@cindex Scalar evolutions
308
@cindex IV analysis on GIMPLE
309
 
310
Scalar evolutions (SCEV) are used to represent results of induction
311
variable analysis on GIMPLE@.  They enable us to represent variables with
312
complicated behavior in a simple and consistent way (we only use it to
313
express values of polynomial induction variables, but it is possible to
314
extend it).  The interfaces to SCEV analysis are declared in
315
@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
316
@code{scev_initialize} must be used.  To stop using SCEV,
317
@code{scev_finalize} should be used.  SCEV analysis caches results in
318
order to save time and memory.  This cache however is made invalid by
319
most of the loop transformations, including removal of code.  If such a
320
transformation is performed, @code{scev_reset} must be called to clean
321
the caches.
322
 
323
Given an SSA name, its behavior in loops can be analyzed using the
324
@code{analyze_scalar_evolution} function.  The returned SCEV however
325
does not have to be fully analyzed and it may contain references to
326
other SSA names defined in the loop.  To resolve these (potentially
327
recursive) references, @code{instantiate_parameters} or
328
@code{resolve_mixers} functions must be used.
329
@code{instantiate_parameters} is useful when you use the results of SCEV
330
only for some analysis, and when you work with whole nest of loops at
331
once.  It will try replacing all SSA names by their SCEV in all loops,
332
including the super-loops of the current loop, thus providing a complete
333
information about the behavior of the variable in the loop nest.
334
@code{resolve_mixers} is useful if you work with only one loop at a
335
time, and if you possibly need to create code based on the value of the
336
induction variable.  It will only resolve the SSA names defined in the
337
current loop, leaving the SSA names defined outside unchanged, even if
338
their evolution in the outer loops is known.
339
 
340
The SCEV is a normal tree expression, except for the fact that it may
341
contain several special tree nodes.  One of them is
342
@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
343
expressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
344
has three arguments -- base, step and loop (both base and step may
345
contain further polynomial chrecs).  Type of the expression and of base
346
and step must be the same.  A variable has evolution
347
@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
348
loop) equivalent to @code{x_1} in the following example
349
 
350
@smallexample
351
while (@dots{})
352
  @{
353
    x_1 = phi (base, x_2);
354
    x_2 = x_1 + step;
355
  @}
356
@end smallexample
357
 
358
Note that this includes the language restrictions on the operations.
359
For example, if we compile C code and @code{x} has signed type, then the
360
overflow in addition would cause undefined behavior, and we may assume
361
that this does not happen.  Hence, the value with this SCEV cannot
362
overflow (which restricts the number of iterations of such a loop).
363
 
364
In many cases, one wants to restrict the attention just to affine
365
induction variables.  In this case, the extra expressive power of SCEV
366
is not useful, and may complicate the optimizations.  In this case,
367
@code{simple_iv} function may be used to analyze a value -- the result
368
is a loop-invariant base and step.
369
 
370
@node loop-iv
371
@section IV analysis on RTL
372
@cindex IV analysis on RTL
373
 
374
The induction variable on RTL is simple and only allows analysis of
375
affine induction variables, and only in one loop at once.  The interface
376
is declared in @file{cfgloop.h}.  Before analyzing induction variables
377
in a loop L, @code{iv_analysis_loop_init} function must be called on L.
378
After the analysis (possibly calling @code{iv_analysis_loop_init} for
379
several loops) is finished, @code{iv_analysis_done} should be called.
380
The following functions can be used to access the results of the
381
analysis:
382
 
383
@itemize
384
@item @code{iv_analyze}: Analyzes a single register used in the given
385
insn.  If no use of the register in this insn is found, the following
386
insns are scanned, so that this function can be called on the insn
387
returned by get_condition.
388
@item @code{iv_analyze_result}: Analyzes result of the assignment in the
389
given insn.
390
@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
391
All its operands are analyzed by @code{iv_analyze}, and hence they must
392
be used in the specified insn or one of the following insns.
393
@end itemize
394
 
395
The description of the induction variable is provided in @code{struct
396
rtx_iv}.  In order to handle subregs, the representation is a bit
397
complicated; if the value of the @code{extend} field is not
398
@code{UNKNOWN}, the value of the induction variable in the i-th
399
iteration is
400
 
401
@smallexample
402
delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
403
@end smallexample
404
 
405
with the following exception:  if @code{first_special} is true, then the
406
value in the first iteration (when @code{i} is zero) is @code{delta +
407
mult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
408
then @code{first_special} must be false, @code{delta} 0, @code{mult} 1
409
and the value in the i-th iteration is
410
 
411
@smallexample
412
subreg_@{mode@} (base + i * step)
413
@end smallexample
414
 
415
The function @code{get_iv_value} can be used to perform these
416
calculations.
417
 
418
@node Number of iterations
419
@section Number of iterations analysis
420
@cindex Number of iterations analysis
421
 
422
Both on GIMPLE and on RTL, there are functions available to determine
423
the number of iterations of a loop, with a similar interface.  The
424
number of iterations of a loop in GCC is defined as the number of
425
executions of the loop latch.  In many cases, it is not possible to
426
determine the number of iterations unconditionally -- the determined
427
number is correct only if some assumptions are satisfied.  The analysis
428
tries to verify these conditions using the information contained in the
429
program; if it fails, the conditions are returned together with the
430
result.  The following information and conditions are provided by the
431
analysis:
432
 
433
@itemize
434
@item @code{assumptions}: If this condition is false, the rest of
435
the information is invalid.
436
@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
437
this condition is true, the loop exits in the first iteration.
438
@item @code{infinite}: If this condition is true, the loop is infinite.
439
This condition is only available on RTL@.  On GIMPLE, conditions for
440
finiteness of the loop are included in @code{assumptions}.
441
@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
442
that gives number of iterations.  The number of iterations is defined as
443
the number of executions of the loop latch.
444
@end itemize
445
 
446
Both on GIMPLE and on RTL, it necessary for the induction variable
447
analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
448
On GIMPLE, the results are stored to @code{struct tree_niter_desc}
449
structure.  Number of iterations before the loop is exited through a
450
given exit can be determined using @code{number_of_iterations_exit}
451
function.  On RTL, the results are returned in @code{struct niter_desc}
452
structure.  The corresponding function is named
453
@code{check_simple_exit}.  There are also functions that pass through
454
all the exits of a loop and try to find one with easy to determine
455
number of iterations -- @code{find_loop_niter} on GIMPLE and
456
@code{find_simple_exit} on RTL@.  Finally, there are functions that
457
provide the same information, but additionally cache it, so that
458
repeated calls to number of iterations are not so costly --
459
@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc}
460
on RTL.
461
 
462
Note that some of these functions may behave slightly differently than
463
others -- some of them return only the expression for the number of
464
iterations, and fail if there are some assumptions.  The function
465
@code{number_of_latch_executions} works only for single-exit loops.
466
The function @code{number_of_cond_exit_executions} can be used to
467
determine number of executions of the exit condition of a single-exit
468
loop (i.e., the @code{number_of_latch_executions} increased by one).
469
 
470
@node Dependency analysis
471
@section Data Dependency Analysis
472
@cindex Data Dependency Analysis
473
 
474
The code for the data dependence analysis can be found in
475
@file{tree-data-ref.c} and its interface and data structures are
476
described in @file{tree-data-ref.h}.  The function that computes the
477
data dependences for all the array and pointer references for a given
478
loop is @code{compute_data_dependences_for_loop}.  This function is
479
currently used by the linear loop transform and the vectorization
480
passes.  Before calling this function, one has to allocate two vectors:
481
a first vector will contain the set of data references that are
482
contained in the analyzed loop body, and the second vector will contain
483
the dependence relations between the data references.  Thus if the
484
vector of data references is of size @code{n}, the vector containing the
485
dependence relations will contain @code{n*n} elements.  However if the
486
analyzed loop contains side effects, such as calls that potentially can
487
interfere with the data references in the current analyzed loop, the
488
analysis stops while scanning the loop body for data references, and
489
inserts a single @code{chrec_dont_know} in the dependence relation
490
array.
491
 
492
The data references are discovered in a particular order during the
493
scanning of the loop body: the loop body is analyzed in execution order,
494
and the data references of each statement are pushed at the end of the
495
data reference array.  Two data references syntactically occur in the
496
program in the same order as in the array of data references.  This
497
syntactic order is important in some classical data dependence tests,
498
and mapping this order to the elements of this array avoids costly
499
queries to the loop body representation.
500
 
501
Three types of data references are currently handled: ARRAY_REF,
502
INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference
503
is @code{data_reference}, where @code{data_reference_p} is a name of a
504
pointer to the data reference structure. The structure contains the
505
following elements:
506
 
507
@itemize
508
@item @code{base_object_info}: Provides information about the base object
509
of the data reference and its access functions. These access functions
510
represent the evolution of the data reference in the loop relative to
511
its base, in keeping with the classical meaning of the data reference
512
access function for the support of arrays. For example, for a reference
513
@code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
514
one for each array subscript, are:
515
@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
516
 
517
@item @code{first_location_in_loop}: Provides information about the first
518
location accessed by the data reference in the loop and about the access
519
function used to represent evolution relative to this location. This data
520
is used to support pointers, and is not used for arrays (for which we
521
have base objects). Pointer accesses are represented as a one-dimensional
522
access that starts from the first location accessed in the loop. For
523
example:
524
 
525
@smallexample
526
      for1 i
527
         for2 j
528
          *((int *)p + i + j) = a[i][j];
529
@end smallexample
530
 
531
The access function of the pointer access is @code{@{0, + 4B@}_for2}
532
relative to @code{p + i}. The access functions of the array are
533
@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
534
relative to @code{a}.
535
 
536
Usually, the object the pointer refers to is either unknown, or we can't
537
prove that the access is confined to the boundaries of a certain object.
538
 
539
Two data references can be compared only if at least one of these two
540
representations has all its fields filled for both data references.
541
 
542
The current strategy for data dependence tests is as follows:
543
If both @code{a} and @code{b} are represented as arrays, compare
544
@code{a.base_object} and @code{b.base_object};
545
if they are equal, apply dependence tests (use access functions based on
546
base_objects).
547
Else if both @code{a} and @code{b} are represented as pointers, compare
548
@code{a.first_location} and @code{b.first_location};
549
if they are equal, apply dependence tests (use access functions based on
550
first location).
551
However, if @code{a} and @code{b} are represented differently, only try
552
to prove that the bases are definitely different.
553
 
554
@item Aliasing information.
555
@item Alignment information.
556
@end itemize
557
 
558
The structure describing the relation between two data references is
559
@code{data_dependence_relation} and the shorter name for a pointer to
560
such a structure is @code{ddr_p}.  This structure contains:
561
 
562
@itemize
563
@item a pointer to each data reference,
564
@item a tree node @code{are_dependent} that is set to @code{chrec_known}
565
if the analysis has proved that there is no dependence between these two
566
data references, @code{chrec_dont_know} if the analysis was not able to
567
determine any useful result and potentially there could exist a
568
dependence between these data references, and @code{are_dependent} is
569
set to @code{NULL_TREE} if there exist a dependence relation between the
570
data references, and the description of this dependence relation is
571
given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
572
arrays,
573
@item a boolean that determines whether the dependence relation can be
574
represented by a classical distance vector,
575
@item an array @code{subscripts} that contains a description of each
576
subscript of the data references.  Given two array accesses a
577
subscript is the tuple composed of the access functions for a given
578
dimension.  For example, given @code{A[f1][f2][f3]} and
579
@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
580
g2), (f3, g3)}.
581
@item two arrays @code{dir_vects} and @code{dist_vects} that contain
582
classical representations of the data dependences under the form of
583
direction and distance dependence vectors,
584
@item an array of loops @code{loop_nest} that contains the loops to
585
which the distance and direction vectors refer to.
586
@end itemize
587
 
588
Several functions for pretty printing the information extracted by the
589
data dependence analysis are available: @code{dump_ddrs} prints with a
590
maximum verbosity the details of a data dependence relations array,
591
@code{dump_dist_dir_vectors} prints only the classical distance and
592
direction vectors for a data dependence relations array, and
593
@code{dump_data_references} prints the details of the data references
594
contained in a data reference array.
595
 
596
@node Lambda
597
@section Linear loop transformations framework
598
@cindex Linear loop transformations framework
599
 
600
Lambda is a framework that allows transformations of loops using
601
non-singular matrix based transformations of the iteration space and
602
loop bounds. This allows compositions of skewing, scaling, interchange,
603
and reversal transformations.  These transformations are often used to
604
improve cache behavior or remove inner loop dependencies to allow
605
parallelization and vectorization to take place.
606
 
607
To perform these transformations, Lambda requires that the loopnest be
608
converted into an internal form that can be matrix transformed easily.
609
To do this conversion, the function
610
@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
611
be transformed using lambda, this function will return NULL.
612
 
613
Once a @code{lambda_loopnest} is obtained from the conversion function,
614
it can be transformed by using @code{lambda_loopnest_transform}, which
615
takes a transformation matrix to apply.  Note that it is up to the
616
caller to verify that the transformation matrix is legal to apply to the
617
loop (dependence respecting, etc).  Lambda simply applies whatever
618
matrix it is told to provide.  It can be extended to make legal matrices
619
out of any non-singular matrix, but this is not currently implemented.
620
Legality of a matrix for a given loopnest can be verified using
621
@code{lambda_transform_legal_p}.
622
 
623
Given a transformed loopnest, conversion back into gcc IR is done by
624
@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
625
loops so that they match the transformed loopnest.
626
 
627
 
628
@node Omega
629
@section Omega a solver for linear programming problems
630
@cindex Omega a solver for linear programming problems
631
 
632
The data dependence analysis contains several solvers triggered
633
sequentially from the less complex ones to the more sophisticated.
634
For ensuring the consistency of the results of these solvers, a data
635
dependence check pass has been implemented based on two different
636
solvers.  The second method that has been integrated to GCC is based
637
on the Omega dependence solver, written in the 1990's by William Pugh
638
and David Wonnacott.  Data dependence tests can be formulated using a
639
subset of the Presburger arithmetics that can be translated to linear
640
constraint systems.  These linear constraint systems can then be
641
solved using the Omega solver.
642
 
643
The Omega solver is using Fourier-Motzkin's algorithm for variable
644
elimination: a linear constraint system containing @code{n} variables
645
is reduced to a linear constraint system with @code{n-1} variables.
646
The Omega solver can also be used for solving other problems that can
647
be expressed under the form of a system of linear equalities and
648
inequalities.  The Omega solver is known to have an exponential worst
649
case, also known under the name of ``omega nightmare'' in the
650
literature, but in practice, the omega test is known to be efficient
651
for the common data dependence tests.
652
 
653
The interface used by the Omega solver for describing the linear
654
programming problems is described in @file{omega.h}, and the solver is
655
@code{omega_solve_problem}.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.