OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-stable/] [gcc-4.5.1/] [gcc/] [tree-predcom.c] - Blame information for rev 847

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 280 jeremybenn
/* Predictive commoning.
2
   Copyright (C) 2005, 2007, 2008, 2009, 2010
3
   Free Software Foundation, Inc.
4
 
5
This file is part of GCC.
6
 
7
GCC is free software; you can redistribute it and/or modify it
8
under the terms of the GNU General Public License as published by the
9
Free Software Foundation; either version 3, or (at your option) any
10
later version.
11
 
12
GCC is distributed in the hope that it will be useful, but WITHOUT
13
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
14
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
15
for more details.
16
 
17
You should have received a copy of the GNU General Public License
18
along with GCC; see the file COPYING3.  If not see
19
<http://www.gnu.org/licenses/>.  */
20
 
21
/* This file implements the predictive commoning optimization.  Predictive
22
   commoning can be viewed as CSE around a loop, and with some improvements,
23
   as generalized strength reduction-- i.e., reusing values computed in
24
   earlier iterations of a loop in the later ones.  So far, the pass only
25
   handles the most useful case, that is, reusing values of memory references.
26
   If you think this is all just a special case of PRE, you are sort of right;
27
   however, concentrating on loops is simpler, and makes it possible to
28
   incorporate data dependence analysis to detect the opportunities, perform
29
   loop unrolling to avoid copies together with renaming immediately,
30
   and if needed, we could also take register pressure into account.
31
 
32
   Let us demonstrate what is done on an example:
33
 
34
   for (i = 0; i < 100; i++)
35
     {
36
       a[i+2] = a[i] + a[i+1];
37
       b[10] = b[10] + i;
38
       c[i] = c[99 - i];
39
       d[i] = d[i + 1];
40
     }
41
 
42
   1) We find data references in the loop, and split them to mutually
43
      independent groups (i.e., we find components of a data dependence
44
      graph).  We ignore read-read dependences whose distance is not constant.
45
      (TODO -- we could also ignore antidependences).  In this example, we
46
      find the following groups:
47
 
48
      a[i]{read}, a[i+1]{read}, a[i+2]{write}
49
      b[10]{read}, b[10]{write}
50
      c[99 - i]{read}, c[i]{write}
51
      d[i + 1]{read}, d[i]{write}
52
 
53
   2) Inside each of the group, we verify several conditions:
54
      a) all the references must differ in indices only, and the indices
55
         must all have the same step
56
      b) the references must dominate loop latch (and thus, they must be
57
         ordered by dominance relation).
58
      c) the distance of the indices must be a small multiple of the step
59
      We are then able to compute the difference of the references (# of
60
      iterations before they point to the same place as the first of them).
61
      Also, in case there are writes in the loop, we split the groups into
62
      chains whose head is the write whose values are used by the reads in
63
      the same chain.  The chains are then processed independently,
64
      making the further transformations simpler.  Also, the shorter chains
65
      need the same number of registers, but may require lower unrolling
66
      factor in order to get rid of the copies on the loop latch.
67
 
68
      In our example, we get the following chains (the chain for c is invalid).
69
 
70
      a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
71
      b[10]{read,+0}, b[10]{write,+0}
72
      d[i + 1]{read,+0}, d[i]{write,+1}
73
 
74
   3) For each read, we determine the read or write whose value it reuses,
75
      together with the distance of this reuse.  I.e. we take the last
76
      reference before it with distance 0, or the last of the references
77
      with the smallest positive distance to the read.  Then, we remove
78
      the references that are not used in any of these chains, discard the
79
      empty groups, and propagate all the links so that they point to the
80
      single root reference of the chain (adjusting their distance
81
      appropriately).  Some extra care needs to be taken for references with
82
      step 0.  In our example (the numbers indicate the distance of the
83
      reuse),
84
 
85
      a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
86
      b[10] --> (*) 1, b[10] (*)
87
 
88
   4) The chains are combined together if possible.  If the corresponding
89
      elements of two chains are always combined together with the same
90
      operator, we remember just the result of this combination, instead
91
      of remembering the values separately.  We may need to perform
92
      reassociation to enable combining, for example
93
 
94
      e[i] + f[i+1] + e[i+1] + f[i]
95
 
96
      can be reassociated as
97
 
98
      (e[i] + f[i]) + (e[i+1] + f[i+1])
99
 
100
      and we can combine the chains for e and f into one chain.
101
 
102
   5) For each root reference (end of the chain) R, let N be maximum distance
103
      of a reference reusing its value.  Variables R0 upto RN are created,
104
      together with phi nodes that transfer values from R1 .. RN to
105
      R0 .. R(N-1).
106
      Initial values are loaded to R0..R(N-1) (in case not all references
107
      must necessarily be accessed and they may trap, we may fail here;
108
      TODO sometimes, the loads could be guarded by a check for the number
109
      of iterations).  Values loaded/stored in roots are also copied to
110
      RN.  Other reads are replaced with the appropriate variable Ri.
111
      Everything is put to SSA form.
112
 
113
      As a small improvement, if R0 is dead after the root (i.e., all uses of
114
      the value with the maximum distance dominate the root), we can avoid
115
      creating RN and use R0 instead of it.
116
 
117
      In our example, we get (only the parts concerning a and b are shown):
118
      for (i = 0; i < 100; i++)
119
        {
120
          f = phi (a[0], s);
121
          s = phi (a[1], f);
122
          x = phi (b[10], x);
123
 
124
          f = f + s;
125
          a[i+2] = f;
126
          x = x + i;
127
          b[10] = x;
128
        }
129
 
130
   6) Factor F for unrolling is determined as the smallest common multiple of
131
      (N + 1) for each root reference (N for references for that we avoided
132
      creating RN).  If F and the loop is small enough, loop is unrolled F
133
      times.  The stores to RN (R0) in the copies of the loop body are
134
      periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
135
      be coalesced and the copies can be eliminated.
136
 
137
      TODO -- copy propagation and other optimizations may change the live
138
      ranges of the temporary registers and prevent them from being coalesced;
139
      this may increase the register pressure.
140
 
141
      In our case, F = 2 and the (main loop of the) result is
142
 
143
      for (i = 0; i < ...; i += 2)
144
        {
145
          f = phi (a[0], f);
146
          s = phi (a[1], s);
147
          x = phi (b[10], x);
148
 
149
          f = f + s;
150
          a[i+2] = f;
151
          x = x + i;
152
          b[10] = x;
153
 
154
          s = s + f;
155
          a[i+3] = s;
156
          x = x + i;
157
          b[10] = x;
158
       }
159
 
160
   TODO -- stores killing other stores can be taken into account, e.g.,
161
   for (i = 0; i < n; i++)
162
     {
163
       a[i] = 1;
164
       a[i+2] = 2;
165
     }
166
 
167
   can be replaced with
168
 
169
   t0 = a[0];
170
   t1 = a[1];
171
   for (i = 0; i < n; i++)
172
     {
173
       a[i] = 1;
174
       t2 = 2;
175
       t0 = t1;
176
       t1 = t2;
177
     }
178
   a[n] = t0;
179
   a[n+1] = t1;
180
 
181
   The interesting part is that this would generalize store motion; still, since
182
   sm is performed elsewhere, it does not seem that important.
183
 
184
   Predictive commoning can be generalized for arbitrary computations (not
185
   just memory loads), and also nontrivial transfer functions (e.g., replacing
186
   i * i with ii_last + 2 * i + 1), to generalize strength reduction.  */
187
 
188
#include "config.h"
189
#include "system.h"
190
#include "coretypes.h"
191
#include "tm.h"
192
#include "tree.h"
193
#include "tm_p.h"
194
#include "cfgloop.h"
195
#include "tree-flow.h"
196
#include "ggc.h"
197
#include "tree-data-ref.h"
198
#include "tree-scalar-evolution.h"
199
#include "tree-chrec.h"
200
#include "params.h"
201
#include "diagnostic.h"
202
#include "tree-pass.h"
203
#include "tree-affine.h"
204
#include "tree-inline.h"
205
 
206
/* The maximum number of iterations between the considered memory
207
   references.  */
208
 
209
#define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
210
 
211
/* Data references (or phi nodes that carry data reference values across
212
   loop iterations).  */
213
 
214
typedef struct dref_d
215
{
216
  /* The reference itself.  */
217
  struct data_reference *ref;
218
 
219
  /* The statement in that the reference appears.  */
220
  gimple stmt;
221
 
222
  /* In case that STMT is a phi node, this field is set to the SSA name
223
     defined by it in replace_phis_by_defined_names (in order to avoid
224
     pointing to phi node that got reallocated in the meantime).  */
225
  tree name_defined_by_phi;
226
 
227
  /* Distance of the reference from the root of the chain (in number of
228
     iterations of the loop).  */
229
  unsigned distance;
230
 
231
  /* Number of iterations offset from the first reference in the component.  */
232
  double_int offset;
233
 
234
  /* Number of the reference in a component, in dominance ordering.  */
235
  unsigned pos;
236
 
237
  /* True if the memory reference is always accessed when the loop is
238
     entered.  */
239
  unsigned always_accessed : 1;
240
} *dref;
241
 
242
DEF_VEC_P (dref);
243
DEF_VEC_ALLOC_P (dref, heap);
244
 
245
/* Type of the chain of the references.  */
246
 
247
enum chain_type
248
{
249
  /* The addresses of the references in the chain are constant.  */
250
  CT_INVARIANT,
251
 
252
  /* There are only loads in the chain.  */
253
  CT_LOAD,
254
 
255
  /* Root of the chain is store, the rest are loads.  */
256
  CT_STORE_LOAD,
257
 
258
  /* A combination of two chains.  */
259
  CT_COMBINATION
260
};
261
 
262
/* Chains of data references.  */
263
 
264
typedef struct chain
265
{
266
  /* Type of the chain.  */
267
  enum chain_type type;
268
 
269
  /* For combination chains, the operator and the two chains that are
270
     combined, and the type of the result.  */
271
  enum tree_code op;
272
  tree rslt_type;
273
  struct chain *ch1, *ch2;
274
 
275
  /* The references in the chain.  */
276
  VEC(dref,heap) *refs;
277
 
278
  /* The maximum distance of the reference in the chain from the root.  */
279
  unsigned length;
280
 
281
  /* The variables used to copy the value throughout iterations.  */
282
  VEC(tree,heap) *vars;
283
 
284
  /* Initializers for the variables.  */
285
  VEC(tree,heap) *inits;
286
 
287
  /* True if there is a use of a variable with the maximal distance
288
     that comes after the root in the loop.  */
289
  unsigned has_max_use_after : 1;
290
 
291
  /* True if all the memory references in the chain are always accessed.  */
292
  unsigned all_always_accessed : 1;
293
 
294
  /* True if this chain was combined together with some other chain.  */
295
  unsigned combined : 1;
296
} *chain_p;
297
 
298
DEF_VEC_P (chain_p);
299
DEF_VEC_ALLOC_P (chain_p, heap);
300
 
301
/* Describes the knowledge about the step of the memory references in
302
   the component.  */
303
 
304
enum ref_step_type
305
{
306
  /* The step is zero.  */
307
  RS_INVARIANT,
308
 
309
  /* The step is nonzero.  */
310
  RS_NONZERO,
311
 
312
  /* The step may or may not be nonzero.  */
313
  RS_ANY
314
};
315
 
316
/* Components of the data dependence graph.  */
317
 
318
struct component
319
{
320
  /* The references in the component.  */
321
  VEC(dref,heap) *refs;
322
 
323
  /* What we know about the step of the references in the component.  */
324
  enum ref_step_type comp_step;
325
 
326
  /* Next component in the list.  */
327
  struct component *next;
328
};
329
 
330
/* Bitmap of ssa names defined by looparound phi nodes covered by chains.  */
331
 
332
static bitmap looparound_phis;
333
 
334
/* Cache used by tree_to_aff_combination_expand.  */
335
 
336
static struct pointer_map_t *name_expansions;
337
 
338
/* Dumps data reference REF to FILE.  */
339
 
340
extern void dump_dref (FILE *, dref);
341
void
342
dump_dref (FILE *file, dref ref)
343
{
344
  if (ref->ref)
345
    {
346
      fprintf (file, "    ");
347
      print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
348
      fprintf (file, " (id %u%s)\n", ref->pos,
349
               DR_IS_READ (ref->ref) ? "" : ", write");
350
 
351
      fprintf (file, "      offset ");
352
      dump_double_int (file, ref->offset, false);
353
      fprintf (file, "\n");
354
 
355
      fprintf (file, "      distance %u\n", ref->distance);
356
    }
357
  else
358
    {
359
      if (gimple_code (ref->stmt) == GIMPLE_PHI)
360
        fprintf (file, "    looparound ref\n");
361
      else
362
        fprintf (file, "    combination ref\n");
363
      fprintf (file, "      in statement ");
364
      print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
365
      fprintf (file, "\n");
366
      fprintf (file, "      distance %u\n", ref->distance);
367
    }
368
 
369
}
370
 
371
/* Dumps CHAIN to FILE.  */
372
 
373
extern void dump_chain (FILE *, chain_p);
374
void
375
dump_chain (FILE *file, chain_p chain)
376
{
377
  dref a;
378
  const char *chain_type;
379
  unsigned i;
380
  tree var;
381
 
382
  switch (chain->type)
383
    {
384
    case CT_INVARIANT:
385
      chain_type = "Load motion";
386
      break;
387
 
388
    case CT_LOAD:
389
      chain_type = "Loads-only";
390
      break;
391
 
392
    case CT_STORE_LOAD:
393
      chain_type = "Store-loads";
394
      break;
395
 
396
    case CT_COMBINATION:
397
      chain_type = "Combination";
398
      break;
399
 
400
    default:
401
      gcc_unreachable ();
402
    }
403
 
404
  fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
405
           chain->combined ? " (combined)" : "");
406
  if (chain->type != CT_INVARIANT)
407
    fprintf (file, "  max distance %u%s\n", chain->length,
408
             chain->has_max_use_after ? "" : ", may reuse first");
409
 
410
  if (chain->type == CT_COMBINATION)
411
    {
412
      fprintf (file, "  equal to %p %s %p in type ",
413
               (void *) chain->ch1, op_symbol_code (chain->op),
414
               (void *) chain->ch2);
415
      print_generic_expr (file, chain->rslt_type, TDF_SLIM);
416
      fprintf (file, "\n");
417
    }
418
 
419
  if (chain->vars)
420
    {
421
      fprintf (file, "  vars");
422
      for (i = 0; VEC_iterate (tree, chain->vars, i, var); i++)
423
        {
424
          fprintf (file, " ");
425
          print_generic_expr (file, var, TDF_SLIM);
426
        }
427
      fprintf (file, "\n");
428
    }
429
 
430
  if (chain->inits)
431
    {
432
      fprintf (file, "  inits");
433
      for (i = 0; VEC_iterate (tree, chain->inits, i, var); i++)
434
        {
435
          fprintf (file, " ");
436
          print_generic_expr (file, var, TDF_SLIM);
437
        }
438
      fprintf (file, "\n");
439
    }
440
 
441
  fprintf (file, "  references:\n");
442
  for (i = 0; VEC_iterate (dref, chain->refs, i, a); i++)
443
    dump_dref (file, a);
444
 
445
  fprintf (file, "\n");
446
}
447
 
448
/* Dumps CHAINS to FILE.  */
449
 
450
extern void dump_chains (FILE *, VEC (chain_p, heap) *);
451
void
452
dump_chains (FILE *file, VEC (chain_p, heap) *chains)
453
{
454
  chain_p chain;
455
  unsigned i;
456
 
457
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
458
    dump_chain (file, chain);
459
}
460
 
461
/* Dumps COMP to FILE.  */
462
 
463
extern void dump_component (FILE *, struct component *);
464
void
465
dump_component (FILE *file, struct component *comp)
466
{
467
  dref a;
468
  unsigned i;
469
 
470
  fprintf (file, "Component%s:\n",
471
           comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
472
  for (i = 0; VEC_iterate (dref, comp->refs, i, a); i++)
473
    dump_dref (file, a);
474
  fprintf (file, "\n");
475
}
476
 
477
/* Dumps COMPS to FILE.  */
478
 
479
extern void dump_components (FILE *, struct component *);
480
void
481
dump_components (FILE *file, struct component *comps)
482
{
483
  struct component *comp;
484
 
485
  for (comp = comps; comp; comp = comp->next)
486
    dump_component (file, comp);
487
}
488
 
489
/* Frees a chain CHAIN.  */
490
 
491
static void
492
release_chain (chain_p chain)
493
{
494
  dref ref;
495
  unsigned i;
496
 
497
  if (chain == NULL)
498
    return;
499
 
500
  for (i = 0; VEC_iterate (dref, chain->refs, i, ref); i++)
501
    free (ref);
502
 
503
  VEC_free (dref, heap, chain->refs);
504
  VEC_free (tree, heap, chain->vars);
505
  VEC_free (tree, heap, chain->inits);
506
 
507
  free (chain);
508
}
509
 
510
/* Frees CHAINS.  */
511
 
512
static void
513
release_chains (VEC (chain_p, heap) *chains)
514
{
515
  unsigned i;
516
  chain_p chain;
517
 
518
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
519
    release_chain (chain);
520
  VEC_free (chain_p, heap, chains);
521
}
522
 
523
/* Frees a component COMP.  */
524
 
525
static void
526
release_component (struct component *comp)
527
{
528
  VEC_free (dref, heap, comp->refs);
529
  free (comp);
530
}
531
 
532
/* Frees list of components COMPS.  */
533
 
534
static void
535
release_components (struct component *comps)
536
{
537
  struct component *act, *next;
538
 
539
  for (act = comps; act; act = next)
540
    {
541
      next = act->next;
542
      release_component (act);
543
    }
544
}
545
 
546
/* Finds a root of tree given by FATHERS containing A, and performs path
547
   shortening.  */
548
 
549
static unsigned
550
component_of (unsigned fathers[], unsigned a)
551
{
552
  unsigned root, n;
553
 
554
  for (root = a; root != fathers[root]; root = fathers[root])
555
    continue;
556
 
557
  for (; a != root; a = n)
558
    {
559
      n = fathers[a];
560
      fathers[a] = root;
561
    }
562
 
563
  return root;
564
}
565
 
566
/* Join operation for DFU.  FATHERS gives the tree, SIZES are sizes of the
567
   components, A and B are components to merge.  */
568
 
569
static void
570
merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
571
{
572
  unsigned ca = component_of (fathers, a);
573
  unsigned cb = component_of (fathers, b);
574
 
575
  if (ca == cb)
576
    return;
577
 
578
  if (sizes[ca] < sizes[cb])
579
    {
580
      sizes[cb] += sizes[ca];
581
      fathers[ca] = cb;
582
    }
583
  else
584
    {
585
      sizes[ca] += sizes[cb];
586
      fathers[cb] = ca;
587
    }
588
}
589
 
590
/* Returns true if A is a reference that is suitable for predictive commoning
591
   in the innermost loop that contains it.  REF_STEP is set according to the
592
   step of the reference A.  */
593
 
594
static bool
595
suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
596
{
597
  tree ref = DR_REF (a), step = DR_STEP (a);
598
 
599
  if (!step
600
      || !is_gimple_reg_type (TREE_TYPE (ref))
601
      || tree_could_throw_p (ref))
602
    return false;
603
 
604
  if (integer_zerop (step))
605
    *ref_step = RS_INVARIANT;
606
  else if (integer_nonzerop (step))
607
    *ref_step = RS_NONZERO;
608
  else
609
    *ref_step = RS_ANY;
610
 
611
  return true;
612
}
613
 
614
/* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET.  */
615
 
616
static void
617
aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
618
{
619
  aff_tree delta;
620
 
621
  tree_to_aff_combination_expand (DR_OFFSET (dr), sizetype, offset,
622
                                  &name_expansions);
623
  aff_combination_const (&delta, sizetype, tree_to_double_int (DR_INIT (dr)));
624
  aff_combination_add (offset, &delta);
625
}
626
 
627
/* Determines number of iterations of the innermost enclosing loop before B
628
   refers to exactly the same location as A and stores it to OFF.  If A and
629
   B do not have the same step, they never meet, or anything else fails,
630
   returns false, otherwise returns true.  Both A and B are assumed to
631
   satisfy suitable_reference_p.  */
632
 
633
static bool
634
determine_offset (struct data_reference *a, struct data_reference *b,
635
                  double_int *off)
636
{
637
  aff_tree diff, baseb, step;
638
  tree typea, typeb;
639
 
640
  /* Check that both the references access the location in the same type.  */
641
  typea = TREE_TYPE (DR_REF (a));
642
  typeb = TREE_TYPE (DR_REF (b));
643
  if (!useless_type_conversion_p (typeb, typea))
644
    return false;
645
 
646
  /* Check whether the base address and the step of both references is the
647
     same.  */
648
  if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
649
      || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
650
    return false;
651
 
652
  if (integer_zerop (DR_STEP (a)))
653
    {
654
      /* If the references have loop invariant address, check that they access
655
         exactly the same location.  */
656
      *off = double_int_zero;
657
      return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
658
              && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
659
    }
660
 
661
  /* Compare the offsets of the addresses, and check whether the difference
662
     is a multiple of step.  */
663
  aff_combination_dr_offset (a, &diff);
664
  aff_combination_dr_offset (b, &baseb);
665
  aff_combination_scale (&baseb, double_int_minus_one);
666
  aff_combination_add (&diff, &baseb);
667
 
668
  tree_to_aff_combination_expand (DR_STEP (a), sizetype,
669
                                  &step, &name_expansions);
670
  return aff_combination_constant_multiple_p (&diff, &step, off);
671
}
672
 
673
/* Returns the last basic block in LOOP for that we are sure that
674
   it is executed whenever the loop is entered.  */
675
 
676
static basic_block
677
last_always_executed_block (struct loop *loop)
678
{
679
  unsigned i;
680
  VEC (edge, heap) *exits = get_loop_exit_edges (loop);
681
  edge ex;
682
  basic_block last = loop->latch;
683
 
684
  for (i = 0; VEC_iterate (edge, exits, i, ex); i++)
685
    last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
686
  VEC_free (edge, heap, exits);
687
 
688
  return last;
689
}
690
 
691
/* Splits dependence graph on DATAREFS described by DEPENDS to components.  */
692
 
693
static struct component *
694
split_data_refs_to_components (struct loop *loop,
695
                               VEC (data_reference_p, heap) *datarefs,
696
                               VEC (ddr_p, heap) *depends)
697
{
698
  unsigned i, n = VEC_length (data_reference_p, datarefs);
699
  unsigned ca, ia, ib, bad;
700
  unsigned *comp_father = XNEWVEC (unsigned, n + 1);
701
  unsigned *comp_size = XNEWVEC (unsigned, n + 1);
702
  struct component **comps;
703
  struct data_reference *dr, *dra, *drb;
704
  struct data_dependence_relation *ddr;
705
  struct component *comp_list = NULL, *comp;
706
  dref dataref;
707
  basic_block last_always_executed = last_always_executed_block (loop);
708
 
709
  for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
710
    {
711
      if (!DR_REF (dr))
712
        {
713
          /* A fake reference for call or asm_expr that may clobber memory;
714
             just fail.  */
715
          goto end;
716
        }
717
      dr->aux = (void *) (size_t) i;
718
      comp_father[i] = i;
719
      comp_size[i] = 1;
720
    }
721
 
722
  /* A component reserved for the "bad" data references.  */
723
  comp_father[n] = n;
724
  comp_size[n] = 1;
725
 
726
  for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
727
    {
728
      enum ref_step_type dummy;
729
 
730
      if (!suitable_reference_p (dr, &dummy))
731
        {
732
          ia = (unsigned) (size_t) dr->aux;
733
          merge_comps (comp_father, comp_size, n, ia);
734
        }
735
    }
736
 
737
  for (i = 0; VEC_iterate (ddr_p, depends, i, ddr); i++)
738
    {
739
      double_int dummy_off;
740
 
741
      if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
742
        continue;
743
 
744
      dra = DDR_A (ddr);
745
      drb = DDR_B (ddr);
746
      ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
747
      ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
748
      if (ia == ib)
749
        continue;
750
 
751
      bad = component_of (comp_father, n);
752
 
753
      /* If both A and B are reads, we may ignore unsuitable dependences.  */
754
      if (DR_IS_READ (dra) && DR_IS_READ (drb)
755
          && (ia == bad || ib == bad
756
              || !determine_offset (dra, drb, &dummy_off)))
757
        continue;
758
 
759
      merge_comps (comp_father, comp_size, ia, ib);
760
    }
761
 
762
  comps = XCNEWVEC (struct component *, n);
763
  bad = component_of (comp_father, n);
764
  for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
765
    {
766
      ia = (unsigned) (size_t) dr->aux;
767
      ca = component_of (comp_father, ia);
768
      if (ca == bad)
769
        continue;
770
 
771
      comp = comps[ca];
772
      if (!comp)
773
        {
774
          comp = XCNEW (struct component);
775
          comp->refs = VEC_alloc (dref, heap, comp_size[ca]);
776
          comps[ca] = comp;
777
        }
778
 
779
      dataref = XCNEW (struct dref_d);
780
      dataref->ref = dr;
781
      dataref->stmt = DR_STMT (dr);
782
      dataref->offset = double_int_zero;
783
      dataref->distance = 0;
784
 
785
      dataref->always_accessed
786
              = dominated_by_p (CDI_DOMINATORS, last_always_executed,
787
                                gimple_bb (dataref->stmt));
788
      dataref->pos = VEC_length (dref, comp->refs);
789
      VEC_quick_push (dref, comp->refs, dataref);
790
    }
791
 
792
  for (i = 0; i < n; i++)
793
    {
794
      comp = comps[i];
795
      if (comp)
796
        {
797
          comp->next = comp_list;
798
          comp_list = comp;
799
        }
800
    }
801
  free (comps);
802
 
803
end:
804
  free (comp_father);
805
  free (comp_size);
806
  return comp_list;
807
}
808
 
809
/* Returns true if the component COMP satisfies the conditions
810
   described in 2) at the beginning of this file.  LOOP is the current
811
   loop.  */
812
 
813
static bool
814
suitable_component_p (struct loop *loop, struct component *comp)
815
{
816
  unsigned i;
817
  dref a, first;
818
  basic_block ba, bp = loop->header;
819
  bool ok, has_write = false;
820
 
821
  for (i = 0; VEC_iterate (dref, comp->refs, i, a); i++)
822
    {
823
      ba = gimple_bb (a->stmt);
824
 
825
      if (!just_once_each_iteration_p (loop, ba))
826
        return false;
827
 
828
      gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
829
      bp = ba;
830
 
831
      if (!DR_IS_READ (a->ref))
832
        has_write = true;
833
    }
834
 
835
  first = VEC_index (dref, comp->refs, 0);
836
  ok = suitable_reference_p (first->ref, &comp->comp_step);
837
  gcc_assert (ok);
838
  first->offset = double_int_zero;
839
 
840
  for (i = 1; VEC_iterate (dref, comp->refs, i, a); i++)
841
    {
842
      if (!determine_offset (first->ref, a->ref, &a->offset))
843
        return false;
844
 
845
#ifdef ENABLE_CHECKING
846
      {
847
        enum ref_step_type a_step;
848
        ok = suitable_reference_p (a->ref, &a_step);
849
        gcc_assert (ok && a_step == comp->comp_step);
850
      }
851
#endif
852
    }
853
 
854
  /* If there is a write inside the component, we must know whether the
855
     step is nonzero or not -- we would not otherwise be able to recognize
856
     whether the value accessed by reads comes from the OFFSET-th iteration
857
     or the previous one.  */
858
  if (has_write && comp->comp_step == RS_ANY)
859
    return false;
860
 
861
  return true;
862
}
863
 
864
/* Check the conditions on references inside each of components COMPS,
865
   and remove the unsuitable components from the list.  The new list
866
   of components is returned.  The conditions are described in 2) at
867
   the beginning of this file.  LOOP is the current loop.  */
868
 
869
static struct component *
870
filter_suitable_components (struct loop *loop, struct component *comps)
871
{
872
  struct component **comp, *act;
873
 
874
  for (comp = &comps; *comp; )
875
    {
876
      act = *comp;
877
      if (suitable_component_p (loop, act))
878
        comp = &act->next;
879
      else
880
        {
881
          dref ref;
882
          unsigned i;
883
 
884
          *comp = act->next;
885
          for (i = 0; VEC_iterate (dref, act->refs, i, ref); i++)
886
            free (ref);
887
          release_component (act);
888
        }
889
    }
890
 
891
  return comps;
892
}
893
 
894
/* Compares two drefs A and B by their offset and position.  Callback for
895
   qsort.  */
896
 
897
static int
898
order_drefs (const void *a, const void *b)
899
{
900
  const dref *const da = (const dref *) a;
901
  const dref *const db = (const dref *) b;
902
  int offcmp = double_int_scmp ((*da)->offset, (*db)->offset);
903
 
904
  if (offcmp != 0)
905
    return offcmp;
906
 
907
  return (*da)->pos - (*db)->pos;
908
}
909
 
910
/* Returns root of the CHAIN.  */
911
 
912
static inline dref
913
get_chain_root (chain_p chain)
914
{
915
  return VEC_index (dref, chain->refs, 0);
916
}
917
 
918
/* Adds REF to the chain CHAIN.  */
919
 
920
static void
921
add_ref_to_chain (chain_p chain, dref ref)
922
{
923
  dref root = get_chain_root (chain);
924
  double_int dist;
925
 
926
  gcc_assert (double_int_scmp (root->offset, ref->offset) <= 0);
927
  dist = double_int_add (ref->offset, double_int_neg (root->offset));
928
  if (double_int_ucmp (uhwi_to_double_int (MAX_DISTANCE), dist) <= 0)
929
    {
930
      free (ref);
931
      return;
932
    }
933
  gcc_assert (double_int_fits_in_uhwi_p (dist));
934
 
935
  VEC_safe_push (dref, heap, chain->refs, ref);
936
 
937
  ref->distance = double_int_to_uhwi (dist);
938
 
939
  if (ref->distance >= chain->length)
940
    {
941
      chain->length = ref->distance;
942
      chain->has_max_use_after = false;
943
    }
944
 
945
  if (ref->distance == chain->length
946
      && ref->pos > root->pos)
947
    chain->has_max_use_after = true;
948
 
949
  chain->all_always_accessed &= ref->always_accessed;
950
}
951
 
952
/* Returns the chain for invariant component COMP.  */
953
 
954
static chain_p
955
make_invariant_chain (struct component *comp)
956
{
957
  chain_p chain = XCNEW (struct chain);
958
  unsigned i;
959
  dref ref;
960
 
961
  chain->type = CT_INVARIANT;
962
 
963
  chain->all_always_accessed = true;
964
 
965
  for (i = 0; VEC_iterate (dref, comp->refs, i, ref); i++)
966
    {
967
      VEC_safe_push (dref, heap, chain->refs, ref);
968
      chain->all_always_accessed &= ref->always_accessed;
969
    }
970
 
971
  return chain;
972
}
973
 
974
/* Make a new chain rooted at REF.  */
975
 
976
static chain_p
977
make_rooted_chain (dref ref)
978
{
979
  chain_p chain = XCNEW (struct chain);
980
 
981
  chain->type = DR_IS_READ (ref->ref) ? CT_LOAD : CT_STORE_LOAD;
982
 
983
  VEC_safe_push (dref, heap, chain->refs, ref);
984
  chain->all_always_accessed = ref->always_accessed;
985
 
986
  ref->distance = 0;
987
 
988
  return chain;
989
}
990
 
991
/* Returns true if CHAIN is not trivial.  */
992
 
993
static bool
994
nontrivial_chain_p (chain_p chain)
995
{
996
  return chain != NULL && VEC_length (dref, chain->refs) > 1;
997
}
998
 
999
/* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1000
   is no such name.  */
1001
 
1002
static tree
1003
name_for_ref (dref ref)
1004
{
1005
  tree name;
1006
 
1007
  if (is_gimple_assign (ref->stmt))
1008
    {
1009
      if (!ref->ref || DR_IS_READ (ref->ref))
1010
        name = gimple_assign_lhs (ref->stmt);
1011
      else
1012
        name = gimple_assign_rhs1 (ref->stmt);
1013
    }
1014
  else
1015
    name = PHI_RESULT (ref->stmt);
1016
 
1017
  return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1018
}
1019
 
1020
/* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1021
   iterations of the innermost enclosing loop).  */
1022
 
1023
static bool
1024
valid_initializer_p (struct data_reference *ref,
1025
                     unsigned distance, struct data_reference *root)
1026
{
1027
  aff_tree diff, base, step;
1028
  double_int off;
1029
 
1030
  /* Both REF and ROOT must be accessing the same object.  */
1031
  if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1032
    return false;
1033
 
1034
  /* The initializer is defined outside of loop, hence its address must be
1035
     invariant inside the loop.  */
1036
  gcc_assert (integer_zerop (DR_STEP (ref)));
1037
 
1038
  /* If the address of the reference is invariant, initializer must access
1039
     exactly the same location.  */
1040
  if (integer_zerop (DR_STEP (root)))
1041
    return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1042
            && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1043
 
1044
  /* Verify that this index of REF is equal to the root's index at
1045
     -DISTANCE-th iteration.  */
1046
  aff_combination_dr_offset (root, &diff);
1047
  aff_combination_dr_offset (ref, &base);
1048
  aff_combination_scale (&base, double_int_minus_one);
1049
  aff_combination_add (&diff, &base);
1050
 
1051
  tree_to_aff_combination_expand (DR_STEP (root), sizetype, &step,
1052
                                  &name_expansions);
1053
  if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1054
    return false;
1055
 
1056
  if (!double_int_equal_p (off, uhwi_to_double_int (distance)))
1057
    return false;
1058
 
1059
  return true;
1060
}
1061
 
1062
/* Finds looparound phi node of LOOP that copies the value of REF, and if its
1063
   initial value is correct (equal to initial value of REF shifted by one
1064
   iteration), returns the phi node.  Otherwise, NULL_TREE is returned.  ROOT
1065
   is the root of the current chain.  */
1066
 
1067
static gimple
1068
find_looparound_phi (struct loop *loop, dref ref, dref root)
1069
{
1070
  tree name, init, init_ref;
1071
  gimple phi = NULL, init_stmt;
1072
  edge latch = loop_latch_edge (loop);
1073
  struct data_reference init_dr;
1074
  gimple_stmt_iterator psi;
1075
 
1076
  if (is_gimple_assign (ref->stmt))
1077
    {
1078
      if (DR_IS_READ (ref->ref))
1079
        name = gimple_assign_lhs (ref->stmt);
1080
      else
1081
        name = gimple_assign_rhs1 (ref->stmt);
1082
    }
1083
  else
1084
    name = PHI_RESULT (ref->stmt);
1085
  if (!name)
1086
    return NULL;
1087
 
1088
  for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1089
    {
1090
      phi = gsi_stmt (psi);
1091
      if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1092
        break;
1093
    }
1094
 
1095
  if (gsi_end_p (psi))
1096
    return NULL;
1097
 
1098
  init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1099
  if (TREE_CODE (init) != SSA_NAME)
1100
    return NULL;
1101
  init_stmt = SSA_NAME_DEF_STMT (init);
1102
  if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1103
    return NULL;
1104
  gcc_assert (gimple_assign_lhs (init_stmt) == init);
1105
 
1106
  init_ref = gimple_assign_rhs1 (init_stmt);
1107
  if (!REFERENCE_CLASS_P (init_ref)
1108
      && !DECL_P (init_ref))
1109
    return NULL;
1110
 
1111
  /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1112
     loop enclosing PHI).  */
1113
  memset (&init_dr, 0, sizeof (struct data_reference));
1114
  DR_REF (&init_dr) = init_ref;
1115
  DR_STMT (&init_dr) = phi;
1116
  if (!dr_analyze_innermost (&init_dr))
1117
    return NULL;
1118
 
1119
  if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1120
    return NULL;
1121
 
1122
  return phi;
1123
}
1124
 
1125
/* Adds a reference for the looparound copy of REF in PHI to CHAIN.  */
1126
 
1127
static void
1128
insert_looparound_copy (chain_p chain, dref ref, gimple phi)
1129
{
1130
  dref nw = XCNEW (struct dref_d), aref;
1131
  unsigned i;
1132
 
1133
  nw->stmt = phi;
1134
  nw->distance = ref->distance + 1;
1135
  nw->always_accessed = 1;
1136
 
1137
  for (i = 0; VEC_iterate (dref, chain->refs, i, aref); i++)
1138
    if (aref->distance >= nw->distance)
1139
      break;
1140
  VEC_safe_insert (dref, heap, chain->refs, i, nw);
1141
 
1142
  if (nw->distance > chain->length)
1143
    {
1144
      chain->length = nw->distance;
1145
      chain->has_max_use_after = false;
1146
    }
1147
}
1148
 
1149
/* For references in CHAIN that are copied around the LOOP (created previously
1150
   by PRE, or by user), add the results of such copies to the chain.  This
1151
   enables us to remove the copies by unrolling, and may need less registers
1152
   (also, it may allow us to combine chains together).  */
1153
 
1154
static void
1155
add_looparound_copies (struct loop *loop, chain_p chain)
1156
{
1157
  unsigned i;
1158
  dref ref, root = get_chain_root (chain);
1159
  gimple phi;
1160
 
1161
  for (i = 0; VEC_iterate (dref, chain->refs, i, ref); i++)
1162
    {
1163
      phi = find_looparound_phi (loop, ref, root);
1164
      if (!phi)
1165
        continue;
1166
 
1167
      bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1168
      insert_looparound_copy (chain, ref, phi);
1169
    }
1170
}
1171
 
1172
/* Find roots of the values and determine distances in the component COMP.
1173
   The references are redistributed into CHAINS.  LOOP is the current
1174
   loop.  */
1175
 
1176
static void
1177
determine_roots_comp (struct loop *loop,
1178
                      struct component *comp,
1179
                      VEC (chain_p, heap) **chains)
1180
{
1181
  unsigned i;
1182
  dref a;
1183
  chain_p chain = NULL;
1184
  double_int last_ofs = double_int_zero;
1185
 
1186
  /* Invariants are handled specially.  */
1187
  if (comp->comp_step == RS_INVARIANT)
1188
    {
1189
      chain = make_invariant_chain (comp);
1190
      VEC_safe_push (chain_p, heap, *chains, chain);
1191
      return;
1192
    }
1193
 
1194
  qsort (VEC_address (dref, comp->refs), VEC_length (dref, comp->refs),
1195
         sizeof (dref), order_drefs);
1196
 
1197
  for (i = 0; VEC_iterate (dref, comp->refs, i, a); i++)
1198
    {
1199
      if (!chain || !DR_IS_READ (a->ref)
1200
          || double_int_ucmp (uhwi_to_double_int (MAX_DISTANCE),
1201
                              double_int_add (a->offset,
1202
                                              double_int_neg (last_ofs))) <= 0)
1203
        {
1204
          if (nontrivial_chain_p (chain))
1205
            {
1206
              add_looparound_copies (loop, chain);
1207
              VEC_safe_push (chain_p, heap, *chains, chain);
1208
            }
1209
          else
1210
            release_chain (chain);
1211
          chain = make_rooted_chain (a);
1212
          last_ofs = a->offset;
1213
          continue;
1214
        }
1215
 
1216
      add_ref_to_chain (chain, a);
1217
    }
1218
 
1219
  if (nontrivial_chain_p (chain))
1220
    {
1221
      add_looparound_copies (loop, chain);
1222
      VEC_safe_push (chain_p, heap, *chains, chain);
1223
    }
1224
  else
1225
    release_chain (chain);
1226
}
1227
 
1228
/* Find roots of the values and determine distances in components COMPS, and
1229
   separates the references to CHAINS.  LOOP is the current loop.  */
1230
 
1231
static void
1232
determine_roots (struct loop *loop,
1233
                 struct component *comps, VEC (chain_p, heap) **chains)
1234
{
1235
  struct component *comp;
1236
 
1237
  for (comp = comps; comp; comp = comp->next)
1238
    determine_roots_comp (loop, comp, chains);
1239
}
1240
 
1241
/* Replace the reference in statement STMT with temporary variable
1242
   NEW_TREE.  If SET is true, NEW_TREE is instead initialized to the value of
1243
   the reference in the statement.  IN_LHS is true if the reference
1244
   is in the lhs of STMT, false if it is in rhs.  */
1245
 
1246
static void
1247
replace_ref_with (gimple stmt, tree new_tree, bool set, bool in_lhs)
1248
{
1249
  tree val;
1250
  gimple new_stmt;
1251
  gimple_stmt_iterator bsi, psi;
1252
 
1253
  if (gimple_code (stmt) == GIMPLE_PHI)
1254
    {
1255
      gcc_assert (!in_lhs && !set);
1256
 
1257
      val = PHI_RESULT (stmt);
1258
      bsi = gsi_after_labels (gimple_bb (stmt));
1259
      psi = gsi_for_stmt (stmt);
1260
      remove_phi_node (&psi, false);
1261
 
1262
      /* Turn the phi node into GIMPLE_ASSIGN.  */
1263
      new_stmt = gimple_build_assign (val, new_tree);
1264
      gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1265
      return;
1266
    }
1267
 
1268
  /* Since the reference is of gimple_reg type, it should only
1269
     appear as lhs or rhs of modify statement.  */
1270
  gcc_assert (is_gimple_assign (stmt));
1271
 
1272
  bsi = gsi_for_stmt (stmt);
1273
 
1274
  /* If we do not need to initialize NEW_TREE, just replace the use of OLD.  */
1275
  if (!set)
1276
    {
1277
      gcc_assert (!in_lhs);
1278
      gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1279
      stmt = gsi_stmt (bsi);
1280
      update_stmt (stmt);
1281
      return;
1282
    }
1283
 
1284
  if (in_lhs)
1285
    {
1286
      /* We have statement
1287
 
1288
         OLD = VAL
1289
 
1290
         If OLD is a memory reference, then VAL is gimple_val, and we transform
1291
         this to
1292
 
1293
         OLD = VAL
1294
         NEW = VAL
1295
 
1296
         Otherwise, we are replacing a combination chain,
1297
         VAL is the expression that performs the combination, and OLD is an
1298
         SSA name.  In this case, we transform the assignment to
1299
 
1300
         OLD = VAL
1301
         NEW = OLD
1302
 
1303
         */
1304
 
1305
      val = gimple_assign_lhs (stmt);
1306
      if (TREE_CODE (val) != SSA_NAME)
1307
        {
1308
          gcc_assert (gimple_assign_copy_p (stmt));
1309
          val = gimple_assign_rhs1 (stmt);
1310
        }
1311
    }
1312
  else
1313
    {
1314
      /* VAL = OLD
1315
 
1316
         is transformed to
1317
 
1318
         VAL = OLD
1319
         NEW = VAL  */
1320
 
1321
      val = gimple_assign_lhs (stmt);
1322
    }
1323
 
1324
  new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1325
  gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1326
}
1327
 
1328
/* Returns the reference to the address of REF in the ITER-th iteration of
1329
   LOOP, or NULL if we fail to determine it (ITER may be negative).  We
1330
   try to preserve the original shape of the reference (not rewrite it
1331
   as an indirect ref to the address), to make tree_could_trap_p in
1332
   prepare_initializers_chain return false more often.  */
1333
 
1334
static tree
1335
ref_at_iteration (struct loop *loop, tree ref, int iter)
1336
{
1337
  tree idx, *idx_p, type, val, op0 = NULL_TREE, ret;
1338
  affine_iv iv;
1339
  bool ok;
1340
 
1341
  if (handled_component_p (ref))
1342
    {
1343
      op0 = ref_at_iteration (loop, TREE_OPERAND (ref, 0), iter);
1344
      if (!op0)
1345
        return NULL_TREE;
1346
    }
1347
  else if (!INDIRECT_REF_P (ref))
1348
    return unshare_expr (ref);
1349
 
1350
  if (INDIRECT_REF_P (ref))
1351
    {
1352
      /* Take care for INDIRECT_REF and MISALIGNED_INDIRECT_REF at
1353
         the same time.  */
1354
      ret = copy_node (ref);
1355
      idx = TREE_OPERAND (ref, 0);
1356
      idx_p = &TREE_OPERAND (ret, 0);
1357
    }
1358
  else if (TREE_CODE (ref) == COMPONENT_REF)
1359
    {
1360
      /* Check that the offset is loop invariant.  */
1361
      if (TREE_OPERAND (ref, 2)
1362
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
1363
        return NULL_TREE;
1364
 
1365
      return build3 (COMPONENT_REF, TREE_TYPE (ref), op0,
1366
                     unshare_expr (TREE_OPERAND (ref, 1)),
1367
                     unshare_expr (TREE_OPERAND (ref, 2)));
1368
    }
1369
  else if (TREE_CODE (ref) == ARRAY_REF)
1370
    {
1371
      /* Check that the lower bound and the step are loop invariant.  */
1372
      if (TREE_OPERAND (ref, 2)
1373
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
1374
        return NULL_TREE;
1375
      if (TREE_OPERAND (ref, 3)
1376
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 3)))
1377
        return NULL_TREE;
1378
 
1379
      ret = build4 (ARRAY_REF, TREE_TYPE (ref), op0, NULL_TREE,
1380
                    unshare_expr (TREE_OPERAND (ref, 2)),
1381
                    unshare_expr (TREE_OPERAND (ref, 3)));
1382
      idx = TREE_OPERAND (ref, 1);
1383
      idx_p = &TREE_OPERAND (ret, 1);
1384
    }
1385
  else
1386
    return NULL_TREE;
1387
 
1388
  ok = simple_iv (loop, loop, idx, &iv, true);
1389
  if (!ok)
1390
    return NULL_TREE;
1391
  iv.base = expand_simple_operations (iv.base);
1392
  if (integer_zerop (iv.step))
1393
    *idx_p = unshare_expr (iv.base);
1394
  else
1395
    {
1396
      type = TREE_TYPE (iv.base);
1397
      if (POINTER_TYPE_P (type))
1398
        {
1399
          val = fold_build2 (MULT_EXPR, sizetype, iv.step,
1400
                             size_int (iter));
1401
          val = fold_build2 (POINTER_PLUS_EXPR, type, iv.base, val);
1402
        }
1403
      else
1404
        {
1405
          val = fold_build2 (MULT_EXPR, type, iv.step,
1406
                             build_int_cst_type (type, iter));
1407
          val = fold_build2 (PLUS_EXPR, type, iv.base, val);
1408
        }
1409
      *idx_p = unshare_expr (val);
1410
    }
1411
 
1412
  return ret;
1413
}
1414
 
1415
/* Get the initialization expression for the INDEX-th temporary variable
1416
   of CHAIN.  */
1417
 
1418
static tree
1419
get_init_expr (chain_p chain, unsigned index)
1420
{
1421
  if (chain->type == CT_COMBINATION)
1422
    {
1423
      tree e1 = get_init_expr (chain->ch1, index);
1424
      tree e2 = get_init_expr (chain->ch2, index);
1425
 
1426
      return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1427
    }
1428
  else
1429
    return VEC_index (tree, chain->inits, index);
1430
}
1431
 
1432
/* Marks all virtual operands of statement STMT for renaming.  */
1433
 
1434
void
1435
mark_virtual_ops_for_renaming (gimple stmt)
1436
{
1437
  tree var;
1438
 
1439
  if (gimple_code (stmt) == GIMPLE_PHI)
1440
    {
1441
      var = PHI_RESULT (stmt);
1442
      if (is_gimple_reg (var))
1443
        return;
1444
 
1445
      if (TREE_CODE (var) == SSA_NAME)
1446
        var = SSA_NAME_VAR (var);
1447
      mark_sym_for_renaming (var);
1448
      return;
1449
    }
1450
 
1451
  update_stmt (stmt);
1452
  if (gimple_vuse (stmt))
1453
    mark_sym_for_renaming (gimple_vop (cfun));
1454
}
1455
 
1456
/* Returns a new temporary variable used for the I-th variable carrying
1457
   value of REF.  The variable's uid is marked in TMP_VARS.  */
1458
 
1459
static tree
1460
predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1461
{
1462
  tree type = TREE_TYPE (ref);
1463
  tree var = create_tmp_var (type, get_lsm_tmp_name (ref, i));
1464
 
1465
  /* We never access the components of the temporary variable in predictive
1466
     commoning.  */
1467
  if (TREE_CODE (type) == COMPLEX_TYPE
1468
      || TREE_CODE (type) == VECTOR_TYPE)
1469
    DECL_GIMPLE_REG_P (var) = 1;
1470
 
1471
  add_referenced_var (var);
1472
  bitmap_set_bit (tmp_vars, DECL_UID (var));
1473
  return var;
1474
}
1475
 
1476
/* Creates the variables for CHAIN, as well as phi nodes for them and
1477
   initialization on entry to LOOP.  Uids of the newly created
1478
   temporary variables are marked in TMP_VARS.  */
1479
 
1480
static void
1481
initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1482
{
1483
  unsigned i;
1484
  unsigned n = chain->length;
1485
  dref root = get_chain_root (chain);
1486
  bool reuse_first = !chain->has_max_use_after;
1487
  tree ref, init, var, next;
1488
  gimple phi;
1489
  gimple_seq stmts;
1490
  edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1491
 
1492
  /* If N == 0, then all the references are within the single iteration.  And
1493
     since this is an nonempty chain, reuse_first cannot be true.  */
1494
  gcc_assert (n > 0 || !reuse_first);
1495
 
1496
  chain->vars = VEC_alloc (tree, heap, n + 1);
1497
 
1498
  if (chain->type == CT_COMBINATION)
1499
    ref = gimple_assign_lhs (root->stmt);
1500
  else
1501
    ref = DR_REF (root->ref);
1502
 
1503
  for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1504
    {
1505
      var = predcom_tmp_var (ref, i, tmp_vars);
1506
      VEC_quick_push (tree, chain->vars, var);
1507
    }
1508
  if (reuse_first)
1509
    VEC_quick_push (tree, chain->vars, VEC_index (tree, chain->vars, 0));
1510
 
1511
  for (i = 0; VEC_iterate (tree, chain->vars, i, var); i++)
1512
    VEC_replace (tree, chain->vars, i, make_ssa_name (var, NULL));
1513
 
1514
  for (i = 0; i < n; i++)
1515
    {
1516
      var = VEC_index (tree, chain->vars, i);
1517
      next = VEC_index (tree, chain->vars, i + 1);
1518
      init = get_init_expr (chain, i);
1519
 
1520
      init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1521
      if (stmts)
1522
        gsi_insert_seq_on_edge_immediate (entry, stmts);
1523
 
1524
      phi = create_phi_node (var, loop->header);
1525
      SSA_NAME_DEF_STMT (var) = phi;
1526
      add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1527
      add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1528
    }
1529
}
1530
 
1531
/* Create the variables and initialization statement for root of chain
1532
   CHAIN.  Uids of the newly created temporary variables are marked
1533
   in TMP_VARS.  */
1534
 
1535
static void
1536
initialize_root (struct loop *loop, chain_p chain, bitmap tmp_vars)
1537
{
1538
  dref root = get_chain_root (chain);
1539
  bool in_lhs = (chain->type == CT_STORE_LOAD
1540
                 || chain->type == CT_COMBINATION);
1541
 
1542
  initialize_root_vars (loop, chain, tmp_vars);
1543
  replace_ref_with (root->stmt,
1544
                    VEC_index (tree, chain->vars, chain->length),
1545
                    true, in_lhs);
1546
}
1547
 
1548
/* Initializes a variable for load motion for ROOT and prepares phi nodes and
1549
   initialization on entry to LOOP if necessary.  The ssa name for the variable
1550
   is stored in VARS.  If WRITTEN is true, also a phi node to copy its value
1551
   around the loop is created.  Uid of the newly created temporary variable
1552
   is marked in TMP_VARS.  INITS is the list containing the (single)
1553
   initializer.  */
1554
 
1555
static void
1556
initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1557
                         VEC(tree, heap) **vars, VEC(tree, heap) *inits,
1558
                         bitmap tmp_vars)
1559
{
1560
  unsigned i;
1561
  tree ref = DR_REF (root->ref), init, var, next;
1562
  gimple_seq stmts;
1563
  gimple phi;
1564
  edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1565
 
1566
  /* Find the initializer for the variable, and check that it cannot
1567
     trap.  */
1568
  init = VEC_index (tree, inits, 0);
1569
 
1570
  *vars = VEC_alloc (tree, heap, written ? 2 : 1);
1571
  var = predcom_tmp_var (ref, 0, tmp_vars);
1572
  VEC_quick_push (tree, *vars, var);
1573
  if (written)
1574
    VEC_quick_push (tree, *vars, VEC_index (tree, *vars, 0));
1575
 
1576
  for (i = 0; VEC_iterate (tree, *vars, i, var); i++)
1577
    VEC_replace (tree, *vars, i, make_ssa_name (var, NULL));
1578
 
1579
  var = VEC_index (tree, *vars, 0);
1580
 
1581
  init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1582
  if (stmts)
1583
    gsi_insert_seq_on_edge_immediate (entry, stmts);
1584
 
1585
  if (written)
1586
    {
1587
      next = VEC_index (tree, *vars, 1);
1588
      phi = create_phi_node (var, loop->header);
1589
      SSA_NAME_DEF_STMT (var) = phi;
1590
      add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1591
      add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1592
    }
1593
  else
1594
    {
1595
      gimple init_stmt = gimple_build_assign (var, init);
1596
      mark_virtual_ops_for_renaming (init_stmt);
1597
      gsi_insert_on_edge_immediate (entry, init_stmt);
1598
    }
1599
}
1600
 
1601
 
1602
/* Execute load motion for references in chain CHAIN.  Uids of the newly
1603
   created temporary variables are marked in TMP_VARS.  */
1604
 
1605
static void
1606
execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1607
{
1608
  VEC (tree, heap) *vars;
1609
  dref a;
1610
  unsigned n_writes = 0, ridx, i;
1611
  tree var;
1612
 
1613
  gcc_assert (chain->type == CT_INVARIANT);
1614
  gcc_assert (!chain->combined);
1615
  for (i = 0; VEC_iterate (dref, chain->refs, i, a); i++)
1616
    if (!DR_IS_READ (a->ref))
1617
      n_writes++;
1618
 
1619
  /* If there are no reads in the loop, there is nothing to do.  */
1620
  if (n_writes == VEC_length (dref, chain->refs))
1621
    return;
1622
 
1623
  initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1624
                           &vars, chain->inits, tmp_vars);
1625
 
1626
  ridx = 0;
1627
  for (i = 0; VEC_iterate (dref, chain->refs, i, a); i++)
1628
    {
1629
      bool is_read = DR_IS_READ (a->ref);
1630
      mark_virtual_ops_for_renaming (a->stmt);
1631
 
1632
      if (!DR_IS_READ (a->ref))
1633
        {
1634
          n_writes--;
1635
          if (n_writes)
1636
            {
1637
              var = VEC_index (tree, vars, 0);
1638
              var = make_ssa_name (SSA_NAME_VAR (var), NULL);
1639
              VEC_replace (tree, vars, 0, var);
1640
            }
1641
          else
1642
            ridx = 1;
1643
        }
1644
 
1645
      replace_ref_with (a->stmt, VEC_index (tree, vars, ridx),
1646
                        !is_read, !is_read);
1647
    }
1648
 
1649
  VEC_free (tree, heap, vars);
1650
}
1651
 
1652
/* Returns the single statement in that NAME is used, excepting
1653
   the looparound phi nodes contained in one of the chains.  If there is no
1654
   such statement, or more statements, NULL is returned.  */
1655
 
1656
static gimple
1657
single_nonlooparound_use (tree name)
1658
{
1659
  use_operand_p use;
1660
  imm_use_iterator it;
1661
  gimple stmt, ret = NULL;
1662
 
1663
  FOR_EACH_IMM_USE_FAST (use, it, name)
1664
    {
1665
      stmt = USE_STMT (use);
1666
 
1667
      if (gimple_code (stmt) == GIMPLE_PHI)
1668
        {
1669
          /* Ignore uses in looparound phi nodes.  Uses in other phi nodes
1670
             could not be processed anyway, so just fail for them.  */
1671
          if (bitmap_bit_p (looparound_phis,
1672
                            SSA_NAME_VERSION (PHI_RESULT (stmt))))
1673
            continue;
1674
 
1675
          return NULL;
1676
        }
1677
      else if (ret != NULL)
1678
        return NULL;
1679
      else
1680
        ret = stmt;
1681
    }
1682
 
1683
  return ret;
1684
}
1685
 
1686
/* Remove statement STMT, as well as the chain of assignments in that it is
1687
   used.  */
1688
 
1689
static void
1690
remove_stmt (gimple stmt)
1691
{
1692
  tree name;
1693
  gimple next;
1694
  gimple_stmt_iterator psi;
1695
 
1696
  if (gimple_code (stmt) == GIMPLE_PHI)
1697
    {
1698
      name = PHI_RESULT (stmt);
1699
      next = single_nonlooparound_use (name);
1700
      psi = gsi_for_stmt (stmt);
1701
      remove_phi_node (&psi, true);
1702
 
1703
      if (!next
1704
          || !gimple_assign_ssa_name_copy_p (next)
1705
          || gimple_assign_rhs1 (next) != name)
1706
        return;
1707
 
1708
      stmt = next;
1709
    }
1710
 
1711
  while (1)
1712
    {
1713
      gimple_stmt_iterator bsi;
1714
 
1715
      bsi = gsi_for_stmt (stmt);
1716
 
1717
      name = gimple_assign_lhs (stmt);
1718
      gcc_assert (TREE_CODE (name) == SSA_NAME);
1719
 
1720
      next = single_nonlooparound_use (name);
1721
 
1722
      mark_virtual_ops_for_renaming (stmt);
1723
      gsi_remove (&bsi, true);
1724
      release_defs (stmt);
1725
 
1726
      if (!next
1727
          || !gimple_assign_ssa_name_copy_p (next)
1728
          || gimple_assign_rhs1 (next) != name)
1729
        return;
1730
 
1731
      stmt = next;
1732
    }
1733
}
1734
 
1735
/* Perform the predictive commoning optimization for a chain CHAIN.
1736
   Uids of the newly created temporary variables are marked in TMP_VARS.*/
1737
 
1738
static void
1739
execute_pred_commoning_chain (struct loop *loop, chain_p chain,
1740
                             bitmap tmp_vars)
1741
{
1742
  unsigned i;
1743
  dref a, root;
1744
  tree var;
1745
 
1746
  if (chain->combined)
1747
    {
1748
      /* For combined chains, just remove the statements that are used to
1749
         compute the values of the expression (except for the root one).  */
1750
      for (i = 1; VEC_iterate (dref, chain->refs, i, a); i++)
1751
        remove_stmt (a->stmt);
1752
    }
1753
  else
1754
    {
1755
      /* For non-combined chains, set up the variables that hold its value,
1756
         and replace the uses of the original references by these
1757
         variables.  */
1758
      root = get_chain_root (chain);
1759
      mark_virtual_ops_for_renaming (root->stmt);
1760
 
1761
      initialize_root (loop, chain, tmp_vars);
1762
      for (i = 1; VEC_iterate (dref, chain->refs, i, a); i++)
1763
        {
1764
          mark_virtual_ops_for_renaming (a->stmt);
1765
          var = VEC_index (tree, chain->vars, chain->length - a->distance);
1766
          replace_ref_with (a->stmt, var, false, false);
1767
        }
1768
    }
1769
}
1770
 
1771
/* Determines the unroll factor necessary to remove as many temporary variable
1772
   copies as possible.  CHAINS is the list of chains that will be
1773
   optimized.  */
1774
 
1775
static unsigned
1776
determine_unroll_factor (VEC (chain_p, heap) *chains)
1777
{
1778
  chain_p chain;
1779
  unsigned factor = 1, af, nfactor, i;
1780
  unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
1781
 
1782
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
1783
    {
1784
      if (chain->type == CT_INVARIANT || chain->combined)
1785
        continue;
1786
 
1787
      /* The best unroll factor for this chain is equal to the number of
1788
         temporary variables that we create for it.  */
1789
      af = chain->length;
1790
      if (chain->has_max_use_after)
1791
        af++;
1792
 
1793
      nfactor = factor * af / gcd (factor, af);
1794
      if (nfactor <= max)
1795
        factor = nfactor;
1796
    }
1797
 
1798
  return factor;
1799
}
1800
 
1801
/* Perform the predictive commoning optimization for CHAINS.
1802
   Uids of the newly created temporary variables are marked in TMP_VARS.  */
1803
 
1804
static void
1805
execute_pred_commoning (struct loop *loop, VEC (chain_p, heap) *chains,
1806
                        bitmap tmp_vars)
1807
{
1808
  chain_p chain;
1809
  unsigned i;
1810
 
1811
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
1812
    {
1813
      if (chain->type == CT_INVARIANT)
1814
        execute_load_motion (loop, chain, tmp_vars);
1815
      else
1816
        execute_pred_commoning_chain (loop, chain, tmp_vars);
1817
    }
1818
 
1819
  update_ssa (TODO_update_ssa_only_virtuals);
1820
}
1821
 
1822
/* For each reference in CHAINS, if its defining statement is
1823
   phi node, record the ssa name that is defined by it.  */
1824
 
1825
static void
1826
replace_phis_by_defined_names (VEC (chain_p, heap) *chains)
1827
{
1828
  chain_p chain;
1829
  dref a;
1830
  unsigned i, j;
1831
 
1832
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
1833
    for (j = 0; VEC_iterate (dref, chain->refs, j, a); j++)
1834
      {
1835
        if (gimple_code (a->stmt) == GIMPLE_PHI)
1836
          {
1837
            a->name_defined_by_phi = PHI_RESULT (a->stmt);
1838
            a->stmt = NULL;
1839
          }
1840
      }
1841
}
1842
 
1843
/* For each reference in CHAINS, if name_defined_by_phi is not
1844
   NULL, use it to set the stmt field.  */
1845
 
1846
static void
1847
replace_names_by_phis (VEC (chain_p, heap) *chains)
1848
{
1849
  chain_p chain;
1850
  dref a;
1851
  unsigned i, j;
1852
 
1853
  for (i = 0; VEC_iterate (chain_p, chains, i, chain); i++)
1854
    for (j = 0; VEC_iterate (dref, chain->refs, j, a); j++)
1855
      if (a->stmt == NULL)
1856
        {
1857
          a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
1858
          gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
1859
          a->name_defined_by_phi = NULL_TREE;
1860
        }
1861
}
1862
 
1863
/* Wrapper over execute_pred_commoning, to pass it as a callback
1864
   to tree_transform_and_unroll_loop.  */
1865
 
1866
struct epcc_data
1867
{
1868
  VEC (chain_p, heap) *chains;
1869
  bitmap tmp_vars;
1870
};
1871
 
1872
static void
1873
execute_pred_commoning_cbck (struct loop *loop, void *data)
1874
{
1875
  struct epcc_data *const dta = (struct epcc_data *) data;
1876
 
1877
  /* Restore phi nodes that were replaced by ssa names before
1878
     tree_transform_and_unroll_loop (see detailed description in
1879
     tree_predictive_commoning_loop).  */
1880
  replace_names_by_phis (dta->chains);
1881
  execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
1882
}
1883
 
1884
/* Base NAME and all the names in the chain of phi nodes that use it
1885
   on variable VAR.  The phi nodes are recognized by being in the copies of
1886
   the header of the LOOP.  */
1887
 
1888
static void
1889
base_names_in_chain_on (struct loop *loop, tree name, tree var)
1890
{
1891
  gimple stmt, phi;
1892
  imm_use_iterator iter;
1893
 
1894
  SSA_NAME_VAR (name) = var;
1895
 
1896
  while (1)
1897
    {
1898
      phi = NULL;
1899
      FOR_EACH_IMM_USE_STMT (stmt, iter, name)
1900
        {
1901
          if (gimple_code (stmt) == GIMPLE_PHI
1902
              && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
1903
            {
1904
              phi = stmt;
1905
              BREAK_FROM_IMM_USE_STMT (iter);
1906
            }
1907
        }
1908
      if (!phi)
1909
        return;
1910
 
1911
      name = PHI_RESULT (phi);
1912
      SSA_NAME_VAR (name) = var;
1913
    }
1914
}
1915
 
1916
/* Given an unrolled LOOP after predictive commoning, remove the
1917
   register copies arising from phi nodes by changing the base
1918
   variables of SSA names.  TMP_VARS is the set of the temporary variables
1919
   for those we want to perform this.  */
1920
 
1921
static void
1922
eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
1923
{
1924
  edge e;
1925
  gimple phi, stmt;
1926
  tree name, use, var;
1927
  gimple_stmt_iterator psi;
1928
 
1929
  e = loop_latch_edge (loop);
1930
  for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1931
    {
1932
      phi = gsi_stmt (psi);
1933
      name = PHI_RESULT (phi);
1934
      var = SSA_NAME_VAR (name);
1935
      if (!bitmap_bit_p (tmp_vars, DECL_UID (var)))
1936
        continue;
1937
      use = PHI_ARG_DEF_FROM_EDGE (phi, e);
1938
      gcc_assert (TREE_CODE (use) == SSA_NAME);
1939
 
1940
      /* Base all the ssa names in the ud and du chain of NAME on VAR.  */
1941
      stmt = SSA_NAME_DEF_STMT (use);
1942
      while (gimple_code (stmt) == GIMPLE_PHI
1943
             /* In case we could not unroll the loop enough to eliminate
1944
                all copies, we may reach the loop header before the defining
1945
                statement (in that case, some register copies will be present
1946
                in loop latch in the final code, corresponding to the newly
1947
                created looparound phi nodes).  */
1948
             && gimple_bb (stmt) != loop->header)
1949
        {
1950
          gcc_assert (single_pred_p (gimple_bb (stmt)));
1951
          use = PHI_ARG_DEF (stmt, 0);
1952
          stmt = SSA_NAME_DEF_STMT (use);
1953
        }
1954
 
1955
      base_names_in_chain_on (loop, use, var);
1956
    }
1957
}
1958
 
1959
/* Returns true if CHAIN is suitable to be combined.  */
1960
 
1961
static bool
1962
chain_can_be_combined_p (chain_p chain)
1963
{
1964
  return (!chain->combined
1965
          && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
1966
}
1967
 
1968
/* Returns the modify statement that uses NAME.  Skips over assignment
1969
   statements, NAME is replaced with the actual name used in the returned
1970
   statement.  */
1971
 
1972
static gimple
1973
find_use_stmt (tree *name)
1974
{
1975
  gimple stmt;
1976
  tree rhs, lhs;
1977
 
1978
  /* Skip over assignments.  */
1979
  while (1)
1980
    {
1981
      stmt = single_nonlooparound_use (*name);
1982
      if (!stmt)
1983
        return NULL;
1984
 
1985
      if (gimple_code (stmt) != GIMPLE_ASSIGN)
1986
        return NULL;
1987
 
1988
      lhs = gimple_assign_lhs (stmt);
1989
      if (TREE_CODE (lhs) != SSA_NAME)
1990
        return NULL;
1991
 
1992
      if (gimple_assign_copy_p (stmt))
1993
        {
1994
          rhs = gimple_assign_rhs1 (stmt);
1995
          if (rhs != *name)
1996
            return NULL;
1997
 
1998
          *name = lhs;
1999
        }
2000
      else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2001
               == GIMPLE_BINARY_RHS)
2002
        return stmt;
2003
      else
2004
        return NULL;
2005
    }
2006
}
2007
 
2008
/* Returns true if we may perform reassociation for operation CODE in TYPE.  */
2009
 
2010
static bool
2011
may_reassociate_p (tree type, enum tree_code code)
2012
{
2013
  if (FLOAT_TYPE_P (type)
2014
      && !flag_unsafe_math_optimizations)
2015
    return false;
2016
 
2017
  return (commutative_tree_code (code)
2018
          && associative_tree_code (code));
2019
}
2020
 
2021
/* If the operation used in STMT is associative and commutative, go through the
2022
   tree of the same operations and returns its root.  Distance to the root
2023
   is stored in DISTANCE.  */
2024
 
2025
static gimple
2026
find_associative_operation_root (gimple stmt, unsigned *distance)
2027
{
2028
  tree lhs;
2029
  gimple next;
2030
  enum tree_code code = gimple_assign_rhs_code (stmt);
2031
  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2032
  unsigned dist = 0;
2033
 
2034
  if (!may_reassociate_p (type, code))
2035
    return NULL;
2036
 
2037
  while (1)
2038
    {
2039
      lhs = gimple_assign_lhs (stmt);
2040
      gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2041
 
2042
      next = find_use_stmt (&lhs);
2043
      if (!next
2044
          || gimple_assign_rhs_code (next) != code)
2045
        break;
2046
 
2047
      stmt = next;
2048
      dist++;
2049
    }
2050
 
2051
  if (distance)
2052
    *distance = dist;
2053
  return stmt;
2054
}
2055
 
2056
/* Returns the common statement in that NAME1 and NAME2 have a use.  If there
2057
   is no such statement, returns NULL_TREE.  In case the operation used on
2058
   NAME1 and NAME2 is associative and commutative, returns the root of the
2059
   tree formed by this operation instead of the statement that uses NAME1 or
2060
   NAME2.  */
2061
 
2062
static gimple
2063
find_common_use_stmt (tree *name1, tree *name2)
2064
{
2065
  gimple stmt1, stmt2;
2066
 
2067
  stmt1 = find_use_stmt (name1);
2068
  if (!stmt1)
2069
    return NULL;
2070
 
2071
  stmt2 = find_use_stmt (name2);
2072
  if (!stmt2)
2073
    return NULL;
2074
 
2075
  if (stmt1 == stmt2)
2076
    return stmt1;
2077
 
2078
  stmt1 = find_associative_operation_root (stmt1, NULL);
2079
  if (!stmt1)
2080
    return NULL;
2081
  stmt2 = find_associative_operation_root (stmt2, NULL);
2082
  if (!stmt2)
2083
    return NULL;
2084
 
2085
  return (stmt1 == stmt2 ? stmt1 : NULL);
2086
}
2087
 
2088
/* Checks whether R1 and R2 are combined together using CODE, with the result
2089
   in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2090
   if it is true.  If CODE is ERROR_MARK, set these values instead.  */
2091
 
2092
static bool
2093
combinable_refs_p (dref r1, dref r2,
2094
                   enum tree_code *code, bool *swap, tree *rslt_type)
2095
{
2096
  enum tree_code acode;
2097
  bool aswap;
2098
  tree atype;
2099
  tree name1, name2;
2100
  gimple stmt;
2101
 
2102
  name1 = name_for_ref (r1);
2103
  name2 = name_for_ref (r2);
2104
  gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2105
 
2106
  stmt = find_common_use_stmt (&name1, &name2);
2107
 
2108
  if (!stmt)
2109
    return false;
2110
 
2111
  acode = gimple_assign_rhs_code (stmt);
2112
  aswap = (!commutative_tree_code (acode)
2113
           && gimple_assign_rhs1 (stmt) != name1);
2114
  atype = TREE_TYPE (gimple_assign_lhs (stmt));
2115
 
2116
  if (*code == ERROR_MARK)
2117
    {
2118
      *code = acode;
2119
      *swap = aswap;
2120
      *rslt_type = atype;
2121
      return true;
2122
    }
2123
 
2124
  return (*code == acode
2125
          && *swap == aswap
2126
          && *rslt_type == atype);
2127
}
2128
 
2129
/* Remove OP from the operation on rhs of STMT, and replace STMT with
2130
   an assignment of the remaining operand.  */
2131
 
2132
static void
2133
remove_name_from_operation (gimple stmt, tree op)
2134
{
2135
  tree other_op;
2136
  gimple_stmt_iterator si;
2137
 
2138
  gcc_assert (is_gimple_assign (stmt));
2139
 
2140
  if (gimple_assign_rhs1 (stmt) == op)
2141
    other_op = gimple_assign_rhs2 (stmt);
2142
  else
2143
    other_op = gimple_assign_rhs1 (stmt);
2144
 
2145
  si = gsi_for_stmt (stmt);
2146
  gimple_assign_set_rhs_from_tree (&si, other_op);
2147
 
2148
  /* We should not have reallocated STMT.  */
2149
  gcc_assert (gsi_stmt (si) == stmt);
2150
 
2151
  update_stmt (stmt);
2152
}
2153
 
2154
/* Reassociates the expression in that NAME1 and NAME2 are used so that they
2155
   are combined in a single statement, and returns this statement.  */
2156
 
2157
static gimple
2158
reassociate_to_the_same_stmt (tree name1, tree name2)
2159
{
2160
  gimple stmt1, stmt2, root1, root2, s1, s2;
2161
  gimple new_stmt, tmp_stmt;
2162
  tree new_name, tmp_name, var, r1, r2;
2163
  unsigned dist1, dist2;
2164
  enum tree_code code;
2165
  tree type = TREE_TYPE (name1);
2166
  gimple_stmt_iterator bsi;
2167
 
2168
  stmt1 = find_use_stmt (&name1);
2169
  stmt2 = find_use_stmt (&name2);
2170
  root1 = find_associative_operation_root (stmt1, &dist1);
2171
  root2 = find_associative_operation_root (stmt2, &dist2);
2172
  code = gimple_assign_rhs_code (stmt1);
2173
 
2174
  gcc_assert (root1 && root2 && root1 == root2
2175
              && code == gimple_assign_rhs_code (stmt2));
2176
 
2177
  /* Find the root of the nearest expression in that both NAME1 and NAME2
2178
     are used.  */
2179
  r1 = name1;
2180
  s1 = stmt1;
2181
  r2 = name2;
2182
  s2 = stmt2;
2183
 
2184
  while (dist1 > dist2)
2185
    {
2186
      s1 = find_use_stmt (&r1);
2187
      r1 = gimple_assign_lhs (s1);
2188
      dist1--;
2189
    }
2190
  while (dist2 > dist1)
2191
    {
2192
      s2 = find_use_stmt (&r2);
2193
      r2 = gimple_assign_lhs (s2);
2194
      dist2--;
2195
    }
2196
 
2197
  while (s1 != s2)
2198
    {
2199
      s1 = find_use_stmt (&r1);
2200
      r1 = gimple_assign_lhs (s1);
2201
      s2 = find_use_stmt (&r2);
2202
      r2 = gimple_assign_lhs (s2);
2203
    }
2204
 
2205
  /* Remove NAME1 and NAME2 from the statements in that they are used
2206
     currently.  */
2207
  remove_name_from_operation (stmt1, name1);
2208
  remove_name_from_operation (stmt2, name2);
2209
 
2210
  /* Insert the new statement combining NAME1 and NAME2 before S1, and
2211
     combine it with the rhs of S1.  */
2212
  var = create_tmp_var (type, "predreastmp");
2213
  if (TREE_CODE (type) == COMPLEX_TYPE
2214
      || TREE_CODE (type) == VECTOR_TYPE)
2215
    DECL_GIMPLE_REG_P (var) = 1;
2216
  add_referenced_var (var);
2217
  new_name = make_ssa_name (var, NULL);
2218
  new_stmt = gimple_build_assign_with_ops (code, new_name, name1, name2);
2219
 
2220
  var = create_tmp_var (type, "predreastmp");
2221
  if (TREE_CODE (type) == COMPLEX_TYPE
2222
      || TREE_CODE (type) == VECTOR_TYPE)
2223
    DECL_GIMPLE_REG_P (var) = 1;
2224
  add_referenced_var (var);
2225
  tmp_name = make_ssa_name (var, NULL);
2226
 
2227
  /* Rhs of S1 may now be either a binary expression with operation
2228
     CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2229
     so that name1 or name2 was removed from it).  */
2230
  tmp_stmt = gimple_build_assign_with_ops (gimple_assign_rhs_code (s1),
2231
                                           tmp_name,
2232
                                           gimple_assign_rhs1 (s1),
2233
                                           gimple_assign_rhs2 (s1));
2234
 
2235
  bsi = gsi_for_stmt (s1);
2236
  gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2237
  s1 = gsi_stmt (bsi);
2238
  update_stmt (s1);
2239
 
2240
  gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2241
  gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2242
 
2243
  return new_stmt;
2244
}
2245
 
2246
/* Returns the statement that combines references R1 and R2.  In case R1
2247
   and R2 are not used in the same statement, but they are used with an
2248
   associative and commutative operation in the same expression, reassociate
2249
   the expression so that they are used in the same statement.  */
2250
 
2251
static gimple
2252
stmt_combining_refs (dref r1, dref r2)
2253
{
2254
  gimple stmt1, stmt2;
2255
  tree name1 = name_for_ref (r1);
2256
  tree name2 = name_for_ref (r2);
2257
 
2258
  stmt1 = find_use_stmt (&name1);
2259
  stmt2 = find_use_stmt (&name2);
2260
  if (stmt1 == stmt2)
2261
    return stmt1;
2262
 
2263
  return reassociate_to_the_same_stmt (name1, name2);
2264
}
2265
 
2266
/* Tries to combine chains CH1 and CH2 together.  If this succeeds, the
2267
   description of the new chain is returned, otherwise we return NULL.  */
2268
 
2269
static chain_p
2270
combine_chains (chain_p ch1, chain_p ch2)
2271
{
2272
  dref r1, r2, nw;
2273
  enum tree_code op = ERROR_MARK;
2274
  bool swap = false;
2275
  chain_p new_chain;
2276
  unsigned i;
2277
  gimple root_stmt;
2278
  tree rslt_type = NULL_TREE;
2279
 
2280
  if (ch1 == ch2)
2281
    return NULL;
2282
  if (ch1->length != ch2->length)
2283
    return NULL;
2284
 
2285
  if (VEC_length (dref, ch1->refs) != VEC_length (dref, ch2->refs))
2286
    return NULL;
2287
 
2288
  for (i = 0; (VEC_iterate (dref, ch1->refs, i, r1)
2289
               && VEC_iterate (dref, ch2->refs, i, r2)); i++)
2290
    {
2291
      if (r1->distance != r2->distance)
2292
        return NULL;
2293
 
2294
      if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2295
        return NULL;
2296
    }
2297
 
2298
  if (swap)
2299
    {
2300
      chain_p tmp = ch1;
2301
      ch1 = ch2;
2302
      ch2 = tmp;
2303
    }
2304
 
2305
  new_chain = XCNEW (struct chain);
2306
  new_chain->type = CT_COMBINATION;
2307
  new_chain->op = op;
2308
  new_chain->ch1 = ch1;
2309
  new_chain->ch2 = ch2;
2310
  new_chain->rslt_type = rslt_type;
2311
  new_chain->length = ch1->length;
2312
 
2313
  for (i = 0; (VEC_iterate (dref, ch1->refs, i, r1)
2314
               && VEC_iterate (dref, ch2->refs, i, r2)); i++)
2315
    {
2316
      nw = XCNEW (struct dref_d);
2317
      nw->stmt = stmt_combining_refs (r1, r2);
2318
      nw->distance = r1->distance;
2319
 
2320
      VEC_safe_push (dref, heap, new_chain->refs, nw);
2321
    }
2322
 
2323
  new_chain->has_max_use_after = false;
2324
  root_stmt = get_chain_root (new_chain)->stmt;
2325
  for (i = 1; VEC_iterate (dref, new_chain->refs, i, nw); i++)
2326
    {
2327
      if (nw->distance == new_chain->length
2328
          && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2329
        {
2330
          new_chain->has_max_use_after = true;
2331
          break;
2332
        }
2333
    }
2334
 
2335
  ch1->combined = true;
2336
  ch2->combined = true;
2337
  return new_chain;
2338
}
2339
 
2340
/* Try to combine the CHAINS.  */
2341
 
2342
static void
2343
try_combine_chains (VEC (chain_p, heap) **chains)
2344
{
2345
  unsigned i, j;
2346
  chain_p ch1, ch2, cch;
2347
  VEC (chain_p, heap) *worklist = NULL;
2348
 
2349
  for (i = 0; VEC_iterate (chain_p, *chains, i, ch1); i++)
2350
    if (chain_can_be_combined_p (ch1))
2351
      VEC_safe_push (chain_p, heap, worklist, ch1);
2352
 
2353
  while (!VEC_empty (chain_p, worklist))
2354
    {
2355
      ch1 = VEC_pop (chain_p, worklist);
2356
      if (!chain_can_be_combined_p (ch1))
2357
        continue;
2358
 
2359
      for (j = 0; VEC_iterate (chain_p, *chains, j, ch2); j++)
2360
        {
2361
          if (!chain_can_be_combined_p (ch2))
2362
            continue;
2363
 
2364
          cch = combine_chains (ch1, ch2);
2365
          if (cch)
2366
            {
2367
              VEC_safe_push (chain_p, heap, worklist, cch);
2368
              VEC_safe_push (chain_p, heap, *chains, cch);
2369
              break;
2370
            }
2371
        }
2372
    }
2373
}
2374
 
2375
/* Prepare initializers for CHAIN in LOOP.  Returns false if this is
2376
   impossible because one of these initializers may trap, true otherwise.  */
2377
 
2378
static bool
2379
prepare_initializers_chain (struct loop *loop, chain_p chain)
2380
{
2381
  unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2382
  struct data_reference *dr = get_chain_root (chain)->ref;
2383
  tree init;
2384
  gimple_seq stmts;
2385
  dref laref;
2386
  edge entry = loop_preheader_edge (loop);
2387
 
2388
  /* Find the initializers for the variables, and check that they cannot
2389
     trap.  */
2390
  chain->inits = VEC_alloc (tree, heap, n);
2391
  for (i = 0; i < n; i++)
2392
    VEC_quick_push (tree, chain->inits, NULL_TREE);
2393
 
2394
  /* If we have replaced some looparound phi nodes, use their initializers
2395
     instead of creating our own.  */
2396
  for (i = 0; VEC_iterate (dref, chain->refs, i, laref); i++)
2397
    {
2398
      if (gimple_code (laref->stmt) != GIMPLE_PHI)
2399
        continue;
2400
 
2401
      gcc_assert (laref->distance > 0);
2402
      VEC_replace (tree, chain->inits, n - laref->distance,
2403
                   PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry));
2404
    }
2405
 
2406
  for (i = 0; i < n; i++)
2407
    {
2408
      if (VEC_index (tree, chain->inits, i) != NULL_TREE)
2409
        continue;
2410
 
2411
      init = ref_at_iteration (loop, DR_REF (dr), (int) i - n);
2412
      if (!init)
2413
        return false;
2414
 
2415
      if (!chain->all_always_accessed && tree_could_trap_p (init))
2416
        return false;
2417
 
2418
      init = force_gimple_operand (init, &stmts, false, NULL_TREE);
2419
      if (stmts)
2420
        gsi_insert_seq_on_edge_immediate (entry, stmts);
2421
 
2422
      VEC_replace (tree, chain->inits, i, init);
2423
    }
2424
 
2425
  return true;
2426
}
2427
 
2428
/* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2429
   be used because the initializers might trap.  */
2430
 
2431
static void
2432
prepare_initializers (struct loop *loop, VEC (chain_p, heap) *chains)
2433
{
2434
  chain_p chain;
2435
  unsigned i;
2436
 
2437
  for (i = 0; i < VEC_length (chain_p, chains); )
2438
    {
2439
      chain = VEC_index (chain_p, chains, i);
2440
      if (prepare_initializers_chain (loop, chain))
2441
        i++;
2442
      else
2443
        {
2444
          release_chain (chain);
2445
          VEC_unordered_remove (chain_p, chains, i);
2446
        }
2447
    }
2448
}
2449
 
2450
/* Performs predictive commoning for LOOP.  Returns true if LOOP was
2451
   unrolled.  */
2452
 
2453
static bool
2454
tree_predictive_commoning_loop (struct loop *loop)
2455
{
2456
  VEC (data_reference_p, heap) *datarefs;
2457
  VEC (ddr_p, heap) *dependences;
2458
  struct component *components;
2459
  VEC (chain_p, heap) *chains = NULL;
2460
  unsigned unroll_factor;
2461
  struct tree_niter_desc desc;
2462
  bool unroll = false;
2463
  edge exit;
2464
  bitmap tmp_vars;
2465
 
2466
  if (dump_file && (dump_flags & TDF_DETAILS))
2467
    fprintf (dump_file, "Processing loop %d\n",  loop->num);
2468
 
2469
  /* Find the data references and split them into components according to their
2470
     dependence relations.  */
2471
  datarefs = VEC_alloc (data_reference_p, heap, 10);
2472
  dependences = VEC_alloc (ddr_p, heap, 10);
2473
  compute_data_dependences_for_loop (loop, true, &datarefs, &dependences);
2474
  if (dump_file && (dump_flags & TDF_DETAILS))
2475
    dump_data_dependence_relations (dump_file, dependences);
2476
 
2477
  components = split_data_refs_to_components (loop, datarefs, dependences);
2478
  free_dependence_relations (dependences);
2479
  if (!components)
2480
    {
2481
      free_data_refs (datarefs);
2482
      return false;
2483
    }
2484
 
2485
  if (dump_file && (dump_flags & TDF_DETAILS))
2486
    {
2487
      fprintf (dump_file, "Initial state:\n\n");
2488
      dump_components (dump_file, components);
2489
    }
2490
 
2491
  /* Find the suitable components and split them into chains.  */
2492
  components = filter_suitable_components (loop, components);
2493
 
2494
  tmp_vars = BITMAP_ALLOC (NULL);
2495
  looparound_phis = BITMAP_ALLOC (NULL);
2496
  determine_roots (loop, components, &chains);
2497
  release_components (components);
2498
 
2499
  if (!chains)
2500
    {
2501
      if (dump_file && (dump_flags & TDF_DETAILS))
2502
        fprintf (dump_file,
2503
                 "Predictive commoning failed: no suitable chains\n");
2504
      goto end;
2505
    }
2506
  prepare_initializers (loop, chains);
2507
 
2508
  /* Try to combine the chains that are always worked with together.  */
2509
  try_combine_chains (&chains);
2510
 
2511
  if (dump_file && (dump_flags & TDF_DETAILS))
2512
    {
2513
      fprintf (dump_file, "Before commoning:\n\n");
2514
      dump_chains (dump_file, chains);
2515
    }
2516
 
2517
  /* Determine the unroll factor, and if the loop should be unrolled, ensure
2518
     that its number of iterations is divisible by the factor.  */
2519
  unroll_factor = determine_unroll_factor (chains);
2520
  scev_reset ();
2521
  unroll = (unroll_factor > 1
2522
            && can_unroll_loop_p (loop, unroll_factor, &desc));
2523
  exit = single_dom_exit (loop);
2524
 
2525
  /* Execute the predictive commoning transformations, and possibly unroll the
2526
     loop.  */
2527
  if (unroll)
2528
    {
2529
      struct epcc_data dta;
2530
 
2531
      if (dump_file && (dump_flags & TDF_DETAILS))
2532
        fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
2533
 
2534
      dta.chains = chains;
2535
      dta.tmp_vars = tmp_vars;
2536
 
2537
      update_ssa (TODO_update_ssa_only_virtuals);
2538
 
2539
      /* Cfg manipulations performed in tree_transform_and_unroll_loop before
2540
         execute_pred_commoning_cbck is called may cause phi nodes to be
2541
         reallocated, which is a problem since CHAINS may point to these
2542
         statements.  To fix this, we store the ssa names defined by the
2543
         phi nodes here instead of the phi nodes themselves, and restore
2544
         the phi nodes in execute_pred_commoning_cbck.  A bit hacky.  */
2545
      replace_phis_by_defined_names (chains);
2546
 
2547
      tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
2548
                                      execute_pred_commoning_cbck, &dta);
2549
      eliminate_temp_copies (loop, tmp_vars);
2550
    }
2551
  else
2552
    {
2553
      if (dump_file && (dump_flags & TDF_DETAILS))
2554
        fprintf (dump_file,
2555
                 "Executing predictive commoning without unrolling.\n");
2556
      execute_pred_commoning (loop, chains, tmp_vars);
2557
    }
2558
 
2559
end: ;
2560
  release_chains (chains);
2561
  free_data_refs (datarefs);
2562
  BITMAP_FREE (tmp_vars);
2563
  BITMAP_FREE (looparound_phis);
2564
 
2565
  free_affine_expand_cache (&name_expansions);
2566
 
2567
  return unroll;
2568
}
2569
 
2570
/* Runs predictive commoning.  */
2571
 
2572
unsigned
2573
tree_predictive_commoning (void)
2574
{
2575
  bool unrolled = false;
2576
  struct loop *loop;
2577
  loop_iterator li;
2578
  unsigned ret = 0;
2579
 
2580
  initialize_original_copy_tables ();
2581
  FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
2582
    if (optimize_loop_for_speed_p (loop))
2583
      {
2584
        unrolled |= tree_predictive_commoning_loop (loop);
2585
      }
2586
 
2587
  if (unrolled)
2588
    {
2589
      scev_reset ();
2590
      ret = TODO_cleanup_cfg;
2591
    }
2592
  free_original_copy_tables ();
2593
 
2594
  return ret;
2595
}

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.