OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-dev/] [or1k-gcc/] [gcc/] [tree-predcom.c] - Blame information for rev 779

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 684 jeremybenn
/* Predictive commoning.
2
   Copyright (C) 2005, 2007, 2008, 2009, 2010, 2011, 2012
3
   Free Software Foundation, Inc.
4
 
5
This file is part of GCC.
6
 
7
GCC is free software; you can redistribute it and/or modify it
8
under the terms of the GNU General Public License as published by the
9
Free Software Foundation; either version 3, or (at your option) any
10
later version.
11
 
12
GCC is distributed in the hope that it will be useful, but WITHOUT
13
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
14
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
15
for more details.
16
 
17
You should have received a copy of the GNU General Public License
18
along with GCC; see the file COPYING3.  If not see
19
<http://www.gnu.org/licenses/>.  */
20
 
21
/* This file implements the predictive commoning optimization.  Predictive
22
   commoning can be viewed as CSE around a loop, and with some improvements,
23
   as generalized strength reduction-- i.e., reusing values computed in
24
   earlier iterations of a loop in the later ones.  So far, the pass only
25
   handles the most useful case, that is, reusing values of memory references.
26
   If you think this is all just a special case of PRE, you are sort of right;
27
   however, concentrating on loops is simpler, and makes it possible to
28
   incorporate data dependence analysis to detect the opportunities, perform
29
   loop unrolling to avoid copies together with renaming immediately,
30
   and if needed, we could also take register pressure into account.
31
 
32
   Let us demonstrate what is done on an example:
33
 
34
   for (i = 0; i < 100; i++)
35
     {
36
       a[i+2] = a[i] + a[i+1];
37
       b[10] = b[10] + i;
38
       c[i] = c[99 - i];
39
       d[i] = d[i + 1];
40
     }
41
 
42
   1) We find data references in the loop, and split them to mutually
43
      independent groups (i.e., we find components of a data dependence
44
      graph).  We ignore read-read dependences whose distance is not constant.
45
      (TODO -- we could also ignore antidependences).  In this example, we
46
      find the following groups:
47
 
48
      a[i]{read}, a[i+1]{read}, a[i+2]{write}
49
      b[10]{read}, b[10]{write}
50
      c[99 - i]{read}, c[i]{write}
51
      d[i + 1]{read}, d[i]{write}
52
 
53
   2) Inside each of the group, we verify several conditions:
54
      a) all the references must differ in indices only, and the indices
55
         must all have the same step
56
      b) the references must dominate loop latch (and thus, they must be
57
         ordered by dominance relation).
58
      c) the distance of the indices must be a small multiple of the step
59
      We are then able to compute the difference of the references (# of
60
      iterations before they point to the same place as the first of them).
61
      Also, in case there are writes in the loop, we split the groups into
62
      chains whose head is the write whose values are used by the reads in
63
      the same chain.  The chains are then processed independently,
64
      making the further transformations simpler.  Also, the shorter chains
65
      need the same number of registers, but may require lower unrolling
66
      factor in order to get rid of the copies on the loop latch.
67
 
68
      In our example, we get the following chains (the chain for c is invalid).
69
 
70
      a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
71
      b[10]{read,+0}, b[10]{write,+0}
72
      d[i + 1]{read,+0}, d[i]{write,+1}
73
 
74
   3) For each read, we determine the read or write whose value it reuses,
75
      together with the distance of this reuse.  I.e. we take the last
76
      reference before it with distance 0, or the last of the references
77
      with the smallest positive distance to the read.  Then, we remove
78
      the references that are not used in any of these chains, discard the
79
      empty groups, and propagate all the links so that they point to the
80
      single root reference of the chain (adjusting their distance
81
      appropriately).  Some extra care needs to be taken for references with
82
      step 0.  In our example (the numbers indicate the distance of the
83
      reuse),
84
 
85
      a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
86
      b[10] --> (*) 1, b[10] (*)
87
 
88
   4) The chains are combined together if possible.  If the corresponding
89
      elements of two chains are always combined together with the same
90
      operator, we remember just the result of this combination, instead
91
      of remembering the values separately.  We may need to perform
92
      reassociation to enable combining, for example
93
 
94
      e[i] + f[i+1] + e[i+1] + f[i]
95
 
96
      can be reassociated as
97
 
98
      (e[i] + f[i]) + (e[i+1] + f[i+1])
99
 
100
      and we can combine the chains for e and f into one chain.
101
 
102
   5) For each root reference (end of the chain) R, let N be maximum distance
103
      of a reference reusing its value.  Variables R0 upto RN are created,
104
      together with phi nodes that transfer values from R1 .. RN to
105
      R0 .. R(N-1).
106
      Initial values are loaded to R0..R(N-1) (in case not all references
107
      must necessarily be accessed and they may trap, we may fail here;
108
      TODO sometimes, the loads could be guarded by a check for the number
109
      of iterations).  Values loaded/stored in roots are also copied to
110
      RN.  Other reads are replaced with the appropriate variable Ri.
111
      Everything is put to SSA form.
112
 
113
      As a small improvement, if R0 is dead after the root (i.e., all uses of
114
      the value with the maximum distance dominate the root), we can avoid
115
      creating RN and use R0 instead of it.
116
 
117
      In our example, we get (only the parts concerning a and b are shown):
118
      for (i = 0; i < 100; i++)
119
        {
120
          f = phi (a[0], s);
121
          s = phi (a[1], f);
122
          x = phi (b[10], x);
123
 
124
          f = f + s;
125
          a[i+2] = f;
126
          x = x + i;
127
          b[10] = x;
128
        }
129
 
130
   6) Factor F for unrolling is determined as the smallest common multiple of
131
      (N + 1) for each root reference (N for references for that we avoided
132
      creating RN).  If F and the loop is small enough, loop is unrolled F
133
      times.  The stores to RN (R0) in the copies of the loop body are
134
      periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
135
      be coalesced and the copies can be eliminated.
136
 
137
      TODO -- copy propagation and other optimizations may change the live
138
      ranges of the temporary registers and prevent them from being coalesced;
139
      this may increase the register pressure.
140
 
141
      In our case, F = 2 and the (main loop of the) result is
142
 
143
      for (i = 0; i < ...; i += 2)
144
        {
145
          f = phi (a[0], f);
146
          s = phi (a[1], s);
147
          x = phi (b[10], x);
148
 
149
          f = f + s;
150
          a[i+2] = f;
151
          x = x + i;
152
          b[10] = x;
153
 
154
          s = s + f;
155
          a[i+3] = s;
156
          x = x + i;
157
          b[10] = x;
158
       }
159
 
160
   TODO -- stores killing other stores can be taken into account, e.g.,
161
   for (i = 0; i < n; i++)
162
     {
163
       a[i] = 1;
164
       a[i+2] = 2;
165
     }
166
 
167
   can be replaced with
168
 
169
   t0 = a[0];
170
   t1 = a[1];
171
   for (i = 0; i < n; i++)
172
     {
173
       a[i] = 1;
174
       t2 = 2;
175
       t0 = t1;
176
       t1 = t2;
177
     }
178
   a[n] = t0;
179
   a[n+1] = t1;
180
 
181
   The interesting part is that this would generalize store motion; still, since
182
   sm is performed elsewhere, it does not seem that important.
183
 
184
   Predictive commoning can be generalized for arbitrary computations (not
185
   just memory loads), and also nontrivial transfer functions (e.g., replacing
186
   i * i with ii_last + 2 * i + 1), to generalize strength reduction.  */
187
 
188
#include "config.h"
189
#include "system.h"
190
#include "coretypes.h"
191
#include "tm.h"
192
#include "tree.h"
193
#include "tm_p.h"
194
#include "cfgloop.h"
195
#include "tree-flow.h"
196
#include "ggc.h"
197
#include "tree-data-ref.h"
198
#include "tree-scalar-evolution.h"
199
#include "tree-chrec.h"
200
#include "params.h"
201
#include "tree-pretty-print.h"
202
#include "gimple-pretty-print.h"
203
#include "tree-pass.h"
204
#include "tree-affine.h"
205
#include "tree-inline.h"
206
 
207
/* The maximum number of iterations between the considered memory
208
   references.  */
209
 
210
#define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
211
 
212
/* Data references (or phi nodes that carry data reference values across
213
   loop iterations).  */
214
 
215
typedef struct dref_d
216
{
217
  /* The reference itself.  */
218
  struct data_reference *ref;
219
 
220
  /* The statement in that the reference appears.  */
221
  gimple stmt;
222
 
223
  /* In case that STMT is a phi node, this field is set to the SSA name
224
     defined by it in replace_phis_by_defined_names (in order to avoid
225
     pointing to phi node that got reallocated in the meantime).  */
226
  tree name_defined_by_phi;
227
 
228
  /* Distance of the reference from the root of the chain (in number of
229
     iterations of the loop).  */
230
  unsigned distance;
231
 
232
  /* Number of iterations offset from the first reference in the component.  */
233
  double_int offset;
234
 
235
  /* Number of the reference in a component, in dominance ordering.  */
236
  unsigned pos;
237
 
238
  /* True if the memory reference is always accessed when the loop is
239
     entered.  */
240
  unsigned always_accessed : 1;
241
} *dref;
242
 
243
DEF_VEC_P (dref);
244
DEF_VEC_ALLOC_P (dref, heap);
245
 
246
/* Type of the chain of the references.  */
247
 
248
enum chain_type
249
{
250
  /* The addresses of the references in the chain are constant.  */
251
  CT_INVARIANT,
252
 
253
  /* There are only loads in the chain.  */
254
  CT_LOAD,
255
 
256
  /* Root of the chain is store, the rest are loads.  */
257
  CT_STORE_LOAD,
258
 
259
  /* A combination of two chains.  */
260
  CT_COMBINATION
261
};
262
 
263
/* Chains of data references.  */
264
 
265
typedef struct chain
266
{
267
  /* Type of the chain.  */
268
  enum chain_type type;
269
 
270
  /* For combination chains, the operator and the two chains that are
271
     combined, and the type of the result.  */
272
  enum tree_code op;
273
  tree rslt_type;
274
  struct chain *ch1, *ch2;
275
 
276
  /* The references in the chain.  */
277
  VEC(dref,heap) *refs;
278
 
279
  /* The maximum distance of the reference in the chain from the root.  */
280
  unsigned length;
281
 
282
  /* The variables used to copy the value throughout iterations.  */
283
  VEC(tree,heap) *vars;
284
 
285
  /* Initializers for the variables.  */
286
  VEC(tree,heap) *inits;
287
 
288
  /* True if there is a use of a variable with the maximal distance
289
     that comes after the root in the loop.  */
290
  unsigned has_max_use_after : 1;
291
 
292
  /* True if all the memory references in the chain are always accessed.  */
293
  unsigned all_always_accessed : 1;
294
 
295
  /* True if this chain was combined together with some other chain.  */
296
  unsigned combined : 1;
297
} *chain_p;
298
 
299
DEF_VEC_P (chain_p);
300
DEF_VEC_ALLOC_P (chain_p, heap);
301
 
302
/* Describes the knowledge about the step of the memory references in
303
   the component.  */
304
 
305
enum ref_step_type
306
{
307
  /* The step is zero.  */
308
  RS_INVARIANT,
309
 
310
  /* The step is nonzero.  */
311
  RS_NONZERO,
312
 
313
  /* The step may or may not be nonzero.  */
314
  RS_ANY
315
};
316
 
317
/* Components of the data dependence graph.  */
318
 
319
struct component
320
{
321
  /* The references in the component.  */
322
  VEC(dref,heap) *refs;
323
 
324
  /* What we know about the step of the references in the component.  */
325
  enum ref_step_type comp_step;
326
 
327
  /* Next component in the list.  */
328
  struct component *next;
329
};
330
 
331
/* Bitmap of ssa names defined by looparound phi nodes covered by chains.  */
332
 
333
static bitmap looparound_phis;
334
 
335
/* Cache used by tree_to_aff_combination_expand.  */
336
 
337
static struct pointer_map_t *name_expansions;
338
 
339
/* Dumps data reference REF to FILE.  */
340
 
341
extern void dump_dref (FILE *, dref);
342
void
343
dump_dref (FILE *file, dref ref)
344
{
345
  if (ref->ref)
346
    {
347
      fprintf (file, "    ");
348
      print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
349
      fprintf (file, " (id %u%s)\n", ref->pos,
350
               DR_IS_READ (ref->ref) ? "" : ", write");
351
 
352
      fprintf (file, "      offset ");
353
      dump_double_int (file, ref->offset, false);
354
      fprintf (file, "\n");
355
 
356
      fprintf (file, "      distance %u\n", ref->distance);
357
    }
358
  else
359
    {
360
      if (gimple_code (ref->stmt) == GIMPLE_PHI)
361
        fprintf (file, "    looparound ref\n");
362
      else
363
        fprintf (file, "    combination ref\n");
364
      fprintf (file, "      in statement ");
365
      print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
366
      fprintf (file, "\n");
367
      fprintf (file, "      distance %u\n", ref->distance);
368
    }
369
 
370
}
371
 
372
/* Dumps CHAIN to FILE.  */
373
 
374
extern void dump_chain (FILE *, chain_p);
375
void
376
dump_chain (FILE *file, chain_p chain)
377
{
378
  dref a;
379
  const char *chain_type;
380
  unsigned i;
381
  tree var;
382
 
383
  switch (chain->type)
384
    {
385
    case CT_INVARIANT:
386
      chain_type = "Load motion";
387
      break;
388
 
389
    case CT_LOAD:
390
      chain_type = "Loads-only";
391
      break;
392
 
393
    case CT_STORE_LOAD:
394
      chain_type = "Store-loads";
395
      break;
396
 
397
    case CT_COMBINATION:
398
      chain_type = "Combination";
399
      break;
400
 
401
    default:
402
      gcc_unreachable ();
403
    }
404
 
405
  fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
406
           chain->combined ? " (combined)" : "");
407
  if (chain->type != CT_INVARIANT)
408
    fprintf (file, "  max distance %u%s\n", chain->length,
409
             chain->has_max_use_after ? "" : ", may reuse first");
410
 
411
  if (chain->type == CT_COMBINATION)
412
    {
413
      fprintf (file, "  equal to %p %s %p in type ",
414
               (void *) chain->ch1, op_symbol_code (chain->op),
415
               (void *) chain->ch2);
416
      print_generic_expr (file, chain->rslt_type, TDF_SLIM);
417
      fprintf (file, "\n");
418
    }
419
 
420
  if (chain->vars)
421
    {
422
      fprintf (file, "  vars");
423
      FOR_EACH_VEC_ELT (tree, chain->vars, i, var)
424
        {
425
          fprintf (file, " ");
426
          print_generic_expr (file, var, TDF_SLIM);
427
        }
428
      fprintf (file, "\n");
429
    }
430
 
431
  if (chain->inits)
432
    {
433
      fprintf (file, "  inits");
434
      FOR_EACH_VEC_ELT (tree, chain->inits, i, var)
435
        {
436
          fprintf (file, " ");
437
          print_generic_expr (file, var, TDF_SLIM);
438
        }
439
      fprintf (file, "\n");
440
    }
441
 
442
  fprintf (file, "  references:\n");
443
  FOR_EACH_VEC_ELT (dref, chain->refs, i, a)
444
    dump_dref (file, a);
445
 
446
  fprintf (file, "\n");
447
}
448
 
449
/* Dumps CHAINS to FILE.  */
450
 
451
extern void dump_chains (FILE *, VEC (chain_p, heap) *);
452
void
453
dump_chains (FILE *file, VEC (chain_p, heap) *chains)
454
{
455
  chain_p chain;
456
  unsigned i;
457
 
458
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
459
    dump_chain (file, chain);
460
}
461
 
462
/* Dumps COMP to FILE.  */
463
 
464
extern void dump_component (FILE *, struct component *);
465
void
466
dump_component (FILE *file, struct component *comp)
467
{
468
  dref a;
469
  unsigned i;
470
 
471
  fprintf (file, "Component%s:\n",
472
           comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
473
  FOR_EACH_VEC_ELT (dref, comp->refs, i, a)
474
    dump_dref (file, a);
475
  fprintf (file, "\n");
476
}
477
 
478
/* Dumps COMPS to FILE.  */
479
 
480
extern void dump_components (FILE *, struct component *);
481
void
482
dump_components (FILE *file, struct component *comps)
483
{
484
  struct component *comp;
485
 
486
  for (comp = comps; comp; comp = comp->next)
487
    dump_component (file, comp);
488
}
489
 
490
/* Frees a chain CHAIN.  */
491
 
492
static void
493
release_chain (chain_p chain)
494
{
495
  dref ref;
496
  unsigned i;
497
 
498
  if (chain == NULL)
499
    return;
500
 
501
  FOR_EACH_VEC_ELT (dref, chain->refs, i, ref)
502
    free (ref);
503
 
504
  VEC_free (dref, heap, chain->refs);
505
  VEC_free (tree, heap, chain->vars);
506
  VEC_free (tree, heap, chain->inits);
507
 
508
  free (chain);
509
}
510
 
511
/* Frees CHAINS.  */
512
 
513
static void
514
release_chains (VEC (chain_p, heap) *chains)
515
{
516
  unsigned i;
517
  chain_p chain;
518
 
519
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
520
    release_chain (chain);
521
  VEC_free (chain_p, heap, chains);
522
}
523
 
524
/* Frees a component COMP.  */
525
 
526
static void
527
release_component (struct component *comp)
528
{
529
  VEC_free (dref, heap, comp->refs);
530
  free (comp);
531
}
532
 
533
/* Frees list of components COMPS.  */
534
 
535
static void
536
release_components (struct component *comps)
537
{
538
  struct component *act, *next;
539
 
540
  for (act = comps; act; act = next)
541
    {
542
      next = act->next;
543
      release_component (act);
544
    }
545
}
546
 
547
/* Finds a root of tree given by FATHERS containing A, and performs path
548
   shortening.  */
549
 
550
static unsigned
551
component_of (unsigned fathers[], unsigned a)
552
{
553
  unsigned root, n;
554
 
555
  for (root = a; root != fathers[root]; root = fathers[root])
556
    continue;
557
 
558
  for (; a != root; a = n)
559
    {
560
      n = fathers[a];
561
      fathers[a] = root;
562
    }
563
 
564
  return root;
565
}
566
 
567
/* Join operation for DFU.  FATHERS gives the tree, SIZES are sizes of the
568
   components, A and B are components to merge.  */
569
 
570
static void
571
merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
572
{
573
  unsigned ca = component_of (fathers, a);
574
  unsigned cb = component_of (fathers, b);
575
 
576
  if (ca == cb)
577
    return;
578
 
579
  if (sizes[ca] < sizes[cb])
580
    {
581
      sizes[cb] += sizes[ca];
582
      fathers[ca] = cb;
583
    }
584
  else
585
    {
586
      sizes[ca] += sizes[cb];
587
      fathers[cb] = ca;
588
    }
589
}
590
 
591
/* Returns true if A is a reference that is suitable for predictive commoning
592
   in the innermost loop that contains it.  REF_STEP is set according to the
593
   step of the reference A.  */
594
 
595
static bool
596
suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
597
{
598
  tree ref = DR_REF (a), step = DR_STEP (a);
599
 
600
  if (!step
601
      || TREE_THIS_VOLATILE (ref)
602
      || !is_gimple_reg_type (TREE_TYPE (ref))
603
      || tree_could_throw_p (ref))
604
    return false;
605
 
606
  if (integer_zerop (step))
607
    *ref_step = RS_INVARIANT;
608
  else if (integer_nonzerop (step))
609
    *ref_step = RS_NONZERO;
610
  else
611
    *ref_step = RS_ANY;
612
 
613
  return true;
614
}
615
 
616
/* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET.  */
617
 
618
static void
619
aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
620
{
621
  tree type = TREE_TYPE (DR_OFFSET (dr));
622
  aff_tree delta;
623
 
624
  tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
625
                                  &name_expansions);
626
  aff_combination_const (&delta, type, tree_to_double_int (DR_INIT (dr)));
627
  aff_combination_add (offset, &delta);
628
}
629
 
630
/* Determines number of iterations of the innermost enclosing loop before B
631
   refers to exactly the same location as A and stores it to OFF.  If A and
632
   B do not have the same step, they never meet, or anything else fails,
633
   returns false, otherwise returns true.  Both A and B are assumed to
634
   satisfy suitable_reference_p.  */
635
 
636
static bool
637
determine_offset (struct data_reference *a, struct data_reference *b,
638
                  double_int *off)
639
{
640
  aff_tree diff, baseb, step;
641
  tree typea, typeb;
642
 
643
  /* Check that both the references access the location in the same type.  */
644
  typea = TREE_TYPE (DR_REF (a));
645
  typeb = TREE_TYPE (DR_REF (b));
646
  if (!useless_type_conversion_p (typeb, typea))
647
    return false;
648
 
649
  /* Check whether the base address and the step of both references is the
650
     same.  */
651
  if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
652
      || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
653
    return false;
654
 
655
  if (integer_zerop (DR_STEP (a)))
656
    {
657
      /* If the references have loop invariant address, check that they access
658
         exactly the same location.  */
659
      *off = double_int_zero;
660
      return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
661
              && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
662
    }
663
 
664
  /* Compare the offsets of the addresses, and check whether the difference
665
     is a multiple of step.  */
666
  aff_combination_dr_offset (a, &diff);
667
  aff_combination_dr_offset (b, &baseb);
668
  aff_combination_scale (&baseb, double_int_minus_one);
669
  aff_combination_add (&diff, &baseb);
670
 
671
  tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
672
                                  &step, &name_expansions);
673
  return aff_combination_constant_multiple_p (&diff, &step, off);
674
}
675
 
676
/* Returns the last basic block in LOOP for that we are sure that
677
   it is executed whenever the loop is entered.  */
678
 
679
static basic_block
680
last_always_executed_block (struct loop *loop)
681
{
682
  unsigned i;
683
  VEC (edge, heap) *exits = get_loop_exit_edges (loop);
684
  edge ex;
685
  basic_block last = loop->latch;
686
 
687
  FOR_EACH_VEC_ELT (edge, exits, i, ex)
688
    last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
689
  VEC_free (edge, heap, exits);
690
 
691
  return last;
692
}
693
 
694
/* Splits dependence graph on DATAREFS described by DEPENDS to components.  */
695
 
696
static struct component *
697
split_data_refs_to_components (struct loop *loop,
698
                               VEC (data_reference_p, heap) *datarefs,
699
                               VEC (ddr_p, heap) *depends)
700
{
701
  unsigned i, n = VEC_length (data_reference_p, datarefs);
702
  unsigned ca, ia, ib, bad;
703
  unsigned *comp_father = XNEWVEC (unsigned, n + 1);
704
  unsigned *comp_size = XNEWVEC (unsigned, n + 1);
705
  struct component **comps;
706
  struct data_reference *dr, *dra, *drb;
707
  struct data_dependence_relation *ddr;
708
  struct component *comp_list = NULL, *comp;
709
  dref dataref;
710
  basic_block last_always_executed = last_always_executed_block (loop);
711
 
712
  FOR_EACH_VEC_ELT (data_reference_p, datarefs, i, dr)
713
    {
714
      if (!DR_REF (dr))
715
        {
716
          /* A fake reference for call or asm_expr that may clobber memory;
717
             just fail.  */
718
          goto end;
719
        }
720
      dr->aux = (void *) (size_t) i;
721
      comp_father[i] = i;
722
      comp_size[i] = 1;
723
    }
724
 
725
  /* A component reserved for the "bad" data references.  */
726
  comp_father[n] = n;
727
  comp_size[n] = 1;
728
 
729
  FOR_EACH_VEC_ELT (data_reference_p, datarefs, i, dr)
730
    {
731
      enum ref_step_type dummy;
732
 
733
      if (!suitable_reference_p (dr, &dummy))
734
        {
735
          ia = (unsigned) (size_t) dr->aux;
736
          merge_comps (comp_father, comp_size, n, ia);
737
        }
738
    }
739
 
740
  FOR_EACH_VEC_ELT (ddr_p, depends, i, ddr)
741
    {
742
      double_int dummy_off;
743
 
744
      if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
745
        continue;
746
 
747
      dra = DDR_A (ddr);
748
      drb = DDR_B (ddr);
749
      ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
750
      ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
751
      if (ia == ib)
752
        continue;
753
 
754
      bad = component_of (comp_father, n);
755
 
756
      /* If both A and B are reads, we may ignore unsuitable dependences.  */
757
      if (DR_IS_READ (dra) && DR_IS_READ (drb)
758
          && (ia == bad || ib == bad
759
              || !determine_offset (dra, drb, &dummy_off)))
760
        continue;
761
 
762
      merge_comps (comp_father, comp_size, ia, ib);
763
    }
764
 
765
  comps = XCNEWVEC (struct component *, n);
766
  bad = component_of (comp_father, n);
767
  FOR_EACH_VEC_ELT (data_reference_p, datarefs, i, dr)
768
    {
769
      ia = (unsigned) (size_t) dr->aux;
770
      ca = component_of (comp_father, ia);
771
      if (ca == bad)
772
        continue;
773
 
774
      comp = comps[ca];
775
      if (!comp)
776
        {
777
          comp = XCNEW (struct component);
778
          comp->refs = VEC_alloc (dref, heap, comp_size[ca]);
779
          comps[ca] = comp;
780
        }
781
 
782
      dataref = XCNEW (struct dref_d);
783
      dataref->ref = dr;
784
      dataref->stmt = DR_STMT (dr);
785
      dataref->offset = double_int_zero;
786
      dataref->distance = 0;
787
 
788
      dataref->always_accessed
789
              = dominated_by_p (CDI_DOMINATORS, last_always_executed,
790
                                gimple_bb (dataref->stmt));
791
      dataref->pos = VEC_length (dref, comp->refs);
792
      VEC_quick_push (dref, comp->refs, dataref);
793
    }
794
 
795
  for (i = 0; i < n; i++)
796
    {
797
      comp = comps[i];
798
      if (comp)
799
        {
800
          comp->next = comp_list;
801
          comp_list = comp;
802
        }
803
    }
804
  free (comps);
805
 
806
end:
807
  free (comp_father);
808
  free (comp_size);
809
  return comp_list;
810
}
811
 
812
/* Returns true if the component COMP satisfies the conditions
813
   described in 2) at the beginning of this file.  LOOP is the current
814
   loop.  */
815
 
816
static bool
817
suitable_component_p (struct loop *loop, struct component *comp)
818
{
819
  unsigned i;
820
  dref a, first;
821
  basic_block ba, bp = loop->header;
822
  bool ok, has_write = false;
823
 
824
  FOR_EACH_VEC_ELT (dref, comp->refs, i, a)
825
    {
826
      ba = gimple_bb (a->stmt);
827
 
828
      if (!just_once_each_iteration_p (loop, ba))
829
        return false;
830
 
831
      gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
832
      bp = ba;
833
 
834
      if (DR_IS_WRITE (a->ref))
835
        has_write = true;
836
    }
837
 
838
  first = VEC_index (dref, comp->refs, 0);
839
  ok = suitable_reference_p (first->ref, &comp->comp_step);
840
  gcc_assert (ok);
841
  first->offset = double_int_zero;
842
 
843
  for (i = 1; VEC_iterate (dref, comp->refs, i, a); i++)
844
    {
845
      if (!determine_offset (first->ref, a->ref, &a->offset))
846
        return false;
847
 
848
#ifdef ENABLE_CHECKING
849
      {
850
        enum ref_step_type a_step;
851
        ok = suitable_reference_p (a->ref, &a_step);
852
        gcc_assert (ok && a_step == comp->comp_step);
853
      }
854
#endif
855
    }
856
 
857
  /* If there is a write inside the component, we must know whether the
858
     step is nonzero or not -- we would not otherwise be able to recognize
859
     whether the value accessed by reads comes from the OFFSET-th iteration
860
     or the previous one.  */
861
  if (has_write && comp->comp_step == RS_ANY)
862
    return false;
863
 
864
  return true;
865
}
866
 
867
/* Check the conditions on references inside each of components COMPS,
868
   and remove the unsuitable components from the list.  The new list
869
   of components is returned.  The conditions are described in 2) at
870
   the beginning of this file.  LOOP is the current loop.  */
871
 
872
static struct component *
873
filter_suitable_components (struct loop *loop, struct component *comps)
874
{
875
  struct component **comp, *act;
876
 
877
  for (comp = &comps; *comp; )
878
    {
879
      act = *comp;
880
      if (suitable_component_p (loop, act))
881
        comp = &act->next;
882
      else
883
        {
884
          dref ref;
885
          unsigned i;
886
 
887
          *comp = act->next;
888
          FOR_EACH_VEC_ELT (dref, act->refs, i, ref)
889
            free (ref);
890
          release_component (act);
891
        }
892
    }
893
 
894
  return comps;
895
}
896
 
897
/* Compares two drefs A and B by their offset and position.  Callback for
898
   qsort.  */
899
 
900
static int
901
order_drefs (const void *a, const void *b)
902
{
903
  const dref *const da = (const dref *) a;
904
  const dref *const db = (const dref *) b;
905
  int offcmp = double_int_scmp ((*da)->offset, (*db)->offset);
906
 
907
  if (offcmp != 0)
908
    return offcmp;
909
 
910
  return (*da)->pos - (*db)->pos;
911
}
912
 
913
/* Returns root of the CHAIN.  */
914
 
915
static inline dref
916
get_chain_root (chain_p chain)
917
{
918
  return VEC_index (dref, chain->refs, 0);
919
}
920
 
921
/* Adds REF to the chain CHAIN.  */
922
 
923
static void
924
add_ref_to_chain (chain_p chain, dref ref)
925
{
926
  dref root = get_chain_root (chain);
927
  double_int dist;
928
 
929
  gcc_assert (double_int_scmp (root->offset, ref->offset) <= 0);
930
  dist = double_int_sub (ref->offset, root->offset);
931
  if (double_int_ucmp (uhwi_to_double_int (MAX_DISTANCE), dist) <= 0)
932
    {
933
      free (ref);
934
      return;
935
    }
936
  gcc_assert (double_int_fits_in_uhwi_p (dist));
937
 
938
  VEC_safe_push (dref, heap, chain->refs, ref);
939
 
940
  ref->distance = double_int_to_uhwi (dist);
941
 
942
  if (ref->distance >= chain->length)
943
    {
944
      chain->length = ref->distance;
945
      chain->has_max_use_after = false;
946
    }
947
 
948
  if (ref->distance == chain->length
949
      && ref->pos > root->pos)
950
    chain->has_max_use_after = true;
951
 
952
  chain->all_always_accessed &= ref->always_accessed;
953
}
954
 
955
/* Returns the chain for invariant component COMP.  */
956
 
957
static chain_p
958
make_invariant_chain (struct component *comp)
959
{
960
  chain_p chain = XCNEW (struct chain);
961
  unsigned i;
962
  dref ref;
963
 
964
  chain->type = CT_INVARIANT;
965
 
966
  chain->all_always_accessed = true;
967
 
968
  FOR_EACH_VEC_ELT (dref, comp->refs, i, ref)
969
    {
970
      VEC_safe_push (dref, heap, chain->refs, ref);
971
      chain->all_always_accessed &= ref->always_accessed;
972
    }
973
 
974
  return chain;
975
}
976
 
977
/* Make a new chain rooted at REF.  */
978
 
979
static chain_p
980
make_rooted_chain (dref ref)
981
{
982
  chain_p chain = XCNEW (struct chain);
983
 
984
  chain->type = DR_IS_READ (ref->ref) ? CT_LOAD : CT_STORE_LOAD;
985
 
986
  VEC_safe_push (dref, heap, chain->refs, ref);
987
  chain->all_always_accessed = ref->always_accessed;
988
 
989
  ref->distance = 0;
990
 
991
  return chain;
992
}
993
 
994
/* Returns true if CHAIN is not trivial.  */
995
 
996
static bool
997
nontrivial_chain_p (chain_p chain)
998
{
999
  return chain != NULL && VEC_length (dref, chain->refs) > 1;
1000
}
1001
 
1002
/* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1003
   is no such name.  */
1004
 
1005
static tree
1006
name_for_ref (dref ref)
1007
{
1008
  tree name;
1009
 
1010
  if (is_gimple_assign (ref->stmt))
1011
    {
1012
      if (!ref->ref || DR_IS_READ (ref->ref))
1013
        name = gimple_assign_lhs (ref->stmt);
1014
      else
1015
        name = gimple_assign_rhs1 (ref->stmt);
1016
    }
1017
  else
1018
    name = PHI_RESULT (ref->stmt);
1019
 
1020
  return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1021
}
1022
 
1023
/* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1024
   iterations of the innermost enclosing loop).  */
1025
 
1026
static bool
1027
valid_initializer_p (struct data_reference *ref,
1028
                     unsigned distance, struct data_reference *root)
1029
{
1030
  aff_tree diff, base, step;
1031
  double_int off;
1032
 
1033
  /* Both REF and ROOT must be accessing the same object.  */
1034
  if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1035
    return false;
1036
 
1037
  /* The initializer is defined outside of loop, hence its address must be
1038
     invariant inside the loop.  */
1039
  gcc_assert (integer_zerop (DR_STEP (ref)));
1040
 
1041
  /* If the address of the reference is invariant, initializer must access
1042
     exactly the same location.  */
1043
  if (integer_zerop (DR_STEP (root)))
1044
    return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1045
            && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1046
 
1047
  /* Verify that this index of REF is equal to the root's index at
1048
     -DISTANCE-th iteration.  */
1049
  aff_combination_dr_offset (root, &diff);
1050
  aff_combination_dr_offset (ref, &base);
1051
  aff_combination_scale (&base, double_int_minus_one);
1052
  aff_combination_add (&diff, &base);
1053
 
1054
  tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1055
                                  &step, &name_expansions);
1056
  if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1057
    return false;
1058
 
1059
  if (!double_int_equal_p (off, uhwi_to_double_int (distance)))
1060
    return false;
1061
 
1062
  return true;
1063
}
1064
 
1065
/* Finds looparound phi node of LOOP that copies the value of REF, and if its
1066
   initial value is correct (equal to initial value of REF shifted by one
1067
   iteration), returns the phi node.  Otherwise, NULL_TREE is returned.  ROOT
1068
   is the root of the current chain.  */
1069
 
1070
static gimple
1071
find_looparound_phi (struct loop *loop, dref ref, dref root)
1072
{
1073
  tree name, init, init_ref;
1074
  gimple phi = NULL, init_stmt;
1075
  edge latch = loop_latch_edge (loop);
1076
  struct data_reference init_dr;
1077
  gimple_stmt_iterator psi;
1078
 
1079
  if (is_gimple_assign (ref->stmt))
1080
    {
1081
      if (DR_IS_READ (ref->ref))
1082
        name = gimple_assign_lhs (ref->stmt);
1083
      else
1084
        name = gimple_assign_rhs1 (ref->stmt);
1085
    }
1086
  else
1087
    name = PHI_RESULT (ref->stmt);
1088
  if (!name)
1089
    return NULL;
1090
 
1091
  for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1092
    {
1093
      phi = gsi_stmt (psi);
1094
      if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1095
        break;
1096
    }
1097
 
1098
  if (gsi_end_p (psi))
1099
    return NULL;
1100
 
1101
  init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1102
  if (TREE_CODE (init) != SSA_NAME)
1103
    return NULL;
1104
  init_stmt = SSA_NAME_DEF_STMT (init);
1105
  if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1106
    return NULL;
1107
  gcc_assert (gimple_assign_lhs (init_stmt) == init);
1108
 
1109
  init_ref = gimple_assign_rhs1 (init_stmt);
1110
  if (!REFERENCE_CLASS_P (init_ref)
1111
      && !DECL_P (init_ref))
1112
    return NULL;
1113
 
1114
  /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1115
     loop enclosing PHI).  */
1116
  memset (&init_dr, 0, sizeof (struct data_reference));
1117
  DR_REF (&init_dr) = init_ref;
1118
  DR_STMT (&init_dr) = phi;
1119
  if (!dr_analyze_innermost (&init_dr, loop))
1120
    return NULL;
1121
 
1122
  if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1123
    return NULL;
1124
 
1125
  return phi;
1126
}
1127
 
1128
/* Adds a reference for the looparound copy of REF in PHI to CHAIN.  */
1129
 
1130
static void
1131
insert_looparound_copy (chain_p chain, dref ref, gimple phi)
1132
{
1133
  dref nw = XCNEW (struct dref_d), aref;
1134
  unsigned i;
1135
 
1136
  nw->stmt = phi;
1137
  nw->distance = ref->distance + 1;
1138
  nw->always_accessed = 1;
1139
 
1140
  FOR_EACH_VEC_ELT (dref, chain->refs, i, aref)
1141
    if (aref->distance >= nw->distance)
1142
      break;
1143
  VEC_safe_insert (dref, heap, chain->refs, i, nw);
1144
 
1145
  if (nw->distance > chain->length)
1146
    {
1147
      chain->length = nw->distance;
1148
      chain->has_max_use_after = false;
1149
    }
1150
}
1151
 
1152
/* For references in CHAIN that are copied around the LOOP (created previously
1153
   by PRE, or by user), add the results of such copies to the chain.  This
1154
   enables us to remove the copies by unrolling, and may need less registers
1155
   (also, it may allow us to combine chains together).  */
1156
 
1157
static void
1158
add_looparound_copies (struct loop *loop, chain_p chain)
1159
{
1160
  unsigned i;
1161
  dref ref, root = get_chain_root (chain);
1162
  gimple phi;
1163
 
1164
  FOR_EACH_VEC_ELT (dref, chain->refs, i, ref)
1165
    {
1166
      phi = find_looparound_phi (loop, ref, root);
1167
      if (!phi)
1168
        continue;
1169
 
1170
      bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1171
      insert_looparound_copy (chain, ref, phi);
1172
    }
1173
}
1174
 
1175
/* Find roots of the values and determine distances in the component COMP.
1176
   The references are redistributed into CHAINS.  LOOP is the current
1177
   loop.  */
1178
 
1179
static void
1180
determine_roots_comp (struct loop *loop,
1181
                      struct component *comp,
1182
                      VEC (chain_p, heap) **chains)
1183
{
1184
  unsigned i;
1185
  dref a;
1186
  chain_p chain = NULL;
1187
  double_int last_ofs = double_int_zero;
1188
 
1189
  /* Invariants are handled specially.  */
1190
  if (comp->comp_step == RS_INVARIANT)
1191
    {
1192
      chain = make_invariant_chain (comp);
1193
      VEC_safe_push (chain_p, heap, *chains, chain);
1194
      return;
1195
    }
1196
 
1197
  VEC_qsort (dref, comp->refs, order_drefs);
1198
 
1199
  FOR_EACH_VEC_ELT (dref, comp->refs, i, a)
1200
    {
1201
      if (!chain || DR_IS_WRITE (a->ref)
1202
          || double_int_ucmp (uhwi_to_double_int (MAX_DISTANCE),
1203
                              double_int_sub (a->offset, last_ofs)) <= 0)
1204
        {
1205
          if (nontrivial_chain_p (chain))
1206
            {
1207
              add_looparound_copies (loop, chain);
1208
              VEC_safe_push (chain_p, heap, *chains, chain);
1209
            }
1210
          else
1211
            release_chain (chain);
1212
          chain = make_rooted_chain (a);
1213
          last_ofs = a->offset;
1214
          continue;
1215
        }
1216
 
1217
      add_ref_to_chain (chain, a);
1218
    }
1219
 
1220
  if (nontrivial_chain_p (chain))
1221
    {
1222
      add_looparound_copies (loop, chain);
1223
      VEC_safe_push (chain_p, heap, *chains, chain);
1224
    }
1225
  else
1226
    release_chain (chain);
1227
}
1228
 
1229
/* Find roots of the values and determine distances in components COMPS, and
1230
   separates the references to CHAINS.  LOOP is the current loop.  */
1231
 
1232
static void
1233
determine_roots (struct loop *loop,
1234
                 struct component *comps, VEC (chain_p, heap) **chains)
1235
{
1236
  struct component *comp;
1237
 
1238
  for (comp = comps; comp; comp = comp->next)
1239
    determine_roots_comp (loop, comp, chains);
1240
}
1241
 
1242
/* Replace the reference in statement STMT with temporary variable
1243
   NEW_TREE.  If SET is true, NEW_TREE is instead initialized to the value of
1244
   the reference in the statement.  IN_LHS is true if the reference
1245
   is in the lhs of STMT, false if it is in rhs.  */
1246
 
1247
static void
1248
replace_ref_with (gimple stmt, tree new_tree, bool set, bool in_lhs)
1249
{
1250
  tree val;
1251
  gimple new_stmt;
1252
  gimple_stmt_iterator bsi, psi;
1253
 
1254
  if (gimple_code (stmt) == GIMPLE_PHI)
1255
    {
1256
      gcc_assert (!in_lhs && !set);
1257
 
1258
      val = PHI_RESULT (stmt);
1259
      bsi = gsi_after_labels (gimple_bb (stmt));
1260
      psi = gsi_for_stmt (stmt);
1261
      remove_phi_node (&psi, false);
1262
 
1263
      /* Turn the phi node into GIMPLE_ASSIGN.  */
1264
      new_stmt = gimple_build_assign (val, new_tree);
1265
      gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1266
      return;
1267
    }
1268
 
1269
  /* Since the reference is of gimple_reg type, it should only
1270
     appear as lhs or rhs of modify statement.  */
1271
  gcc_assert (is_gimple_assign (stmt));
1272
 
1273
  bsi = gsi_for_stmt (stmt);
1274
 
1275
  /* If we do not need to initialize NEW_TREE, just replace the use of OLD.  */
1276
  if (!set)
1277
    {
1278
      gcc_assert (!in_lhs);
1279
      gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1280
      stmt = gsi_stmt (bsi);
1281
      update_stmt (stmt);
1282
      return;
1283
    }
1284
 
1285
  if (in_lhs)
1286
    {
1287
      /* We have statement
1288
 
1289
         OLD = VAL
1290
 
1291
         If OLD is a memory reference, then VAL is gimple_val, and we transform
1292
         this to
1293
 
1294
         OLD = VAL
1295
         NEW = VAL
1296
 
1297
         Otherwise, we are replacing a combination chain,
1298
         VAL is the expression that performs the combination, and OLD is an
1299
         SSA name.  In this case, we transform the assignment to
1300
 
1301
         OLD = VAL
1302
         NEW = OLD
1303
 
1304
         */
1305
 
1306
      val = gimple_assign_lhs (stmt);
1307
      if (TREE_CODE (val) != SSA_NAME)
1308
        {
1309
          val = gimple_assign_rhs1 (stmt);
1310
          gcc_assert (gimple_assign_single_p (stmt));
1311
          if (TREE_CLOBBER_P (val))
1312
            {
1313
              val = gimple_default_def (cfun, SSA_NAME_VAR (new_tree));
1314
              if (val == NULL_TREE)
1315
                {
1316
                  val = make_ssa_name (SSA_NAME_VAR (new_tree),
1317
                                       gimple_build_nop ());
1318
                  set_default_def (SSA_NAME_VAR (new_tree), val);
1319
                }
1320
            }
1321
          else
1322
            gcc_assert (gimple_assign_copy_p (stmt));
1323
        }
1324
    }
1325
  else
1326
    {
1327
      /* VAL = OLD
1328
 
1329
         is transformed to
1330
 
1331
         VAL = OLD
1332
         NEW = VAL  */
1333
 
1334
      val = gimple_assign_lhs (stmt);
1335
    }
1336
 
1337
  new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1338
  gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1339
}
1340
 
1341
/* Returns the reference to the address of REF in the ITER-th iteration of
1342
   LOOP, or NULL if we fail to determine it (ITER may be negative).  We
1343
   try to preserve the original shape of the reference (not rewrite it
1344
   as an indirect ref to the address), to make tree_could_trap_p in
1345
   prepare_initializers_chain return false more often.  */
1346
 
1347
static tree
1348
ref_at_iteration (struct loop *loop, tree ref, int iter)
1349
{
1350
  tree idx, *idx_p, type, val, op0 = NULL_TREE, ret;
1351
  affine_iv iv;
1352
  bool ok;
1353
 
1354
  if (handled_component_p (ref))
1355
    {
1356
      op0 = ref_at_iteration (loop, TREE_OPERAND (ref, 0), iter);
1357
      if (!op0)
1358
        return NULL_TREE;
1359
    }
1360
  else if (!INDIRECT_REF_P (ref)
1361
           && TREE_CODE (ref) != MEM_REF)
1362
    return unshare_expr (ref);
1363
 
1364
  if (TREE_CODE (ref) == MEM_REF)
1365
    {
1366
      ret = unshare_expr (ref);
1367
      idx = TREE_OPERAND (ref, 0);
1368
      idx_p = &TREE_OPERAND (ret, 0);
1369
    }
1370
  else if (TREE_CODE (ref) == COMPONENT_REF)
1371
    {
1372
      /* Check that the offset is loop invariant.  */
1373
      if (TREE_OPERAND (ref, 2)
1374
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
1375
        return NULL_TREE;
1376
 
1377
      return build3 (COMPONENT_REF, TREE_TYPE (ref), op0,
1378
                     unshare_expr (TREE_OPERAND (ref, 1)),
1379
                     unshare_expr (TREE_OPERAND (ref, 2)));
1380
    }
1381
  else if (TREE_CODE (ref) == ARRAY_REF)
1382
    {
1383
      /* Check that the lower bound and the step are loop invariant.  */
1384
      if (TREE_OPERAND (ref, 2)
1385
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
1386
        return NULL_TREE;
1387
      if (TREE_OPERAND (ref, 3)
1388
          && !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 3)))
1389
        return NULL_TREE;
1390
 
1391
      ret = build4 (ARRAY_REF, TREE_TYPE (ref), op0, NULL_TREE,
1392
                    unshare_expr (TREE_OPERAND (ref, 2)),
1393
                    unshare_expr (TREE_OPERAND (ref, 3)));
1394
      idx = TREE_OPERAND (ref, 1);
1395
      idx_p = &TREE_OPERAND (ret, 1);
1396
    }
1397
  else
1398
    return NULL_TREE;
1399
 
1400
  ok = simple_iv (loop, loop, idx, &iv, true);
1401
  if (!ok)
1402
    return NULL_TREE;
1403
  iv.base = expand_simple_operations (iv.base);
1404
  if (integer_zerop (iv.step))
1405
    *idx_p = unshare_expr (iv.base);
1406
  else
1407
    {
1408
      type = TREE_TYPE (iv.base);
1409
      if (POINTER_TYPE_P (type))
1410
        {
1411
          val = fold_build2 (MULT_EXPR, sizetype, iv.step,
1412
                             size_int (iter));
1413
          val = fold_build_pointer_plus (iv.base, val);
1414
        }
1415
      else
1416
        {
1417
          val = fold_build2 (MULT_EXPR, type, iv.step,
1418
                             build_int_cst_type (type, iter));
1419
          val = fold_build2 (PLUS_EXPR, type, iv.base, val);
1420
        }
1421
      *idx_p = unshare_expr (val);
1422
    }
1423
 
1424
  return ret;
1425
}
1426
 
1427
/* Get the initialization expression for the INDEX-th temporary variable
1428
   of CHAIN.  */
1429
 
1430
static tree
1431
get_init_expr (chain_p chain, unsigned index)
1432
{
1433
  if (chain->type == CT_COMBINATION)
1434
    {
1435
      tree e1 = get_init_expr (chain->ch1, index);
1436
      tree e2 = get_init_expr (chain->ch2, index);
1437
 
1438
      return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1439
    }
1440
  else
1441
    return VEC_index (tree, chain->inits, index);
1442
}
1443
 
1444
/* Marks all virtual operands of statement STMT for renaming.  */
1445
 
1446
void
1447
mark_virtual_ops_for_renaming (gimple stmt)
1448
{
1449
  tree var;
1450
 
1451
  if (gimple_code (stmt) == GIMPLE_PHI)
1452
    {
1453
      var = PHI_RESULT (stmt);
1454
      if (is_gimple_reg (var))
1455
        return;
1456
 
1457
      if (TREE_CODE (var) == SSA_NAME)
1458
        var = SSA_NAME_VAR (var);
1459
      mark_sym_for_renaming (var);
1460
      return;
1461
    }
1462
 
1463
  update_stmt (stmt);
1464
  if (gimple_vuse (stmt))
1465
    mark_sym_for_renaming (gimple_vop (cfun));
1466
}
1467
 
1468
/* Returns a new temporary variable used for the I-th variable carrying
1469
   value of REF.  The variable's uid is marked in TMP_VARS.  */
1470
 
1471
static tree
1472
predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1473
{
1474
  tree type = TREE_TYPE (ref);
1475
  /* We never access the components of the temporary variable in predictive
1476
     commoning.  */
1477
  tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1478
 
1479
  add_referenced_var (var);
1480
  bitmap_set_bit (tmp_vars, DECL_UID (var));
1481
  return var;
1482
}
1483
 
1484
/* Creates the variables for CHAIN, as well as phi nodes for them and
1485
   initialization on entry to LOOP.  Uids of the newly created
1486
   temporary variables are marked in TMP_VARS.  */
1487
 
1488
static void
1489
initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1490
{
1491
  unsigned i;
1492
  unsigned n = chain->length;
1493
  dref root = get_chain_root (chain);
1494
  bool reuse_first = !chain->has_max_use_after;
1495
  tree ref, init, var, next;
1496
  gimple phi;
1497
  gimple_seq stmts;
1498
  edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1499
 
1500
  /* If N == 0, then all the references are within the single iteration.  And
1501
     since this is an nonempty chain, reuse_first cannot be true.  */
1502
  gcc_assert (n > 0 || !reuse_first);
1503
 
1504
  chain->vars = VEC_alloc (tree, heap, n + 1);
1505
 
1506
  if (chain->type == CT_COMBINATION)
1507
    ref = gimple_assign_lhs (root->stmt);
1508
  else
1509
    ref = DR_REF (root->ref);
1510
 
1511
  for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1512
    {
1513
      var = predcom_tmp_var (ref, i, tmp_vars);
1514
      VEC_quick_push (tree, chain->vars, var);
1515
    }
1516
  if (reuse_first)
1517
    VEC_quick_push (tree, chain->vars, VEC_index (tree, chain->vars, 0));
1518
 
1519
  FOR_EACH_VEC_ELT (tree, chain->vars, i, var)
1520
    VEC_replace (tree, chain->vars, i, make_ssa_name (var, NULL));
1521
 
1522
  for (i = 0; i < n; i++)
1523
    {
1524
      var = VEC_index (tree, chain->vars, i);
1525
      next = VEC_index (tree, chain->vars, i + 1);
1526
      init = get_init_expr (chain, i);
1527
 
1528
      init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1529
      if (stmts)
1530
        gsi_insert_seq_on_edge_immediate (entry, stmts);
1531
 
1532
      phi = create_phi_node (var, loop->header);
1533
      SSA_NAME_DEF_STMT (var) = phi;
1534
      add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1535
      add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1536
    }
1537
}
1538
 
1539
/* Create the variables and initialization statement for root of chain
1540
   CHAIN.  Uids of the newly created temporary variables are marked
1541
   in TMP_VARS.  */
1542
 
1543
static void
1544
initialize_root (struct loop *loop, chain_p chain, bitmap tmp_vars)
1545
{
1546
  dref root = get_chain_root (chain);
1547
  bool in_lhs = (chain->type == CT_STORE_LOAD
1548
                 || chain->type == CT_COMBINATION);
1549
 
1550
  initialize_root_vars (loop, chain, tmp_vars);
1551
  replace_ref_with (root->stmt,
1552
                    VEC_index (tree, chain->vars, chain->length),
1553
                    true, in_lhs);
1554
}
1555
 
1556
/* Initializes a variable for load motion for ROOT and prepares phi nodes and
1557
   initialization on entry to LOOP if necessary.  The ssa name for the variable
1558
   is stored in VARS.  If WRITTEN is true, also a phi node to copy its value
1559
   around the loop is created.  Uid of the newly created temporary variable
1560
   is marked in TMP_VARS.  INITS is the list containing the (single)
1561
   initializer.  */
1562
 
1563
static void
1564
initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1565
                         VEC(tree, heap) **vars, VEC(tree, heap) *inits,
1566
                         bitmap tmp_vars)
1567
{
1568
  unsigned i;
1569
  tree ref = DR_REF (root->ref), init, var, next;
1570
  gimple_seq stmts;
1571
  gimple phi;
1572
  edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1573
 
1574
  /* Find the initializer for the variable, and check that it cannot
1575
     trap.  */
1576
  init = VEC_index (tree, inits, 0);
1577
 
1578
  *vars = VEC_alloc (tree, heap, written ? 2 : 1);
1579
  var = predcom_tmp_var (ref, 0, tmp_vars);
1580
  VEC_quick_push (tree, *vars, var);
1581
  if (written)
1582
    VEC_quick_push (tree, *vars, VEC_index (tree, *vars, 0));
1583
 
1584
  FOR_EACH_VEC_ELT (tree, *vars, i, var)
1585
    VEC_replace (tree, *vars, i, make_ssa_name (var, NULL));
1586
 
1587
  var = VEC_index (tree, *vars, 0);
1588
 
1589
  init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1590
  if (stmts)
1591
    gsi_insert_seq_on_edge_immediate (entry, stmts);
1592
 
1593
  if (written)
1594
    {
1595
      next = VEC_index (tree, *vars, 1);
1596
      phi = create_phi_node (var, loop->header);
1597
      SSA_NAME_DEF_STMT (var) = phi;
1598
      add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1599
      add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1600
    }
1601
  else
1602
    {
1603
      gimple init_stmt = gimple_build_assign (var, init);
1604
      mark_virtual_ops_for_renaming (init_stmt);
1605
      gsi_insert_on_edge_immediate (entry, init_stmt);
1606
    }
1607
}
1608
 
1609
 
1610
/* Execute load motion for references in chain CHAIN.  Uids of the newly
1611
   created temporary variables are marked in TMP_VARS.  */
1612
 
1613
static void
1614
execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1615
{
1616
  VEC (tree, heap) *vars;
1617
  dref a;
1618
  unsigned n_writes = 0, ridx, i;
1619
  tree var;
1620
 
1621
  gcc_assert (chain->type == CT_INVARIANT);
1622
  gcc_assert (!chain->combined);
1623
  FOR_EACH_VEC_ELT (dref, chain->refs, i, a)
1624
    if (DR_IS_WRITE (a->ref))
1625
      n_writes++;
1626
 
1627
  /* If there are no reads in the loop, there is nothing to do.  */
1628
  if (n_writes == VEC_length (dref, chain->refs))
1629
    return;
1630
 
1631
  initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1632
                           &vars, chain->inits, tmp_vars);
1633
 
1634
  ridx = 0;
1635
  FOR_EACH_VEC_ELT (dref, chain->refs, i, a)
1636
    {
1637
      bool is_read = DR_IS_READ (a->ref);
1638
      mark_virtual_ops_for_renaming (a->stmt);
1639
 
1640
      if (DR_IS_WRITE (a->ref))
1641
        {
1642
          n_writes--;
1643
          if (n_writes)
1644
            {
1645
              var = VEC_index (tree, vars, 0);
1646
              var = make_ssa_name (SSA_NAME_VAR (var), NULL);
1647
              VEC_replace (tree, vars, 0, var);
1648
            }
1649
          else
1650
            ridx = 1;
1651
        }
1652
 
1653
      replace_ref_with (a->stmt, VEC_index (tree, vars, ridx),
1654
                        !is_read, !is_read);
1655
    }
1656
 
1657
  VEC_free (tree, heap, vars);
1658
}
1659
 
1660
/* Returns the single statement in that NAME is used, excepting
1661
   the looparound phi nodes contained in one of the chains.  If there is no
1662
   such statement, or more statements, NULL is returned.  */
1663
 
1664
static gimple
1665
single_nonlooparound_use (tree name)
1666
{
1667
  use_operand_p use;
1668
  imm_use_iterator it;
1669
  gimple stmt, ret = NULL;
1670
 
1671
  FOR_EACH_IMM_USE_FAST (use, it, name)
1672
    {
1673
      stmt = USE_STMT (use);
1674
 
1675
      if (gimple_code (stmt) == GIMPLE_PHI)
1676
        {
1677
          /* Ignore uses in looparound phi nodes.  Uses in other phi nodes
1678
             could not be processed anyway, so just fail for them.  */
1679
          if (bitmap_bit_p (looparound_phis,
1680
                            SSA_NAME_VERSION (PHI_RESULT (stmt))))
1681
            continue;
1682
 
1683
          return NULL;
1684
        }
1685
      else if (is_gimple_debug (stmt))
1686
        continue;
1687
      else if (ret != NULL)
1688
        return NULL;
1689
      else
1690
        ret = stmt;
1691
    }
1692
 
1693
  return ret;
1694
}
1695
 
1696
/* Remove statement STMT, as well as the chain of assignments in that it is
1697
   used.  */
1698
 
1699
static void
1700
remove_stmt (gimple stmt)
1701
{
1702
  tree name;
1703
  gimple next;
1704
  gimple_stmt_iterator psi;
1705
 
1706
  if (gimple_code (stmt) == GIMPLE_PHI)
1707
    {
1708
      name = PHI_RESULT (stmt);
1709
      next = single_nonlooparound_use (name);
1710
      psi = gsi_for_stmt (stmt);
1711
      remove_phi_node (&psi, true);
1712
 
1713
      if (!next
1714
          || !gimple_assign_ssa_name_copy_p (next)
1715
          || gimple_assign_rhs1 (next) != name)
1716
        return;
1717
 
1718
      stmt = next;
1719
    }
1720
 
1721
  while (1)
1722
    {
1723
      gimple_stmt_iterator bsi;
1724
 
1725
      bsi = gsi_for_stmt (stmt);
1726
 
1727
      name = gimple_assign_lhs (stmt);
1728
      gcc_assert (TREE_CODE (name) == SSA_NAME);
1729
 
1730
      next = single_nonlooparound_use (name);
1731
 
1732
      mark_virtual_ops_for_renaming (stmt);
1733
      gsi_remove (&bsi, true);
1734
      release_defs (stmt);
1735
 
1736
      if (!next
1737
          || !gimple_assign_ssa_name_copy_p (next)
1738
          || gimple_assign_rhs1 (next) != name)
1739
        return;
1740
 
1741
      stmt = next;
1742
    }
1743
}
1744
 
1745
/* Perform the predictive commoning optimization for a chain CHAIN.
1746
   Uids of the newly created temporary variables are marked in TMP_VARS.*/
1747
 
1748
static void
1749
execute_pred_commoning_chain (struct loop *loop, chain_p chain,
1750
                             bitmap tmp_vars)
1751
{
1752
  unsigned i;
1753
  dref a, root;
1754
  tree var;
1755
 
1756
  if (chain->combined)
1757
    {
1758
      /* For combined chains, just remove the statements that are used to
1759
         compute the values of the expression (except for the root one).  */
1760
      for (i = 1; VEC_iterate (dref, chain->refs, i, a); i++)
1761
        remove_stmt (a->stmt);
1762
    }
1763
  else
1764
    {
1765
      /* For non-combined chains, set up the variables that hold its value,
1766
         and replace the uses of the original references by these
1767
         variables.  */
1768
      root = get_chain_root (chain);
1769
      mark_virtual_ops_for_renaming (root->stmt);
1770
 
1771
      initialize_root (loop, chain, tmp_vars);
1772
      for (i = 1; VEC_iterate (dref, chain->refs, i, a); i++)
1773
        {
1774
          mark_virtual_ops_for_renaming (a->stmt);
1775
          var = VEC_index (tree, chain->vars, chain->length - a->distance);
1776
          replace_ref_with (a->stmt, var, false, false);
1777
        }
1778
    }
1779
}
1780
 
1781
/* Determines the unroll factor necessary to remove as many temporary variable
1782
   copies as possible.  CHAINS is the list of chains that will be
1783
   optimized.  */
1784
 
1785
static unsigned
1786
determine_unroll_factor (VEC (chain_p, heap) *chains)
1787
{
1788
  chain_p chain;
1789
  unsigned factor = 1, af, nfactor, i;
1790
  unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
1791
 
1792
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
1793
    {
1794
      if (chain->type == CT_INVARIANT || chain->combined)
1795
        continue;
1796
 
1797
      /* The best unroll factor for this chain is equal to the number of
1798
         temporary variables that we create for it.  */
1799
      af = chain->length;
1800
      if (chain->has_max_use_after)
1801
        af++;
1802
 
1803
      nfactor = factor * af / gcd (factor, af);
1804
      if (nfactor <= max)
1805
        factor = nfactor;
1806
    }
1807
 
1808
  return factor;
1809
}
1810
 
1811
/* Perform the predictive commoning optimization for CHAINS.
1812
   Uids of the newly created temporary variables are marked in TMP_VARS.  */
1813
 
1814
static void
1815
execute_pred_commoning (struct loop *loop, VEC (chain_p, heap) *chains,
1816
                        bitmap tmp_vars)
1817
{
1818
  chain_p chain;
1819
  unsigned i;
1820
 
1821
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
1822
    {
1823
      if (chain->type == CT_INVARIANT)
1824
        execute_load_motion (loop, chain, tmp_vars);
1825
      else
1826
        execute_pred_commoning_chain (loop, chain, tmp_vars);
1827
    }
1828
 
1829
  update_ssa (TODO_update_ssa_only_virtuals);
1830
}
1831
 
1832
/* For each reference in CHAINS, if its defining statement is
1833
   phi node, record the ssa name that is defined by it.  */
1834
 
1835
static void
1836
replace_phis_by_defined_names (VEC (chain_p, heap) *chains)
1837
{
1838
  chain_p chain;
1839
  dref a;
1840
  unsigned i, j;
1841
 
1842
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
1843
    FOR_EACH_VEC_ELT (dref, chain->refs, j, a)
1844
      {
1845
        if (gimple_code (a->stmt) == GIMPLE_PHI)
1846
          {
1847
            a->name_defined_by_phi = PHI_RESULT (a->stmt);
1848
            a->stmt = NULL;
1849
          }
1850
      }
1851
}
1852
 
1853
/* For each reference in CHAINS, if name_defined_by_phi is not
1854
   NULL, use it to set the stmt field.  */
1855
 
1856
static void
1857
replace_names_by_phis (VEC (chain_p, heap) *chains)
1858
{
1859
  chain_p chain;
1860
  dref a;
1861
  unsigned i, j;
1862
 
1863
  FOR_EACH_VEC_ELT (chain_p, chains, i, chain)
1864
    FOR_EACH_VEC_ELT (dref, chain->refs, j, a)
1865
      if (a->stmt == NULL)
1866
        {
1867
          a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
1868
          gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
1869
          a->name_defined_by_phi = NULL_TREE;
1870
        }
1871
}
1872
 
1873
/* Wrapper over execute_pred_commoning, to pass it as a callback
1874
   to tree_transform_and_unroll_loop.  */
1875
 
1876
struct epcc_data
1877
{
1878
  VEC (chain_p, heap) *chains;
1879
  bitmap tmp_vars;
1880
};
1881
 
1882
static void
1883
execute_pred_commoning_cbck (struct loop *loop, void *data)
1884
{
1885
  struct epcc_data *const dta = (struct epcc_data *) data;
1886
 
1887
  /* Restore phi nodes that were replaced by ssa names before
1888
     tree_transform_and_unroll_loop (see detailed description in
1889
     tree_predictive_commoning_loop).  */
1890
  replace_names_by_phis (dta->chains);
1891
  execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
1892
}
1893
 
1894
/* Base NAME and all the names in the chain of phi nodes that use it
1895
   on variable VAR.  The phi nodes are recognized by being in the copies of
1896
   the header of the LOOP.  */
1897
 
1898
static void
1899
base_names_in_chain_on (struct loop *loop, tree name, tree var)
1900
{
1901
  gimple stmt, phi;
1902
  imm_use_iterator iter;
1903
 
1904
  SSA_NAME_VAR (name) = var;
1905
 
1906
  while (1)
1907
    {
1908
      phi = NULL;
1909
      FOR_EACH_IMM_USE_STMT (stmt, iter, name)
1910
        {
1911
          if (gimple_code (stmt) == GIMPLE_PHI
1912
              && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
1913
            {
1914
              phi = stmt;
1915
              BREAK_FROM_IMM_USE_STMT (iter);
1916
            }
1917
        }
1918
      if (!phi)
1919
        return;
1920
 
1921
      name = PHI_RESULT (phi);
1922
      SSA_NAME_VAR (name) = var;
1923
    }
1924
}
1925
 
1926
/* Given an unrolled LOOP after predictive commoning, remove the
1927
   register copies arising from phi nodes by changing the base
1928
   variables of SSA names.  TMP_VARS is the set of the temporary variables
1929
   for those we want to perform this.  */
1930
 
1931
static void
1932
eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
1933
{
1934
  edge e;
1935
  gimple phi, stmt;
1936
  tree name, use, var;
1937
  gimple_stmt_iterator psi;
1938
 
1939
  e = loop_latch_edge (loop);
1940
  for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1941
    {
1942
      phi = gsi_stmt (psi);
1943
      name = PHI_RESULT (phi);
1944
      var = SSA_NAME_VAR (name);
1945
      if (!bitmap_bit_p (tmp_vars, DECL_UID (var)))
1946
        continue;
1947
      use = PHI_ARG_DEF_FROM_EDGE (phi, e);
1948
      gcc_assert (TREE_CODE (use) == SSA_NAME);
1949
 
1950
      /* Base all the ssa names in the ud and du chain of NAME on VAR.  */
1951
      stmt = SSA_NAME_DEF_STMT (use);
1952
      while (gimple_code (stmt) == GIMPLE_PHI
1953
             /* In case we could not unroll the loop enough to eliminate
1954
                all copies, we may reach the loop header before the defining
1955
                statement (in that case, some register copies will be present
1956
                in loop latch in the final code, corresponding to the newly
1957
                created looparound phi nodes).  */
1958
             && gimple_bb (stmt) != loop->header)
1959
        {
1960
          gcc_assert (single_pred_p (gimple_bb (stmt)));
1961
          use = PHI_ARG_DEF (stmt, 0);
1962
          stmt = SSA_NAME_DEF_STMT (use);
1963
        }
1964
 
1965
      base_names_in_chain_on (loop, use, var);
1966
    }
1967
}
1968
 
1969
/* Returns true if CHAIN is suitable to be combined.  */
1970
 
1971
static bool
1972
chain_can_be_combined_p (chain_p chain)
1973
{
1974
  return (!chain->combined
1975
          && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
1976
}
1977
 
1978
/* Returns the modify statement that uses NAME.  Skips over assignment
1979
   statements, NAME is replaced with the actual name used in the returned
1980
   statement.  */
1981
 
1982
static gimple
1983
find_use_stmt (tree *name)
1984
{
1985
  gimple stmt;
1986
  tree rhs, lhs;
1987
 
1988
  /* Skip over assignments.  */
1989
  while (1)
1990
    {
1991
      stmt = single_nonlooparound_use (*name);
1992
      if (!stmt)
1993
        return NULL;
1994
 
1995
      if (gimple_code (stmt) != GIMPLE_ASSIGN)
1996
        return NULL;
1997
 
1998
      lhs = gimple_assign_lhs (stmt);
1999
      if (TREE_CODE (lhs) != SSA_NAME)
2000
        return NULL;
2001
 
2002
      if (gimple_assign_copy_p (stmt))
2003
        {
2004
          rhs = gimple_assign_rhs1 (stmt);
2005
          if (rhs != *name)
2006
            return NULL;
2007
 
2008
          *name = lhs;
2009
        }
2010
      else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2011
               == GIMPLE_BINARY_RHS)
2012
        return stmt;
2013
      else
2014
        return NULL;
2015
    }
2016
}
2017
 
2018
/* Returns true if we may perform reassociation for operation CODE in TYPE.  */
2019
 
2020
static bool
2021
may_reassociate_p (tree type, enum tree_code code)
2022
{
2023
  if (FLOAT_TYPE_P (type)
2024
      && !flag_unsafe_math_optimizations)
2025
    return false;
2026
 
2027
  return (commutative_tree_code (code)
2028
          && associative_tree_code (code));
2029
}
2030
 
2031
/* If the operation used in STMT is associative and commutative, go through the
2032
   tree of the same operations and returns its root.  Distance to the root
2033
   is stored in DISTANCE.  */
2034
 
2035
static gimple
2036
find_associative_operation_root (gimple stmt, unsigned *distance)
2037
{
2038
  tree lhs;
2039
  gimple next;
2040
  enum tree_code code = gimple_assign_rhs_code (stmt);
2041
  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2042
  unsigned dist = 0;
2043
 
2044
  if (!may_reassociate_p (type, code))
2045
    return NULL;
2046
 
2047
  while (1)
2048
    {
2049
      lhs = gimple_assign_lhs (stmt);
2050
      gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2051
 
2052
      next = find_use_stmt (&lhs);
2053
      if (!next
2054
          || gimple_assign_rhs_code (next) != code)
2055
        break;
2056
 
2057
      stmt = next;
2058
      dist++;
2059
    }
2060
 
2061
  if (distance)
2062
    *distance = dist;
2063
  return stmt;
2064
}
2065
 
2066
/* Returns the common statement in that NAME1 and NAME2 have a use.  If there
2067
   is no such statement, returns NULL_TREE.  In case the operation used on
2068
   NAME1 and NAME2 is associative and commutative, returns the root of the
2069
   tree formed by this operation instead of the statement that uses NAME1 or
2070
   NAME2.  */
2071
 
2072
static gimple
2073
find_common_use_stmt (tree *name1, tree *name2)
2074
{
2075
  gimple stmt1, stmt2;
2076
 
2077
  stmt1 = find_use_stmt (name1);
2078
  if (!stmt1)
2079
    return NULL;
2080
 
2081
  stmt2 = find_use_stmt (name2);
2082
  if (!stmt2)
2083
    return NULL;
2084
 
2085
  if (stmt1 == stmt2)
2086
    return stmt1;
2087
 
2088
  stmt1 = find_associative_operation_root (stmt1, NULL);
2089
  if (!stmt1)
2090
    return NULL;
2091
  stmt2 = find_associative_operation_root (stmt2, NULL);
2092
  if (!stmt2)
2093
    return NULL;
2094
 
2095
  return (stmt1 == stmt2 ? stmt1 : NULL);
2096
}
2097
 
2098
/* Checks whether R1 and R2 are combined together using CODE, with the result
2099
   in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2100
   if it is true.  If CODE is ERROR_MARK, set these values instead.  */
2101
 
2102
static bool
2103
combinable_refs_p (dref r1, dref r2,
2104
                   enum tree_code *code, bool *swap, tree *rslt_type)
2105
{
2106
  enum tree_code acode;
2107
  bool aswap;
2108
  tree atype;
2109
  tree name1, name2;
2110
  gimple stmt;
2111
 
2112
  name1 = name_for_ref (r1);
2113
  name2 = name_for_ref (r2);
2114
  gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2115
 
2116
  stmt = find_common_use_stmt (&name1, &name2);
2117
 
2118
  if (!stmt)
2119
    return false;
2120
 
2121
  acode = gimple_assign_rhs_code (stmt);
2122
  aswap = (!commutative_tree_code (acode)
2123
           && gimple_assign_rhs1 (stmt) != name1);
2124
  atype = TREE_TYPE (gimple_assign_lhs (stmt));
2125
 
2126
  if (*code == ERROR_MARK)
2127
    {
2128
      *code = acode;
2129
      *swap = aswap;
2130
      *rslt_type = atype;
2131
      return true;
2132
    }
2133
 
2134
  return (*code == acode
2135
          && *swap == aswap
2136
          && *rslt_type == atype);
2137
}
2138
 
2139
/* Remove OP from the operation on rhs of STMT, and replace STMT with
2140
   an assignment of the remaining operand.  */
2141
 
2142
static void
2143
remove_name_from_operation (gimple stmt, tree op)
2144
{
2145
  tree other_op;
2146
  gimple_stmt_iterator si;
2147
 
2148
  gcc_assert (is_gimple_assign (stmt));
2149
 
2150
  if (gimple_assign_rhs1 (stmt) == op)
2151
    other_op = gimple_assign_rhs2 (stmt);
2152
  else
2153
    other_op = gimple_assign_rhs1 (stmt);
2154
 
2155
  si = gsi_for_stmt (stmt);
2156
  gimple_assign_set_rhs_from_tree (&si, other_op);
2157
 
2158
  /* We should not have reallocated STMT.  */
2159
  gcc_assert (gsi_stmt (si) == stmt);
2160
 
2161
  update_stmt (stmt);
2162
}
2163
 
2164
/* Reassociates the expression in that NAME1 and NAME2 are used so that they
2165
   are combined in a single statement, and returns this statement.  */
2166
 
2167
static gimple
2168
reassociate_to_the_same_stmt (tree name1, tree name2)
2169
{
2170
  gimple stmt1, stmt2, root1, root2, s1, s2;
2171
  gimple new_stmt, tmp_stmt;
2172
  tree new_name, tmp_name, var, r1, r2;
2173
  unsigned dist1, dist2;
2174
  enum tree_code code;
2175
  tree type = TREE_TYPE (name1);
2176
  gimple_stmt_iterator bsi;
2177
 
2178
  stmt1 = find_use_stmt (&name1);
2179
  stmt2 = find_use_stmt (&name2);
2180
  root1 = find_associative_operation_root (stmt1, &dist1);
2181
  root2 = find_associative_operation_root (stmt2, &dist2);
2182
  code = gimple_assign_rhs_code (stmt1);
2183
 
2184
  gcc_assert (root1 && root2 && root1 == root2
2185
              && code == gimple_assign_rhs_code (stmt2));
2186
 
2187
  /* Find the root of the nearest expression in that both NAME1 and NAME2
2188
     are used.  */
2189
  r1 = name1;
2190
  s1 = stmt1;
2191
  r2 = name2;
2192
  s2 = stmt2;
2193
 
2194
  while (dist1 > dist2)
2195
    {
2196
      s1 = find_use_stmt (&r1);
2197
      r1 = gimple_assign_lhs (s1);
2198
      dist1--;
2199
    }
2200
  while (dist2 > dist1)
2201
    {
2202
      s2 = find_use_stmt (&r2);
2203
      r2 = gimple_assign_lhs (s2);
2204
      dist2--;
2205
    }
2206
 
2207
  while (s1 != s2)
2208
    {
2209
      s1 = find_use_stmt (&r1);
2210
      r1 = gimple_assign_lhs (s1);
2211
      s2 = find_use_stmt (&r2);
2212
      r2 = gimple_assign_lhs (s2);
2213
    }
2214
 
2215
  /* Remove NAME1 and NAME2 from the statements in that they are used
2216
     currently.  */
2217
  remove_name_from_operation (stmt1, name1);
2218
  remove_name_from_operation (stmt2, name2);
2219
 
2220
  /* Insert the new statement combining NAME1 and NAME2 before S1, and
2221
     combine it with the rhs of S1.  */
2222
  var = create_tmp_reg (type, "predreastmp");
2223
  add_referenced_var (var);
2224
  new_name = make_ssa_name (var, NULL);
2225
  new_stmt = gimple_build_assign_with_ops (code, new_name, name1, name2);
2226
 
2227
  var = create_tmp_reg (type, "predreastmp");
2228
  add_referenced_var (var);
2229
  tmp_name = make_ssa_name (var, NULL);
2230
 
2231
  /* Rhs of S1 may now be either a binary expression with operation
2232
     CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2233
     so that name1 or name2 was removed from it).  */
2234
  tmp_stmt = gimple_build_assign_with_ops (gimple_assign_rhs_code (s1),
2235
                                           tmp_name,
2236
                                           gimple_assign_rhs1 (s1),
2237
                                           gimple_assign_rhs2 (s1));
2238
 
2239
  bsi = gsi_for_stmt (s1);
2240
  gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2241
  s1 = gsi_stmt (bsi);
2242
  update_stmt (s1);
2243
 
2244
  gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2245
  gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2246
 
2247
  return new_stmt;
2248
}
2249
 
2250
/* Returns the statement that combines references R1 and R2.  In case R1
2251
   and R2 are not used in the same statement, but they are used with an
2252
   associative and commutative operation in the same expression, reassociate
2253
   the expression so that they are used in the same statement.  */
2254
 
2255
static gimple
2256
stmt_combining_refs (dref r1, dref r2)
2257
{
2258
  gimple stmt1, stmt2;
2259
  tree name1 = name_for_ref (r1);
2260
  tree name2 = name_for_ref (r2);
2261
 
2262
  stmt1 = find_use_stmt (&name1);
2263
  stmt2 = find_use_stmt (&name2);
2264
  if (stmt1 == stmt2)
2265
    return stmt1;
2266
 
2267
  return reassociate_to_the_same_stmt (name1, name2);
2268
}
2269
 
2270
/* Tries to combine chains CH1 and CH2 together.  If this succeeds, the
2271
   description of the new chain is returned, otherwise we return NULL.  */
2272
 
2273
static chain_p
2274
combine_chains (chain_p ch1, chain_p ch2)
2275
{
2276
  dref r1, r2, nw;
2277
  enum tree_code op = ERROR_MARK;
2278
  bool swap = false;
2279
  chain_p new_chain;
2280
  unsigned i;
2281
  gimple root_stmt;
2282
  tree rslt_type = NULL_TREE;
2283
 
2284
  if (ch1 == ch2)
2285
    return NULL;
2286
  if (ch1->length != ch2->length)
2287
    return NULL;
2288
 
2289
  if (VEC_length (dref, ch1->refs) != VEC_length (dref, ch2->refs))
2290
    return NULL;
2291
 
2292
  for (i = 0; (VEC_iterate (dref, ch1->refs, i, r1)
2293
               && VEC_iterate (dref, ch2->refs, i, r2)); i++)
2294
    {
2295
      if (r1->distance != r2->distance)
2296
        return NULL;
2297
 
2298
      if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2299
        return NULL;
2300
    }
2301
 
2302
  if (swap)
2303
    {
2304
      chain_p tmp = ch1;
2305
      ch1 = ch2;
2306
      ch2 = tmp;
2307
    }
2308
 
2309
  new_chain = XCNEW (struct chain);
2310
  new_chain->type = CT_COMBINATION;
2311
  new_chain->op = op;
2312
  new_chain->ch1 = ch1;
2313
  new_chain->ch2 = ch2;
2314
  new_chain->rslt_type = rslt_type;
2315
  new_chain->length = ch1->length;
2316
 
2317
  for (i = 0; (VEC_iterate (dref, ch1->refs, i, r1)
2318
               && VEC_iterate (dref, ch2->refs, i, r2)); i++)
2319
    {
2320
      nw = XCNEW (struct dref_d);
2321
      nw->stmt = stmt_combining_refs (r1, r2);
2322
      nw->distance = r1->distance;
2323
 
2324
      VEC_safe_push (dref, heap, new_chain->refs, nw);
2325
    }
2326
 
2327
  new_chain->has_max_use_after = false;
2328
  root_stmt = get_chain_root (new_chain)->stmt;
2329
  for (i = 1; VEC_iterate (dref, new_chain->refs, i, nw); i++)
2330
    {
2331
      if (nw->distance == new_chain->length
2332
          && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2333
        {
2334
          new_chain->has_max_use_after = true;
2335
          break;
2336
        }
2337
    }
2338
 
2339
  ch1->combined = true;
2340
  ch2->combined = true;
2341
  return new_chain;
2342
}
2343
 
2344
/* Try to combine the CHAINS.  */
2345
 
2346
static void
2347
try_combine_chains (VEC (chain_p, heap) **chains)
2348
{
2349
  unsigned i, j;
2350
  chain_p ch1, ch2, cch;
2351
  VEC (chain_p, heap) *worklist = NULL;
2352
 
2353
  FOR_EACH_VEC_ELT (chain_p, *chains, i, ch1)
2354
    if (chain_can_be_combined_p (ch1))
2355
      VEC_safe_push (chain_p, heap, worklist, ch1);
2356
 
2357
  while (!VEC_empty (chain_p, worklist))
2358
    {
2359
      ch1 = VEC_pop (chain_p, worklist);
2360
      if (!chain_can_be_combined_p (ch1))
2361
        continue;
2362
 
2363
      FOR_EACH_VEC_ELT (chain_p, *chains, j, ch2)
2364
        {
2365
          if (!chain_can_be_combined_p (ch2))
2366
            continue;
2367
 
2368
          cch = combine_chains (ch1, ch2);
2369
          if (cch)
2370
            {
2371
              VEC_safe_push (chain_p, heap, worklist, cch);
2372
              VEC_safe_push (chain_p, heap, *chains, cch);
2373
              break;
2374
            }
2375
        }
2376
    }
2377
}
2378
 
2379
/* Prepare initializers for CHAIN in LOOP.  Returns false if this is
2380
   impossible because one of these initializers may trap, true otherwise.  */
2381
 
2382
static bool
2383
prepare_initializers_chain (struct loop *loop, chain_p chain)
2384
{
2385
  unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2386
  struct data_reference *dr = get_chain_root (chain)->ref;
2387
  tree init;
2388
  gimple_seq stmts;
2389
  dref laref;
2390
  edge entry = loop_preheader_edge (loop);
2391
 
2392
  /* Find the initializers for the variables, and check that they cannot
2393
     trap.  */
2394
  chain->inits = VEC_alloc (tree, heap, n);
2395
  for (i = 0; i < n; i++)
2396
    VEC_quick_push (tree, chain->inits, NULL_TREE);
2397
 
2398
  /* If we have replaced some looparound phi nodes, use their initializers
2399
     instead of creating our own.  */
2400
  FOR_EACH_VEC_ELT (dref, chain->refs, i, laref)
2401
    {
2402
      if (gimple_code (laref->stmt) != GIMPLE_PHI)
2403
        continue;
2404
 
2405
      gcc_assert (laref->distance > 0);
2406
      VEC_replace (tree, chain->inits, n - laref->distance,
2407
                   PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry));
2408
    }
2409
 
2410
  for (i = 0; i < n; i++)
2411
    {
2412
      if (VEC_index (tree, chain->inits, i) != NULL_TREE)
2413
        continue;
2414
 
2415
      init = ref_at_iteration (loop, DR_REF (dr), (int) i - n);
2416
      if (!init)
2417
        return false;
2418
 
2419
      if (!chain->all_always_accessed && tree_could_trap_p (init))
2420
        return false;
2421
 
2422
      init = force_gimple_operand (init, &stmts, false, NULL_TREE);
2423
      if (stmts)
2424
        gsi_insert_seq_on_edge_immediate (entry, stmts);
2425
 
2426
      VEC_replace (tree, chain->inits, i, init);
2427
    }
2428
 
2429
  return true;
2430
}
2431
 
2432
/* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2433
   be used because the initializers might trap.  */
2434
 
2435
static void
2436
prepare_initializers (struct loop *loop, VEC (chain_p, heap) *chains)
2437
{
2438
  chain_p chain;
2439
  unsigned i;
2440
 
2441
  for (i = 0; i < VEC_length (chain_p, chains); )
2442
    {
2443
      chain = VEC_index (chain_p, chains, i);
2444
      if (prepare_initializers_chain (loop, chain))
2445
        i++;
2446
      else
2447
        {
2448
          release_chain (chain);
2449
          VEC_unordered_remove (chain_p, chains, i);
2450
        }
2451
    }
2452
}
2453
 
2454
/* Performs predictive commoning for LOOP.  Returns true if LOOP was
2455
   unrolled.  */
2456
 
2457
static bool
2458
tree_predictive_commoning_loop (struct loop *loop)
2459
{
2460
  VEC (loop_p, heap) *loop_nest;
2461
  VEC (data_reference_p, heap) *datarefs;
2462
  VEC (ddr_p, heap) *dependences;
2463
  struct component *components;
2464
  VEC (chain_p, heap) *chains = NULL;
2465
  unsigned unroll_factor;
2466
  struct tree_niter_desc desc;
2467
  bool unroll = false;
2468
  edge exit;
2469
  bitmap tmp_vars;
2470
 
2471
  if (dump_file && (dump_flags & TDF_DETAILS))
2472
    fprintf (dump_file, "Processing loop %d\n",  loop->num);
2473
 
2474
  /* Find the data references and split them into components according to their
2475
     dependence relations.  */
2476
  datarefs = VEC_alloc (data_reference_p, heap, 10);
2477
  dependences = VEC_alloc (ddr_p, heap, 10);
2478
  loop_nest = VEC_alloc (loop_p, heap, 3);
2479
  if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
2480
                                           &dependences))
2481
    {
2482
      if (dump_file && (dump_flags & TDF_DETAILS))
2483
        fprintf (dump_file, "Cannot analyze data dependencies\n");
2484
      VEC_free (loop_p, heap, loop_nest);
2485
      free_data_refs (datarefs);
2486
      free_dependence_relations (dependences);
2487
      return false;
2488
    }
2489
 
2490
  if (dump_file && (dump_flags & TDF_DETAILS))
2491
    dump_data_dependence_relations (dump_file, dependences);
2492
 
2493
  components = split_data_refs_to_components (loop, datarefs, dependences);
2494
  VEC_free (loop_p, heap, loop_nest);
2495
  free_dependence_relations (dependences);
2496
  if (!components)
2497
    {
2498
      free_data_refs (datarefs);
2499
      return false;
2500
    }
2501
 
2502
  if (dump_file && (dump_flags & TDF_DETAILS))
2503
    {
2504
      fprintf (dump_file, "Initial state:\n\n");
2505
      dump_components (dump_file, components);
2506
    }
2507
 
2508
  /* Find the suitable components and split them into chains.  */
2509
  components = filter_suitable_components (loop, components);
2510
 
2511
  tmp_vars = BITMAP_ALLOC (NULL);
2512
  looparound_phis = BITMAP_ALLOC (NULL);
2513
  determine_roots (loop, components, &chains);
2514
  release_components (components);
2515
 
2516
  if (!chains)
2517
    {
2518
      if (dump_file && (dump_flags & TDF_DETAILS))
2519
        fprintf (dump_file,
2520
                 "Predictive commoning failed: no suitable chains\n");
2521
      goto end;
2522
    }
2523
  prepare_initializers (loop, chains);
2524
 
2525
  /* Try to combine the chains that are always worked with together.  */
2526
  try_combine_chains (&chains);
2527
 
2528
  if (dump_file && (dump_flags & TDF_DETAILS))
2529
    {
2530
      fprintf (dump_file, "Before commoning:\n\n");
2531
      dump_chains (dump_file, chains);
2532
    }
2533
 
2534
  /* Determine the unroll factor, and if the loop should be unrolled, ensure
2535
     that its number of iterations is divisible by the factor.  */
2536
  unroll_factor = determine_unroll_factor (chains);
2537
  scev_reset ();
2538
  unroll = (unroll_factor > 1
2539
            && can_unroll_loop_p (loop, unroll_factor, &desc));
2540
  exit = single_dom_exit (loop);
2541
 
2542
  /* Execute the predictive commoning transformations, and possibly unroll the
2543
     loop.  */
2544
  if (unroll)
2545
    {
2546
      struct epcc_data dta;
2547
 
2548
      if (dump_file && (dump_flags & TDF_DETAILS))
2549
        fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
2550
 
2551
      dta.chains = chains;
2552
      dta.tmp_vars = tmp_vars;
2553
 
2554
      update_ssa (TODO_update_ssa_only_virtuals);
2555
 
2556
      /* Cfg manipulations performed in tree_transform_and_unroll_loop before
2557
         execute_pred_commoning_cbck is called may cause phi nodes to be
2558
         reallocated, which is a problem since CHAINS may point to these
2559
         statements.  To fix this, we store the ssa names defined by the
2560
         phi nodes here instead of the phi nodes themselves, and restore
2561
         the phi nodes in execute_pred_commoning_cbck.  A bit hacky.  */
2562
      replace_phis_by_defined_names (chains);
2563
 
2564
      tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
2565
                                      execute_pred_commoning_cbck, &dta);
2566
      eliminate_temp_copies (loop, tmp_vars);
2567
    }
2568
  else
2569
    {
2570
      if (dump_file && (dump_flags & TDF_DETAILS))
2571
        fprintf (dump_file,
2572
                 "Executing predictive commoning without unrolling.\n");
2573
      execute_pred_commoning (loop, chains, tmp_vars);
2574
    }
2575
 
2576
end: ;
2577
  release_chains (chains);
2578
  free_data_refs (datarefs);
2579
  BITMAP_FREE (tmp_vars);
2580
  BITMAP_FREE (looparound_phis);
2581
 
2582
  free_affine_expand_cache (&name_expansions);
2583
 
2584
  return unroll;
2585
}
2586
 
2587
/* Runs predictive commoning.  */
2588
 
2589
unsigned
2590
tree_predictive_commoning (void)
2591
{
2592
  bool unrolled = false;
2593
  struct loop *loop;
2594
  loop_iterator li;
2595
  unsigned ret = 0;
2596
 
2597
  initialize_original_copy_tables ();
2598
  FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
2599
    if (optimize_loop_for_speed_p (loop))
2600
      {
2601
        unrolled |= tree_predictive_commoning_loop (loop);
2602
      }
2603
 
2604
  if (unrolled)
2605
    {
2606
      scev_reset ();
2607
      ret = TODO_cleanup_cfg;
2608
    }
2609
  free_original_copy_tables ();
2610
 
2611
  return ret;
2612
}

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.