| 1 | 711 | jeremybenn | @c markers: BUG TODO
 | 
      
         | 2 |  |  |  
 | 
      
         | 3 |  |  | @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
 | 
      
         | 4 |  |  | @c 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
 | 
      
         | 5 |  |  | @c Free Software Foundation, Inc.
 | 
      
         | 6 |  |  | @c This is part of the GCC manual.
 | 
      
         | 7 |  |  | @c For copying conditions, see the file gcc.texi.
 | 
      
         | 8 |  |  |  
 | 
      
         | 9 |  |  | @node Passes
 | 
      
         | 10 |  |  | @chapter Passes and Files of the Compiler
 | 
      
         | 11 |  |  | @cindex passes and files of the compiler
 | 
      
         | 12 |  |  | @cindex files and passes of the compiler
 | 
      
         | 13 |  |  | @cindex compiler passes and files
 | 
      
         | 14 |  |  |  
 | 
      
         | 15 |  |  | This chapter is dedicated to giving an overview of the optimization and
 | 
      
         | 16 |  |  | code generation passes of the compiler.  In the process, it describes
 | 
      
         | 17 |  |  | some of the language front end interface, though this description is no
 | 
      
         | 18 |  |  | where near complete.
 | 
      
         | 19 |  |  |  
 | 
      
         | 20 |  |  | @menu
 | 
      
         | 21 |  |  | * Parsing pass::         The language front end turns text into bits.
 | 
      
         | 22 |  |  | * Gimplification pass::  The bits are turned into something we can optimize.
 | 
      
         | 23 |  |  | * Pass manager::         Sequencing the optimization passes.
 | 
      
         | 24 |  |  | * Tree SSA passes::      Optimizations on a high-level representation.
 | 
      
         | 25 |  |  | * RTL passes::           Optimizations on a low-level representation.
 | 
      
         | 26 |  |  | @end menu
 | 
      
         | 27 |  |  |  
 | 
      
         | 28 |  |  | @node Parsing pass
 | 
      
         | 29 |  |  | @section Parsing pass
 | 
      
         | 30 |  |  | @cindex GENERIC
 | 
      
         | 31 |  |  | @findex lang_hooks.parse_file
 | 
      
         | 32 |  |  | The language front end is invoked only once, via
 | 
      
         | 33 |  |  | @code{lang_hooks.parse_file}, to parse the entire input.  The language
 | 
      
         | 34 |  |  | front end may use any intermediate language representation deemed
 | 
      
         | 35 |  |  | appropriate.  The C front end uses GENERIC trees (@pxref{GENERIC}), plus
 | 
      
         | 36 |  |  | a double handful of language specific tree codes defined in
 | 
      
         | 37 |  |  | @file{c-common.def}.  The Fortran front end uses a completely different
 | 
      
         | 38 |  |  | private representation.
 | 
      
         | 39 |  |  |  
 | 
      
         | 40 |  |  | @cindex GIMPLE
 | 
      
         | 41 |  |  | @cindex gimplification
 | 
      
         | 42 |  |  | @cindex gimplifier
 | 
      
         | 43 |  |  | @cindex language-independent intermediate representation
 | 
      
         | 44 |  |  | @cindex intermediate representation lowering
 | 
      
         | 45 |  |  | @cindex lowering, language-dependent intermediate representation
 | 
      
         | 46 |  |  | At some point the front end must translate the representation used in the
 | 
      
         | 47 |  |  | front end to a representation understood by the language-independent
 | 
      
         | 48 |  |  | portions of the compiler.  Current practice takes one of two forms.
 | 
      
         | 49 |  |  | The C front end manually invokes the gimplifier (@pxref{GIMPLE}) on each function,
 | 
      
         | 50 |  |  | and uses the gimplifier callbacks to convert the language-specific tree
 | 
      
         | 51 |  |  | nodes directly to GIMPLE before passing the function off to be compiled.
 | 
      
         | 52 |  |  | The Fortran front end converts from a private representation to GENERIC,
 | 
      
         | 53 |  |  | which is later lowered to GIMPLE when the function is compiled.  Which
 | 
      
         | 54 |  |  | route to choose probably depends on how well GENERIC (plus extensions)
 | 
      
         | 55 |  |  | can be made to match up with the source language and necessary parsing
 | 
      
         | 56 |  |  | data structures.
 | 
      
         | 57 |  |  |  
 | 
      
         | 58 |  |  | BUG: Gimplification must occur before nested function lowering,
 | 
      
         | 59 |  |  | and nested function lowering must be done by the front end before
 | 
      
         | 60 |  |  | passing the data off to cgraph.
 | 
      
         | 61 |  |  |  
 | 
      
         | 62 |  |  | TODO: Cgraph should control nested function lowering.  It would
 | 
      
         | 63 |  |  | only be invoked when it is certain that the outer-most function
 | 
      
         | 64 |  |  | is used.
 | 
      
         | 65 |  |  |  
 | 
      
         | 66 |  |  | TODO: Cgraph needs a gimplify_function callback.  It should be
 | 
      
         | 67 |  |  | invoked when (1) it is certain that the function is used, (2)
 | 
      
         | 68 |  |  | warning flags specified by the user require some amount of
 | 
      
         | 69 |  |  | compilation in order to honor, (3) the language indicates that
 | 
      
         | 70 |  |  | semantic analysis is not complete until gimplification occurs.
 | 
      
         | 71 |  |  | Hum@dots{} this sounds overly complicated.  Perhaps we should just
 | 
      
         | 72 |  |  | have the front end gimplify always; in most cases it's only one
 | 
      
         | 73 |  |  | function call.
 | 
      
         | 74 |  |  |  
 | 
      
         | 75 |  |  | The front end needs to pass all function definitions and top level
 | 
      
         | 76 |  |  | declarations off to the middle-end so that they can be compiled and
 | 
      
         | 77 |  |  | emitted to the object file.  For a simple procedural language, it is
 | 
      
         | 78 |  |  | usually most convenient to do this as each top level declaration or
 | 
      
         | 79 |  |  | definition is seen.  There is also a distinction to be made between
 | 
      
         | 80 |  |  | generating functional code and generating complete debug information.
 | 
      
         | 81 |  |  | The only thing that is absolutely required for functional code is that
 | 
      
         | 82 |  |  | function and data @emph{definitions} be passed to the middle-end.  For
 | 
      
         | 83 |  |  | complete debug information, function, data and type declarations
 | 
      
         | 84 |  |  | should all be passed as well.
 | 
      
         | 85 |  |  |  
 | 
      
         | 86 |  |  | @findex rest_of_decl_compilation
 | 
      
         | 87 |  |  | @findex rest_of_type_compilation
 | 
      
         | 88 |  |  | @findex cgraph_finalize_function
 | 
      
         | 89 |  |  | In any case, the front end needs each complete top-level function or
 | 
      
         | 90 |  |  | data declaration, and each data definition should be passed to
 | 
      
         | 91 |  |  | @code{rest_of_decl_compilation}.  Each complete type definition should
 | 
      
         | 92 |  |  | be passed to @code{rest_of_type_compilation}.  Each function definition
 | 
      
         | 93 |  |  | should be passed to @code{cgraph_finalize_function}.
 | 
      
         | 94 |  |  |  
 | 
      
         | 95 |  |  | TODO: I know rest_of_compilation currently has all sorts of
 | 
      
         | 96 |  |  | RTL generation semantics.  I plan to move all code generation
 | 
      
         | 97 |  |  | bits (both Tree and RTL) to compile_function.  Should we hide
 | 
      
         | 98 |  |  | cgraph from the front ends and move back to rest_of_compilation
 | 
      
         | 99 |  |  | as the official interface?  Possibly we should rename all three
 | 
      
         | 100 |  |  | interfaces such that the names match in some meaningful way and
 | 
      
         | 101 |  |  | that is more descriptive than "rest_of".
 | 
      
         | 102 |  |  |  
 | 
      
         | 103 |  |  | The middle-end will, at its option, emit the function and data
 | 
      
         | 104 |  |  | definitions immediately or queue them for later processing.
 | 
      
         | 105 |  |  |  
 | 
      
         | 106 |  |  | @node Gimplification pass
 | 
      
         | 107 |  |  | @section Gimplification pass
 | 
      
         | 108 |  |  |  
 | 
      
         | 109 |  |  | @cindex gimplification
 | 
      
         | 110 |  |  | @cindex GIMPLE
 | 
      
         | 111 |  |  | @dfn{Gimplification} is a whimsical term for the process of converting
 | 
      
         | 112 |  |  | the intermediate representation of a function into the GIMPLE language
 | 
      
         | 113 |  |  | (@pxref{GIMPLE}).  The term stuck, and so words like ``gimplification'',
 | 
      
         | 114 |  |  | ``gimplify'', ``gimplifier'' and the like are sprinkled throughout this
 | 
      
         | 115 |  |  | section of code.
 | 
      
         | 116 |  |  |  
 | 
      
         | 117 |  |  | While a front end may certainly choose to generate GIMPLE directly if
 | 
      
         | 118 |  |  | it chooses, this can be a moderately complex process unless the
 | 
      
         | 119 |  |  | intermediate language used by the front end is already fairly simple.
 | 
      
         | 120 |  |  | Usually it is easier to generate GENERIC trees plus extensions
 | 
      
         | 121 |  |  | and let the language-independent gimplifier do most of the work.
 | 
      
         | 122 |  |  |  
 | 
      
         | 123 |  |  | @findex gimplify_function_tree
 | 
      
         | 124 |  |  | @findex gimplify_expr
 | 
      
         | 125 |  |  | @findex lang_hooks.gimplify_expr
 | 
      
         | 126 |  |  | The main entry point to this pass is @code{gimplify_function_tree}
 | 
      
         | 127 |  |  | located in @file{gimplify.c}.  From here we process the entire
 | 
      
         | 128 |  |  | function gimplifying each statement in turn.  The main workhorse
 | 
      
         | 129 |  |  | for this pass is @code{gimplify_expr}.  Approximately everything
 | 
      
         | 130 |  |  | passes through here at least once, and it is from here that we
 | 
      
         | 131 |  |  | invoke the @code{lang_hooks.gimplify_expr} callback.
 | 
      
         | 132 |  |  |  
 | 
      
         | 133 |  |  | The callback should examine the expression in question and return
 | 
      
         | 134 |  |  | @code{GS_UNHANDLED} if the expression is not a language specific
 | 
      
         | 135 |  |  | construct that requires attention.  Otherwise it should alter the
 | 
      
         | 136 |  |  | expression in some way to such that forward progress is made toward
 | 
      
         | 137 |  |  | producing valid GIMPLE@.  If the callback is certain that the
 | 
      
         | 138 |  |  | transformation is complete and the expression is valid GIMPLE, it
 | 
      
         | 139 |  |  | should return @code{GS_ALL_DONE}.  Otherwise it should return
 | 
      
         | 140 |  |  | @code{GS_OK}, which will cause the expression to be processed again.
 | 
      
         | 141 |  |  | If the callback encounters an error during the transformation (because
 | 
      
         | 142 |  |  | the front end is relying on the gimplification process to finish
 | 
      
         | 143 |  |  | semantic checks), it should return @code{GS_ERROR}.
 | 
      
         | 144 |  |  |  
 | 
      
         | 145 |  |  | @node Pass manager
 | 
      
         | 146 |  |  | @section Pass manager
 | 
      
         | 147 |  |  |  
 | 
      
         | 148 |  |  | The pass manager is located in @file{passes.c}, @file{tree-optimize.c}
 | 
      
         | 149 |  |  | and @file{tree-pass.h}.
 | 
      
         | 150 |  |  | Its job is to run all of the individual passes in the correct order,
 | 
      
         | 151 |  |  | and take care of standard bookkeeping that applies to every pass.
 | 
      
         | 152 |  |  |  
 | 
      
         | 153 |  |  | The theory of operation is that each pass defines a structure that
 | 
      
         | 154 |  |  | represents everything we need to know about that pass---when it
 | 
      
         | 155 |  |  | should be run, how it should be run, what intermediate language
 | 
      
         | 156 |  |  | form or on-the-side data structures it needs.  We register the pass
 | 
      
         | 157 |  |  | to be run in some particular order, and the pass manager arranges
 | 
      
         | 158 |  |  | for everything to happen in the correct order.
 | 
      
         | 159 |  |  |  
 | 
      
         | 160 |  |  | The actuality doesn't completely live up to the theory at present.
 | 
      
         | 161 |  |  | Command-line switches and @code{timevar_id_t} enumerations must still
 | 
      
         | 162 |  |  | be defined elsewhere.  The pass manager validates constraints but does
 | 
      
         | 163 |  |  | not attempt to (re-)generate data structures or lower intermediate
 | 
      
         | 164 |  |  | language form based on the requirements of the next pass.  Nevertheless,
 | 
      
         | 165 |  |  | what is present is useful, and a far sight better than nothing at all.
 | 
      
         | 166 |  |  |  
 | 
      
         | 167 |  |  | Each pass should have a unique name.
 | 
      
         | 168 |  |  | Each pass may have its own dump file (for GCC debugging purposes).
 | 
      
         | 169 |  |  | Passes with a name starting with a star do not dump anything.
 | 
      
         | 170 |  |  | Sometimes passes are supposed to share a dump file / option name.
 | 
      
         | 171 |  |  | To still give these unique names, you can use a prefix that is delimited
 | 
      
         | 172 |  |  | by a space from the part that is used for the dump file / option name.
 | 
      
         | 173 |  |  | E.g. When the pass name is "ud dce", the name used for dump file/options
 | 
      
         | 174 |  |  | is "dce".
 | 
      
         | 175 |  |  |  
 | 
      
         | 176 |  |  | TODO: describe the global variables set up by the pass manager,
 | 
      
         | 177 |  |  | and a brief description of how a new pass should use it.
 | 
      
         | 178 |  |  | I need to look at what info RTL passes use first@enddots{}
 | 
      
         | 179 |  |  |  
 | 
      
         | 180 |  |  | @node Tree SSA passes
 | 
      
         | 181 |  |  | @section Tree SSA passes
 | 
      
         | 182 |  |  |  
 | 
      
         | 183 |  |  | The following briefly describes the Tree optimization passes that are
 | 
      
         | 184 |  |  | run after gimplification and what source files they are located in.
 | 
      
         | 185 |  |  |  
 | 
      
         | 186 |  |  | @itemize @bullet
 | 
      
         | 187 |  |  | @item Remove useless statements
 | 
      
         | 188 |  |  |  
 | 
      
         | 189 |  |  | This pass is an extremely simple sweep across the gimple code in which
 | 
      
         | 190 |  |  | we identify obviously dead code and remove it.  Here we do things like
 | 
      
         | 191 |  |  | simplify @code{if} statements with constant conditions, remove
 | 
      
         | 192 |  |  | exception handling constructs surrounding code that obviously cannot
 | 
      
         | 193 |  |  | throw, remove lexical bindings that contain no variables, and other
 | 
      
         | 194 |  |  | assorted simplistic cleanups.  The idea is to get rid of the obvious
 | 
      
         | 195 |  |  | stuff quickly rather than wait until later when it's more work to get
 | 
      
         | 196 |  |  | rid of it.  This pass is located in @file{tree-cfg.c} and described by
 | 
      
         | 197 |  |  | @code{pass_remove_useless_stmts}.
 | 
      
         | 198 |  |  |  
 | 
      
         | 199 |  |  | @item Mudflap declaration registration
 | 
      
         | 200 |  |  |  
 | 
      
         | 201 |  |  | If mudflap (@pxref{Optimize Options,,-fmudflap -fmudflapth
 | 
      
         | 202 |  |  | -fmudflapir,gcc,Using the GNU Compiler Collection (GCC)}) is
 | 
      
         | 203 |  |  | enabled, we generate code to register some variable declarations with
 | 
      
         | 204 |  |  | the mudflap runtime.  Specifically, the runtime tracks the lifetimes of
 | 
      
         | 205 |  |  | those variable declarations that have their addresses taken, or whose
 | 
      
         | 206 |  |  | bounds are unknown at compile time (@code{extern}).  This pass generates
 | 
      
         | 207 |  |  | new exception handling constructs (@code{try}/@code{finally}), and so
 | 
      
         | 208 |  |  | must run before those are lowered.  In addition, the pass enqueues
 | 
      
         | 209 |  |  | declarations of static variables whose lifetimes extend to the entire
 | 
      
         | 210 |  |  | program.  The pass is located in @file{tree-mudflap.c} and is described
 | 
      
         | 211 |  |  | by @code{pass_mudflap_1}.
 | 
      
         | 212 |  |  |  
 | 
      
         | 213 |  |  | @item OpenMP lowering
 | 
      
         | 214 |  |  |  
 | 
      
         | 215 |  |  | If OpenMP generation (@option{-fopenmp}) is enabled, this pass lowers
 | 
      
         | 216 |  |  | OpenMP constructs into GIMPLE.
 | 
      
         | 217 |  |  |  
 | 
      
         | 218 |  |  | Lowering of OpenMP constructs involves creating replacement
 | 
      
         | 219 |  |  | expressions for local variables that have been mapped using data
 | 
      
         | 220 |  |  | sharing clauses, exposing the control flow of most synchronization
 | 
      
         | 221 |  |  | directives and adding region markers to facilitate the creation of the
 | 
      
         | 222 |  |  | control flow graph.  The pass is located in @file{omp-low.c} and is
 | 
      
         | 223 |  |  | described by @code{pass_lower_omp}.
 | 
      
         | 224 |  |  |  
 | 
      
         | 225 |  |  | @item OpenMP expansion
 | 
      
         | 226 |  |  |  
 | 
      
         | 227 |  |  | If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
 | 
      
         | 228 |  |  | parallel regions into their own functions to be invoked by the thread
 | 
      
         | 229 |  |  | library.  The pass is located in @file{omp-low.c} and is described by
 | 
      
         | 230 |  |  | @code{pass_expand_omp}.
 | 
      
         | 231 |  |  |  
 | 
      
         | 232 |  |  | @item Lower control flow
 | 
      
         | 233 |  |  |  
 | 
      
         | 234 |  |  | This pass flattens @code{if} statements (@code{COND_EXPR})
 | 
      
         | 235 |  |  | and moves lexical bindings (@code{BIND_EXPR}) out of line.  After
 | 
      
         | 236 |  |  | this pass, all @code{if} statements will have exactly two @code{goto}
 | 
      
         | 237 |  |  | statements in its @code{then} and @code{else} arms.  Lexical binding
 | 
      
         | 238 |  |  | information for each statement will be found in @code{TREE_BLOCK} rather
 | 
      
         | 239 |  |  | than being inferred from its position under a @code{BIND_EXPR}.  This
 | 
      
         | 240 |  |  | pass is found in @file{gimple-low.c} and is described by
 | 
      
         | 241 |  |  | @code{pass_lower_cf}.
 | 
      
         | 242 |  |  |  
 | 
      
         | 243 |  |  | @item Lower exception handling control flow
 | 
      
         | 244 |  |  |  
 | 
      
         | 245 |  |  | This pass decomposes high-level exception handling constructs
 | 
      
         | 246 |  |  | (@code{TRY_FINALLY_EXPR} and @code{TRY_CATCH_EXPR}) into a form
 | 
      
         | 247 |  |  | that explicitly represents the control flow involved.  After this
 | 
      
         | 248 |  |  | pass, @code{lookup_stmt_eh_region} will return a non-negative
 | 
      
         | 249 |  |  | number for any statement that may have EH control flow semantics;
 | 
      
         | 250 |  |  | examine @code{tree_can_throw_internal} or @code{tree_can_throw_external}
 | 
      
         | 251 |  |  | for exact semantics.  Exact control flow may be extracted from
 | 
      
         | 252 |  |  | @code{foreach_reachable_handler}.  The EH region nesting tree is defined
 | 
      
         | 253 |  |  | in @file{except.h} and built in @file{except.c}.  The lowering pass
 | 
      
         | 254 |  |  | itself is in @file{tree-eh.c} and is described by @code{pass_lower_eh}.
 | 
      
         | 255 |  |  |  
 | 
      
         | 256 |  |  | @item Build the control flow graph
 | 
      
         | 257 |  |  |  
 | 
      
         | 258 |  |  | This pass decomposes a function into basic blocks and creates all of
 | 
      
         | 259 |  |  | the edges that connect them.  It is located in @file{tree-cfg.c} and
 | 
      
         | 260 |  |  | is described by @code{pass_build_cfg}.
 | 
      
         | 261 |  |  |  
 | 
      
         | 262 |  |  | @item Find all referenced variables
 | 
      
         | 263 |  |  |  
 | 
      
         | 264 |  |  | This pass walks the entire function and collects an array of all
 | 
      
         | 265 |  |  | variables referenced in the function, @code{referenced_vars}.  The
 | 
      
         | 266 |  |  | index at which a variable is found in the array is used as a UID
 | 
      
         | 267 |  |  | for the variable within this function.  This data is needed by the
 | 
      
         | 268 |  |  | SSA rewriting routines.  The pass is located in @file{tree-dfa.c}
 | 
      
         | 269 |  |  | and is described by @code{pass_referenced_vars}.
 | 
      
         | 270 |  |  |  
 | 
      
         | 271 |  |  | @item Enter static single assignment form
 | 
      
         | 272 |  |  |  
 | 
      
         | 273 |  |  | This pass rewrites the function such that it is in SSA form.  After
 | 
      
         | 274 |  |  | this pass, all @code{is_gimple_reg} variables will be referenced by
 | 
      
         | 275 |  |  | @code{SSA_NAME}, and all occurrences of other variables will be
 | 
      
         | 276 |  |  | annotated with @code{VDEFS} and @code{VUSES}; PHI nodes will have
 | 
      
         | 277 |  |  | been inserted as necessary for each basic block.  This pass is
 | 
      
         | 278 |  |  | located in @file{tree-ssa.c} and is described by @code{pass_build_ssa}.
 | 
      
         | 279 |  |  |  
 | 
      
         | 280 |  |  | @item Warn for uninitialized variables
 | 
      
         | 281 |  |  |  
 | 
      
         | 282 |  |  | This pass scans the function for uses of @code{SSA_NAME}s that
 | 
      
         | 283 |  |  | are fed by default definition.  For non-parameter variables, such
 | 
      
         | 284 |  |  | uses are uninitialized.  The pass is run twice, before and after
 | 
      
         | 285 |  |  | optimization (if turned on).  In the first pass we only warn for uses that are
 | 
      
         | 286 |  |  | positively uninitialized; in the second pass we warn for uses that
 | 
      
         | 287 |  |  | are possibly uninitialized.  The pass is located in @file{tree-ssa.c}
 | 
      
         | 288 |  |  | and is defined by @code{pass_early_warn_uninitialized} and
 | 
      
         | 289 |  |  | @code{pass_late_warn_uninitialized}.
 | 
      
         | 290 |  |  |  
 | 
      
         | 291 |  |  | @item Dead code elimination
 | 
      
         | 292 |  |  |  
 | 
      
         | 293 |  |  | This pass scans the function for statements without side effects whose
 | 
      
         | 294 |  |  | result is unused.  It does not do memory life analysis, so any value
 | 
      
         | 295 |  |  | that is stored in memory is considered used.  The pass is run multiple
 | 
      
         | 296 |  |  | times throughout the optimization process.  It is located in
 | 
      
         | 297 |  |  | @file{tree-ssa-dce.c} and is described by @code{pass_dce}.
 | 
      
         | 298 |  |  |  
 | 
      
         | 299 |  |  | @item Dominator optimizations
 | 
      
         | 300 |  |  |  
 | 
      
         | 301 |  |  | This pass performs trivial dominator-based copy and constant propagation,
 | 
      
         | 302 |  |  | expression simplification, and jump threading.  It is run multiple times
 | 
      
         | 303 |  |  | throughout the optimization process.  It is located in @file{tree-ssa-dom.c}
 | 
      
         | 304 |  |  | and is described by @code{pass_dominator}.
 | 
      
         | 305 |  |  |  
 | 
      
         | 306 |  |  | @item Forward propagation of single-use variables
 | 
      
         | 307 |  |  |  
 | 
      
         | 308 |  |  | This pass attempts to remove redundant computation by substituting
 | 
      
         | 309 |  |  | variables that are used once into the expression that uses them and
 | 
      
         | 310 |  |  | seeing if the result can be simplified.  It is located in
 | 
      
         | 311 |  |  | @file{tree-ssa-forwprop.c} and is described by @code{pass_forwprop}.
 | 
      
         | 312 |  |  |  
 | 
      
         | 313 |  |  | @item Copy Renaming
 | 
      
         | 314 |  |  |  
 | 
      
         | 315 |  |  | This pass attempts to change the name of compiler temporaries involved in
 | 
      
         | 316 |  |  | copy operations such that SSA->normal can coalesce the copy away.  When compiler
 | 
      
         | 317 |  |  | temporaries are copies of user variables, it also renames the compiler
 | 
      
         | 318 |  |  | temporary to the user variable resulting in better use of user symbols.  It is
 | 
      
         | 319 |  |  | located in @file{tree-ssa-copyrename.c} and is described by
 | 
      
         | 320 |  |  | @code{pass_copyrename}.
 | 
      
         | 321 |  |  |  
 | 
      
         | 322 |  |  | @item PHI node optimizations
 | 
      
         | 323 |  |  |  
 | 
      
         | 324 |  |  | This pass recognizes forms of PHI inputs that can be represented as
 | 
      
         | 325 |  |  | conditional expressions and rewrites them into straight line code.
 | 
      
         | 326 |  |  | It is located in @file{tree-ssa-phiopt.c} and is described by
 | 
      
         | 327 |  |  | @code{pass_phiopt}.
 | 
      
         | 328 |  |  |  
 | 
      
         | 329 |  |  | @item May-alias optimization
 | 
      
         | 330 |  |  |  
 | 
      
         | 331 |  |  | This pass performs a flow sensitive SSA-based points-to analysis.
 | 
      
         | 332 |  |  | The resulting may-alias, must-alias, and escape analysis information
 | 
      
         | 333 |  |  | is used to promote variables from in-memory addressable objects to
 | 
      
         | 334 |  |  | non-aliased variables that can be renamed into SSA form.  We also
 | 
      
         | 335 |  |  | update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
 | 
      
         | 336 |  |  | aggregates so that we get fewer false kills.  The pass is located
 | 
      
         | 337 |  |  | in @file{tree-ssa-alias.c} and is described by @code{pass_may_alias}.
 | 
      
         | 338 |  |  |  
 | 
      
         | 339 |  |  | Interprocedural points-to information is located in
 | 
      
         | 340 |  |  | @file{tree-ssa-structalias.c} and described by @code{pass_ipa_pta}.
 | 
      
         | 341 |  |  |  
 | 
      
         | 342 |  |  | @item Profiling
 | 
      
         | 343 |  |  |  
 | 
      
         | 344 |  |  | This pass rewrites the function in order to collect runtime block
 | 
      
         | 345 |  |  | and value profiling data.  Such data may be fed back into the compiler
 | 
      
         | 346 |  |  | on a subsequent run so as to allow optimization based on expected
 | 
      
         | 347 |  |  | execution frequencies.  The pass is located in @file{predict.c} and
 | 
      
         | 348 |  |  | is described by @code{pass_profile}.
 | 
      
         | 349 |  |  |  
 | 
      
         | 350 |  |  | @item Lower complex arithmetic
 | 
      
         | 351 |  |  |  
 | 
      
         | 352 |  |  | This pass rewrites complex arithmetic operations into their component
 | 
      
         | 353 |  |  | scalar arithmetic operations.  The pass is located in @file{tree-complex.c}
 | 
      
         | 354 |  |  | and is described by @code{pass_lower_complex}.
 | 
      
         | 355 |  |  |  
 | 
      
         | 356 |  |  | @item Scalar replacement of aggregates
 | 
      
         | 357 |  |  |  
 | 
      
         | 358 |  |  | This pass rewrites suitable non-aliased local aggregate variables into
 | 
      
         | 359 |  |  | a set of scalar variables.  The resulting scalar variables are
 | 
      
         | 360 |  |  | rewritten into SSA form, which allows subsequent optimization passes
 | 
      
         | 361 |  |  | to do a significantly better job with them.  The pass is located in
 | 
      
         | 362 |  |  | @file{tree-sra.c} and is described by @code{pass_sra}.
 | 
      
         | 363 |  |  |  
 | 
      
         | 364 |  |  | @item Dead store elimination
 | 
      
         | 365 |  |  |  
 | 
      
         | 366 |  |  | This pass eliminates stores to memory that are subsequently overwritten
 | 
      
         | 367 |  |  | by another store, without any intervening loads.  The pass is located
 | 
      
         | 368 |  |  | in @file{tree-ssa-dse.c} and is described by @code{pass_dse}.
 | 
      
         | 369 |  |  |  
 | 
      
         | 370 |  |  | @item Tail recursion elimination
 | 
      
         | 371 |  |  |  
 | 
      
         | 372 |  |  | This pass transforms tail recursion into a loop.  It is located in
 | 
      
         | 373 |  |  | @file{tree-tailcall.c} and is described by @code{pass_tail_recursion}.
 | 
      
         | 374 |  |  |  
 | 
      
         | 375 |  |  | @item Forward store motion
 | 
      
         | 376 |  |  |  
 | 
      
         | 377 |  |  | This pass sinks stores and assignments down the flowgraph closer to their
 | 
      
         | 378 |  |  | use point.  The pass is located in @file{tree-ssa-sink.c} and is
 | 
      
         | 379 |  |  | described by @code{pass_sink_code}.
 | 
      
         | 380 |  |  |  
 | 
      
         | 381 |  |  | @item Partial redundancy elimination
 | 
      
         | 382 |  |  |  
 | 
      
         | 383 |  |  | This pass eliminates partially redundant computations, as well as
 | 
      
         | 384 |  |  | performing load motion.  The pass is located in @file{tree-ssa-pre.c}
 | 
      
         | 385 |  |  | and is described by @code{pass_pre}.
 | 
      
         | 386 |  |  |  
 | 
      
         | 387 |  |  | Just before partial redundancy elimination, if
 | 
      
         | 388 |  |  | @option{-funsafe-math-optimizations} is on, GCC tries to convert
 | 
      
         | 389 |  |  | divisions to multiplications by the reciprocal.  The pass is located
 | 
      
         | 390 |  |  | in @file{tree-ssa-math-opts.c} and is described by
 | 
      
         | 391 |  |  | @code{pass_cse_reciprocal}.
 | 
      
         | 392 |  |  |  
 | 
      
         | 393 |  |  | @item Full redundancy elimination
 | 
      
         | 394 |  |  |  
 | 
      
         | 395 |  |  | This is a simpler form of PRE that only eliminates redundancies that
 | 
      
         | 396 |  |  | occur on all paths.  It is located in @file{tree-ssa-pre.c} and
 | 
      
         | 397 |  |  | described by @code{pass_fre}.
 | 
      
         | 398 |  |  |  
 | 
      
         | 399 |  |  | @item Loop optimization
 | 
      
         | 400 |  |  |  
 | 
      
         | 401 |  |  | The main driver of the pass is placed in @file{tree-ssa-loop.c}
 | 
      
         | 402 |  |  | and described by @code{pass_loop}.
 | 
      
         | 403 |  |  |  
 | 
      
         | 404 |  |  | The optimizations performed by this pass are:
 | 
      
         | 405 |  |  |  
 | 
      
         | 406 |  |  | Loop invariant motion.  This pass moves only invariants that
 | 
      
         | 407 |  |  | would be hard to handle on RTL level (function calls, operations that expand to
 | 
      
         | 408 |  |  | nontrivial sequences of insns).  With @option{-funswitch-loops} it also moves
 | 
      
         | 409 |  |  | operands of conditions that are invariant out of the loop, so that we can use
 | 
      
         | 410 |  |  | just trivial invariantness analysis in loop unswitching.  The pass also includes
 | 
      
         | 411 |  |  | store motion.  The pass is implemented in @file{tree-ssa-loop-im.c}.
 | 
      
         | 412 |  |  |  
 | 
      
         | 413 |  |  | Canonical induction variable creation.  This pass creates a simple counter
 | 
      
         | 414 |  |  | for number of iterations of the loop and replaces the exit condition of the
 | 
      
         | 415 |  |  | loop using it, in case when a complicated analysis is necessary to determine
 | 
      
         | 416 |  |  | the number of iterations.  Later optimizations then may determine the number
 | 
      
         | 417 |  |  | easily.  The pass is implemented in @file{tree-ssa-loop-ivcanon.c}.
 | 
      
         | 418 |  |  |  
 | 
      
         | 419 |  |  | Induction variable optimizations.  This pass performs standard induction
 | 
      
         | 420 |  |  | variable optimizations, including strength reduction, induction variable
 | 
      
         | 421 |  |  | merging and induction variable elimination.  The pass is implemented in
 | 
      
         | 422 |  |  | @file{tree-ssa-loop-ivopts.c}.
 | 
      
         | 423 |  |  |  
 | 
      
         | 424 |  |  | Loop unswitching.  This pass moves the conditional jumps that are invariant
 | 
      
         | 425 |  |  | out of the loops.  To achieve this, a duplicate of the loop is created for
 | 
      
         | 426 |  |  | each possible outcome of conditional jump(s).  The pass is implemented in
 | 
      
         | 427 |  |  | @file{tree-ssa-loop-unswitch.c}.  This pass should eventually replace the
 | 
      
         | 428 |  |  | RTL level loop unswitching in @file{loop-unswitch.c}, but currently
 | 
      
         | 429 |  |  | the RTL level pass is not completely redundant yet due to deficiencies
 | 
      
         | 430 |  |  | in tree level alias analysis.
 | 
      
         | 431 |  |  |  
 | 
      
         | 432 |  |  | The optimizations also use various utility functions contained in
 | 
      
         | 433 |  |  | @file{tree-ssa-loop-manip.c}, @file{cfgloop.c}, @file{cfgloopanal.c} and
 | 
      
         | 434 |  |  | @file{cfgloopmanip.c}.
 | 
      
         | 435 |  |  |  
 | 
      
         | 436 |  |  | Vectorization.  This pass transforms loops to operate on vector types
 | 
      
         | 437 |  |  | instead of scalar types.  Data parallelism across loop iterations is exploited
 | 
      
         | 438 |  |  | to group data elements from consecutive iterations into a vector and operate
 | 
      
         | 439 |  |  | on them in parallel.  Depending on available target support the loop is
 | 
      
         | 440 |  |  | conceptually unrolled by a factor @code{VF} (vectorization factor), which is
 | 
      
         | 441 |  |  | the number of elements operated upon in parallel in each iteration, and the
 | 
      
         | 442 |  |  | @code{VF} copies of each scalar operation are fused to form a vector operation.
 | 
      
         | 443 |  |  | Additional loop transformations such as peeling and versioning may take place
 | 
      
         | 444 |  |  | to align the number of iterations, and to align the memory accesses in the
 | 
      
         | 445 |  |  | loop.
 | 
      
         | 446 |  |  | The pass is implemented in @file{tree-vectorizer.c} (the main driver),
 | 
      
         | 447 |  |  | @file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts
 | 
      
         | 448 |  |  | and general loop utilities), @file{tree-vect-slp} (loop-aware SLP
 | 
      
         | 449 |  |  | functionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}.
 | 
      
         | 450 |  |  | Analysis of data references is in @file{tree-data-ref.c}.
 | 
      
         | 451 |  |  |  
 | 
      
         | 452 |  |  | SLP Vectorization.  This pass performs vectorization of straight-line code. The
 | 
      
         | 453 |  |  | pass is implemented in @file{tree-vectorizer.c} (the main driver),
 | 
      
         | 454 |  |  | @file{tree-vect-slp.c}, @file{tree-vect-stmts.c} and
 | 
      
         | 455 |  |  | @file{tree-vect-data-refs.c}.
 | 
      
         | 456 |  |  |  
 | 
      
         | 457 |  |  | Autoparallelization.  This pass splits the loop iteration space to run
 | 
      
         | 458 |  |  | into several threads.  The pass is implemented in @file{tree-parloops.c}.
 | 
      
         | 459 |  |  |  
 | 
      
         | 460 |  |  | Graphite is a loop transformation framework based on the polyhedral
 | 
      
         | 461 |  |  | model.  Graphite stands for Gimple Represented as Polyhedra.  The
 | 
      
         | 462 |  |  | internals of this infrastructure are documented in
 | 
      
         | 463 |  |  | @w{@uref{http://gcc.gnu.org/wiki/Graphite}}.  The passes working on
 | 
      
         | 464 |  |  | this representation are implemented in the various @file{graphite-*}
 | 
      
         | 465 |  |  | files.
 | 
      
         | 466 |  |  |  
 | 
      
         | 467 |  |  | @item Tree level if-conversion for vectorizer
 | 
      
         | 468 |  |  |  
 | 
      
         | 469 |  |  | This pass applies if-conversion to simple loops to help vectorizer.
 | 
      
         | 470 |  |  | We identify if convertible loops, if-convert statements and merge
 | 
      
         | 471 |  |  | basic blocks in one big block.  The idea is to present loop in such
 | 
      
         | 472 |  |  | form so that vectorizer can have one to one mapping between statements
 | 
      
         | 473 |  |  | and available vector operations.  This pass is located in
 | 
      
         | 474 |  |  | @file{tree-if-conv.c} and is described by @code{pass_if_conversion}.
 | 
      
         | 475 |  |  |  
 | 
      
         | 476 |  |  | @item Conditional constant propagation
 | 
      
         | 477 |  |  |  
 | 
      
         | 478 |  |  | This pass relaxes a lattice of values in order to identify those
 | 
      
         | 479 |  |  | that must be constant even in the presence of conditional branches.
 | 
      
         | 480 |  |  | The pass is located in @file{tree-ssa-ccp.c} and is described
 | 
      
         | 481 |  |  | by @code{pass_ccp}.
 | 
      
         | 482 |  |  |  
 | 
      
         | 483 |  |  | A related pass that works on memory loads and stores, and not just
 | 
      
         | 484 |  |  | register values, is located in @file{tree-ssa-ccp.c} and described by
 | 
      
         | 485 |  |  | @code{pass_store_ccp}.
 | 
      
         | 486 |  |  |  
 | 
      
         | 487 |  |  | @item Conditional copy propagation
 | 
      
         | 488 |  |  |  
 | 
      
         | 489 |  |  | This is similar to constant propagation but the lattice of values is
 | 
      
         | 490 |  |  | the ``copy-of'' relation.  It eliminates redundant copies from the
 | 
      
         | 491 |  |  | code.  The pass is located in @file{tree-ssa-copy.c} and described by
 | 
      
         | 492 |  |  | @code{pass_copy_prop}.
 | 
      
         | 493 |  |  |  
 | 
      
         | 494 |  |  | A related pass that works on memory copies, and not just register
 | 
      
         | 495 |  |  | copies, is located in @file{tree-ssa-copy.c} and described by
 | 
      
         | 496 |  |  | @code{pass_store_copy_prop}.
 | 
      
         | 497 |  |  |  
 | 
      
         | 498 |  |  | @item Value range propagation
 | 
      
         | 499 |  |  |  
 | 
      
         | 500 |  |  | This transformation is similar to constant propagation but
 | 
      
         | 501 |  |  | instead of propagating single constant values, it propagates
 | 
      
         | 502 |  |  | known value ranges.  The implementation is based on Patterson's
 | 
      
         | 503 |  |  | range propagation algorithm (Accurate Static Branch Prediction by
 | 
      
         | 504 |  |  | Value Range Propagation, J. R. C. Patterson, PLDI '95).  In
 | 
      
         | 505 |  |  | contrast to Patterson's algorithm, this implementation does not
 | 
      
         | 506 |  |  | propagate branch probabilities nor it uses more than a single
 | 
      
         | 507 |  |  | range per SSA name. This means that the current implementation
 | 
      
         | 508 |  |  | cannot be used for branch prediction (though adapting it would
 | 
      
         | 509 |  |  | not be difficult).  The pass is located in @file{tree-vrp.c} and is
 | 
      
         | 510 |  |  | described by @code{pass_vrp}.
 | 
      
         | 511 |  |  |  
 | 
      
         | 512 |  |  | @item Folding built-in functions
 | 
      
         | 513 |  |  |  
 | 
      
         | 514 |  |  | This pass simplifies built-in functions, as applicable, with constant
 | 
      
         | 515 |  |  | arguments or with inferable string lengths.  It is located in
 | 
      
         | 516 |  |  | @file{tree-ssa-ccp.c} and is described by @code{pass_fold_builtins}.
 | 
      
         | 517 |  |  |  
 | 
      
         | 518 |  |  | @item Split critical edges
 | 
      
         | 519 |  |  |  
 | 
      
         | 520 |  |  | This pass identifies critical edges and inserts empty basic blocks
 | 
      
         | 521 |  |  | such that the edge is no longer critical.  The pass is located in
 | 
      
         | 522 |  |  | @file{tree-cfg.c} and is described by @code{pass_split_crit_edges}.
 | 
      
         | 523 |  |  |  
 | 
      
         | 524 |  |  | @item Control dependence dead code elimination
 | 
      
         | 525 |  |  |  
 | 
      
         | 526 |  |  | This pass is a stronger form of dead code elimination that can
 | 
      
         | 527 |  |  | eliminate unnecessary control flow statements.   It is located
 | 
      
         | 528 |  |  | in @file{tree-ssa-dce.c} and is described by @code{pass_cd_dce}.
 | 
      
         | 529 |  |  |  
 | 
      
         | 530 |  |  | @item Tail call elimination
 | 
      
         | 531 |  |  |  
 | 
      
         | 532 |  |  | This pass identifies function calls that may be rewritten into
 | 
      
         | 533 |  |  | jumps.  No code transformation is actually applied here, but the
 | 
      
         | 534 |  |  | data and control flow problem is solved.  The code transformation
 | 
      
         | 535 |  |  | requires target support, and so is delayed until RTL@.  In the
 | 
      
         | 536 |  |  | meantime @code{CALL_EXPR_TAILCALL} is set indicating the possibility.
 | 
      
         | 537 |  |  | The pass is located in @file{tree-tailcall.c} and is described by
 | 
      
         | 538 |  |  | @code{pass_tail_calls}.  The RTL transformation is handled by
 | 
      
         | 539 |  |  | @code{fixup_tail_calls} in @file{calls.c}.
 | 
      
         | 540 |  |  |  
 | 
      
         | 541 |  |  | @item Warn for function return without value
 | 
      
         | 542 |  |  |  
 | 
      
         | 543 |  |  | For non-void functions, this pass locates return statements that do
 | 
      
         | 544 |  |  | not specify a value and issues a warning.  Such a statement may have
 | 
      
         | 545 |  |  | been injected by falling off the end of the function.  This pass is
 | 
      
         | 546 |  |  | run last so that we have as much time as possible to prove that the
 | 
      
         | 547 |  |  | statement is not reachable.  It is located in @file{tree-cfg.c} and
 | 
      
         | 548 |  |  | is described by @code{pass_warn_function_return}.
 | 
      
         | 549 |  |  |  
 | 
      
         | 550 |  |  | @item Mudflap statement annotation
 | 
      
         | 551 |  |  |  
 | 
      
         | 552 |  |  | If mudflap is enabled, we rewrite some memory accesses with code to
 | 
      
         | 553 |  |  | validate that the memory access is correct.  In particular, expressions
 | 
      
         | 554 |  |  | involving pointer dereferences (@code{INDIRECT_REF}, @code{ARRAY_REF},
 | 
      
         | 555 |  |  | etc.) are replaced by code that checks the selected address range
 | 
      
         | 556 |  |  | against the mudflap runtime's database of valid regions.  This check
 | 
      
         | 557 |  |  | includes an inline lookup into a direct-mapped cache, based on
 | 
      
         | 558 |  |  | shift/mask operations of the pointer value, with a fallback function
 | 
      
         | 559 |  |  | call into the runtime.  The pass is located in @file{tree-mudflap.c} and
 | 
      
         | 560 |  |  | is described by @code{pass_mudflap_2}.
 | 
      
         | 561 |  |  |  
 | 
      
         | 562 |  |  | @item Leave static single assignment form
 | 
      
         | 563 |  |  |  
 | 
      
         | 564 |  |  | This pass rewrites the function such that it is in normal form.  At
 | 
      
         | 565 |  |  | the same time, we eliminate as many single-use temporaries as possible,
 | 
      
         | 566 |  |  | so the intermediate language is no longer GIMPLE, but GENERIC@.  The
 | 
      
         | 567 |  |  | pass is located in @file{tree-outof-ssa.c} and is described by
 | 
      
         | 568 |  |  | @code{pass_del_ssa}.
 | 
      
         | 569 |  |  |  
 | 
      
         | 570 |  |  | @item Merge PHI nodes that feed into one another
 | 
      
         | 571 |  |  |  
 | 
      
         | 572 |  |  | This is part of the CFG cleanup passes.  It attempts to join PHI nodes
 | 
      
         | 573 |  |  | from a forwarder CFG block into another block with PHI nodes.  The
 | 
      
         | 574 |  |  | pass is located in @file{tree-cfgcleanup.c} and is described by
 | 
      
         | 575 |  |  | @code{pass_merge_phi}.
 | 
      
         | 576 |  |  |  
 | 
      
         | 577 |  |  | @item Return value optimization
 | 
      
         | 578 |  |  |  
 | 
      
         | 579 |  |  | If a function always returns the same local variable, and that local
 | 
      
         | 580 |  |  | variable is an aggregate type, then the variable is replaced with the
 | 
      
         | 581 |  |  | return value for the function (i.e., the function's DECL_RESULT).  This
 | 
      
         | 582 |  |  | is equivalent to the C++ named return value optimization applied to
 | 
      
         | 583 |  |  | GIMPLE@.  The pass is located in @file{tree-nrv.c} and is described by
 | 
      
         | 584 |  |  | @code{pass_nrv}.
 | 
      
         | 585 |  |  |  
 | 
      
         | 586 |  |  | @item Return slot optimization
 | 
      
         | 587 |  |  |  
 | 
      
         | 588 |  |  | If a function returns a memory object and is called as @code{var =
 | 
      
         | 589 |  |  | foo()}, this pass tries to change the call so that the address of
 | 
      
         | 590 |  |  | @code{var} is sent to the caller to avoid an extra memory copy.  This
 | 
      
         | 591 |  |  | pass is located in @code{tree-nrv.c} and is described by
 | 
      
         | 592 |  |  | @code{pass_return_slot}.
 | 
      
         | 593 |  |  |  
 | 
      
         | 594 |  |  | @item Optimize calls to @code{__builtin_object_size}
 | 
      
         | 595 |  |  |  
 | 
      
         | 596 |  |  | This is a propagation pass similar to CCP that tries to remove calls
 | 
      
         | 597 |  |  | to @code{__builtin_object_size} when the size of the object can be
 | 
      
         | 598 |  |  | computed at compile-time.  This pass is located in
 | 
      
         | 599 |  |  | @file{tree-object-size.c} and is described by
 | 
      
         | 600 |  |  | @code{pass_object_sizes}.
 | 
      
         | 601 |  |  |  
 | 
      
         | 602 |  |  | @item Loop invariant motion
 | 
      
         | 603 |  |  |  
 | 
      
         | 604 |  |  | This pass removes expensive loop-invariant computations out of loops.
 | 
      
         | 605 |  |  | The pass is located in @file{tree-ssa-loop.c} and described by
 | 
      
         | 606 |  |  | @code{pass_lim}.
 | 
      
         | 607 |  |  |  
 | 
      
         | 608 |  |  | @item Loop nest optimizations
 | 
      
         | 609 |  |  |  
 | 
      
         | 610 |  |  | This is a family of loop transformations that works on loop nests.  It
 | 
      
         | 611 |  |  | includes loop interchange, scaling, skewing and reversal and they are
 | 
      
         | 612 |  |  | all geared to the optimization of data locality in array traversals
 | 
      
         | 613 |  |  | and the removal of dependencies that hamper optimizations such as loop
 | 
      
         | 614 |  |  | parallelization and vectorization.  The pass is located in
 | 
      
         | 615 |  |  | @file{tree-loop-linear.c} and described by
 | 
      
         | 616 |  |  | @code{pass_linear_transform}.
 | 
      
         | 617 |  |  |  
 | 
      
         | 618 |  |  | @item Removal of empty loops
 | 
      
         | 619 |  |  |  
 | 
      
         | 620 |  |  | This pass removes loops with no code in them.  The pass is located in
 | 
      
         | 621 |  |  | @file{tree-ssa-loop-ivcanon.c} and described by
 | 
      
         | 622 |  |  | @code{pass_empty_loop}.
 | 
      
         | 623 |  |  |  
 | 
      
         | 624 |  |  | @item Unrolling of small loops
 | 
      
         | 625 |  |  |  
 | 
      
         | 626 |  |  | This pass completely unrolls loops with few iterations.  The pass
 | 
      
         | 627 |  |  | is located in @file{tree-ssa-loop-ivcanon.c} and described by
 | 
      
         | 628 |  |  | @code{pass_complete_unroll}.
 | 
      
         | 629 |  |  |  
 | 
      
         | 630 |  |  | @item Predictive commoning
 | 
      
         | 631 |  |  |  
 | 
      
         | 632 |  |  | This pass makes the code reuse the computations from the previous
 | 
      
         | 633 |  |  | iterations of the loops, especially loads and stores to memory.
 | 
      
         | 634 |  |  | It does so by storing the values of these computations to a bank
 | 
      
         | 635 |  |  | of temporary variables that are rotated at the end of loop.  To avoid
 | 
      
         | 636 |  |  | the need for this rotation, the loop is then unrolled and the copies
 | 
      
         | 637 |  |  | of the loop body are rewritten to use the appropriate version of
 | 
      
         | 638 |  |  | the temporary variable.  This pass is located in @file{tree-predcom.c}
 | 
      
         | 639 |  |  | and described by @code{pass_predcom}.
 | 
      
         | 640 |  |  |  
 | 
      
         | 641 |  |  | @item Array prefetching
 | 
      
         | 642 |  |  |  
 | 
      
         | 643 |  |  | This pass issues prefetch instructions for array references inside
 | 
      
         | 644 |  |  | loops.  The pass is located in @file{tree-ssa-loop-prefetch.c} and
 | 
      
         | 645 |  |  | described by @code{pass_loop_prefetch}.
 | 
      
         | 646 |  |  |  
 | 
      
         | 647 |  |  | @item Reassociation
 | 
      
         | 648 |  |  |  
 | 
      
         | 649 |  |  | This pass rewrites arithmetic expressions to enable optimizations that
 | 
      
         | 650 |  |  | operate on them, like redundancy elimination and vectorization.  The
 | 
      
         | 651 |  |  | pass is located in @file{tree-ssa-reassoc.c} and described by
 | 
      
         | 652 |  |  | @code{pass_reassoc}.
 | 
      
         | 653 |  |  |  
 | 
      
         | 654 |  |  | @item Optimization of @code{stdarg} functions
 | 
      
         | 655 |  |  |  
 | 
      
         | 656 |  |  | This pass tries to avoid the saving of register arguments into the
 | 
      
         | 657 |  |  | stack on entry to @code{stdarg} functions.  If the function doesn't
 | 
      
         | 658 |  |  | use any @code{va_start} macros, no registers need to be saved.  If
 | 
      
         | 659 |  |  | @code{va_start} macros are used, the @code{va_list} variables don't
 | 
      
         | 660 |  |  | escape the function, it is only necessary to save registers that will
 | 
      
         | 661 |  |  | be used in @code{va_arg} macros.  For instance, if @code{va_arg} is
 | 
      
         | 662 |  |  | only used with integral types in the function, floating point
 | 
      
         | 663 |  |  | registers don't need to be saved.  This pass is located in
 | 
      
         | 664 |  |  | @code{tree-stdarg.c} and described by @code{pass_stdarg}.
 | 
      
         | 665 |  |  |  
 | 
      
         | 666 |  |  | @end itemize
 | 
      
         | 667 |  |  |  
 | 
      
         | 668 |  |  | @node RTL passes
 | 
      
         | 669 |  |  | @section RTL passes
 | 
      
         | 670 |  |  |  
 | 
      
         | 671 |  |  | The following briefly describes the RTL generation and optimization
 | 
      
         | 672 |  |  | passes that are run after the Tree optimization passes.
 | 
      
         | 673 |  |  |  
 | 
      
         | 674 |  |  | @itemize @bullet
 | 
      
         | 675 |  |  | @item RTL generation
 | 
      
         | 676 |  |  |  
 | 
      
         | 677 |  |  | @c Avoiding overfull is tricky here.
 | 
      
         | 678 |  |  | The source files for RTL generation include
 | 
      
         | 679 |  |  | @file{stmt.c},
 | 
      
         | 680 |  |  | @file{calls.c},
 | 
      
         | 681 |  |  | @file{expr.c},
 | 
      
         | 682 |  |  | @file{explow.c},
 | 
      
         | 683 |  |  | @file{expmed.c},
 | 
      
         | 684 |  |  | @file{function.c},
 | 
      
         | 685 |  |  | @file{optabs.c}
 | 
      
         | 686 |  |  | and @file{emit-rtl.c}.
 | 
      
         | 687 |  |  | Also, the file
 | 
      
         | 688 |  |  | @file{insn-emit.c}, generated from the machine description by the
 | 
      
         | 689 |  |  | program @code{genemit}, is used in this pass.  The header file
 | 
      
         | 690 |  |  | @file{expr.h} is used for communication within this pass.
 | 
      
         | 691 |  |  |  
 | 
      
         | 692 |  |  | @findex genflags
 | 
      
         | 693 |  |  | @findex gencodes
 | 
      
         | 694 |  |  | The header files @file{insn-flags.h} and @file{insn-codes.h},
 | 
      
         | 695 |  |  | generated from the machine description by the programs @code{genflags}
 | 
      
         | 696 |  |  | and @code{gencodes}, tell this pass which standard names are available
 | 
      
         | 697 |  |  | for use and which patterns correspond to them.
 | 
      
         | 698 |  |  |  
 | 
      
         | 699 |  |  | @item Generation of exception landing pads
 | 
      
         | 700 |  |  |  
 | 
      
         | 701 |  |  | This pass generates the glue that handles communication between the
 | 
      
         | 702 |  |  | exception handling library routines and the exception handlers within
 | 
      
         | 703 |  |  | the function.  Entry points in the function that are invoked by the
 | 
      
         | 704 |  |  | exception handling library are called @dfn{landing pads}.  The code
 | 
      
         | 705 |  |  | for this pass is located in @file{except.c}.
 | 
      
         | 706 |  |  |  
 | 
      
         | 707 |  |  | @item Control flow graph cleanup
 | 
      
         | 708 |  |  |  
 | 
      
         | 709 |  |  | This pass removes unreachable code, simplifies jumps to next, jumps to
 | 
      
         | 710 |  |  | jump, jumps across jumps, etc.  The pass is run multiple times.
 | 
      
         | 711 |  |  | For historical reasons, it is occasionally referred to as the ``jump
 | 
      
         | 712 |  |  | optimization pass''.  The bulk of the code for this pass is in
 | 
      
         | 713 |  |  | @file{cfgcleanup.c}, and there are support routines in @file{cfgrtl.c}
 | 
      
         | 714 |  |  | and @file{jump.c}.
 | 
      
         | 715 |  |  |  
 | 
      
         | 716 |  |  | @item Forward propagation of single-def values
 | 
      
         | 717 |  |  |  
 | 
      
         | 718 |  |  | This pass attempts to remove redundant computation by substituting
 | 
      
         | 719 |  |  | variables that come from a single definition, and
 | 
      
         | 720 |  |  | seeing if the result can be simplified.  It performs copy propagation
 | 
      
         | 721 |  |  | and addressing mode selection.  The pass is run twice, with values
 | 
      
         | 722 |  |  | being propagated into loops only on the second run.  The code is
 | 
      
         | 723 |  |  | located in @file{fwprop.c}.
 | 
      
         | 724 |  |  |  
 | 
      
         | 725 |  |  | @item Common subexpression elimination
 | 
      
         | 726 |  |  |  
 | 
      
         | 727 |  |  | This pass removes redundant computation within basic blocks, and
 | 
      
         | 728 |  |  | optimizes addressing modes based on cost.  The pass is run twice.
 | 
      
         | 729 |  |  | The code for this pass is located in @file{cse.c}.
 | 
      
         | 730 |  |  |  
 | 
      
         | 731 |  |  | @item Global common subexpression elimination
 | 
      
         | 732 |  |  |  
 | 
      
         | 733 |  |  | This pass performs two
 | 
      
         | 734 |  |  | different types of GCSE  depending on whether you are optimizing for
 | 
      
         | 735 |  |  | size or not (LCM based GCSE tends to increase code size for a gain in
 | 
      
         | 736 |  |  | speed, while Morel-Renvoise based GCSE does not).
 | 
      
         | 737 |  |  | When optimizing for size, GCSE is done using Morel-Renvoise Partial
 | 
      
         | 738 |  |  | Redundancy Elimination, with the exception that it does not try to move
 | 
      
         | 739 |  |  | invariants out of loops---that is left to  the loop optimization pass.
 | 
      
         | 740 |  |  | If MR PRE GCSE is done, code hoisting (aka unification) is also done, as
 | 
      
         | 741 |  |  | well as load motion.
 | 
      
         | 742 |  |  | If you are optimizing for speed, LCM (lazy code motion) based GCSE is
 | 
      
         | 743 |  |  | done.  LCM is based on the work of Knoop, Ruthing, and Steffen.  LCM
 | 
      
         | 744 |  |  | based GCSE also does loop invariant code motion.  We also perform load
 | 
      
         | 745 |  |  | and store motion when optimizing for speed.
 | 
      
         | 746 |  |  | Regardless of which type of GCSE is used, the GCSE pass also performs
 | 
      
         | 747 |  |  | global constant and  copy propagation.
 | 
      
         | 748 |  |  | The source file for this pass is @file{gcse.c}, and the LCM routines
 | 
      
         | 749 |  |  | are in @file{lcm.c}.
 | 
      
         | 750 |  |  |  
 | 
      
         | 751 |  |  | @item Loop optimization
 | 
      
         | 752 |  |  |  
 | 
      
         | 753 |  |  | This pass performs several loop related optimizations.
 | 
      
         | 754 |  |  | The source files @file{cfgloopanal.c} and @file{cfgloopmanip.c} contain
 | 
      
         | 755 |  |  | generic loop analysis and manipulation code.  Initialization and finalization
 | 
      
         | 756 |  |  | of loop structures is handled by @file{loop-init.c}.
 | 
      
         | 757 |  |  | A loop invariant motion pass is implemented in @file{loop-invariant.c}.
 | 
      
         | 758 |  |  | Basic block level optimizations---unrolling, peeling and unswitching loops---
 | 
      
         | 759 |  |  | are implemented in @file{loop-unswitch.c} and @file{loop-unroll.c}.
 | 
      
         | 760 |  |  | Replacing of the exit condition of loops by special machine-dependent
 | 
      
         | 761 |  |  | instructions is handled by @file{loop-doloop.c}.
 | 
      
         | 762 |  |  |  
 | 
      
         | 763 |  |  | @item Jump bypassing
 | 
      
         | 764 |  |  |  
 | 
      
         | 765 |  |  | This pass is an aggressive form of GCSE that transforms the control
 | 
      
         | 766 |  |  | flow graph of a function by propagating constants into conditional
 | 
      
         | 767 |  |  | branch instructions.  The source file for this pass is @file{gcse.c}.
 | 
      
         | 768 |  |  |  
 | 
      
         | 769 |  |  | @item If conversion
 | 
      
         | 770 |  |  |  
 | 
      
         | 771 |  |  | This pass attempts to replace conditional branches and surrounding
 | 
      
         | 772 |  |  | assignments with arithmetic, boolean value producing comparison
 | 
      
         | 773 |  |  | instructions, and conditional move instructions.  In the very last
 | 
      
         | 774 |  |  | invocation after reload, it will generate predicated instructions
 | 
      
         | 775 |  |  | when supported by the target.  The code is located in @file{ifcvt.c}.
 | 
      
         | 776 |  |  |  
 | 
      
         | 777 |  |  | @item Web construction
 | 
      
         | 778 |  |  |  
 | 
      
         | 779 |  |  | This pass splits independent uses of each pseudo-register.  This can
 | 
      
         | 780 |  |  | improve effect of the other transformation, such as CSE or register
 | 
      
         | 781 |  |  | allocation.  The code for this pass is located in @file{web.c}.
 | 
      
         | 782 |  |  |  
 | 
      
         | 783 |  |  | @item Instruction combination
 | 
      
         | 784 |  |  |  
 | 
      
         | 785 |  |  | This pass attempts to combine groups of two or three instructions that
 | 
      
         | 786 |  |  | are related by data flow into single instructions.  It combines the
 | 
      
         | 787 |  |  | RTL expressions for the instructions by substitution, simplifies the
 | 
      
         | 788 |  |  | result using algebra, and then attempts to match the result against
 | 
      
         | 789 |  |  | the machine description.  The code is located in @file{combine.c}.
 | 
      
         | 790 |  |  |  
 | 
      
         | 791 |  |  | @item Register movement
 | 
      
         | 792 |  |  |  
 | 
      
         | 793 |  |  | This pass looks for cases where matching constraints would force an
 | 
      
         | 794 |  |  | instruction to need a reload, and this reload would be a
 | 
      
         | 795 |  |  | register-to-register move.  It then attempts to change the registers
 | 
      
         | 796 |  |  | used by the instruction to avoid the move instruction.  The code is
 | 
      
         | 797 |  |  | located in @file{regmove.c}.
 | 
      
         | 798 |  |  |  
 | 
      
         | 799 |  |  | @item Mode switching optimization
 | 
      
         | 800 |  |  |  
 | 
      
         | 801 |  |  | This pass looks for instructions that require the processor to be in a
 | 
      
         | 802 |  |  | specific ``mode'' and minimizes the number of mode changes required to
 | 
      
         | 803 |  |  | satisfy all users.  What these modes are, and what they apply to are
 | 
      
         | 804 |  |  | completely target-specific.  The code for this pass is located in
 | 
      
         | 805 |  |  | @file{mode-switching.c}.
 | 
      
         | 806 |  |  |  
 | 
      
         | 807 |  |  | @cindex modulo scheduling
 | 
      
         | 808 |  |  | @cindex sms, swing, software pipelining
 | 
      
         | 809 |  |  | @item Modulo scheduling
 | 
      
         | 810 |  |  |  
 | 
      
         | 811 |  |  | This pass looks at innermost loops and reorders their instructions
 | 
      
         | 812 |  |  | by overlapping different iterations.  Modulo scheduling is performed
 | 
      
         | 813 |  |  | immediately before instruction scheduling.  The code for this pass is
 | 
      
         | 814 |  |  | located in @file{modulo-sched.c}.
 | 
      
         | 815 |  |  |  
 | 
      
         | 816 |  |  | @item Instruction scheduling
 | 
      
         | 817 |  |  |  
 | 
      
         | 818 |  |  | This pass looks for instructions whose output will not be available by
 | 
      
         | 819 |  |  | the time that it is used in subsequent instructions.  Memory loads and
 | 
      
         | 820 |  |  | floating point instructions often have this behavior on RISC machines.
 | 
      
         | 821 |  |  | It re-orders instructions within a basic block to try to separate the
 | 
      
         | 822 |  |  | definition and use of items that otherwise would cause pipeline
 | 
      
         | 823 |  |  | stalls.  This pass is performed twice, before and after register
 | 
      
         | 824 |  |  | allocation.  The code for this pass is located in @file{haifa-sched.c},
 | 
      
         | 825 |  |  | @file{sched-deps.c}, @file{sched-ebb.c}, @file{sched-rgn.c} and
 | 
      
         | 826 |  |  | @file{sched-vis.c}.
 | 
      
         | 827 |  |  |  
 | 
      
         | 828 |  |  | @item Register allocation
 | 
      
         | 829 |  |  |  
 | 
      
         | 830 |  |  | These passes make sure that all occurrences of pseudo registers are
 | 
      
         | 831 |  |  | eliminated, either by allocating them to a hard register, replacing
 | 
      
         | 832 |  |  | them by an equivalent expression (e.g.@: a constant) or by placing
 | 
      
         | 833 |  |  | them on the stack.  This is done in several subpasses:
 | 
      
         | 834 |  |  |  
 | 
      
         | 835 |  |  | @itemize @bullet
 | 
      
         | 836 |  |  | @item
 | 
      
         | 837 |  |  | Register move optimizations.  This pass makes some simple RTL code
 | 
      
         | 838 |  |  | transformations which improve the subsequent register allocation.  The
 | 
      
         | 839 |  |  | source file is @file{regmove.c}.
 | 
      
         | 840 |  |  |  
 | 
      
         | 841 |  |  | @item
 | 
      
         | 842 |  |  | The integrated register allocator (@acronym{IRA}).  It is called
 | 
      
         | 843 |  |  | integrated because coalescing, register live range splitting, and hard
 | 
      
         | 844 |  |  | register preferencing are done on-the-fly during coloring.  It also
 | 
      
         | 845 |  |  | has better integration with the reload pass.  Pseudo-registers spilled
 | 
      
         | 846 |  |  | by the allocator or the reload have still a chance to get
 | 
      
         | 847 |  |  | hard-registers if the reload evicts some pseudo-registers from
 | 
      
         | 848 |  |  | hard-registers.  The allocator helps to choose better pseudos for
 | 
      
         | 849 |  |  | spilling based on their live ranges and to coalesce stack slots
 | 
      
         | 850 |  |  | allocated for the spilled pseudo-registers.  IRA is a regional
 | 
      
         | 851 |  |  | register allocator which is transformed into Chaitin-Briggs allocator
 | 
      
         | 852 |  |  | if there is one region.  By default, IRA chooses regions using
 | 
      
         | 853 |  |  | register pressure but the user can force it to use one region or
 | 
      
         | 854 |  |  | regions corresponding to all loops.
 | 
      
         | 855 |  |  |  
 | 
      
         | 856 |  |  | Source files of the allocator are @file{ira.c}, @file{ira-build.c},
 | 
      
         | 857 |  |  | @file{ira-costs.c}, @file{ira-conflicts.c}, @file{ira-color.c},
 | 
      
         | 858 |  |  | @file{ira-emit.c}, @file{ira-lives}, plus header files @file{ira.h}
 | 
      
         | 859 |  |  | and @file{ira-int.h} used for the communication between the allocator
 | 
      
         | 860 |  |  | and the rest of the compiler and between the IRA files.
 | 
      
         | 861 |  |  |  
 | 
      
         | 862 |  |  | @cindex reloading
 | 
      
         | 863 |  |  | @item
 | 
      
         | 864 |  |  | Reloading.  This pass renumbers pseudo registers with the hardware
 | 
      
         | 865 |  |  | registers numbers they were allocated.  Pseudo registers that did not
 | 
      
         | 866 |  |  | get hard registers are replaced with stack slots.  Then it finds
 | 
      
         | 867 |  |  | instructions that are invalid because a value has failed to end up in
 | 
      
         | 868 |  |  | a register, or has ended up in a register of the wrong kind.  It fixes
 | 
      
         | 869 |  |  | up these instructions by reloading the problematical values
 | 
      
         | 870 |  |  | temporarily into registers.  Additional instructions are generated to
 | 
      
         | 871 |  |  | do the copying.
 | 
      
         | 872 |  |  |  
 | 
      
         | 873 |  |  | The reload pass also optionally eliminates the frame pointer and inserts
 | 
      
         | 874 |  |  | instructions to save and restore call-clobbered registers around calls.
 | 
      
         | 875 |  |  |  
 | 
      
         | 876 |  |  | Source files are @file{reload.c} and @file{reload1.c}, plus the header
 | 
      
         | 877 |  |  | @file{reload.h} used for communication between them.
 | 
      
         | 878 |  |  | @end itemize
 | 
      
         | 879 |  |  |  
 | 
      
         | 880 |  |  | @item Basic block reordering
 | 
      
         | 881 |  |  |  
 | 
      
         | 882 |  |  | This pass implements profile guided code positioning.  If profile
 | 
      
         | 883 |  |  | information is not available, various types of static analysis are
 | 
      
         | 884 |  |  | performed to make the predictions normally coming from the profile
 | 
      
         | 885 |  |  | feedback (IE execution frequency, branch probability, etc).  It is
 | 
      
         | 886 |  |  | implemented in the file @file{bb-reorder.c}, and the various
 | 
      
         | 887 |  |  | prediction routines are in @file{predict.c}.
 | 
      
         | 888 |  |  |  
 | 
      
         | 889 |  |  | @item Variable tracking
 | 
      
         | 890 |  |  |  
 | 
      
         | 891 |  |  | This pass computes where the variables are stored at each
 | 
      
         | 892 |  |  | position in code and generates notes describing the variable locations
 | 
      
         | 893 |  |  | to RTL code.  The location lists are then generated according to these
 | 
      
         | 894 |  |  | notes to debug information if the debugging information format supports
 | 
      
         | 895 |  |  | location lists.  The code is located in @file{var-tracking.c}.
 | 
      
         | 896 |  |  |  
 | 
      
         | 897 |  |  | @item Delayed branch scheduling
 | 
      
         | 898 |  |  |  
 | 
      
         | 899 |  |  | This optional pass attempts to find instructions that can go into the
 | 
      
         | 900 |  |  | delay slots of other instructions, usually jumps and calls.  The code
 | 
      
         | 901 |  |  | for this pass is located in @file{reorg.c}.
 | 
      
         | 902 |  |  |  
 | 
      
         | 903 |  |  | @item Branch shortening
 | 
      
         | 904 |  |  |  
 | 
      
         | 905 |  |  | On many RISC machines, branch instructions have a limited range.
 | 
      
         | 906 |  |  | Thus, longer sequences of instructions must be used for long branches.
 | 
      
         | 907 |  |  | In this pass, the compiler figures out what how far each instruction
 | 
      
         | 908 |  |  | will be from each other instruction, and therefore whether the usual
 | 
      
         | 909 |  |  | instructions, or the longer sequences, must be used for each branch.
 | 
      
         | 910 |  |  | The code for this pass is located in @file{final.c}.
 | 
      
         | 911 |  |  |  
 | 
      
         | 912 |  |  | @item Register-to-stack conversion
 | 
      
         | 913 |  |  |  
 | 
      
         | 914 |  |  | Conversion from usage of some hard registers to usage of a register
 | 
      
         | 915 |  |  | stack may be done at this point.  Currently, this is supported only
 | 
      
         | 916 |  |  | for the floating-point registers of the Intel 80387 coprocessor.  The
 | 
      
         | 917 |  |  | code for this pass is located in @file{reg-stack.c}.
 | 
      
         | 918 |  |  |  
 | 
      
         | 919 |  |  | @item Final
 | 
      
         | 920 |  |  |  
 | 
      
         | 921 |  |  | This pass outputs the assembler code for the function.  The source files
 | 
      
         | 922 |  |  | are @file{final.c} plus @file{insn-output.c}; the latter is generated
 | 
      
         | 923 |  |  | automatically from the machine description by the tool @file{genoutput}.
 | 
      
         | 924 |  |  | The header file @file{conditions.h} is used for communication between
 | 
      
         | 925 |  |  | these files.  If mudflap is enabled, the queue of deferred declarations
 | 
      
         | 926 |  |  | and any addressed constants (e.g., string literals) is processed by
 | 
      
         | 927 |  |  | @code{mudflap_finish_file} into a synthetic constructor function
 | 
      
         | 928 |  |  | containing calls into the mudflap runtime.
 | 
      
         | 929 |  |  |  
 | 
      
         | 930 |  |  | @item Debugging information output
 | 
      
         | 931 |  |  |  
 | 
      
         | 932 |  |  | This is run after final because it must output the stack slot offsets
 | 
      
         | 933 |  |  | for pseudo registers that did not get hard registers.  Source files
 | 
      
         | 934 |  |  | are @file{dbxout.c} for DBX symbol table format, @file{sdbout.c} for
 | 
      
         | 935 |  |  | SDB symbol table format, @file{dwarfout.c} for DWARF symbol table
 | 
      
         | 936 |  |  | format, files @file{dwarf2out.c} and @file{dwarf2asm.c} for DWARF2
 | 
      
         | 937 |  |  | symbol table format, and @file{vmsdbgout.c} for VMS debug symbol table
 | 
      
         | 938 |  |  | format.
 | 
      
         | 939 |  |  |  
 | 
      
         | 940 |  |  | @end itemize
 |