Constant folding and constant propagation are related compiler optimizations used by many modern compilers.[1] An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.
Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2
, but they may also be variables whose values are known at compile time. Consider the statement:
Most compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000
).
Constant folding can make use of arithmetic identities. If x
is numeric, the value of 0 * x
is zero even if the compiler does not know the value of x
(note that this is not valid for IEEE floats since x
could be Infinity or NaN. Still, some environments that favor performance such as GLSL shaders allow this for constants, which can occasionally cause bugs).
Constant folding may apply to more than just numbers. Concatenation of string literals and constant strings can be constant folded. Code such as "abc" + "def"
may be replaced with "abcdef"
.
In implementing a cross compiler, care must be taken to ensure that the behaviour of the arithmetic operations on the host architecture matches that on the target architecture, as otherwise enabling constant folding will change the behaviour of the program. This is of particular importance in the case of floating point operations, whose precise implementation may vary widely.
Constant propagation is the process of substituting the values of known constants in expressions at compile time. Such constants include those defined above, as well as intrinsic functions applied to constant values. Consider the following pseudocode:
Propagating x yields:
Continuing to propagate yields the following (which would likely be further optimized by dead-code elimination of both x and y.)
Constant propagation is implemented in compilers using reaching definition analysis results. If all variable's reaching definitions are the same assignment which assigns a same constant to the variable, then the variable has a constant value and can be replaced with the constant.
Constant propagation can also cause conditional branches to simplify to one or more unconditional statements, when the conditional expression can be evaluated to true or false at compile time to determine the only possible outcome.
Constant folding and propagation are typically used together to achieve many simplifications and reductions, and their interleaved, iterative application continues until those effects cease.
Consider this unoptimized pseudocode returning a number unknown pending analysis:
Applying constant propagation once, followed by constant folding, yields:
Repeating both steps twice produces:
Having replaced all uses of variables a
and b
with constants, the compiler's dead-code elimination applies to those variables, leaving:
if (true) return c * 2;
(Boolean constructs vary among languages and compilers, but their details—such as the status, origin, and representation of true—do not affect these optimization principles.)
Traditional constant propagation produces no further optimization; it does not restructure programs.
However, a similar optimization, sparse conditional constant propagation, goes further by selecting the appropriate conditional branch, and removing the always-true conditional test. Thus, variable c
becomes redundant, and only an operation on a constant remains:
If that pseudocode constitutes a function body, the compiler knows the function evaluates to integer constant 4
, allowing replacement of calls to the function with 4
, and further increasing program efficiency.