Live-variable analysis explained

In compilers, live variable analysis (or simply liveness analysis) is a classic data-flow analysis to calculate the variables that are live at each point in the program. A variable is live at some point if it holds a value that may be needed in the future, or equivalently if its value may be read before the next time the variable is written to.

Example

Consider the following program:

b = 3 c = 5 a = f(b * c)

The set of live variables between lines 2 and 3 is because both are used in the multiplication on line 3. But the set of live variables after line 1 is only, since variable c is updated later, on line 2. The value of variable a is not used in this code.

Note that the assignment to a may be eliminated as a is not used later, but there is insufficient information to justify removing all of line 3 as f may have side effects (printing b * c, perhaps).

Expression in terms of dataflow equations

Liveness analysis is a "backwards may" analysis. The analysis is done in a backwards order, and the dataflow confluence operator is set union. In other words, if applying liveness analysis to a function with a particular number of logical branches within it, the analysis is performed starting from the end of the function working towards the beginning (hence "backwards"), and a variable is considered live if any of the branches moving forward within the function might potentially (hence "may") need the variable's current value. This is in contrast to a "backwards must" analysis which would instead enforce this condition on all branches moving forward.

The dataflow equations used for a given basic block s and exiting block f in live variable analysis are the following:

{GEN

}[s] : The set of variables that are used in s before any assignment in the same basic block.

{KILL

}[s] : The set of variables that are assigned a value in s (in many books that discuss compiler design, KILL (s) is also defined as the set of variables assigned a value in s before any use, but this does not change the solution of the dataflow equation):

{LIVE

}_[s] = [s] \cup (_[s] - [s])

{LIVE

}_[final] =

{LIVE

}_[s] = \bigcup_ _[p]

{GEN

}[d : y \leftarrow f(x_1,\cdots,x_n)] = \

{KILL

}[d : y \leftarrow f(x_1,\cdots,x_n)] = \

The in-state of a block is the set of variables that are live at the start of the block. Its out-state is the set of variables that are live at the end of it. The out-state is the union of the in-states of the block's successors. The transfer function of a statement is applied by making the variables that are written dead, then making the variables that are read live.

Second example

// in: ; predecessor blocks: none b1: a = 3; b = 5; d = 4; x = 100; //x is never being used later thus not in the out set if a > b then // out: //union of all (in) successors of b1 => b2:, and b3: // in: ; predecessor blocks: b1 b2: c = a + b; d = 2; // out: // in: ; predecessor blocks: b1 and b2 b3: endif c = 4; return b * d + c; // out:

The in-state of b3 only contains b and d, since c has been written. The out-state of b1 is the union of the in-states of b2 and b3. The definition of c in b2 can be removed, since c is not live immediately after the statement.

Solving the data flow equations starts with initializing all in-states and out-states to the empty set. The work list is initialized by inserting the exit point (b3) in the work list (typical for backward flow). Its computed in-state differs from the previous one, so its predecessors b1 and b2 are inserted and the process continues. The progress is summarized in the table below.

processingout-stateold in-state new in-statework list
b3(b1,b2)
b1(b2)
b2(b1)
b1

Note that b1 was entered in the list before b2, which forced processing b1 twice (b1 was re-entered as predecessor of b2). Inserting b2 before b1 would have allowed earlier completion.

Initializing with the empty set is an optimistic initialization: all variables start out as dead. Note that the out-states cannot shrink from one iteration to the next, although the out-state can be smaller than the in-state. This can be seen from the fact that after the first iteration the out-state can only change by a change of the in-state. Since the in-state starts as the empty set, it can only grow in further iterations.

References

Book: Aho. Alfred. Lam. Monica. Sethi. Ravi. Ullman. Jeffrey. 2007. 2. Compilers: Principles, Techniques, and Tools. 608.