Ershov Number Explained

Ershov numbers, named after Andrey Petrovich Yershov, are used in code optimization to minimize the amount of register allocations. Ershov numbers can be used in methods to optimally select registers when there is only one expression in a code block. Given an expression E = E₁ op E₂ the goal is to generate code so as to either minimize the number of registers used, or, if an insufficient number of registers is available, to minimize the number of nonregister temporaries required.

Definition

The Ershov number n of a node in given expression tree is defined as follows:^[1] ^[2]

Every leaf has n = 1.
For a node with one child, n is the same as child's.
For a node with two children, n is defined as:

n=\begin{cases} max(child_1,child₂₎&child₁\nechild₂\\ child₁+1&child₁=child_{2
\end{cases}}

The Ershov number of a node represents the minimum number of registers required to evaluate the subexpression whose root is that node. The idea is that we evaluate the child with the larger Ershov number first, then the other child, then perform the operation at the root.

Example

Suppose we have an expression tree with a '+' operation at the root, and the left and right subtrees have Ershov numbers of 3 and 4, respectively. The Ershov number of this node is 4, so we should be able to generate code for the expression using 4 registers.

Generate code to evaluate the right child using registers r1, r2, r3, and r4. Place the result in r1.
Generate code to evaluate the left child using registers r2, r3, and r4. Place the result in r2.
Issue the instruction ADD r1, r1, r2, which adds r1 and r2 and stores the result in r1.

Code generation

The general procedure for generating code using a minimal number of loads and stores from memory is as follows:

Generate code for the child with the largest Ershov number first
Issue an instruction to store the result in a temporary register, or, if none is available, a temporary location in memory
Generate code for the child with the smaller Ershov number
Issue an instruction to load the temporary variable back into a register
Issue an instruction to perform the operation at the root

In the ideal case, if there are n registers and the first subexpression requires n registers and the next subexpression requires n - 1 registers, a single register can be used to store the result of the first expression, and there will still be n - 1 registers available to compute the next subexpression, therefore requiring no loads or stores from memory at all.^[3]

If the Ershov number of the root of the expression tree is greater than the number of registers available, the Ershov number can also be used to determine the amount of additional temporary memory space required, for example on the stack.

Notes and References

Web site: Optimal Code Generation (for Expressions) and Data-Flow Analysis. Carleton University. 2022-05-30.
Web site: Code Generation, Chapter 8. Western Michigan University. 2022-05-30.
Web site: Notes on Code generation. Department of Computer Science, University of Calgary. September 14, 2007. 2022-05-30.

Ershov Number Explained

Definition

Example

Code generation

See also

Notes and References