In computer science, code motion, also known as code hoisting, code sinking, loop-invariant code motion, or code factoring, is a blanket term for any process that moves code within a program for performance or size benefits, and is a common optimization performed in most optimizing compilers. It can be difficult to differentiate between different types of code motion, due to the inconsistent meaning of the terms surrounding it.
Code motion has a variety of uses and benefits, many of which overlap each other in their implementation.
Code Sinking, also known as lazy code motion, is a term for a technique that reduces wasted instructions by moving instructions to branches in which they are used:[1] If an operation is executed before a branch, and only one of the branch paths use the result of that operation, then code sinking entails moving that operation into the branch where it will be used.
This technique is a form of dead code elimination in the sense that it removes code when its results are discarded or unused, but in contrast to dead code elimination, it can remove pointless instructions even if there is a possible use of that instruction’s results in an execution code path.
Code Factoring is a term for a size-optimization technique that merges common dependencies in branches into the branch above it.[2] Just like factorizing integers decomposes a number into its smallest possible forms (as factors), code factorization transforms the code into the smallest possible form, by merging common "factors" until no duplicates remain.
See main article: Instruction scheduling. Global code motion, local code motion, code scheduling, Instruction scheduling and code hoisting/sinking are all terms for a technique where instructions are rearranged (or "scheduled") to improve the efficiency of execution within the CPU.[3] [4] Modern CPUs are able to schedule five or more instructions per clock cycle. However, a CPU cannot schedule an instruction that relies on data from a currently (or not yet executed) instruction. Compilers will interleave dependencies in a manner that maximizes the amount of instructions a CPU can process at any point in time.[5]
On the defunct Intel Itanium architecture, the branch predict (BRP) instruction is manually hoisted above branches by the compiler to enable the branch to be immediately taken by the CPU. Itanium relies on additional code scheduling from the CPU to maximize efficiency in the processor.[6]
See main article: Loop-invariant code motion. Loop-invariant code motion is the process of moving loop-invariant code to a position outside the loop, which may reduce the execution time of the loop by preventing some computations from being done twice for the same result.
LLVM has a sinking pass in its single static assignment form. LLVM 15.0 will not sink an operation if any of its code paths include a store instruction, or if it may throw an error.[7] Additionally, LLVM will not sink an instruction into a loop.
The GNU Compiler Collection implements code motion under the name "code factoring", with the purpose of reducing the size of a compiled program.[8] GCC will move any code upwards or downwards if it "[does not] invalidate any existing dependences nor introduce new ones".[9]
LuaJIT uses code sinking under the name "Allocation sinking", to reduce the amount of time compiled code spends allocating and collecting temporary objects within a loop.[10] Allocation sinking moves allocations to execution paths where the allocated object may escape the executing code, and will thus require heap allocation. All removed allocations are filled in with load-to-store forwarding over their fields.[11]