Bootstrappable builds, a process of compiling software that doesn't depend on (compiler) binaries that aren't built from source by this process.[1] [2] [3]
This process can protect against compiler backdoors: if the build process doesn't depend on binary code that is difficult to audit, then a compiler backdoor cannot be hidden in compiler binaries anymore.
A way to tackle the issue for a Software distributions is to reduce the size of the binaries used to bootstrap the distribution until there are not needed anymore or that the size is small enough to be easily reviewed by humans.[4]
Many compilers for various programming languages are written in the language they target. For instance the official Go compiler(gc) is written in Go.
So without alternatives compilers compiler like GCC that are written in another programming language (here in C and C++) the go compiler would require a binary of a previous version of the go compiler binary to be built.
To have bootstrappable builds, it is often possible to find an older versions of the compiler that could be built from sources, and from that, write code to automatically build the next version of the compilers until having a recent version. Identifying which version can build which versions is often not trivial and that often result in very long compilation times for the bootstrap procedure. Sometimes this also require to maintain older compiler versions and to backport support for newer CPU architectures on older compilers versions to be able to bootstrap these architectures. GCC 4.7 for example is the last version that can be compiled using tcc but can then go on to compile newer versions of GCC.[5]
This process can also be replaced or combined with other ways to bootstrap compilers.
For instance it is also possible to write a new compiler for a language, that is written in another language.
These techniques can be used to reduce the size of the binaries used to bootstrap a distribution.
As for building the first compiler that can build the subsequent compilers, it is possible to reduce the size to a single binary that is 357 bytes[6] and from that use multiple stages in the bootstrapping procedure to be able to build a C compiler, and from that build the other compilers or software.[7]
Software can depend on itself for compiling and the first version could've been compiled in a way that isn't bootstrappable.
Gradle is one such case as it depends on Scala, which had a proprietary dependency in its first release,[8] and Kotlin, which depends on itself and Gradle to be compiled.[9]
The Bootstrappable Builds project was started in 2016 as a spin-off of the Reproducible Builds project.[3]
In 2022, Guix gained the ability to be built from the aforementioned 357 bytes binary.