AlphaDev | |
AlphaDev | |
Genre: | Reinforcement learning |
AlphaDev is an artificial intelligence system developed by Google DeepMind to discover enhanced computer science algorithms using reinforcement learning. AlphaDev is based on AlphaZero, a system that mastered the games of chess, shogi and go by self-play. AlphaDev applies the same approach to finding faster algorithms for fundamental tasks such as sorting and hashing.[1] [2] [3]
On June 7, 2023, Google DeepMind published a paper in Nature introducing AlphaDev, which discovered new algorithms that outperformed the state-of-the-art methods for small sort algorithms.[1] For example, AlphaDev found a faster assembly language sequence for sorting 5-element sequences. Upon analysing the algorithms in-depth, AlphaDev discovered two unique sequences of assembly instructions called the AlphaDev swap and copy moves that avoid a single assembly instruction each time they are applied.[1] [3] For variable sort algorithms, AlphaDev discovered fundamentally different algorithm structures. For example, for VarSort4 (sort up to 4 elements) AlphaDev discovered an algorithm 29 assembly instructions shorter than the human benchmark.[1] AlphaDev also improved on the speed of hashing algorithms by up to 30% in certain cases.[2]
In January 2022, Google DeepMind submitted its new sorting algorithms to the organization that manages C++, one of the most popular programming languages in the world, and after independent vetting, AlphaDev's algorithms were added to the library.[4] This was the first change to the C++ Standard Library sorting algorithms in more than a decade and the first update to involve an algorithm discovered using AI.[4] In January 2023, DeepMind also added its hashing algorithm for inputs from 9 to 16 bytes to Abseil, an open-source collection of prewritten C++ algorithms that can be used by anyone coding with C++.[5] [4] Google estimates that these two algorithms are used trillions of times every day.[6]
AlphaDev is built on top of AlphaZero, the reinforcement-learning model that DeepMind trained to master games such as Go and chess.[4] The company's breakthrough was to treat the problem of finding a faster algorithm as a game and then train its AI to win it.[2] AlphaDev plays a single-player game where the objective is to iteratively build an algorithm in the assembly language that is both fast and correct.[1] AlphaDev uses a neural network to guide its search for optimal moves, and learns from its own experience and synthetic demonstrations.[1]
AlphaDev showcases the potential of AI to advance the foundations of computing and optimize code for different criteria. Google DeepMind hopes that AlphaDev will inspire further research on using AI to discover new algorithms and improve existing ones.[2]
The primary learning algorithm in AlphaDev is an extension of AlphaZero.
In order to use AlphaZero on assembly programming, the authors created a Transformer-based vector representation of assembly programs designed to capture their underlying structure. This finite representation allows a neural network to play assembly programming like a game with finitely many possible moves (like Go),
The representation uses the following components:
The game state is the assembly program generated up to a given point.
The game move is an extra instruction appended to the current assembly program.
The game's reward is a function of the assembly program's correctness and latency. To reduce cost, AlphaDev only computes actual measured latency on less than 0.002% of generated programs, as it does not evaluate latency during the search process. Instead, it uses two functions that estimate the correctness and latency by being trained via supervised learning using the real measured correctness and latency values.
AlphaDev developed hashing algorithms for inputs from 9 to 16 bytes to Abseil, an open-source collection of prewritten C++ algorithms.[7]
AlphaDev discovered new sorting algorithms, which led to up to 70% improvements in the LLVM libc++ sorting library for shorter sequences and about 1.7% improvements for sequences exceeding 250,000 elements. These improvements apply to the uint32, uint64 and float data types for ARMv8, Intel Skylake and AMD Zen 2 CPU architectures. AlphaDev's branchless conditional assembly and new swap move contributed to these performance improvements. The discovered algorithms were reverse-engineered from low-level assembly to C++, and have officially been included in the libc++ standard sorting library.
AlphaDev learned an optimized VarInt deserialization function in protobuf,[8] outperforming the human benchmark for single valued inputs by approximately three times in terms of speed. AlphaDev also discovered a new VarInt assignment move, combining two operations into a single instruction for latency savings.
The AlphaDev's performance was compared to stochastic superoptimization,[9] a logical AI approach. The latter was run with at least the same amount of resources and wall-clock time as AlphaDev. The results showed that AlphaDev-S requires a prohibitive amount of time to optimize directly for latency, as latency needs to be computed after every mutation. As such, AlphaDev-S optimizes for a latency proxy, specifically algorithm length, and, then, at the end of training, all correct programs generated by AlphaDev-S are searched through.