The Parallel Element Processing Ensemble (PEPE) was one of the very early parallel computing systems. Bell began researching the concept in the mid-1960s as a way to provide high-performance computing support for the needs of anti-ballistic missile (ABM) systems. The goal was to build a computer system that could simultaneously track hundreds of incoming ballistic missile warheads.[1] [2] [3] A single PEPE system was built by Burroughs Corporation in the 1970s, by which time the US Army's ABM efforts were winding down. The design later evolved into the Burroughs Scientific Computer for commercial sales, but a lack of sales prospects led to it being withdrawn from the market.
PEPE came about as a result of predictions of the sorts of ICBM forces that would be expected in the event of an all-out Soviet attack during the 1970s. Missile fleets of both the US and USSR were growing through the 1960s, but a bigger issue was the rapid increase in the number of warheads as a result of the move to multiple independently targetable reentry vehicles (MIRV). Computers designed for the Nike-X system were largely similar to systems like the IBM 7030, and would have been able to handle attacks of perhaps a dozen warheads arriving simultaneously. With MIRV, hundreds of targets, both warheads and decoys, would arrive at the same time, and the CPUs being used simply did not have the performance needed to analyze their trajectories quickly enough to leave time to attack them.[1]
Bell Labs, which had been the primary industry partner in previous ABM systems, proposed development of a new system able to track 200 to 300 missiles at a time. The program officially started in 1969. Development was led by System Development Corporation (SDC), which had formed in 1955 to develop the software for the SAGE air defense computer system. PEPE was designed by a team led by George Mueller, formerly of NASA. He described the ultimate goal to produce 300 million instructions per second, far in advance of contemporary systems.[4]
An initial testbed system, the "IC model", was built with 16 processors consisting of individual integrated circuits and connected to an IBM 360/65 host. This was completed in 1971. This proved successful, and between October 1971 and September 1972, SDC and Honeywell produced a final design. In November, Burroughs won the contract to build a 36 processor prototype of the full-sized 288-processor version. Burroughs delivered PEPE to the Ballistic Missile Defense Advanced Technology Center (part of US Army's Strategic Defense Command) in Huntsville, Alabama in 1976.[2] Testing was apparently successful, but Bell concluded that the machine was too expensive for the sorts of threats being addressed by the Safeguard Program that was being deployed in the 1970s.[1]
The system was eventually sent to McDonnell Douglas in Huntington Beach, CA. After it was retired, it was sent to Auburn University, which scrapped the system some time in the late 1980s or early 1990s.[1]
The PEPE system was based on a number of interconnected chassis. Each of the main Processing Element Bays could hold 36 Processing Elements (PEs), arranged in four rows of nine PEs. A separate, similar, chassis held the Control Unit (CU) and a simple system console that displayed the status. The CU could control up to eight Bays, for a total of 288 PEs.[1]
The PE consisted of three main functional units, a floating point processor (the Arithmetic Unit, AU) that could perform basic arithmetic including square roots, and separate input (Correlation Unit, CU) and output (Associative Output Unit, AOU) address generators that could determine the associative address of the next data element to be read, and the address of the output such that the results were ordered. The data was stored in an content-addressable memory (associative addressing),[5] and each unit had 2 k of 32-bit words (8 kB). A failed PE could have its duties switch in real-time to any other PE, giving the system significant redundancy.[6]
Associative addressing was used in PEPE to allow it to quickly correlate new measurements to existing information. For instance, a particular radar may sweep a section of the sky every 2 seconds. On one such sweep it might see an object in a certain location, and the system has to quickly decide whether this is a new blip or an update of an existing one. The memory system is designed to produce a sort of hash code of this information that is used to retrieve the data, as opposed to searching through memory for possible matches based on the fields in the data.[6]
Each processing element contained a minimum of control logic, the bulk of the control being concentrated in the common control unit. The control unit read instructions from memory, decoded them, and issued them to all processing elements simultaneously so that the elements were required to execute the same instruction at the same time. The elements were capable of executing a complete single address instruction including reading and writing the data.[1] The program as a whole was stored on and fed into PEPE from a front-end system, originally a CDC 7600.[6]
The system as a whole operated in a lock-step fashion, able to perform one floating point instruction per cycle. The system normally ran at 1 MHz, so each PE performed about 1 MFLOPS, and the system as a whole around 288 MFLOPS. The integer instructions were about 100 times faster, with the system as a whole running about 2,880 MIPS. This was much faster than any machine of the era.[6]
A Burroughs B1700 computer system was used as a test and diagnostic computer. A custom software package, called TRANSET, which executed on the B1700 was used to debug and maintain PEPE's processing elements.[1]