Berknet Explained

The Berkeley Network, or Berknet, was an early wide area network, developed at the University of California, Berkeley in 1978, primarily by Eric Schmidt as part of his master's thesis work.[1] The network continuously connected about a dozen computers running BSD[2] and provided email, file transfer, printing and remote command execution services to its users, and it connected to the two other major networks in use at the time, the ARPANET and UUCPNET.[3]

The network operated using what were then high-speed serial links, 1200 bit/s in the initial system. Its software implementation shipped with the Berkeley Software Distribution from version 2.0 onwards. It consisted of a line discipline within the Unix kernel,[4] a set of daemons that managed queues of commands to be sent across machines, and a set of user-level programs that enqueued the actual commands. The Berkeley Network introduced the file.

The release of UUCP as part of Version 7 Unix in 1979 led to little external interest in the system;Mary Ann Horton noted in 1984 that "Berknets are gone now".[5] Support for Berknet's custom email addressing scheme was provided in the Sendmail program until 1993.

History

The development of what became Berknet began with an initial concept developed by Bob Fabry, Eric Schmidt's advisor. Schmidt developed the system until leaving for a break in May. The system was initially designed to connect only two machines, known as A and Q, both PDP-11 machines running Unix. Development ran until the end of term in May. By 1 May the initial system, connecting two machines, A and Q, was operational. As development took place mostly on the A machine, the system was also used to distribute the code to Q. The menial task of moving the code to Q led to early efforts to automate the system.

As the code began to become functional it also began to see use by other users, and this led to A being connected to a new machine, C. This presented the new problem that the original implementation required the same user login on both machines, and trying to keep this in sync between several machines led to A's password file growing too large. Dealing with this problem was initially handled by having dedicated "free" accounts on both machines that was used by Berknet, along with using the file to automatically log into them when needed.

A third machine, Cory, was then added. To deal with the account issue, and handle physical routing, A became a hub responsible for storing and forwarding any messages between C and Cory. A table-based routing map was introduced to control this. It was in this state when Schmidt completed the first version in May. While he was away, the Computing Center made several changes to the system, and this led to several incompatible versions of Berknet emerging, and periodic breakage of the system, including the sendmail system not working.

By the time Schmidt returned to the system in October, a VAX 11/780 machine had been added, bringing the total to three different versions of Unix being supported. make was then used to build and distribute the code to the machines, including using the older version of the code to bootstrap the Cory machine, which ran Unix Version 6 while the others ran 7. Further changes led to a stable version and documentation began, and several additional sites were added. Performance became a problem, especially for the machines tasked with forwarding, and required hand maintenance to keep the system working.

Description

The system had some resemblance to UUCP in that it used a batch mode file transfer system based on a Unix daemon that performed the actual transfers, as well as defining a suitable simple network protocol to move the data over the telephone-based links. The protocol is built into the application, which watches for new files to appear a series of defined locations representing queues. When a file appears, starts a terminal connection to the selected remote machine, issues commands and performs file transfers, and then disconnects and deletes the local file if it was a success.

From a user's perspective, there are a number of separate applications that make up the system, ones for reading and writing mail, one for moving files between machines, etc. These all work by placing files in the proper queues which then automatically move the data to the target machine where it is reassembled and moved back into user directories for use. The most-used application was, which copied files over the network. This was supplied with two filenames, the first the path to the existing file, and the second the path to the desired final location. Paths were a combination of a machine name followed by a colon, and then a normal Unix slash-separated path. For instance:

netcp testfile.txt Cory:/usr/pascal/sh

would copy the file to the path on the machine.

Likewise, the existing application was modified to understand the same addressing scheme, and supported by the utility which automatically added headers to indicate where the message originated. On the receive side, the new application uses to log into a named remote machine and read any mail found there. When this completes, then copies the messages into the user's local where it can be read and replied to as normal. Automation of these separate tasks was quickly introduced.

Protocol

The underlying transfer protocol was arranged in three layers. The uppermost was the command protocol, which defined the overall file structure, below this was the stream protocol, which broke the file into multiple packets, and at the bottom was the hardware layer using the serial ports.

The command layer consisted primarily of a file format that started with a length indicator, followed by a header, a command to run on the remote machine, and then any required data. The command assembled the files, prepends the length indicator, and then places the resulting file in the proper queue. The header contains the origin and destination machine names, the login names to use on both, a version number, time stamps, and then the "pseudo-command". The header data is written as normal ASCII text with the fields separated by colons.

During a transfer, the file is broken into packets for error correction purposes. The packets have a header consisting of a length byte that allows up to 255 bytes in a packet, followed by a two-byte packet number, a one-byte type indicator, a one-byte checksum, and then the data. The checksum is placed at the front because the packets are variable length, and this simplified the code. There are several packet types, which contain command information or data. Among these is the "reset" packet, which indicates the start of a new transfer from a remote machine. The protocol was very similar to XMODEM in complexity, lacking sliding windows or streaming acknowledgement, features that were considered important to add to future versions.

At the hardware level, the system was simply a serial port connecting two machines. However, the existing Unix systems could not directly process 8-bit ASCII data without considerable overhead, so the data being sent is encoded with three 8-bit bytes packaged into four 6-bit characters in the printable range. This introduces an overhead of 33%, which was also considered an important area for possible improvement.

References

Bibliography

Notes and References

  1. Encyclopedia: Unix Operating System . Mark . Shacklette . The Internet Encyclopedia . 2004 . Wiley . 497 . 9780471222019 . April 28, 2020.
  2. Some simple economics of open source . Josh . Lerner . Jean . Tirole . 2000 . NATIONAL BUREAU OF ECONOMIC RESEARCH . April 30, 2020 . NBER Working Paper Series.
  3. Book: Hauben . Michael . Hauben . Ronda . Netizens: On the History and Impact of Usenet and the Internet . 978-0-8186-7706-9 . 1997 . 170. Wiley .
  4. Book: Paul A. . Vixie . Paul Vixie . Frederick M. . Avolio . Sendmail: Theory and Practice . Elsevier . 3 . 2002 . 9781555582296.
  5. Horton . Mark R. . What is a Domain? . Software Tools Users Group [Sof84] . 1986 . 368–372 . April 28, 2020.