Burroughs B6x00-7x00 instruction set explained

Burroughs B6x00-7x00 instruction set should not be confused with B5000 instruction set.

The Burroughs B6x00-7x00 instruction set includes the set of valid operations for the Burroughs B6500,^[1] B7500 and later Burroughs large systems, including the current (as of 2006) Unisys Clearpath/MCP systems; it does not include the instruction for other Burroughs large systems including the B5000, B5500, B5700 and the B8500. These unique machines have a distinctive design and instruction set. Each word of data is associated with a type, and the effect of an operation on that word can depend on the type. Further, the machines are stack based to the point that they had no user-addressable registers.

Overview

As you would expect from the unique architecture used in these systems, they also have an interesting instruction set. Programs are made up of 8-bit syllables, which may be Name Call, be Value Call or form an operator, which may be from one to twelve syllables in length. There are less than 200 operators, all of which fit into 8-bit syllables. If we ignore the powerful string scanning, transfer, and edit operators, the basic set is only about 120 operators. If we remove the operators reserved for the operating system such as MVST and HALT, the set of operators commonly used by user-level programs is less than 100. The Name Call and Value Call syllables contain address couples; the Operator syllables either use no addresses or use control words and descriptors on the stack.

Since there are no programmer-addressable registers, most of the register manipulating operations required in other architectures are not needed, nor are variants for performing operations between pairs of registers, since all operations are applied to the top of the stack. This also makes code files very compact, since operators are zero-address and do not need to include the address of registers or memory locations in the code stream. Some of the code density was due to moving vital operand information elsewhere, to 'tags' on every data word or into tables of pointers. Many of the operators are generic or polymorphic depending on the kind of data being acted on as given by the tag. The generic opcodes required fewer opcode bits but made the hardware more like an interpreter, with less opportunity to pipeline the common cases.

For example, the instruction set has only one ADD operator. It had to fetch the operand to discover whether this was an integer add or floating point add. Typical architectures require multiple operators for each data type, for example add.i, add.f, add.d, add.l for integer, float, double, and long data types. The architecture only distinguishes single and double precision numbers - integers are just reals with a zero exponent. When one or both of the operands has a tag of 2, a double precision add is performed, otherwise tag 0 indicates single precision. Thus the tag itself is the equivalent of the operator .i, .f, .d, and .l extension. This also means that the code and data can never be mismatched.

Two operators are important in the handling of on-stack data - Value Call (VALC) and Name Call (NAMC). These are two-bit operators, 00 being VALC and 01 being NAMC. The following six bits of the syllable, concatenated with the following syllable, provide the address couple. Thus VALC covers syllable values 0000 to 3FFF and NAMC 4000 to 7FFF.

VALC is another polymorphic operator. If it hits a data word, that word is loaded to the top of stack. If it hits an IRW, that is followed, possibly in a chain of IRWs until a data word is found. If a PCW is found, then a function is entered to compute the value and the VALC does not complete until the function returns.

NAMC simply loads the address couple onto the top of the stack as an IRW (with the tag automatically set to 1).

Static branches (BRUN, BRFL, and BRTR) used two additional syllables of offset. Thus arithmetic operations occupied one syllable, addressing operations (NAMC and VALC) occupied two, branches three, and long literals (LT48) five. As a result, code was much denser (had better entropy) than a conventional RISC architecture in which each operation occupies four bytes. Better code density meant fewer instruction cache misses and hence better performance running large-scale code.

In the following operator explanations remember that A and B are the top two stack registers. Double precision extensions are provided by the X and Y registers; thus the top two double precision operands are given by AX and BY. (Mostly AX and BY is implied by just A and B.)

B6x00/7x00 Address Couple
Current LL	Lexical Level bits	Index bits
0-1	13	12-0
2-3	13-12	11-0
4-7	13-11	10-0
8-15	13-10	9-0
16-31	13-9	8-0

Tags and control words

In the B6500, a word has 48 bits of data and three tag bits. extended to three bits outside of the 48 bit word into a tag. The data bits are bits 0–47 and the tag is in bits 48–50. Bit 48 is the read-only bit, thus odd tags indicated control words that cannot be written by a user-level program. Code words are given tag 3. Here is a list of the tags and their function:

Tag	Word kind	Description
0	Data	All kinds of user and system data (text data and single precision numbers)
2	Double	Double Precision data
4	SIW	Step Index word (used in loops)
6		Uninitialized data
6		SCW	Software Control Word (used to cut back the stack)
1	IRW	Indirect Reference Word
1		SIRW	Stuffed Indirect Reference Word
3	Code	Program code word
		MSCW	Mark Stack Control Word
		RCW	Return Control Word
		TOSCW	Top of Stack Control Word
		SD	Segment Descriptor
5	Descriptor	Data block descriptors
7	PCW	Program Control Word

The current incarnation of these machines, the Unisys ClearPath has extended tags further into a four bit tag. The microcode level that specified four bit tags was referred to as level Gamma.

Even-tagged words are user data which can be modified by a user program as user state. Odd-tagged words are created and used directly by the hardware and represent a program's execution state. Since these words are created and consumed by specific instructions or the hardware, the exact format of these words can change between hardware implementation and user programs do not need to be recompiled, since the same code stream will produce the same results, even though system word format may have changed.

Tag 1 words represent on-stack data addresses. The normal IRW simply stores an address couple to data on the current stack. The SIRW references data on any stack by including a stack number in the address.

Tag 5 words are descriptors, which are more fully described in the next section. Tag 5 words represent off-stack data addresses.

Tag 7 is the program control word which describes a procedure entry point. When operators hit a PCW, the procedure is entered. The ENTR operator explicitly enters a procedure (non-value-returning routine). Functions (value-returning routines) are implicitly entered by operators such as value call (VALC). Global routines are stored in the D[2] environment as SIRWs that point to a PCW stored in the code segment dictionary in the D[1] environment. The D[1] environment is not stored on the current stack because it can be referenced by all processes sharing this code. Thus code is reentrant and shared.

Tag 3 represents code words themselves, which won't occur on the stack. Tag 3 is also used for the stack control words MSCW, RCW, TOSCW.

Display registers

A stack hardware optimization is the provision of D (or "display") registers. These are registers that point to the start of each called stack frame. These registers are updated automatically as procedures are entered and exited and are not accessible by any software. There are 32 D registers, which is what limits to 32 levels of lexical nesting.

Consider how we would access a lexical level 2 (D[2]) global variable from lexical level 5 (D[5]). Suppose the variable is 6 words away from the base of lexical level 2. It is thus represented by the address couple (2, 6). If we don't have D registers, we have to look at the control word at the base of the D[5] frame, which points to the frame containing the D[4] environment. We then look at the control word at the base of this environment to find the D[3] environment, and continue in this fashion until we have followed all the links back to the required lexical level. This is not the same path as the return path back through the procedures which have been called in order to get to this point. (The architecture keeps both the data stack and the call stack in the same structure, but uses control words to tell them apart.)

As you can see, this is quite inefficient just to access a variable. With D registers, the D[2] register points at the base of the lexical level 2 environment, and all we need to do to generate the address of the variable is to add its offset from the stack frame base to the frame base address in the D register. (There is an efficient linked list search operator LLLU, which could search the stack in the above fashion, but the D register approach is still going to be faster.) With D registers, access to entities in outer and global environments is just as efficient as local variable access.

D Tag Data — Address couple, Comments register

| 0 | n | (4, 1) The integer n (declared on entry to a block, not a procedure) |-----------------------| | D[4]

>3 | MSCW | (4, 0) The Mark Stack Control Word containing the link to D[3]. |

| | 0 | r2 | (3, 5) The real r2 |-----------------------| | 0 | r1 | (3, 4) The real r1 |-----------------------| | 1 | p2 | (3, 3) A SIRW reference to g at (2,6) |-----------------------| | 0 | p1 | (3, 2) The parameter p1 from value of f |-----------------------| | 3 | RCW | (3, 1) A return control word |-----------------------| | D[3]

>3 | MSCW | (3, 0) The Mark Stack Control Word containing the link to D[2]. |

| | 1 | a | (2, 7) The array a

=>[ten word memory block] |-----------------------| | 0 | g | (2, 6) The real g |-----------------------| | 0 | f | (2, 5) The real f |-----------------------| | 0 | k | (2, 4) The integer k |-----------------------| | 0 | j | (2, 3) The integer j |-----------------------| | 0 | i | (2, 2) The integer i |-----------------------| | 3 | RCW | (2, 1) A return control word |-----------------------| | D[2]

>3 | MSCW | (2, 0) The Mark Stack Control Word containing the link to the previous stack frame. |

| — Stack bottom

If we had invoked the procedure p as a coroutine, or a process instruction, the D[3] environment would have become a separate D[3]-based stack. This means that asynchronous processes still have access to the D[2] environment as implied in ALGOL program code. Taking this one step further, a totally different program could call another program's code, creating a D[3] stack frame pointing to another process' D[2] environment on top of its own process stack. At an instant the whole address space from the code's execution environment changes, making the D[2] environment on the own process stack not directly addressable and instead make the D[2] environment in another process stack directly addressable. This is how library calls are implemented. At such a cross-stack call, the calling code and called code could even originate from programs written in different source languages and be compiled by different compilers.

The D[1] and D[0] environments do not occur in the current process's stack. The D[1] environment is the code segment dictionary, which is shared by all processes running the same code. The D[0] environment represents entities exported by the operating system.

Stack frames actually don't even have to exist in a process stack. This feature was used early on for file I/O optimization, the FIB (file information block) was linked into the display registers at D[1] during I/O operations. In the early nineties, this ability was implemented as a language feature as STRUCTURE BLOCKs and – combined with library technology - as CONNECTION BLOCKs. The ability to link a data structure into the display register address scope implemented object orientation. Thus, the B6500 actually used a form of object orientation long before the term was ever used.

On other systems, the compiler might build its symbol table in a similar manner, but eventually the storage requirements would be collated and the machine code would be written to use flat memory addresses of 16-bits or 32-bits or even 64-bits. These addresses might contain anything so that a write to the wrong address could damage anything. Instead, the two-part address scheme was implemented by the hardware. At each lexical level, variables were placed at displacements up from the base of the level's stack, typically occupying one word - double precision or complex variables would occupy two. Arrays were not stored in this area, only a one word descriptor for the array was. Thus, at each lexical level the total storage requirement was not great: dozens, hundreds or a few thousand in extreme cases, certainly not a count requiring 32-bits or more. And indeed, this was reflected in the form of the VALC instruction (value call) that loaded an operand onto the stack. This op-code was two bits long and the rest of the byte's bits were concatenated with the following byte to give a fourteen-bit addressing field. The code being executed would be at some lexical level, say six: this meant that only lexical levels zero to six were valid, and so just three bits were needed to specify the lexical level desired. The address part of the VALC operation thus reserved just three bits for that purpose, with the remainder being available for referring to entities at that and lower levels. A deeply nested procedure (thus at a high lexical level) would have fewer bits available to identify entities: for level sixteen upwards five bits would be needed to specify the choice of levels 0–31 thus leaving nine bits to identify no more than the first 512 entities of any lexical level. This is much more compact than addressing entities by their literal memory address in a 32-bit addressing space. Further, only the VALC opcode loaded data: opcodes for ADD, MULT and so forth did no addressing, working entirely on the top elements of the stack.

Much more important is that this method meant that many errors available to systems employing flat addressing could not occur because they were simply unspeakable even at the machine code level. A task had no way to corrupt memory in use by another task, because it had no way to develop its address. Offsets from a specified D-register would be checked by the hardware against the stack frame bound: rogue values would be trapped. Similarly, within a task, an array descriptor contained information on the array's bounds, and so any indexing operation was checked by the hardware: put another way, each array formed its own address space. In any case, the tagging of all memory words provided a second level of protection: a misdirected assignment of a value could only go to a data-holding location, not to one holding a pointer or an array descriptor, etc. and certainly not to a location holding machine code.

Arithmetic operators

: Add top two stack operands (B := B + A or BY := BY + AX if double precision)

Comparison operators

: Is B < A?

Logical operators

: Logical bitwise and of all bits in operands

Branch and call operators

: Branch unconditional (offset given by following code syllables)

Bit and field operators

: Bit set (bit number given by syllable following instruction)

Literal operators

: Load following code word onto top of stack

Descriptor operators

: Index create a pointer (copy descriptor) from a base (MOM) descriptor

Stack operators

: Push down stack register

Store operators

: Store destructive (if the target word has an odd tag throw a memory protect interrupt, - store the value in the B register at the memory addressed by the A register. - Delete the value off the stack.

Load operators

The Load instruction could find itself tripping on an indirect address, or worse, a disguised call to a call-by-name thunk routine.

: Load the value given by the address (tag 5 or tag 1 word) on the top of stack. - Follow an address chain if necessary.

Transfer operators

These were used for string transfers usually until a certain character was detected in the source string.All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

: Transfer while false, destructive (forget pointer)

Scan operators

These were used for scanning strings useful in writing compilers.All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

: Scan while false, destructive

: Compare characters less, destructive

System

: Set interval timer

Other

: Escape to extended (variable instructions which were less frequent)

Edit operators

These were special operators for sophisticated string manipulation, particularly for business applications.

: Move with insert - insert characters in a string

Notes and References

Burroughs . Burroughs B6500 Information Processing System Reference Manual . 1043676 . September 1969 . cs2 .