LOADALL explained

LOADALL is the common name for two different, undocumented machine instructions of Intel 80286 and Intel 80386 processors, which allow access to areas of the internal processor state that are normally outside of the IA-32 API scope, like descriptor cache registers. The LOADALL for 286 processors is encoded 0Fh 05h, while the LOADALL for 386 processors is 0Fh 07h.

Both variants – as the name implies – load all CPU internal registers in one operation. LOADALL had the unique ability to set up the visible part of the segment registers (selector) independently of their corresponding cached part, allowing the programmer to bring the CPU into states not otherwise allowed by the official programming model.

Usage

As an example of the usefulness of these techniques, LOADALL can set up the CPU to allow access to all memory from real mode, without having to switch it into unreal mode (which requires switching into protected mode, accessing memory and finally switching back to real mode). Programs such as the pre-XMS versions of RAMDRIVE.SYS (1985), SMARTDRV.SYS (1986) as well as HIMEM.SYS (2.03, 1988-08-04; 2.04, 1988-08-17) drivers in MS-DOS, Uniform Software Systems' The Extender (1985) and The Connector (1985) for Lotus 1-2-3, Above Disk (1986) (a LIMulator by Above Software (formerly Tele-Ware West aka Los Angeles Securities Group) that converted hard disk space or extended memory into expanded memory), and OS/2 1.0 and 1.1 used the 286 LOADALL instruction. DOS 3.3 and 4.0 reserved a 102-byte buffer at 0070:0100h (which was normally occupied by DOS BIOS data) so that there was no need to save & restore it for LOADALL. Microsoft's EMM386.EXE special-cases both the 286 and 386 LOADALL instructions in its invalid opcode handler. Examination of the virtual-machine monitor code in Windows/386 2.10 shows that it uses both the 286 and the even less known 386 variant. Microsoft's HIMEM.SYS version 2.06 also used LOADALL to quickly copy to and from extended memory on 286 systems.

Another interesting usage of LOADALL, laid out in the book The Design of OS/2, would have been to allow running former real-mode programs in 16-bit protected mode, as utilized by Digital Research's Concurrent DOS 286 since 1985, as well as FlexOS 286 and IBM 4680 OS since 1986. Marking all the descriptor caches in the GDT and LDTs "not present" would allow the operating system to trap segment-register reloads, as well as attempts at performing real-mode–specific "segment arithmetic" and emulate the desired behavior by updating the segment descriptors (LOADALL again). This "8086 emulation mode" for the 80286 was, however, too slow to be practical. The idea had to be mostly discarded due to errata in some early Intel 80286 processors before the E-2 stepping. As a result, OS/2 1.x – and Windows in "standard" mode as well – had to run DOS programs in real mode. Nevertheless, the idea was not lost; it led Intel to introduce the virtual 8086 mode of the 80386, allowing the implementation of "DOS boxes" at last in a relatively efficient and documented way.

Because LOADALL did not perform any checks on the validity of the data loaded into processor registers, it was possible to load a processor state that could not be normally entered, such as using real mode (PE=0) together with paging (PG=1) on 386-class CPUs.

An in-circuit emulator (ICE) is a tool used for low-level debugging. On Intel 80386, asserting the undocumented pin at location B6 causes the microprocessor to halt execution and enter ICE mode. The microprocessor saves its entire state to an area of memory isolated from normal system memory. The layout of this area is suitable for the LOADALL instruction, and this instruction is used by ICE code to return to normal execution.

In later processors, this evolved into System Management Mode (SMM). In SMM, the RSM instruction is used to load a full CPU state from a memory area. The layout of this memory area is similar to one used by the LOADALL instruction. 386-style LOADALL instruction can be executed on 486 too, but only in SMM mode. In later processors, the RSM instruction, with a different encoding, took its role.

Microsoft's Codeview 3.0 and Borland's Turbo Debugger 2.0 correctly decode 286 and 386 LOADALL instructions.

As the two LOADALL instructions were never documented and do not exist on later processors, the opcodes were reused in the AMD64 architecture. The opcode for the 286 LOADALL instruction, 0F05, became the AMD64 instruction SYSCALL; the 386 LOADALL instruction, 0F07, became the SYSRET instruction. These definitions were implemented even on Intel CPUs with the introduction of the Intel 64 implementation of AMD64.

80286

Opcode 0F05. The instruction reads data from addresses 0x00800–0x00866, whatever the content of the segment registers.

Addressnumber
of bytes!
registerregisterregisterregister
008006not used
008062MSW, machine status word
0080814not used
008162TR (task register)
008182flags
0081A2IP (instruction pointer)
0081C2LDTR, local
descriptor table register
0081E4× 2DS (data segment)SS (stack segment)CS (code segment)ES (extra segment)
008264× 2DI (destination index)SI (source index)BP (base pointer)SP (stack pointer)
0082E4× 2BXDXCXAX
008364× 6ES segment descriptorCS segment descriptorSS segment descriptorDS segment descriptor
0084E4× 6GDT,
global descriptor table
LDT,
local descriptor table
IDT,
interrupt descriptor table
TSS,
task state segment
The 80286 LOADALL instruction can not be used to switch from protected back to real mode (it can't clear the PE bit in the MSW). However, use of the LOADALL instruction can avoid the need to switch to protected mode altogether.

80386

Opcode 0F07. The instruction loads data from address ES:EDI. It actually uses ES, not the ES descriptor.

Addressnumber
of bytes!
registerregisterregisterregister
ES:EDI+004CR0, control register 0
ES:EDI+044EFLAGS
ES:EDI+084EIP, instruction pointer
ES:EDI+0C4× 4EDI, destination indexESI, source index EBP, base pointerESP, stack pointer
ES:EDI+1C4× 4EBXEDXECXEAX
ES:EDI+2C2× 4DR6DR7
ES:EDI+344TR, task state selector
ES:EDI+384LDTR,
local descriptor table
ES:EDI+3C4× 2GS, extra segmentnot usedFS, extra segmentnot used
ES:EDI+444× 2DS, data segmentnot usedSS, stack segmentnot used
ES:EDI+4C4× 2CS, code segmentnot usedES, extra segmentnot used
ES:EDI+544× 12TSS descriptor,
task state selector
IDT descriptor,
interrupt descriptor table
GDT descriptor,
global descriptor table
LDT descriptor,
local descriptor table
ES:EDI+844× 12GS segment descriptorFS segment descriptorDS segment descriptorSS segment descriptor
ES:EDI+B42× 12CS segment descriptorES segment descriptor

See also

Further reading