LOADALL is the common name for two different, undocumented machine instructions of Intel 80286 and Intel 80386 processors, which allow access to areas of the internal processor state that are normally outside of the IA-32 API scope, like descriptor cache registers. The LOADALL for 286 processors is encoded 0Fh 05h, while the LOADALL for 386 processors is 0Fh 07h.
Both variants – as the name implies – load all CPU internal registers in one operation. LOADALL had the unique ability to set up the visible part of the segment registers (selector) independently of their corresponding cached part, allowing the programmer to bring the CPU into states not otherwise allowed by the official programming model.
As an example of the usefulness of these techniques, LOADALL can set up the CPU to allow access to all memory from real mode, without having to switch it into unreal mode (which requires switching into protected mode, accessing memory and finally switching back to real mode). Programs such as the pre-XMS versions of RAMDRIVE.SYS (1985), SMARTDRV.SYS (1986) as well as HIMEM.SYS (2.03, 1988-08-04; 2.04, 1988-08-17) drivers in MS-DOS, Uniform Software Systems' The Extender (1985) and The Connector (1985) for Lotus 1-2-3, Above Disk (1986) (a LIMulator by Above Software (formerly Tele-Ware West aka Los Angeles Securities Group) that converted hard disk space or extended memory into expanded memory), and OS/2 1.0 and 1.1 used the 286 LOADALL instruction. DOS 3.3 and 4.0 reserved a 102-byte buffer at 0070:0100h (which was normally occupied by DOS BIOS data) so that there was no need to save & restore it for LOADALL. Microsoft's EMM386.EXE special-cases both the 286 and 386 LOADALL instructions in its invalid opcode handler. Examination of the virtual-machine monitor code in Windows/386 2.10 shows that it uses both the 286 and the even less known 386 variant. Microsoft's HIMEM.SYS version 2.06 also used LOADALL to quickly copy to and from extended memory on 286 systems.
Another interesting usage of LOADALL, laid out in the book The Design of OS/2, would have been to allow running former real-mode programs in 16-bit protected mode, as utilized by Digital Research's Concurrent DOS 286 since 1985, as well as FlexOS 286 and IBM 4680 OS since 1986. Marking all the descriptor caches in the GDT and LDTs "not present" would allow the operating system to trap segment-register reloads, as well as attempts at performing real-mode–specific "segment arithmetic" and emulate the desired behavior by updating the segment descriptors (LOADALL again). This "8086 emulation mode" for the 80286 was, however, too slow to be practical. The idea had to be mostly discarded due to errata in some early Intel 80286 processors before the E-2 stepping. As a result, OS/2 1.x – and Windows in "standard" mode as well – had to run DOS programs in real mode. Nevertheless, the idea was not lost; it led Intel to introduce the virtual 8086 mode of the 80386, allowing the implementation of "DOS boxes" at last in a relatively efficient and documented way.
Because LOADALL did not perform any checks on the validity of the data loaded into processor registers, it was possible to load a processor state that could not be normally entered, such as using real mode (PE=0) together with paging (PG=1) on 386-class CPUs.
An in-circuit emulator (ICE) is a tool used for low-level debugging. On Intel 80386, asserting the undocumented pin at location B6 causes the microprocessor to halt execution and enter ICE mode. The microprocessor saves its entire state to an area of memory isolated from normal system memory. The layout of this area is suitable for the LOADALL instruction, and this instruction is used by ICE code to return to normal execution.
In later processors, this evolved into System Management Mode (SMM). In SMM, the RSM instruction is used to load a full CPU state from a memory area. The layout of this memory area is similar to one used by the LOADALL instruction. 386-style LOADALL instruction can be executed on 486 too, but only in SMM mode. In later processors, the RSM instruction, with a different encoding, took its role.
Microsoft's Codeview 3.0 and Borland's Turbo Debugger 2.0 correctly decode 286 and 386 LOADALL instructions.
As the two LOADALL instructions were never documented and do not exist on later processors, the opcodes were reused in the AMD64 architecture. The opcode for the 286 LOADALL instruction, 0F05, became the AMD64 instruction SYSCALL; the 386 LOADALL instruction, 0F07, became the SYSRET instruction. These definitions were implemented even on Intel CPUs with the introduction of the Intel 64 implementation of AMD64.
Opcode 0F05. The instruction reads data from addresses 0x00800–0x00866, whatever the content of the segment registers.
Address | number of bytes! | register | register | register | register | |
---|---|---|---|---|---|---|
00800 | 6 | not used | ||||
00806 | 2 | MSW, machine status word | ||||
00808 | 14 | not used | ||||
00816 | 2 | TR (task register) | ||||
00818 | 2 | flags | ||||
0081A | 2 | IP (instruction pointer) | ||||
0081C | 2 | LDTR, local descriptor table register | ||||
0081E | 4× 2 | DS (data segment) | SS (stack segment) | CS (code segment) | ES (extra segment) | |
00826 | 4× 2 | DI (destination index) | SI (source index) | BP (base pointer) | SP (stack pointer) | |
0082E | 4× 2 | BX | DX | CX | AX | |
00836 | 4× 6 | ES segment descriptor | CS segment descriptor | SS segment descriptor | DS segment descriptor | |
0084E | 4× 6 | GDT, global descriptor table | LDT, local descriptor table | IDT, interrupt descriptor table | TSS, task state segment |
Opcode 0F07. The instruction loads data from address ES:EDI. It actually uses ES, not the ES descriptor.
Address | number of bytes! | register | register | register | register |
---|---|---|---|---|---|
ES:EDI+00 | 4 | CR0, control register 0 | |||
ES:EDI+04 | 4 | EFLAGS | |||
ES:EDI+08 | 4 | EIP, instruction pointer | |||
ES:EDI+0C | 4× 4 | EDI, destination index | ESI, source index | EBP, base pointer | ESP, stack pointer |
ES:EDI+1C | 4× 4 | EBX | EDX | ECX | EAX |
ES:EDI+2C | 2× 4 | DR6 | DR7 | ||
ES:EDI+34 | 4 | TR, task state selector | |||
ES:EDI+38 | 4 | LDTR, local descriptor table | |||
ES:EDI+3C | 4× 2 | GS, extra segment | not used | FS, extra segment | not used |
ES:EDI+44 | 4× 2 | DS, data segment | not used | SS, stack segment | not used |
ES:EDI+4C | 4× 2 | CS, code segment | not used | ES, extra segment | not used |
ES:EDI+54 | 4× 12 | TSS descriptor, task state selector | IDT descriptor, interrupt descriptor table | GDT descriptor, global descriptor table | LDT descriptor, local descriptor table |
ES:EDI+84 | 4× 12 | GS segment descriptor | FS segment descriptor | DS segment descriptor | SS segment descriptor |
ES:EDI+B4 | 2× 12 | CS segment descriptor | ES segment descriptor |