EVEX prefix explained

The EVEX prefix (enhanced vector extension) and corresponding coding scheme is an extension to the 32-bit x86 (IA-32) and 64-bit x86-64 (AMD64) instruction set architecture. EVEX is based on, but should not be confused with the MVEX prefix[1] used by the Knights Corner processor.

The EVEX scheme is a 4-byte extension to the VEX scheme which supports the AVX-512 instruction set and allows addressing new 512-bit ZMM registers and new 64-bit operand mask registers.

With Advanced Performance Extensions, the Extended EVEX prefix redefines the semantics of several payload bits.[2]

Features

EVEX coding can address 8 operand mask registers, 16 general-purpose registers and 32 vector registers in 64-bit mode (otherwise, 8 general-purpose and 8 vector), and can support up to 4 operands.

Like the VEX coding scheme, the EVEX prefix unifies existing opcode prefixes and escape codes, memory addressing and operand length modifiers of the x86 instruction set.

The following features are carried over from the VEX scheme:

EVEX also extends VEX with additional capabilities:

For example, the EVEX encoding scheme allows conditional vector addition in the form of

VADDPS zmm1, zmm2, zmm3

where modifier next to the destination operand encodes the use of opmask register k1 for conditional processing and updates to destination, and modifier (encoded by EVEX.z) provides the two types of masking (merging and zeroing), with merging as default when no modifier is attached.

Technical description

The EVEX coding scheme uses a code prefix consisting of 4 bytes; the first byte is always 62h and derives from an unused opcode of the 32-bit BOUND instruction, which is not supported in 64-bit mode.[3]

EVEX prefix in the AVX-512 instruction format
  1. of bytes
4 1 1 1 4 / 1 1
[Prefixes] EVEX Opcode ModR/M [SIB] [Disp32] / [Disp8 × N] [Immediate]

The ModR/M byte specifies one operand (always a register) with reg field, and the second operand is encoded with mod and r/m fields, specifying either a register or a location in memory. Base-plus-index and scale-plus-index addressing require the SIB byte, which encodes 2-bit scale factor as well as 3-bit index and 3-bit base registers. Depending on the addressing mode, Disp8/Disp16/Disp32 field may follow with displacement that needs to be added to the address.

The EVEX prefix retains fields introduced in the VEX prefix:

New functions of the existing fields:

There are several new bit fields:

The encoding of the EVEX prefix is as follows:

7 6 5 4 3 2 1 0
Byte 0 (62h)0 1 1 0 0 0 1 0
Byte 1 (P0)R̅’0 0 m1 m0 P[7:0]
Byte 2 (P1)W 3 2 1 0 1 p1 p0P[15:8]
Byte 3 (P2)z L’ L b V̅’ a2 a1 a0P[23:16]

The following table lists possible register addressing combinations (bit 4 is always zero when encoding the 16 general purpose registers):

Register addressing in 64-bit mode using EVEX prefix
Addressing mode Bit 4 Bit 3 Bits [2:0] Register type Common usage
REG EVEX.R’ EVEX.R ModRM.reg General purpose, vector Register operand
RM (if ModRM.mod=11)EVEX.X EVEX.B ModRM.r/m GPR, vector Register operand
RM 0 EVEX.B ModRM.r/m GPR Register memory address
BASE 0 EVEX.B SIB.base GPR Base + index × scale memory address
INDEX 0 EVEX.X SIB.index GPR Base + index × scale memory address
VIDX EVEX.V’ EVEX.X SIB.index Vector Base + vectorindex × scale memory address
NDS/NDD EVEX.V’ EVEX.v3v2v1v0 GPR, vector Register operand
K 0 0 EVEX.a2a1a0Mask Mask register operand

A few VEX-encoded AVX blending instructions have 4 operands. To accommodate this, VEX has IS4 addressing mode, which encodes 4th operand (a vector register) in bits Imm8[7:4] of the immediate constant. Similar EVEX-encoded blend instructions have their 4th operand in a mask register. No EVEX-encoded instruction uses IS4 addressing mode encoding.

Extended EVEX prefix

Intel Advanced Performance Extensions introduce several new variants of the 3-byte payload in the EVEX prefix, which are used to encode Extended GPR registers R16-R31 and new conditional instructions.

EVEX extension of EVEX instructions:

7 6 5 4 3 2 1 0
Byte 0 (62h)0 1 1 0 0 0 1 0
Byte 1 (P0)3 3 3 4 B4 m2 m1 m0 P[7:0]
Byte 2 (P1)W 3 2 1 0 4 p1 p0P[15:8]
Byte 3 (P2)z L’ L b 4a2 a1 a0P[23:16]

EVEX extension of VEX instructions:

7 6 5 4 3 2 1 0
Byte 0 (62h)0 1 1 0 0 0 1 0
Byte 1 (P0)3 3 3 4 B4 m2 m1 m0 P[7:0]
Byte 2 (P1)W 3 2 1 0 4 p1 p0P[15:8]
Byte 3 (P2)0 0 L 0 4NF 0 0 P[23:16]

EVEX extension for legacy instructions:

7 6 5 4 3 2 1 0
Byte 0 (62h)0 1 1 0 0 0 1 0
Byte 1 (P0)3 3 3 4 B4 1 0 0P[7:0]
Byte 2 (P1)W 3 2 1 0 4 p1 p0P[15:8]
Byte 3 (P2)0 0 0 ND 4NF 0 0 P[23:16]

EVEX prefix for conditional CMP and TEST:

7 6 5 4 3 2 1 0
Byte 0 (62h)0 1 1 0 0 0 1 0
Byte 1 (P0)3 3 3 4 B4 1 0 0P[7:0]
Byte 2 (P1)W OF SF ZF CF 4 p1 p0P[15:8]
Byte 3 (P2)0 0 0 ND=0 SC3 SC2SC1 SC0 P[23:16]

When the new EGPR registers and operand destinations can be encoded by both extended EVEX and REX2 prefixes, the latter is preferred.

Notes and References

  1. Book: Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual . Sep 7, 2012. 42. https://web.archive.org/web/20210804022347/https://software.intel.com/content/dam/develop/external/us/en/documents/327364001en.pdf . Aug 4, 2021 . live . 327364-001 .
  2. Book: Intel® Advanced Performance Extensions (Intel® APX) Architecture Specification . August 2023 . 21 . 2 . 355828-002US . Sep 10, 2023 . https://web.archive.org/web/20230910083914/https://cdrdv2-public.intel.com/786223/355828-intel-apx-spec.pdf . live . apx.
  3. Web site: Intel Architecture Instruction Set Extensions Programming Reference. Intel Corporation . July 2013.