Bus error explained

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.

On POSIX-compliant platforms, bus errors usually result in the SIGBUS signal being sent to the process that caused the error. SIGBUS can also be caused by any general device fault that the computer detects, though a bus error rarely means that the computer hardware is physically broken—it is normally caused by a bug in software. Bus errors may also be raised for certain other paging errors; see below.

Causes

There are at least three main causes of bus errors:

Non-existent address

Software instructs the CPU to read or write a specific physical memory address. Accordingly, the CPU sets this physical address on its address bus and requests all other hardware connected to the CPU to respond with the results, if they answer for this specific address. If no other hardware responds, the CPU raises an exception, stating that the requested physical address is unrecognized by the whole computer system. Note that this only covers physical memory addresses. Trying to access an undefined virtual memory address is generally considered to be a segmentation fault rather than a bus error, though if the MMU is separate, the processor cannot tell the difference.

Unaligned access

Most CPUs are byte-addressable, where each unique memory address refers to an 8-bit byte. Most CPUs can access individual bytes from each memory address, but they generally cannot access larger units (16 bits, 32 bits, 64 bits and so on) without these units being "aligned" to a specific boundary (the x86 platform being a notable exception).

For example, if multi-byte accesses must be 16 bit-aligned, addresses (given in bytes) at 0, 2, 4, 6, and so on would be considered aligned and therefore accessible, while addresses 1, 3, 5, and so on would be considered unaligned. Similarly, if multi-byte accesses must be 32-bit aligned, addresses 0, 4, 8, 12, and so on would be considered aligned and therefore accessible, and all addresses in between would be considered unaligned. Attempting to access a unit larger than a byte at an unaligned address can cause a bus error.

Some systems may have a hybrid of these depending on the architecture being used. For example, for hardware based on the IBM System/360 mainframe, including the IBM System z, Fujitsu B8000, RCA Spectra, and UNIVAC Series 90, instructions must be on a 16-bit boundary, that is, execution addresses must start on an even byte. Attempts to branch to an odd address results in a specification exception.[1] Data, however, may be retrieved from any address in memory, and may be one byte or longer depending on the instruction.

CPUs generally access data at the full width of their data bus at all times. To address bytes, they access memory at the full width of their data bus, then mask and shift to address the individual byte. Systems tolerate this inefficient algorithm, as it is an essential feature for most software, especially string processing. Unlike bytes, larger units can span two aligned addresses and would thus require more than one fetch on the data bus.It is possible for CPUs to support this, but this functionality is rarely required directly at the machine code level, thus CPU designers normally avoid implementing it and instead issue bus errors for unaligned memory access.

Paging errors

FreeBSD, Linux and Solaris can signal a bus error when virtual memory pages cannot be paged in, e.g. because it has disappeared (e.g. accessing a memory-mapped file or executing a binary image which has been truncated while the program was running),[2] or because a just-created memory-mapped file cannot be physically allocated, because the disk is full.

Non-present segment (x86)

On x86 there exists an older memory management mechanism known as segmentation.If the application loads a segment register with the selector of a non-present segment (which under POSIX-compliant OSescan only be done with assembly language), the exceptionis generated. Some OSes used that for swapping, but under Linux this generates SIGBUS.

Example

This is an example of unaligned memory access, written in the C programming language with AT&T assembly syntax.

  1. include

int main(int argc, char **argv)

Compiling and running the example on a POSIX compliant OS on x86 demonstrates the error:$ gcc -ansi sigbus.c -o sigbus$ ./sigbus Bus error$ gdb ./sigbus(gdb) rProgram received signal SIGBUS, Bus error.0x080483ba in main (gdb) x/i $pc0x80483ba : mov DWORD PTR [eax],0x2a(gdb) p/x $eax$1 = 0x804a009(gdb) p/t $eax & (sizeof(int) - 1)$2 = 1The GDB debugger shows that the immediate value 0x2a is being stored at the location stored in the EAX register, using X86 assembly language. This is an example of register indirect addressing.

Printing the low order bits of the address shows that it is not aligned to a word boundary ("dword" using x86 terminology).

Notes and References

  1. z/Architecture Principles of Operation, SA22-7832-04, Page 6-6, Fifth Edition (September, 2005) IBM Corporation, Poukeepsie, NY, Retrievable from http://publibfp.dhe.ibm.com/epubs/pdf/a2278324.pdf (Retrieved December 31, 2015)
  2. Web site: What is SIGBUS - Object specific hardware error?.