A modified Harvard architecture is a variation of the Harvard computer architecture that, unlike the pure Harvard architecture, allows memory that contains instructions to be accessed as data. Most modern computers that are documented as Harvard architecture are, in fact, modified Harvard architecture.
See main article: article and Harvard architecture. The original Harvard architecture computer, the Harvard Mark I, employed entirely separate memory systems to store instructions and data. The CPU fetched the next instruction and loaded or stored data simultaneously[1] and independently. This is in contrast to a von Neumann architecture computer, in which both instructions and data are stored in the same memory system and (without the complexity of a CPU cache) must be accessed in turn.
The physical separation of instruction and data memory is sometimes held to be the distinguishing feature of modern Harvard architecture computers. With microcontrollers (entire computer systems integrated onto single chips), the use of different memory technologies for instructions (e.g. flash memory) and data (typically read/write memory) in von Neumann machines is becoming popular. The true distinction of a Harvard machine is that instruction and data memory occupy different address spaces. In other words, a memory address does not uniquely identify a storage location (as it does in a von Neumann machine); it is also necessary to know the memory space (instruction or data) to which the address belongs.
See main article: article and Von Neumann architecture. A computer with a von Neumann architecture has the advantage over Harvard machines as described above in that code can also be accessed and treated the same as data, and vice versa. This allows, for example, data to be read from disk storage into memory and then executed as code, or self-optimizing software systems using technologies such as just-in-time compilation to write machine code into their own memory and then later execute it. Another example is self-modifying code, which allows a program to modify itself.
A disadvantage of these methods are issues with executable space protection, which increase the risks from malware and software defects.
Accordingly, some pure Harvard machines are specialty products. Most modern computers instead implement a modified Harvard architecture. Those modifications are various ways to loosen the strict separation between code and data, while still supporting the higher performance concurrent data and instruction access of the Harvard architecture.
The most common modification builds a memory hierarchy with separate CPU caches for instructions and data at lower levels of the hierarchy. There is a single address space for instructions and data, providing the von Neumann model, but the CPU fetches instructions from the instruction cache and fetches data from the data cache. Most programmers never need to be aware of the fact that the processor core implements a (modified) Harvard architecture, although they benefit from its speed advantages. Only programmers who generate and store instructions into memory need to be aware of issues such as cache coherency, if the store doesn't modify or invalidate a cached copy of the instruction in an instruction cache.
Another change preserves the "separate address space" nature of a Harvard machine, but provides special machine operations to access the contents of the instruction memory as data. Because data is not directly executable as instructions, such machines are not always viewed as "modified" Harvard architecture:
A few Harvard architecture processors, such as the Maxim Integrated MAXQ, can execute instructions fetched from any memory segment - unlike the original Harvard processor, which can only execute instructions fetched from the program memory segment.Such processors, like other Harvard architecture processors - and unlike pure von Neumann architecture - can read an instruction and read a data value simultaneously, if they're in separate memory segments, since the processor has (at least) two separate memory segments with independent data buses.The most obvious programmer-visible difference between this kind of modified Harvard architecture and a pure von Neumann architecture is that - when executing an instruction from one memory segment - the same memory segment cannot be simultaneously accessed as data.[3] [4]
Three characteristics may be used to distinguish modified Harvard machines from pure Harvard and von Neumann machines:
Outside of applications where a cacheless DSP or microcontroller is required, most modern processors have a CPU cache which partitions instruction and data.
There are also processors which are Harvard machines by the most rigorous definition (that program and data memory occupy different address spaces), and are only modified in the weak sense that there are operations to read and/or write program memory as data. For example, LPM (Load Program Memory) and SPM (Store Program Memory) instructions in the Atmel AVR implement such a modification. Similar solutions are found in other microcontrollers such as the PIC and Z8Encore!, many families of digital signal processors such as the TI C55x cores, and more. Because instruction execution is still restricted to the program address space, these processors are very unlike von Neumann machines. External wiring can also convert a strictly Harvard CPU core into a modified Harvard one, for example by simply combining `PSEN#` (program space read) and `RD#` (external data space read) signals externally through an AND gate on an Intel 8051 family microcontroller, the microcontroller are said to be "von Neumann connected," as the external data and program address spaces become unified.
Having separate address spaces creates certain difficulties in programming with high-level languages that do not directly support the notion that tables of read-only data might be in a different address space from normal writable data (and thus need to be read using different instructions). The C programming language can support multiple address spaces either through non-standard extensions or through the now standardized extensions to support embedded processors.