The face inversion effect is a phenomenon where identifying inverted (upside-down) faces compared to upright faces is much more difficult than doing the same for non-facial objects.[1] [2]
A typical study examining the face inversion effect would have images of the inverted and upright object presented to participants and time how long it takes them to recognise that object as what it actually is (i.e. a picture of a face as a face). The face inversion effect occurs when, compared to other objects, it takes a disproportionately longer time to recognise faces when they are inverted as opposed to upright.[3] [4]
Faces are normally processed in the special face-selective regions of the brain, such as the fusiform face area.[5] However, processing inverted faces involves both face-selective regions and additional visual areas such as mid-level visual areas [6] and high-level scene-sensitive and object-sensitive regions of the parahippocampal place area and lateral occipital cortex. There seems to be something different about inverted faces that requires them to also involve these mid-level and high-level scene and object processing mechanisms.[7]
The most supported explanation for why faces take longer to recognise when they are inverted is the configural information hypothesis. The configural information hypothesis states that faces are processed with the use of configural information to form a holistic (whole) representation of a face. Objects, however, are not processed in this configural way. Instead, they are processed featurally (in parts). Inverting a face disrupts configural processing, forcing it to instead be processed featurally like other objects. This causes a delay since it takes longer to form a representation of a face with only local information.
Faces are processed in separate areas of the brain to other stimuli, such as scenes or non-facial objects. For example, the fusiform face area (FFA) is a face-selective region in the brain that is only used for facial processing. The FFA responds more to upright but not inverted faces, demonstrating that inverted faces are not detected the same way that upright faces are.[8]
The scene-selective parahippocampal place area (PPA) processes places, or scenes of the visual environment.[9] The object recognition area in the lateral occipital cortex (LOC) is involved in the processing of objects.[10] Together, these regions are used to process inverted, but not upright faces. This suggests that there is something special about inverted compared to upright faces that requires them to involve object and scene processing regions.
There is still some activity in face recognition regions when viewing inverted faces.[11] Evidence has found that a face-selective region in the brain known as the occipital face area (OFA) is involved in the processing of both upright and inverted faces.
Overall, face and object processing mechanisms seem to be separate in the brain. Recognising upright faces involves special facial recognition regions, but recognising inverted faces involves both face and non-facial stimuli recognition regions.
Configural information, also known as relational information, helps people to quickly recognise faces. It involves the arrangement of facial features, such as the eyes and nose. There are two types of configural information: first-order relational information and second-order relational information.[12]
First-order relational information consists of the spatial relationships between different features of the face. These relationships between facial features are common to most people, for example, having the mouth located under the nose. First-order relational information therefore helps to identify a face as a face and not some other object.
Second-order relational information is the size of the relationships between the features of the face, relative to a prototype (a model of what a face should look like). This type of information helps to distinguish one face from another because it differs between different faces.
The holistic processing of faces describes the perception of faces as wholes, rather than the sum of their parts. This means that facial features (such as the eyes or nose) are not explicitly represented in the brain on their own, rather, the entire face is represented.[13]
According to the configural information hypothesis of face recognition, recognising faces involves two stages that use configural information to form holistic representations of faces.[14]
A study demonstrated that face-selective activity in the brain was delayed when the configural information of faces was disrupted (for example, when faces were inverted).[15] This means that it took longer for the participants to recognise the faces they were viewing as faces and not other (non-facial) objects. The configural information explanation for facial recognition is therefore supported by the presence of the face inversion effect (a delay when faces are inverted).
The first stage of recognising faces in the configural information hypothesis is first-level information processing. This stage uses first-order relational information to detect a face (i.e. to determine that a face is actually a face and not another object). Building a holistic representation of a face occurs at this early stage of face processing, to allow faces to be detected quickly.
The next stage, second-level information processing, distinguishes one face from another with the use of second-order relational information.
Smaller inversion effects occur for non-facial objects, suggesting that faces and other objects are not processed in the same way.
Face recognition involves configural information to process faces holistically. However, object recognition does not use configural information to form a holistic representation. Instead, each part of the object is processed independently to allow it to be recognised. This is known as a featural recognition method. Additionally, an explicit representation of each part of the object is made, rather than a representation of the object as a whole.
According to the configural information hypothesis, the face inversion effect occurs because configural information can no longer be used to build a holistic representation of a face. Inverted faces are instead processed like objects, using local information (i.e. the individual features of the face) instead of configural information.
A delay is caused when processing inverted faces compared to upright faces. This is because the specific holistic mechanism (see holistic processing) that allows faces to be quickly detected is absent when processing inverted faces. Only local information is available when viewing inverted faces, disrupting this early recognition stage and therefore preventing faces from being detected as quickly. Instead, independent features are put together piece-by-piece to form a representation of the object (a face) and allow the viewer to recognise what it is.[16]
Although the configural processing hypothesis is a popular explanation for the face inversion effect, there have been some challenges to this theory. In particular, it has been suggested that faces and objects are both recognised using featural processing mechanisms, instead of holistic processing for faces and featural processing for objects.[17] The face inversion effect is therefore not caused by delay from faces being processed as objects. Instead, another element is involved. Two potential explanations follow.
Perceptual learning is a common alternative explanation to the configural processing hypothesis for the face inversion effect. According to the perceptual learning theory, being presented with a stimulus (for example, faces or cars) more often makes that stimulus easier to recognise in the future.[18]
Most people are highly familiar with viewing upright faces. It follows that highly efficient mechanisms have been able to develop to the quick detection and identification of upright faces.[19] This means that the face inversion effect would therefore be caused by an increased amount of experience with perceiving and recognising upright faces compared to inverted faces.[20]
The face-scheme incompatibility model has been proposed in order to explain some of the missing elements of the configural information hypothesis. According to the model, faces are processed and assigned meaning by the use of schemes and prototypes.[21]
The model defines a scheme as an abstract representation of the general structure of a face, including characteristics common to most faces (i.e. the structure of and relationships between facial features). A prototype refers to an image of what an average face would look like for a particular group (e.g. humans or monkeys). After being recognised as a face with the use of a scheme, new faces are added to a group by being evaluated for their similarity to that group's prototype.
There are different schemes for upright and inverted faces: upright faces are more frequently viewed and thus have more efficient schemes than inverted faces. The face inversion effect is thus partly caused by less efficient schemes for processing the less familiar inverted form of faces. This makes the face-scheme incompatibility model similar to the perceptual learning theory, because both consider the role of experience important in the quick recognition of faces.
Instead of just one explanation for the face inversion effect, it is more likely that aspects of different theories apply. For example, faces could be processed with configural information but the role of experience may be important for quickly recognising a particular type of face (i.e. human or dog) by building schemes of this facial type.
The ability to quickly detect and recognise faces was important in early human life, and is still useful today. For example, facial expressions can provide various signals important for communication.[22] [23] Highly efficient facial recognition mechanisms have therefore developed to support this ability.
As humans get older, they become more familiar with upright human faces and continuously refine the mechanisms used to recognise them.[24] This process allows people to quickly detect faces around them, which helps with social interaction.
By about the first year of life, infants are familiar with faces in their upright form and are thus more prone to experiencing the face inversion effect. As they age, they get better at recognising faces and so the face inversion effect becomes stronger. The increased strength of the face inversion effect over time supports the perceptual learning hypothesis, since more experience with faces results in increased susceptibility to the effect.
The more familiar a particular type of face (e.g. human or dog) is, the more susceptible one is to the face inversion effect for that face. This applies to both humans and other species. For example, older chimpanzees familiar with human faces experienced the face inversion effect when viewing human faces, but the same result did not occur for younger chimpanzees familiar with chimpanzee faces.[25] The face inversion effect was also stronger for dog faces when they were viewed by dog experts. This evidence demonstrates that familiarity with a particular type of face develops over time and appears to be necessary for the face inversion effect to occur.
There are a number of conditions that may reduce or even eliminate the face inversion effect. This is because the mechanism used to recognise faces by forming holistic representations is absent or disrupted. This can cause faces to be processed the same way as other (non-facial) objects.
Prosopagnosia is a condition marked by an inability to recognize faces.[26] When those with prosopagnosia view faces, the fusiform gyrus (a facial recognition area of the brain) activates differently to how it would in someone without the condition.[27] Additionally, non-facial object recognition areas (such as the ventral occipitotemporal extrastriate cortex) are activated when viewing faces, suggesting that faces and objects are processed similarly.
Individuals with prosopagnosia can be unaffected or even benefit from face inversion in facial recognition tasks.[28] Normally, they process upright faces featurally, like objects. Inverted faces are also processed featurally rather than holistically.[29] This demonstrates that there is no difference between the processing of upright and inverted faces, which explains why there is no disproportionate delay for recognizing inverted faces.
Like those with prosopagnosia, individuals with autism spectrum disorder (ASD) do not use a configural processing mechanism to form a holistic representation of a face.[30] Instead, they tend to process faces with the use of local or featural information.[31] This means that the same featural mechanisms are used between processing upright faces, inverted faces, and objects. Consequentially, the face inversion effect is less likely to occur in those with ASD.[32]
However, there is some evidence that the development of a holistic facial recognition mechanism in those with ASD is simply delayed, rather than missing. This would mean that there would actually be a difference between the processing of upright and inverted faces. Those with ASD may therefore eventually become susceptible to the face inversion effect.[33]