Eye movement in scene viewing explained

Eye movement in scene viewing refers to the visual processing of information presented in scenes. This phenomenon has been studied in a range of areas such as cognitive psychology and psychophysics, where eye movement can be monitored under experimental conditions. A core aspect in these studies is the division of eye movements into saccades, the rapid movement of the eyes, and fixations, the focus of the eyes on a point. There are several factors which influence eye movement in scene viewing, both the task and knowledge of the viewer (top-down factors), and the properties of the image being viewed (bottom-up factors). The study of eye movement in scene viewing helps to understand visual processing in more natural environments.

Typically, when presented with a scene, viewers demonstrate short fixation durations and long saccade amplitudes in the earlier phases of viewing an image, representing ambient processing. This is followed by longer fixations and shorter saccades in the latter phases of scene viewing, representing focal processing (Pannasch et al., 2008).

Eye movement behaviour in scene viewing differs between different levels of cognitive development. Fixation durations shorten and saccade amplitudes lengthen with the increase in age. In children, the development of saccades to the amplitude normally found in adults have occur earlier (4–6 years old) than the development of fixation durations (6–8 years old). Yet, the typical pattern of behaviour during scene viewing, when progressing from ambient processing to focal processing, has been observed to occur from the age of 2 years old (Helo, Pannasch, Sirri & Rämä, 2014).

Spatial variation

There are particular factors which affect where eye movements fixate upon, these include bottom-up factors inherent to the stimulus, and top-down factors inherent to the viewer. Even an initial glimpse of a scene has been found to generate an abstract representation of the image that can be stored in memory for use in subsequent eye movements (Castelhano & Henderson, 2007).

In bottom-up factors, eye guidance can be affected by the local contrast or salience of features in an image (Itti & Koch, 2000). An example of this would be an area with a large difference in luminance (Parkhurst et al., 2002), a greater density of edges (Mannan, Ruddock & Wooding, 1996) or binocular disparity determining the distance of different objects on the scene (Jansen et al., 2009).

The top-down factors of scenes have more of an impact than bottom-up features in affecting fixation positions. Behaviourally relevant information that are more interesting in a scene is more salient than low-level features, drawing fixations more frequently and more quickly from scene onset (Onat, Açik, Schumann & König, 2014). Local scene colour in a fixation position has an influence on where fixations occur. The presence of colour can increase the likelihood of the item being processed as a semantic object as it can aid the discrimination of the object, making it more interesting to view (Amano & Foster, 2014). When viewers are semantically primed by being presented with consistently similar scenes, the density of fixations increase, and fixation durations decrease (Henderson, Weeks Jr., & Hollingworth, 1999).

Information separate to what is presented in a scene also has an effect on the area being fixated upon. Eye movements can be guided anticipatorily by linguistic input, where if an item in the scene is presented verbally, the listener will be more likely to move their visual focus to that object (Staub, Abott & Bogartz, 2012).With regard to factors relating to viewers rather than the scene, differences have been found in cross-cultural research. Westerners have an inclination to concentrate on focal objects in a scene, where they look at focal objects more often and quicker in comparison to East Asians who attend more to contextual information, where they make more saccades to the background of the scene (Chua, Boland & Nisbett, 2002).

Temporal variation

Regarding the temporality of fixations, average fixation durations last for 300ms on average, although there is a large variability around this approximation. Some of this variability can be explained through global properties of an image, impacting upon both bottom-up processing and top-down processing.

During natural scene viewing, the masking of an image by replacing it with a grey field during fixations has an increase in fixation durations (Henderson & Pierce, 2008). More subtle degradations of an image on fixation durations, such as the decrease in luminance of an image during fixations, also increases the length of fixation durations (Henderson, Nuthmann & Luke, 2013). An asymmetric effect is shown where the increase of luminance also increases fixation durations (Walshe & Nuthmann, 2014). However, the change in factors affecting top-down processing, such as blurring or phase noise, increases fixation durations when used to degrade a scene and decreases fixation durations when used to enhance a scene (Henderson, Olejarczyk, Luke & Schmidt, 2014; Einhäuser et al., 2006).

Furthermore, temporal and spatial aspects interact in a complex manner. When a picture is first presented on the screen, fixations made within the first second are more likely to be directed toward the left side of the scene, whereas the opposite holds true for the remaining part of the presentation (Ossandón et al., 2014).

References

Amano, K. & Foster, D., H. (2014). Influence of local scene color on fixation position in visual search. Journal of the Optical Society of America A, 31, A254-A261.
Castelhano, M., S. & Henderson, J., M. (2007). Initial Scene Representations Facilitate Eye Movement Guidance in Visual Search. Journal of Experimental Psychology: Human Perception and Performance, 33, 753-763.
Chua, H., F., Boland, J., E. & Nisbett, R., E. (2002). Cultural variation in eye movement during scene perception. Proceedings of the National Academy of Sciences of the United States of America, 102, 12629-12633.
Einhäuser, W., Rutishauser, U., Frady, E., P., Nadler, S., König, P. & Kock, C. (2006). The relation of phase noise and luminance contrast to overt attention in complex visual stimuli. Journal of Vision, 6, 1148-1158.
Helo, A., Pannasch, S., Sirri, L. & Rämä, P. (2014). The maturation of eye movement behaviour: scene viewing characteristics in children and adults. Vision Research, 103, 83-91.
Henderson, J., M., Weeks, Jr., P., A. & Hollingworth, A. (1999). The Effects of Semantic Consistency on Eye Movements During Complex Scene Viewing. Journal of Experimental Psychology: Human Perception and Performance, 25, 210-228.
Henderson, J., M. & Pierce, G., L. (2008). Eye movements during scene viewing: Evidence for mixed control of fixation durations. Psychonomic Bulletin & Review, 15, 566-573.
Henderson, J., M., Nuthmann, A. & Luke, S., G. (2013). Eye Movement Control During Scene Viewing: Immediate Effects of Scene Luminance on Fixation Durations. Journal of Experimental Psychology: Human Perception and Performance, 39, 318-322.
Henderson, J., M., Olejarczyk, J., Luke, S., G. & Schmidt, J. (2014). Eye movement control during scene viewing: Immediate degradation and enhancement effects of spatial frequency filtering. Visual Cognition, 22, 486-502.
Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40, 1489-1506.
Jansen, L., Onat, S., & König, P. (2009). Influence of disparity on fixation and saccades in free viewing of natural scenes. Journal of Vision, 9(1):29, 1–19, http://journalofvision.org/9/1/29/, doi:10.1167/9.1.29.
Mannan, S., K., Ruddock, K., H. & Wooding, D., S. (1996). The relationship between the locations of spatial features and those of fixations made during visual examination of briefly presented images. Spatial Vision, 10, 165–188.
Onat, S., Açik, A., Schumann, F. & König, P. (2014). The Contributions of Image Content and Behavioural Relevancy to Overt Attention. PLOS ONE, 9, e93254.
Ossandón, J. P., Onat, S., & König, P. (2014) Spatial biases in viewing behavior. Journal of Vision, 14(2):20, 1–26.
Pannasch, S., Helmert, J., R., Roth, K., Herbold, A.-K. & Walter, H. (2008). Visual Fixation Durations and Saccade Amplitudes: Shifting Relationship in a Variety of Conditions.Journal of Eye Movement Research, 2, 1-19.
Parkhurst, D. J., Law, K., & Niebur, E. (2002). Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42, 107-123.
Staub, A., Abbott, M. & Bogartz, R., S. (2012). Linguistically guided anticipatory eye movements in scene viewing. Visual Cognition, 20, 922-946.
Walshe, R., C. & Nuthmann, A. (2014). Asymmetrical control of fixation durations in scene viewing. Vision Research, 100, 38-46.

Eye movement in scene viewing explained

Spatial variation

Temporal variation

See also

References