Visual masking is a phenomenon of visual perception. It occurs when the visibility of one image, called a target, is reduced by the presence of another image, called a mask.[1] The target might be invisible or appear to have reduced contrast or lightness. There are three different timing arrangements for masking: forward masking, backward masking, and simultaneous masking. In forward masking, the mask precedes the target. In backward masking the mask follows the target. In simultaneous masking, the mask and target are shown together. There are two different spatial arrangements for masking: pattern masking and metacontrast. Pattern masking occurs when the target and mask locations overlap. Metacontrast masking occurs when the mask does not overlap with the target location.
Suppression can be seen in both forward and backward masking when there is pattern masking, but not when there is metacontrast. Simultaneous masking, however, will produce facilitation of target visibility during pattern masking. Facilitation also comes about when metacontrast is combined with either simultaneous or forward masking.[2] This is because it takes time for the mask to reach the target's location through lateral propagation. As the target gets further from the mask, the time required for lateral propagation increases. Thus, the masking effect will increase as the mask gets closer to the target.
As the time difference between the target and the mask increases, the masking effect decreases. This is because the integration time of a target stimulus has an upper limit 200 ms, based on physiological experiments[3] [4] [5] and as the separation approaches this limit, the mask is able to produce less of an effect on the target, as the target has had more time to form a full neural representation in the brain.Polat, Sterkin, and Yehezkel went into great detail in explaining the effect of temporal matching between target input and lateral propagation of the mask. Based on data from previous single-unit recordings, they concluded that the time window for any sort of efficient interaction with target processing is 210 to 310 ms after the target's appearance. Anything outside of this window would fail to cause any sort of masking effect. This explains why there is a masking effect when the mask is presented 50 ms after the target, but not when the inter-stimulus interval between mask and target is 150 ms. In the first case, mask response would propagate to the target location and be processed with a delay of 260 to 310 ms, whereas the ISI of 150 would result in a delay of 410 to 460 ms.
In dichoptic visual masking, the target is presented to one eye and the mask to the other, whereas in monoptic visual masking, both eyes are presented with the target and the mask. It was found that the masking effect was just as strong in dichoptic as it was in monoptic masking, and that it showed the same timing characteristics.[6] [7] [8]
There are multiple theories surrounding the neural correlates of masking, but most of them agree on a few key ideas. First, backward visual masking comes about from suppression of the target's “after-discharge”,[9] where the after-discharge can be thought of as the neural response to the target's termination. Impairments in backward masking have been consistently found in those with schizophrenia[10] as well as in their unaffected siblings,[11] [12] thus suggesting that the impairments might be an endophenotype for schizophrenia.[13]
Forward masking, on the other hand, is correlated to the suppression of the target's “onset-response”, which can be thought of as the neural response to the target's appearance.
Originally proposed by Breitmeyer and Ganz in 1976,[14] the original version of this model stated that there were two different visual information channels- one being fast and transient, the other being slow and sustained. The theory asserts that each stimulus travels up each channel, and both channels are necessary for proper and full processing of any given stimulus. It explained backward masking by saying that the neural representation of the mask would travel up the transient channel and intercept the neural representation of the target as it travelled up the slower channel, suppressing the target's representation and decreasing its visibility. One problem with this model, as proposed by Macknik and Martinez-Conde, is that it predicts masking to occur as a function of how far apart, temporally, the stimulus onset is. However, Macknik and Martinez-Conde showed that backward masking is actually more dependent on how far apart stimulus termination is.
Breitmeyer and Ögmen modified the two-channel model in 2006,[15] renaming it to the retino-cortical dynamics (RECOD) model in the process. Their main proposed modification was that the fast and slow channels were actually feed forward and feedback channels, instead of the magnocellular and parvocellular retino-geniculocortical pathways, which is what had previously been proposed. Thus, according to this new model, backward masking is caused when feed forward input from the mask interferes with the feedback coming from the higher visual areas’ response to the target, thus reducing visibility.
This model proposes that backward masking is caused by an interference with feedback from higher visual areas.[16] In this model, target duration is irrelevant because masking is supposed to occur as a function of feedback, which is generated when the target appears on screen. Lamme's group further supported their model when they described that the surgical removal of the extrastriate cortex in monkeys leads to a reduction of area V1 late responses.[17]
Proposed by Macknik and Martinez-Conde in 2008, this theory proposes that masking can be explained almost entirely by feed forward lateral inhibition circuits. The idea is that the edges of the mask, if positioned in close proximity to the target, may inhibit the responses caused by the edges of the target, inhibiting perception of the target.
Haynes, Driver, and Rees proposed this theory in 2005,[18] stating that visibility derives from the feed forward and feedback interactions between the V1 and fusiform gyrus. In their experiment, they required subjects to attend actively to the target- thus, as Macknik and Martinez-Conde point out, it is possible that their results were confounded by the attentional aspect of the trials, and that the results may not accurately reflect the effects of visual masking.
This was proposed by Thompson and Schall, based on experiments conducted in 1999[19] and 2000.[20] They concluded that visual masking is processed in the frontal-eye fields, and that the neural correlate of masking lies not in the inhibition of the response to the target but in the “merging” of target and mask responses. One criticism of their experiment, however, is that their target was almost 300x dimmer than the mask, so their results may have been confounded by the different response latencies one would expect from stimuli with such differences in brightness.
Macknik & Martinez-Conde[21] recorded from neurons in the lateral geniculate nucleus (LGN) and V1 V1 while presenting monoptic and dichoptic stimuli, and found that monoptic masking occurred in all the LGN and V1 neurons that were recorded, but dichoptic masking only occurred in some of the binocular neurons in V1, which supports the hypothesis that visual masking in monoptic regions is not due to feedback from dichoptic regions. This is because, if there had been feedback from higher areas of the visual field, the early circuits would have “inherited” dichoptic masking from the feedback coming from higher levels, and so would exhibit both dichoptic and monoptic masking. Although monoptic masking is stronger in the early visual areas, monoptic and dichoptic masking are equivalent in magnitude. Thus, dichoptic masking must become stronger as it proceeds down the visual hierarchy if the preceding hypothesis is correct. In fact, dichoptic masking was shown to begin downstream of area V2.