Media naturalness theory is also known as the psychobiological model. The theory was developed by Ned Kock and attempts to apply Darwinian evolutionary principles to suggest which types of computer-mediated communication will best fit innate human communication capabilities. Media naturalness theory argues that natural selection has resulted in face-to-face communication becoming the most effective way for two people to exchange information.
The theory has been applied to human communication outcomes in various contexts, such as: education,[1] knowledge transfer,[2] communication in virtual environments,[3] e-negotiation,[4] business process improvement,[5] trust and leadership in virtual teamwork,[6] online learning,[7] [8] maintenance of distributed relationships,[9] performance in experimental tasks using various media,[10] [11] and modular production.[12] Its development is also consistent with ideas from the field of evolutionary psychology.[13]
The media naturalness theory builds on the media richness theory's arguments that face-to-face interaction is the richest type of communication medium[14] by providing an evolutionary explanation for the face-to-face medium's degree of richness. Media naturalness theory argues that since ancient hominins communicated primarily face-to-face, evolutionary pressures since that time have led to the development of a brain that is consequently adapted for that form of communication.[13] [15] Kock points out that computer-mediated communication is far too recent a phenomenon to have had the time necessary to shape human cognition and language capabilities via natural selection. In turn, Kock argues that using communication media that suppress key elements found in face-to-face communication, as many electronic communication media do, ends up posing cognitive obstacles to communication, and particularly in the case of complex tasks (e.g., business process redesign, new product development, online learning), because such tasks seem to require more intense communication over extended periods of time than simple tasks.[13]
The naturalness of a communication medium is defined by Kock as the degree of similarity of the medium with the face-to-face medium. The face-to-face medium is presented as the medium enabling the highest possible level of communication naturalness, which is characterized by the following five key elements:[13] [15] (1) a high degree of co-location, which would allow the individuals engaged in a communication interaction to see and hear each other; (2) a high degree of synchronicity, which would allow the individuals to quickly exchange communicative stimuli; (3) the ability to convey and observe facial expressions; (4) the ability to convey and observe body language; and (5) the ability to convey and listen to speech.
Media naturalness theory predicts that any electronic communication medium allowing for the exchange of significantly less or more communicative stimuli per unit of time than the face-to-face medium will pose cognitive obstacles to communication.[13] In other words, media naturalness theory places the face-to-face medium at the center of a one-dimensional scale of naturalness, where deviations to the left or right are associated with decreases in naturalness (see Figure 1).
Electronic media that enable the exchange of significantly more communicative stimuli per unit of time than the face-to-face medium are classified by media naturalness theory as having a lower degree of naturalness than the face-to-face medium. As such, those media are predicted to be associated with higher cognitive effort; in this case due primarily to a phenomenon known as information overload, which is characterized by individuals having more communicative stimuli to process than they are able to.[13]
Human beings possess specialized brain circuits that are adapted for the recognition of faces and the generation and recognition of facial expressions, which artificial intelligence research suggests require complex computations that are difficult to replicate even in powerful computers. The same situation is found in connection with speech generation and recognition. Generation and recognition of facial expressions, and speech generation and recognition, are performed effortlessly by humans.[13]
Cognitive effort is defined in media naturalness theory as the amount of mental activity, or, from a biological perspective, the amount of brain activity involved in a communication interaction.[13] It can be assessed directly, with the use of techniques such as magnetic resonance imaging. Cognitive effort can also be assessed indirectly, based on perceptions of levels of difficulty associated with communicative tasks, as well as through indirect measures such as that of fluency. Fluency is defined as the amount of time taken to convey a certain number of words through different communication media, which is assumed to correlate (and serve as a surrogate measure of) the amount of time taken to convey a certain number of ideas through different media.[10] According to media naturalness theory, a decrease in the degree of naturalness of a communication medium leads to an increase in the amount of cognitive effort required to use the medium for communication.[13]
Individuals brought up in different cultural environments usually possess different information processing schemas that they have learned over their lifetimes. Different schemas make individuals interpret information in different ways, particularly when information is expected but not actually provided.[13] [15]
While different individuals are likely to look for the same types of communicative stimuli, their interpretation of the message being communicated in the absence of those stimuli will be largely based on their learned schemas, which are likely to differ from those held by other individuals (no two individuals, not even identical twins raised together, go through exactly the same experiences during their lives). According to media naturalness theory, a decrease in medium naturalness, caused by the selective suppression of media naturalness elements in a communication medium, leads to an increase in the probability of misinterpretations of communicative cues, and thus an increase in communication ambiguity.[15]
To say that our genes influence the formation of a phenotypic trait (i.e., a biological trait that defines a morphological, behavioral, physiological, etc. characteristic) does not mean the same as saying that the trait in question is innate. Very few phenotypic traits are innate (e.g., blood type); the vast majority, including most of those in connection with our biological communication apparatus, need interaction with the environment to be fully and properly developed.[15]
While there is substantial evidence suggesting that our biological communication apparatus is adapted for face-to-face communication, there is also ample evidence that such an apparatus (including the neural functional language system) cannot be fully developed without a significant amount of practice. Thus, according to media naturalness theory, evolution must have shaped brain mechanisms to compel human beings to practice the use of their biological communication apparatus; mechanisms that are similar to those compelling animals to practice those skills that play a key role in connection with survival and mating.[15] Among these mechanisms, one of the most important is that of physiological arousal, which is often associated with excitement and pleasure. Engaging in communication interactions, particularly in face-to-face situations, triggers physiological arousal in human beings. Suppression of media naturalness elements makes communication interactions duller than if those elements were present.[15]
Complex speech was enabled by the evolution of a larynx located relatively low in the neck, which considerably increased the variety of sounds that our species could generate; this is actually one of the most important landmarks in the evolution of the human species.[13] However, that adaptive design also significantly increased our ancestors' chances of choking on ingested food and liquids, and suffering from aerodigestive tract diseases such as gastroesophageal reflux. This leads to an interesting conclusion, which is that complex speech must have been particularly important for effective communication in our evolutionary past, otherwise the related evolutionary costs would prevent it from evolving through natural selection.[13] This argument is similar to that made by Amotz Zahavi in connection with evolutionary handicaps. If a trait evolves to improve the effectiveness in connection with a task, in spite of imposing a survival handicap, then the trait should be a particularly strong determinant of the performance in the task to offset the survival cost it imposes.
Media naturalness theory builds on this evolutionary handicap conclusion to predict that the degree to which an electronic communication medium supports an individual's ability to convey and listen to speech is particularly significant in defining its naturalness.[13] Media naturalness theory predicts, through its speech imperative proposition, that speech enablement influences naturalness significantly more than a medium's degree of support for the use of facial expressions and body language.[13] This prediction is consistent with past research showing that removing speech from an electronic communication medium significantly increases the perceived mental effort associated with using the medium to perform knowledge-intensive tasks. According to this prediction, a medium such as audio conferencing is relatively close to the face-to-face medium in terms of naturalness (see Figure 2).
Increases in cognitive effort and communication ambiguity are usually accompanied by an interesting behavioral phenomenon, called compensatory adaptation.[10] The phenomenon is characterized by voluntary and involuntary attempts by the individuals involved in a communicative act to compensate for the obstacles posed by an unnatural communication medium. One of the key indications of compensatory adaptation is a decrease in communication fluency, which can be measured through the number of words conveyed per minute through a communication medium. That is, communication fluency is believed to go down as a result of individuals making an effort to adapt their behavior in a compensatory way.[10]
For example, an empirical study suggests that when individuals used instant messaging and face-to-face media to perform complex and knowledge-intensive tasks, the use of the electronic (i.e., instant messaging) medium caused several effects. Those effects were consistent with media naturalness theory, and the compensatory adaptation notion.[16] Among those effects, the electronic medium increased perceived cognitive effort by approximately 40% and perceived communication ambiguity by approximately 80% – as predicted by media naturalness theory. The electronic medium also reduced actual fluency by approximately 80%, and the quality of the task outcomes was not affected, suggesting compensatory adaptation.
The 2011 media compensation theory[17] by Hantula, Kock, D'Arcy, and DeRosa proposes a new theory that further refines Kock's media naturalness theory. The authors explain that the media compensation theory has been developed to specifically address two paradoxes:
The authors grapple with how humans "who have not changed much in many millennia" (Hantula et al., 2011, p. 358) are able to successfully embrace and employ lean media, such as texting, considering their assumption that human evolution has progressed down a path toward, and adeptness for, face-to-face communication.
Kock and Garza (2011)[18] continue Kock's research on media naturalness by studying whether taking a college course on-line, as opposed to in-person, will negatively impact students’ actual and perceived learning experiences due to differences in the respective media richness and media naturalness afforded by the two different approaches studied. The findings show that the online cohort performed statistically as well as the in-person portion of the study sample. The authors suggest that the study’s findings support Carlson and Zmud’s channel expansion theory (1999)[19], which asserts that humans are capable of adapting to new communication media (Kock & Garza, 2011). Kock & Garza (2011) also argue that a portion of their findings support Kock’s (2004) earlier, since-disputed claim that people are not evolutionarily equipped to communicate through computer mediated communication as well as when they are communicating through richer media such as face-to-face communication. For example, DeClerck and Holtzman dispute an overriding need for visual and auditory cues when they say that “experienced users are able to accurately convey their intended message via digitally-mediated communication, despite the lack of available verbal and non-verbal cues” (2018, p. 116).[20] DeClerk and Holtzman also suggest that text messaging may be more focused than face-to-face communication because it is not cluttered by additional verbal and non-verbal cues that can otherwise tie up “cognitive resources” (2018, p. 111). Additionally, Lisiecka, Rychwalska, Samson, Lucznik, Ziembowicz, Schostek, and Nowak (2016) point out that, although it has been generally accepted that “media other than face-to-face are considered an obstacle rather than an equally effective means of information transfer” (2016, p. 13), the results their study suggest that computer-mediated communication “has become similarly natural and intuitive as face-to-face contacts” (2016, p. 13).[21]