David C. Marr | |
Birth Date: | 1945 1, df=yes |
Birth Place: | Woodford, London, UK |
Death Place: | Cambridge, Massachusetts, U.S. |
Thesis Title: | A general theory for cerebral cortex |
Thesis Url: | https://idiscover.lib.cam.ac.uk/permalink/f/16u99e0/44CAM_ALMA21429706740003606 |
Thesis Year: | 1972 |
Field: | Computational neuroscience Artificial intelligence Psychology |
Work Institutions: | Massachusetts Institute of Technology |
Alma Mater: | Trinity College, Cambridge |
Doctoral Advisor: | Giles Brindley |
Doctoral Students: | Shimon Ullman Eric Grimson John M. Hollerbach |
Awards: | IJCAI Computers and Thought Award |
David Courtenay Marr (19 January 1945 – 17 November 1980)[1] was a British neuroscientist and physiologist. Marr integrated results from psychology, artificial intelligence, and neurophysiology into new models of visual processing. His work was influential in computational neuroscience and led to a resurgence of interest in the discipline.
Born in Woodford, Essex, and educated at Rugby School; he was admitted at Trinity College, Cambridge on 1 October 1963 (having been awarded an Open Scholarship and the Lees Knowles Rugby Exhibition).
He was awarded the Coutts Trotter Scholarship in 1966 and obtained his BA in mathematics the same year. He was elected a Research Fellow of Trinity College, Cambridge in 1968. His doctoral dissertation, supervised by Giles Brindley, was submitted in 1969 and described his model of the function of the cerebellum based mainly on anatomical and physiological data garnered from a book by J.C. Eccles. His interest turned from general brain theory to visual processing. Subsequently, he worked at the Massachusetts Institute of Technology, where he took on a faculty appointment in the Department of Psychology in 1977 and was subsequently made a tenured full professor in 1980. Marr proposed that understanding the brain requires an understanding of the problems it faces and the solutions it finds. He emphasised the need to avoid general theoretical debates and instead focus on understanding specific problems.
Marr died of leukemia in Cambridge, Massachusetts, at the age of 35. His findings are collected in the book Vision: A computational investigation into the human representation and processing of visual information, which was finished mainly in the summer of 1979, was published in 1982 after his death and re-issued in 2010 by The MIT Press. This book had a key role in the beginning and rapid growth of computational neuroscience field.[2] He was married to Lucia M. Vaina of Boston University's Department of Biomedical Engineering and Neurology.
There are various academic awards and prizes named in his honour. The Marr Prize, one of the most prestigious awards in computer vision, the David Marr Medal awarded every two years by the Applied Vision Association in the UK,[3] and the Cognitive Science Society also awards a Marr Prize for the best student paper at its annual conference.
Marr is best known for his work on vision, but before he began work on that topic he published three seminal papers proposing computational theories of the cerebellum (in 1969), neocortex (in 1970), and hippocampus (in 1971). Each of those papers presented important new ideas that continue to influence modern theoretical thinking.
The cerebellum theory[4] was motivated by two unique features of cerebellar anatomy: (1) the cerebellum contains vast numbers of tiny granule cells, each receiving only a few inputs from "mossy fibers"; (2) Purkinje cells in the cerebellar cortex each receive tens of thousands of inputs from "parallel fibers", but only one input from a single "climbing fiber", which however is extremely strong. Marr proposed that the granule cells encode combinations of mossy fibre inputs, and that the climbing fibres carry a "teaching" signal that instructs their Purkinje cell targets to modify the strength of synaptic connections from parallel fibres.
The theory of neocortex[5] was primarily motivated by the discoveries of David Hubel and Torsten Wiesel, who found several types of "feature detectors" in the primary visual area of the cortex. Marr proposed, generalising on that observation, that cells in the neocortex are flexible categorizers—that is, they learn the statistical structure of their input patterns and become sensitive to combinations that are frequently repeated.
The theory of hippocampus[6] (which Marr called "archicortex") was motivated by the discovery by William Scoville and Brenda Milner that destruction of the hippocampus produced amnesia for memories of new or recent events but left intact memories of events that had occurred years earlier. Marr called his theory "simple memory": the basic idea was that the hippocampus could rapidly form memory traces of a simple type by strengthening connections between neurons. Remarkably, Marr's paper only preceded by two years a paper by Tim Bliss and Terje Lømo that provided the first clear report of long-term potentiation in the hippocampus, a type of synaptic plasticity very similar to what Marr hypothesized.[7] (Marr's paper contains a footnote mentioning a preliminary report of that discovery.[8]) The details of Marr's theory are no longer of great value because of errors in his understanding of hippocampal anatomy, but the basic concept of the hippocampus as a temporary memory system remains in a number of modern theories.[9] At the end of his paper Marr promised a follow-up paper on the relations between the hippocampus and neocortex, but no such paper ever appeared.
Marr treated vision as an information processing system. He put forth (in concert with Tomaso Poggio) the idea that one must understand information processing systems at three distinct, complementary levels of analysis.[10] This idea is known in cognitive science as Marr's Tri-Level Hypothesis:[11]
Marr illustrates his tripartite analysis recurring to the example of a device whose functioning is well understood: a cash register.[12]
At the computational level, the functioning of the register can be accounted for in terms of arithmetic and, in particular, in terms of the theory of addition: at this level are relevant the computed function (addition), and such abstract properties of it, as commutativity or associativity. The level of representation and algorithm specify the form of the representations and the processes elaborating them: "we might choose Arabic numerals for the representations, and for the algorithm we could follow the usual rules about adding the least significant digits first and `carrying' if the sum exceeds 9".[12] Finally, the level of implementation has to do with how such representations and processes are physically realized; for example, the digits could be represented as positions on a metal wheel, or, alternatively, as binary numbers coded by the electrical states of digital circuitry. Notably, Marr pointed out that the most important level for the design of effective systems is the computational one.[12]
Marr described vision as proceeding from a two-dimensional visual array (on the retina) to a three-dimensional description of the world as output. His stages of vision include:[10]
2.5D sketch is related to stereopsis, optic flow, and motion parallax. The 2.5D sketch represents that in reality we do not see all of our surroundings but construct the viewer-centered three dimensional view of our environment. 2.5D Sketch is a so-called paraline drawing technique of data visualization and often referred to by its generic term "axonometric" or "isometric" drawing and is often used by modern architects and designers.[13]
Marr's three-stage framework does not capture well a central stage of visual processing: visual attention. A more recent, alternative, framework proposed that vision is composed instead of the following three stages: encoding, selection, and decoding.[14] Encoding is to sample and represent visual inputs (e.g., to represent visual inputs as neural activities in the retina).[15] Selection, or attentional selection, is to select a tiny fraction of input information for further processing, e.g., by shifting gaze to an object or visual location to better process the visual signals at that location. Decoding is to infer or recognize the selected input signals, e.g., to recognize the object at the center of gaze as somebody's face.