Alexander G. Hauptmann is an American academic. He currently serves as a research professor in the Language Technologies Institute at the School of Computer Science at Carnegie Mellon University. He has been the leader of the Informedia Digital Library, which has made seminal strides in multimedia information retrieval and won best-paper awards at major conferences. He was also a founder of the international advisory committee for the Text Retrieval Conference Video Retrieval Evaluation, also known as TRECVID.
Hauptmann started at the Johns Hopkins University in 1978 and received a BA and an MA in psychology in 1982. For two years, he studied computer science at the Technische Universitaet Berlin. In 1991, he received a PhD in computer science from the Carnegie Mellon University (CMU).
From 1984, he was researcher at Carnegie Mellon University in the CMU speech group. The next two years, he was a research associate at the School of Computer Science, since 1994 a system scientist, and since 1998 a senior system scientist.
In 2003, he received the Allen Newell Award for Research Excellence, for the Informedia Digital Library, with H. Wactlar, M. Christel, T. Kanade and S. Stevens.
His research interests are in multimedia analysis and indexing, speech recognition, speech synthesis, speech interfaces, interfaces to multimedia systems and language in general.[1] According to Hauptmann (2008) "Over the years his research interests have led him to pursue and combine several different areas of research: man-machine communication, natural language processing and speech understanding".[2]
In the area of man-machine communication, according to Hauptmann (2008), "he is interested in the tradeoffs between different modalities, including gestures and speech, and in the intuitiveness of interaction protocols. In natural language processing, his desire is to break through the bottlenecks that are currently preventing larger scale natural language applications. The latter theme was also the focus of my thesis, which investigated the use of machine learning on large text samples to acquire the knowledge needed for semantic natural language understanding".[2]