A temporal expression in a text is a sequence of tokens (words, numbers and characters) that denote time, that is express a point in time, a duration or a frequency. Examples:
A point in time:
He was born on <TIMEX>6 May, 1980</TIMEX>.
A duration:
The show lasted <TIMEX>7 minutes</TIMEX>.
A frequency:
The pump circulates the water <TIMEX>every 2 hours</TIMEX>.
Initially, temporal expressions were considered a type of named entities and their identification was part of the named entity recognition task. Since the Automatic Content Extraction program in 2004 there has been a separate task identified and called Temporal Expression Recognition and Normalisation (TERN). Timex evaluation is now evaluated in two major temporal annotation challenges: TempEval and i2b2, both of which prefer the TimeML-level TIMEX3 standard.[1]
Similarly to NER systems, temporal expression taggers have been created either using linguistic grammar-based techniques or statistical models. Hand-crafted grammar-based systems typically obtained better results, but at the cost of months of work by experienced linguists. There are many such systems available now,[2] [3] [4] so creating a temporal expression recognizer from scratch is generally an undesirable duplication of effort. Instead, current approaches focus on novel subclasses of timex.[5]
Statistical systems typically require a large amount of manually annotated training data and are usually applied to the recognition task only (although there is work done using machine learning algorithms to resolve certain ambiguities in the interpretation step).[6] [7]