Reification (information retrieval) explained

In information retrieval and natural language processing reification is the process by which an abstract idea about a person, place or thing, is turned into an explicit data model or other object created in a programming language, such as a feature set of demographic[1] or psychographic[2] attributes or both. By means of reification, something that was previously implicit, unexpressed, and possibly inexpressible is explicitly formulated and made available to conceptual (logical or computational) manipulation.

The process by which a natural language statement is transformed so actions and events in it become quantifiable variables is semantic parsing.[3] For example "John chased the duck furiously" can be transformed into something like

(Exists e)(chasing(e) & past_tense(e) & actor(e,John) & furiously(e) & patient(e,duck)).

Another example would be "Sally said John is mean", which could be expressed as something like

(Exists u,v)(saying(u) & past_tense(u) & actor(u,Sally) & that(u,v) & is(v) & actor(v,John) & mean(v)).

Such formal meaning representations allow one to use the tools of classical first-order predicate calculus even for statements which, due to their use of tense, modality, adverbial constructions, propositional arguments (e.g. "Sally said that X"), etc., would have seemed intractable. This is an advantage because predicate calculus is better understood and simpler than the more complex alternatives (higher-order logics, modal logics, temporal logics, etc.), and there exist better automated tools (e.g. automated theorem provers and model checkers) for manipulating it.

Meaning representations can be used for other purposes besides the application of first-order logic; one example is the automatic discovery of synonymous phrases.[4] [5]

The meaning representations are sometimes called quasi-logical forms, and the existential variables are sometimes treated as Skolem constants.

Not all natural language constructs admit a uniform translation to first order logic. See donkey sentence for examples and a discussion.

See also

Notes and References

  1. http://cs.iit.edu/~culotta/pubs/culotta15predicting.pdf
  2. 2403.14380 . Salvi . Francesco . Manoel Horta Ribeiro . Gallotti . Riccardo . West . Robert . On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial . 2024 . cs.CY .
  3. https://cs.stanford.edu/~pliang/papers/executable-cacm2016.pdf
  4. Dekang Lin and Patrick Pantel, "DIRT – Discovery of Inference Rules from Text", (2001) KDD01-Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
  5. Hoifung Poon and Pedro Domingos "Unsupervised Semantic Parsing" (2009) EMNLP09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing