Filtered-popping recursive transition network explained

A filtered-popping recursive transition network (FPRTN),[1] or simply filtered-popping network (FPN), is a recursive transition network (RTN)[2] extended with a map of states to keys where returning from a subroutine jump requires the acceptor and return states to be mapped to the same key. RTNs are finite-state machines that can be seen as finite-state automata extended with a stack of return states; as well as consuming transitions and

\varepsilon

-transitions, RTNs may define call transitions. These transitions perform a subroutine jump by pushing the transition's target state onto the stack and bringing the machine to the called state. Each time an acceptor state is reached, the return state at the top of the stack is popped out, provided that the stack is not empty, and the machine is brought to this state.

Throughout this article we refer to filtered-popping recursive transition networks as FPNs, though this acronym is ambiguous (e.g.: fuzzy Petri nets). Filtered-popping networks and FPRTNs are unambiguous alternatives.

Formal Definition

A FPN is a structure

(Q,K,\Sigma,\delta,\kappa,QI,F)

where

Q

is a finite set of states,

K

is a finite set of keys,

\Sigma

is a finite input alphabet,

\delta:Q x (\Sigma\cup\{\varepsilon\}\cupQ)\toQ

is a partial transition function,

\varepsilon

being the empty symbol,

\kappa:Q\toK

is a map of states to keys,

QI\subseteqQ

is the set of initial states, and

F\subseteqQ

is the set of acceptance states.

Transitions

Transitions represent the possibility of bringing the FPN from a source state

qs

to a target state

qt

by possibly performing an additional action. Depending on this action, we distinguish the following types of explicitly-defined transitions:

\varepsilon

-transitions are transitions of the form

\delta(qs,\varepsilon)\toqt

and perform no additional action,

\delta(qs,\sigma)\toqt

and consume an input symbol

\sigma

, and

\delta(qs,qc)\toqt

and perform a subroutine jump to called state

qc

before reaching

qt

.

The behaviour of call transitions is governed by two kinds of implicitly-defined transitions:

\delta(qs,qc)\toqt

the FPN implicitly defines a push transition that brings the machine from

qs

to

qc

by pushing

qt

onto the stack, and

(qf,qr)\inF x Q

the FPN implicitly defines a pop transition that brings the machine from

qf

to

qr

by popping

qr

from the stack iff

qr

is the state at the top of the stack and

\kappa(qf)=\kappa(qr)

.

Push transitions initialize subroutine jumps and pop transitions are equivalent to return statements.

Purpose

A (natural language) text can be enriched with meta-information by the application of a RTN with output; for instance, a RTN inserting XML tags can be used for transforming a plain text into a structured XML document. A RTN with output representing a natural language grammar would delimit and add the syntactic structure of each text sentence (see parsing). Other RTNs with output could simply mark text segments containing relevant information (see information extraction). The application of a RTN with output representing an ambiguous grammar results in a set of possible translations or interpretations of the input. Computing this set has an exponential worst-case cost, even for an Earley parser for RTNs with output,[3] due to cases in which the number of translations increases exponentially w.r.t. the input length; for instance, the number of interpretations of a natural language sentence increases exponentially w.r.t. the number of unresolved prepositional phrase attachments:[4] [5]

FPNs serve as a compact representation of this set of translations, allowing to compute it in cubic time by means of an Earley-like parser.[1] FPN states correspond to execution states (see instruction steps) of an Earley-parser for RTNs without output, and FPN transitions correspond to possible translations of input symbols. The

\kappa

map of the resulting FPN gives the correspondence between the represented output segments and the recognized input segments: given a recognized input sequence

\sigma1\ldots\sigmal

and a FPN path

p

starting at a state

q

and ending at a state

q\prime

,

p

represents a possible translation of input segment

\sigma\kappa(q)+1

\ldots\sigma
\kappa(q\prime)
. The filtered-popping feature is required in order to avoid FPN paths to represent translations of disconnected or overlapping input segments: a FPN call may contain several translation paths from the called state to an acceptor state, where the input segments they correspond to share the same start point but do not necessarily have the same length. Only return states corresponding to the same input point than the acceptor state finishing the call are valid return states.

Notes and References

  1. Javier M. Sastre, "Efficient parsing using filtered-popping recursive transition networks", Lecture Notes in Artificial Intelligence, 5642:241-244, 2009
  2. William A. Woods, "Transition network grammars for natural language analysis", Communications of the ACM, ACM Press, 13:10:591-606, 1970
  3. Javier M. Sastre & Mikel L. Forcada, "Efficient parsing using recursive transition networks with output", Lecture Notes in Computer Science, 5603:192-204, 2009
  4. Adwait Ratnaparkhi, "Statistical models for unsupervised prepositional phrase attachment", ACL-36: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 1079-1085, 1998
  5. Miriam Butt, "Chunk/Shallow parsing", lecture notes, 2002