Generalized context-free grammar (GCFG) is a grammar formalism that expands on context-free grammars by adding potentially non-context-free composition functions to rewrite rules.[1] Head grammar (and its weak equivalents) is an instance of such a GCFG which is known to be especially adept at handling a wide variety of non-CF properties of natural language.
A GCFG consists of two components: a set of composition functions that combine string tuples, and a set of rewrite rules. The composition functions all have the form
f(\langlex1,...,xm\rangle,\langley1,...,yn\rangle,...)=\gamma
\gamma
X\tof(Y,Z,...)
Y
Z
The rewrite semantics of GCFGs is fairly straightforward. An occurrence of a non-terminal symbol is rewritten using rewrite rules as in a context-free grammar, eventually yielding just compositions (composition functions applied to string tuples or other compositions). The composition functions are then applied, successively reducing the tuples to a single tuple.
A simple translation of a context-free grammar into a GCFG can be performed in the following fashion. Given the grammar in, which generates the palindrome language
\{wwR:w\in\{a,b\}*\}
wR
w
The CF production of is
and the corresponding GCFG production is
S\toconc(\langlea\rangle,S,\langlea\rangle)
conc(\langlea\rangle,conc(\langleb\rangle,S,\langleb\rangle),\langlea\rangle)
conc(\langlea\rangle,conc(\langleb\rangle,conc(\langleb\rangle,S,\langleb\rangle),\langleb\rangle),\langlea\rangle)
conc(\langlea\rangle,conc(\langleb\rangle,conc(\langleb\rangle,conc(\langle\epsilon\rangle,\langle\epsilon\rangle,\langle\epsilon\rangle),\langleb\rangle),\langleb\rangle),\langlea\rangle)
conc(\langlea\rangle,conc(\langleb\rangle,conc(\langleb\rangle,\langle\epsilon\rangle,\langleb\rangle),\langleb\rangle),\langlea\rangle)
conc(\langlea\rangle,conc(\langleb\rangle,\langlebb\rangle,\langleb\rangle),\langlea\rangle)
conc(\langlea\rangle,\langlebbbb\rangle,\langlea\rangle)
\langleabbbba\rangle
Weir (1988) describes two properties of composition functions, linearity and regularity. A function defined as
f(x1,...,xn)=...
f(x)=g(x,y)
f(x)=g(x,x)
f(x1,...,xn)=...
f(x,y)=g(y,x)
f(x)=g(x,y)
f(x,y)=g(x)
A grammar in which all composition functions are both linear and regular is called a Linear Context-free Rewriting System (LCFRS). LCFRS is a proper subclass of the GCFGs, i.e. it has strictly less computational power than the GCFGs as a whole.
On the other hand, LCFRSs are strictly more expressive than linear-indexed grammars and their weakly equivalent variant tree adjoining grammars (TAGs).[2] Head grammar is another example of an LCFRS that is strictly less powerful than the class of LCFRSs as a whole.
LCFRS are weakly equivalent to (set-local) multicomponent TAGs (MCTAGs) and also with multiple context-free grammar (MCFGs http://www.labri.fr/perso/salvati/downloads/cours/esslli/course1.pdf).[3] and minimalist grammars (MGs). The languages generated by LCFRS (and their weakly equivalents) can be parsed in polynomial time.[4]