An operator precedence grammar is a kind of grammar for formal languages.
Technically, an operator precedence grammar is a context-free grammar that has the property (among others)that no production has either an empty right-hand side or two adjacent nonterminals in itsright-hand side. These properties allow precedence relations to bedefined between the terminals of the grammar. A parser that exploits these relations is considerably simpler than more general-purpose parsers, such as LALR parsers. Operator-precedence parsers can be constructed for a large class of context-free grammars.
Operator precedence grammars rely on the following three precedence relations between the terminals:
Relation | Meaning | ||
---|---|---|---|
a\lessdotb | yields precedence to | ||
a
b | has the same precedence as | ||
a\gtrdotb | takes precedence over |
These operator precedence relations allow to delimit the handles in the right sentential forms:
\lessdot
eq |
\gtrdot
a
eq |
b
b
eq |
a
b\gtrdota
a\lessdotb
a
eq |
a
a\gtrdota
Let us assume that between the terminals and there is always exactly one precedence relation. Suppose that $ is the end of the string.Then for all terminals we define:
\$\lessdotb
b\gtrdot\$
\lessdot
eq |
\gtrdot
For example, the following operator precedence relations canbe introduced for simple expressions:
\begin{array}{c|cccc} &id &+ &* &\$ \\ \hline id & &\gtrdot &\gtrdot &\gtrdot \\ + &\lessdot &\gtrdot &\lessdot &\gtrdot \\ * &\lessdot &\gtrdot &\gtrdot &\gtrdot \\ \$ &\lessdot &\lessdot &\lessdot & \end{array}
They follow from the following facts:
+\lessdot*
*\gtrdot+
+\gtrdot+
*\gtrdot*
The input string
id1+id2*id3
\$\lessdotid1\gtrdot+\lessdotid2\gtrdot*\lessdotid3\gtrdot\$
Having precedence relations allows to identify handles as follows:
\gtrdot
eq |
\lessdot
\lessdot
\gtrdot
It is generally not necessary to scan the entire sentential form to find the handle.
The algorithm below is from Aho et al.: If $ is on the top of the stack and ip points to $ then return else Let a be the top terminal on the stack, and b the symbol pointed to by ip if a
\lessdot
eq |
\gtrdot
\lessdot
An operator precedence parser usually does not store the precedence table with the relations, which can get rather large. Instead, precedence functions f and g are defined.They map terminal symbols to integers, and so the precedence relations between the symbols are implemented by numerical comparison: must hold if
a\lessdotb
Not every table of precedence relations has precedence functions, but in practice for most grammars such functions can be designed.
The below algorithm is from Aho et al.:
a
eq |
b
a\gtrdotb
Consider the following table (repeated from above):
\begin{array}{c|cccc} &id &+ &* &\$ \\ \hline id & &\gtrdot &\gtrdot &\gtrdot \\ + &\lessdot &\gtrdot &\lessdot &\gtrdot \\ * &\lessdot &\gtrdot &\gtrdot &\gtrdot \\ \$ &\lessdot &\lessdot &\lessdot & \end{array}
gid \ fid f* \ / g* / f+ | \ | g+ | | g$ f$
from which we extract the following precedence functions from the maximum heights in the directed acyclic graph:
id | + | $ | |||
---|---|---|---|---|---|
f | 4 | 2 | 4 | 0 | |
g | 5 | 1 | 3 | 0 |
The class of languages described by operator-precedence grammars, i.e., operator-precedence languages, is strictly contained in the class of deterministic context-free languages, and strictly contains visibly pushdown languages.
Operator-precedence languages enjoy many closure properties: union, intersection, complementation, concatenation, and they are the largest known class closed under all these operations and for which the emptiness problem is decidable. Another peculiar feature of operator-precedence languages is their local parsability, that enables efficient parallel parsing.
There are also characterizations based on an equivalent form of automata and monadic second-order logic.