Cross-serial dependencies explained

In linguistics, cross-serial dependencies (also called crossing dependencies by some authors[1]) occur when the lines representing the dependency relations between two series of words cross over each other.[2] They are of particular interest to linguists who wish to determine the syntactic structure of natural language; languages containing an arbitrary number of them are non-context-free. By this fact, Dutch[3] and Swiss-German[4] have been proven to be non-context-free.

Example

As Swiss-German allows verbs and their arguments to be ordered cross-serially, we have the following example, taken from Shieber:[4]

...mer em Hans es huus hälfed aastriiche.
...weHans the house help paint.
That is, "we help Hans paint the house."

Notice that the sequential noun phrases em Hans (Hans) and es huus (the house), and the sequential verbs hälfed (help) and aastriiche (paint) both form two separate series of constituents. Notice also that the dative verb hälfed and the accusative verb aastriiche take the dative em Hans and accusative es huus as their arguments, respectively.

Non-context-freeness

Let

LSG

to be the set of all Swiss-German sentences. We will prove mathematically that

LSG

is not context-free.

In Swiss-German sentences, the number of verbs of a grammatical case (dative or accusative) must match the number of objects of that case. Additionally, a sentence containing an arbitrary number of such objects is admissible (in principle). Hence, we can define the following formal language, a subset of

LSG

: L =\text \text\ddot\text \text \text \text^m \text \text^n \text \text \text\ddot\text \text \text^m \text\ddot\text^n \text Thus, we have

L=LSG\capLr

, where

Lr

is the regular language defined by L =\text \text\ddot\text \text \text \text^+ \text \text^+ \text \text \text\ddot\text \text \text^+ \text\ddot\text^+ \text where the superscript plus symbol means "one or more copies". Since the set of context-free languages is closed under intersection with regular languages, we need only prove that

L

is not context-free ([5] pp 130–135).

After a word substitution,

L

is of the form

\{xambnycmdnz|m,n\geq1\}

. Since

L

can be mapped to

L'

by the following map:

x,y,z\mapsto\epsilon;a\mapstoa;b\mapstob;c\mapstoc

, and since the context-free languages are closed under mappings from terminal symbols to terminal strings (that is, a homomorphism) (pp 130–135), we need only prove that

L'

is not context-free.

L'=\{ambncmdn|m,n\geq1\}

is a standard example of non-context-free language (p. 128). This can be shown by Ogden's lemma.
Suppose the language is generated by a context-free grammar, then let

p

be the length required in Ogden's lemma, then consider the word

apbpcpdp

in the language, and mark the letters

bpcp

. Then the three conditions implied by Ogden's lemma cannot all be satisfied.
All known spoken languages which contain cross-serial dependencies can be similarly proved to be not context-free. This led to the abandonment of Generalized Phrase Structure Grammar once cross-serial dependencies were identified in natural languages in the 1980s.[6]

Treatment

Research in mildly context-sensitive language has attempted to identify a narrower and more computationally tractable subclass of context-sensitive languages that can capture context sensitivity as found in natural languages. For example, cross-serial dependencies can be expressed in linear context-free rewriting systems (LCFRS); one can write a LCFRS grammar for for example.[7] [8] [9]

Notes and References

  1. .
  2. Book: Jurafsky . Daniel . Martin . James H. . 978-0-13-095069-7 . Speech and Language Processing . 1st . 2000 . Prentice Hall . 473–495. .
  3. .
  4. .
  5. Book: . Introduction to Automata Theory, Languages, and Computation . Introduction to Automata Theory, Languages, and Computation . Pearson Education . 1979 . 978-0-201-44124-6 . 1st . .
  6. Book: Gazdar . Gerald . Natural Language Parsing and Linguistic Theories . 1988 . 978-1-55608-056-2 . Studies in Linguistics and Philosophy . 35 . 69–94 . Applicability of Indexed Grammars to Natural Languages . 10.1007/978-94-009-1337-0_3.
  7. http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf
  8. http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf
  9. Book: Laura Kallmeyer. Parsing Beyond Context-Free Grammars. 2010. Springer Science & Business Media. 978-3-642-14846-0. 1–5.