In the theory of formal languages of computer science, mathematics, and linguistics, a Dyck word is a balanced string of brackets.The set of Dyck words forms a Dyck language. The simplest, D1, uses just two matching brackets, e.g. (and).
Dyck words and language are named after the mathematician Walther von Dyck. They have applications in the parsing of expressions that must have a correctly nested sequence of brackets, such as arithmetic or algebraic expressions.
Let
\Sigma=\{[,]\}
\Sigma*
\{u\in\Sigma*\vertallprefixesofucontainnomore]'sthan['sandthenumberof['sinuequalsthenumberof]'s\}.
It may be helpful to define the Dyck language via a context-free grammar in some situations.The Dyck language is generated by the context-free grammar with a single non-terminal, and the production:
That is, S is either the empty string or is "[", an element of the Dyck language, the matching "]", and an element of the Dyck language.
An alternative context-free grammar for the Dyck language is given by the production:
That is, S is zero or more occurrences of the combination of "[", an element of the Dyck language, and a matching "]", where multiple elements of the Dyck language on the right side of the production are free to differ from each other.
In yet other contexts it may instead be helpful to define the Dyck language by splitting
\Sigma*
u\in\Sigma*
|u|
\operatorname{insert}:\Sigma* x N → \Sigma*
\operatorname{delete}:\Sigma* x N → \Sigma*
\operatorname{insert}(u,j)
u
[]
j
\operatorname{delete}(u,j)
u
[]
j
with the understanding that
\operatorname{insert}(u,j)
j>|u|
\operatorname{delete}(u,j)
j>|u|-2
R
\Sigma*
a,b\in\Sigma*
(a,b)\inR
\operatorname{insert}
\operatorname{delete}
a
b
R
\operatorname{insert}
\operatorname{delete}
The equivalence relation partitions the language
\Sigma*
\epsilon
\operatorname{Cl}(\epsilon)
\Sigma*
\Sigma*/R
\operatorname{Cl}(\epsilon)
1
u=\operatorname{Cl}([)
v=\operatorname{Cl}(])
uv=\operatorname{Cl}([])=1\ne\operatorname{Cl}(][)=vu
uv=1
u
v
\Sigma*/R
\operatorname{Cl}([)
\operatorname{Cl}(])
TC0
[ ]
\operatorname{N}(n,k)
Cn
Cn=
n | |
\sum | |
k=1 |
\operatorname{N}(n,k)
We can define an equivalence relation
L
l{D}
u,v\inl{D}
(u,v)\inL
|u|=|v|
u
v
l{D}/L=\{l{D}0,l{D}1,\ldots\}
l{D}=l{D}0\cupl{D}2\cupl{D}4\cup\ldots=
infty | |
cup | |
n=0 |
l{D}n
l{D}n=\{u\inl{D}\mid|u|=n\}
l{D}n
n
Having introduced the Dyck words of length
n
n\inN
Sn
l{D}n
u,v\inl{D}n
(u,v)\inSn
v
u
u\inl{D}n
n\inN
Sn
l{D}n
Sn
u
u
u
v
v
w
u
w
Sn
\sigman:l{D}n → N
v
u
\sigman(u)=\sumvw=u((countof['sinv)-(countof]'sinv))
The following table illustrates that
\sigman
partial sums of \sigman(u) | P | P-1 | P | Q |
---|---|---|---|---|
u | \ldots | ] | [|| <math>\ldots</math>
|-
! <math>u'</math>
| <math>\ldots</math> || [ || ] || \ldots \sigman(u') P P+1 P Q Hence \sigman(u')-\sigman(u)=2>0 \sigman(u)<\sigman(u') u u' (u,v),(v,u)\inSn u\nev u v \sigman(u)<\sigman(v)<\sigman(u) (u,v) (v,u) Sn u=v Sn The partial ordered set D8 GeneralizationsThere exist variants of the Dyck language with multiple delimiters, e.g., D2 on the alphabet "(", ")", "[", and "]". The words of such a language are the ones which are well-parenthesized for all delimiters, i.e., one can read the word from left to right, push every opening delimiter on the stack, and whenever we reach a closing delimiter then we must be able to pop the matching opening delimiter from the top of the stack. (The counting algorithm above does not generalise). See also
References |