In combinatorics, the symbolic method is a technique for counting combinatorial objects. It uses the internal structure of the objects to derive formulas for their generating functions. The method is mostly associated with Philippe Flajolet and is detailed in Part A of his book with Robert Sedgewick, Analytic Combinatorics, while the rest of the book explains how to use complex analysis in order to get asymptotic and probabilistic results on the corresponding generating functions.
During two centuries, generating functions were popping up via the corresponding recurrences on their coefficients (as can be seen in the seminal works of Bernoulli, Euler, Arthur Cayley, Schröder, Ramanujan, Riordan, Knuth,, etc.).It was then slowly realized that the generating functions were capturing many other facets of the initial discrete combinatorial objects, and that this could be done in a more direct formal way: The recursive nature of some combinatorial structures translates, via some isomorphisms, into noteworthy identities on the corresponding generating functions. Following the works of Pólya, further advances were thus done in this spirit in the 1970s with generic uses of languages for specifying combinatorial classes and their generating functions, as found in works by Foata and Schützenberger[1] on permutations, Bender and Goldman on prefabs,[2] and Joyal on combinatorial species.[3]
Note that this symbolic method in enumeration is unrelated to "Blissard's symbolic method", which is just another old name for umbral calculus.
The symbolic method in combinatorics constitutes the first step of many analyses of combinatorial structures, which can then lead to fast computation schemes, to asymptotic properties and limit laws, to random generation, all of them being suitable to automatization via computer algebra.
Consider the problem of distributing objects given by a generating function into a set of n slots, where a permutation group G of degree n acts on the slots to create an equivalence relation of filled slot configurations, and asking about the generating function of the configurations by weight of the configurations with respect to this equivalence relation, where the weight of a configuration is the sum of the weights of the objects in the slots. We will first explain how to solve this problem in the labelled and the unlabelled case and use the solution to motivate the creation of classes of combinatorial structures.
The Pólya enumeration theorem solves this problem in the unlabelled case. Let f(z) be the ordinary generating function (OGF) of the objects, then the OGF of the configurations is given by the substituted cycle index
Z(G)(f(z),f(z2),\ldots,f(zn)).
In the labelled case we use an exponential generating function (EGF) g(z) of the objects and apply the Labelled enumeration theorem, which says that the EGF of the configurations is given by
g(z)n | |
|G| |
.
We are able to enumerate filled slot configurations using either PET in the unlabelled case or the labelled enumeration theorem in the labelled case. We now ask about the generating function of configurations obtained when there is more than one set of slots, with a permutation group acting on each. Clearly the orbits do not intersect and we may add the respective generating functions. Suppose, for example, that we want to enumerate unlabelled sequences of length two or three of some objects contained in a set X. There are two sets of slots, the first one containing two slots, and the second one, three slots. The group acting on the first set is
E2
E3
2/E | |
X | |
2 |
+
3/E | |
X | |
3 |
where the term
Xn/G
Xn=X x … x X
X/C1 +
2/C | |
X | |
2 |
+
3/C | |
X | |
3 |
+
4/C | |
X | |
4 |
+ … .
Clearly we can assign meaning to any such power series of quotients (orbits) with respect to permutation groups, where we restrict the groups of degree n to the conjugacy classes
\operatorname{Cl}(Sn)
Sn
A class
l{C}\inN[ak{A}]
l{C}=\sumn\sumG\in(Sn)}cG(Xn/G)
ak{A}
\{\operatorname{Cl}(Sn)\}n\ge
cG\inN.
In the following we will simplify our notation a bit and write e.g.
E2+E3andC1+C2+C3+ … .
for the classes mentioned above.
A theorem in the Flajolet - Sedgewick theory of symbolic combinatorics treats the enumeration problem of labelled and unlabelled combinatorial classes by means of the creation of symbolic operators that make it possible to translate equations involving combinatorial structures directly (and automatically) into equations in the generating functions of these structures.
Let
l{C}\inN[ak{A}]
F(z)
l{C}(X)
f(z)
G(z)
l{C}(X)
g(z)
F(z)=\sumn\sumG\in(Sn)}cGZ(G)(f(z),f(z2),\ldots,f(zn))
and
G(z)=\sumn\left(\sumG\in(Sn)}
cG | |
|G| |
\right)g(z)n.
In the labelled case we have the additional requirement that X not contain elements of size zero. It will sometimes prove convenient to add one to
G(z)
l{C}\inZ[ak{A}]
l{C}\inQ[ak{A}].
The power of this theorem lies in the fact that it makes it possible to construct operators on generating functions that represent combinatorial classes. A structural equation between combinatorial classes thus translates directly into an equation in the corresponding generating functions. Moreover, in the labelled case it is evident from the formula that we may replace
g(z)
This operator corresponds to the class
1+E1+E2+E3+ …
and represents sequences, i.e. the slots are not being permuted and there is exactly one empty sequence. We have
F(z)=1+\sumn\geZ(En)(f(z),f(z2),\ldots,f(zn))= 1+\sumn\gef(z)n=
1 | |
1-f(z) |
and
G(z)=1+\sumn\ge\left(
1 | |
|En| |
\right)g(z)n=
1 | |
1-g(z) |
.
This operator corresponds to the class
C1+C2+C3+ …
i.e., cycles containing at least one object. We have
F(z)=\sumn\geZ(Cn)(f(z),f(z2),\ldots,f(zn))= \sumn\ge
1 | |
n |
\sumd\mid\varphi(d)f(zd)n/d
or
F(z)=\sumk\ge\varphi(k)\summ\ge
1 | |
km |
f(zk)m= \sumk\ge
\varphi(k) | |
k |
log
1 | |
1-f(zk) |
and
G(z)=\sumn\ge\left(
1 | |
|Cn| |
\right)g(z)n=log
1 | |
1-g(z) |
.
This operator, together with the set operator, and their restrictions to specific degrees are used to compute random permutation statistics. There are two useful restrictions of this operator, namely to even and odd cycles.
The labelled even cycle operator is
C2+C4+C6+ …
which yields
G(z)=\sumn\ge\left(
1 | |
|C2n| |
\right)g(z)2n=
1 | |
2 |
log
1 | |
1-g(z)2 |
.
This implies that the labelled odd cycle operator
C1+C3+C5+ …
is given by
G(z)=log
1 | |
1-g(z) |
-
1 | |
2 |
log
1 | = | |
1-g(z)2 |
1 | |
2 |
log
1+g(z) | |
1-g(z) |
.
The series is
1+S1+S2+S3+ …
i.e., the symmetric group is applied to the slots. This creates multisets in the unlabelled case and sets in the labelled case (there are no multisets in the labelled case because the labels distinguish multiple instances of the same object from the set being put into different slots). We include the empty set in both the labelled and the unlabelled case.
The unlabelled case is done using the function
M(f(z),y)=\sumn\geynZ(Sn)(f(z),f(z2),\ldots,f(zn))
so that
ak{M}(f(z))=M(f(z),1).
Evaluating
M(f(z),1)
F(z)=\exp\left(\sum\ell\ge
f(z\ell) | |
\ell |
\right).
For the labelled case we have
G(z)=1+\sumn\ge\left(
1 | |
|Sn| |
\right)g(z)n=\sumn\ge
g(z)n | |
n! |
=\expg(z).
In the labelled case we denote the operator by, and in the unlabelled case, by . This is because in the labeled case there are no multisets (the labels distinguish the constituents of a compound combinatorial class) whereas in the unlabeled case there are multisets and sets, with the latter being given by
F(z)=\exp\left(\sum\ell\ge(-1)\ell-1
f(z\ell) | |
\ell |
\right).
Typically, one starts with the neutral class
l{E}
\epsilon
l{Z}
In this article, we will follow the convention of using script uppercase letters to denote combinatorial classes and the corresponding plain letters for the generating functions (so the class
l{A}
A(z)
There are two types of generating functions commonly used in symbolic combinatorics—ordinary generating functions, used for combinatorial classes of unlabelled objects, and exponential generating functions, used for classes of labelled objects.
It is trivial to show that the generating functions (either ordinary or exponential) for
l{E}
l{Z}
E(z)=1
Z(z)=z
l{B}
l{C}
l{A}=l{B}\cupl{C}
A(z)=B(z)+C(z)
The restriction of unions to disjoint unions is an important one; however, in the formal specification of symbolic combinatorics, it is too much trouble to keep track of which sets are disjoint. Instead, we make use of a construction that guarantees there is no intersection (be careful, however; this affects the semantics of the operation as well). In defining the combinatorial sum of two sets
l{A}
l{B}
\circ
l{A}
\bullet
l{B}
l{A}+l{B}=(l{A} x \{\circ\})\cup(l{B} x \{\bullet\})
This is the operation that formally corresponds to addition.
With unlabelled structures, an ordinary generating function (OGF) is used. The OGF of a sequence
An
infty | |
A(x)=\sum | |
n=0 |
Anxn
The product of two combinatorial classes
l{A}
l{B}
a\inl{A}
b\inl{B}
|(a,b)|=|a|+|b|
l{A} x l{B}
n | |
\sum | |
k=0 |
AkBn-k.
Using the definition of the OGF and some elementary algebra, we can show that
l{A}=l{B} x l{C}
A(z)=B(z) ⋅ C(z).
The sequence construction, denoted by
l{A}=ak{G}\{l{B}\}
ak{G}\{l{B}\}=l{E}+l{B}+(l{B} x l{B})+(l{B} x l{B} x l{B})+ … .
In other words, a sequence is the neutral element, or an element of
l{B}
A(z)=1+B(z)+B(z)2+B(z)3+ … =
1 | |
1-B(z) |
.
The set (or powerset) construction, denoted by
l{A}=ak{P}\{l{B}\}
ak{P}\{l{B}\}=\prod\beta
which leads to the relation
\begin{align}A(z)&{}=\prod\beta
where the expansion
ln(1+u)=
infty | |
\sum | |
k=1 |
(-1)k-1uk | |
k |
was used to go from line 4 to line 5.
The multiset construction, denoted
l{A}=ak{M}\{l{B}\}
ak{M}\{l{B}\}=\prod\beta
This leads to the relation
\begin{align}A(z)&{}=\prod\beta
where, similar to the above set construction, we expand
ln(1-zn)
l{B}
Other important elementary constructions are:
ak{C}\{l{B}\}
\Thetal{B}
l{B}\circl{C}
The derivations for these constructions are too complicated to show here. Here are the results:
Construction | Generating function | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
l{A}=ak{C}\{l{B}\} | A(z)=
ln
\phi(k) | ||||||||||||
l{A}=\Thetal{B} | A(z)=z
B(z) | ||||||||||||
l{A}=l{B}\circl{C} | A(z)=B(C(z)) |
Many combinatorial classes can be built using these elementary constructions. For example, the class of plane trees (that is, trees embedded in the plane, so that the order of the subtrees matters) is specified by the recursive relation
l{G}=l{Z} x \operatorname{SEQ}\{l{G}\}.
In other words, a tree is a root node of size 1 and a sequence of subtrees. This gives
G(z)=
z | |
1-G(z) |
we solve for G(z) by multiplying
1-G(z)
G(z)-G(z)2=z
subtracting z and solving for G(z) using the quadratic formula gives
G(z)=
1-\sqrt{1-4z | |
Another example (and a classic combinatorics problem) is integer partitions. First, define the class of positive integers
l{I}
l{I}=l{Z} x \operatorname{SEQ}\{l{Z}\}
The OGF of
l{I}
I(z)=
z | |
1-z |
.
Now, define the set of partitions
l{P}
l{P}=\operatorname{MSET}\{l{I}\}.
The OGF of
l{P}
P(z)=\exp\left(I(z)+
1 | |
2 |
I(z2)+
1 | |
3 |
I(z3)+ … \right).
Unfortunately, there is no closed form for
P(z)
The elementary constructions mentioned above allow us to define the notion of specification. This specification allows us to use a set of recursive equations, with multiple combinatorial classes.
Formally, a specification for a set of combinatorial classes
(lA1,...,lAr)
r
lAi=\Phii(lA1,...,lAr)
\Phii
lE,lZ
lAi
A class of combinatorial structures is said to be constructible or specifiable when it admits a specification.
For example, the set of trees whose leaves' depth is even (respectively, odd) can be defined using the specification with two classes
lAeven
lAodd
lAodd=lZ x \operatorname{Seq}\ge1lAeven
lAeven=lZ x \operatorname{Seq}lAodd
An object is weakly labelled if each of its atoms has a nonnegative integer label, and each of these labels is distinct. An object is (strongly or well) labelled, if furthermore, these labels comprise the consecutive integers
[1\ldotsn]
With labelled structures, an exponential generating function (EGF) is used. The EGF of a sequence
An
infty | |
A(x)=\sum | |
n=0 |
An
xn | |
n! |
.
For labelled structures, we must use a different definition for product than for unlabelled structures. In fact, if we simply used the cartesian product, the resulting structures would not even be well labelled. Instead, we use the so-called labelled product, denoted
l{A}\starl{B}.
For a pair
\beta\inl{B}
\gamma\inl{C}
\beta
\gamma
To aid this development, let us define a function,
\rho
\alpha
\rho(\alpha)
\alpha
\beta
\alpha\star\beta=\{(\alpha',\beta'):(\alpha',\beta')iswell-labelled,\rho(\alpha')=\alpha,\rho(\beta')=\beta\}.
Finally, the labelled product of two classes
l{A}
l{B}
l{A}\starl{B}=cup\alpha,\beta\inl{B}}(\alpha\star\beta).
The EGF can be derived by noting that for objects of size
k
n-k
{n\choosek}
n
n | |
\sum | |
k=0 |
{n\choosek}AkBn-k.
This binomial convolution relation for the terms is equivalent to multiplying the EGFs,
A(z) ⋅ B(z).
The sequence construction
l{A}=ak{G}\{l{B}\}
ak{G}\{l{B}\}=l{E}+l{B}+(l{B}\starl{B})+(l{B}\starl{B}\starl{B})+ …
A(z)=
1 | |
1-B(z) |
In labelled structures, a set of
k
k!
l{A}=ak{P}\{l{B}\}
A(z)=
infty | |
\sum | |
k=0 |
B(z)k | |
k! |
=\exp(B(z))
Cycles are also easier than in the unlabelled case. A cycle of length
k
k
l{A}=ak{C}\{l{B}\}
A(z)=
infty | |
\sum | |
k=0 |
B(z)k | |
k |
=ln\left(
1 | |
1-B(z) |
\right).
In labelled structures, the min-boxed product
l{A}min=l{B}\square\starl{C}
l{B}
l{A}max=l{B}\blacksquare\starl{C}
Amin(z)=Amax
z | |
(z)=\int | |
0 |
B'(t)C(t)dt.
Amin'(t)=Amax'(t)=B'(t)C(t).
An increasing Cayley tree is a labelled non-plane and rooted tree whose labels along any branch stemming from the root form an increasing sequence. Then, let
l{L}
l{L}=l{Z}\square\star\operatorname{SET}(l{L}).
The operatorsand represent cycles of even and odd length, and sets of even and odd cardinality.
Stirling numbers of the second kind may be derived and analyzed using the structural decomposition
\operatorname{SET}(\operatorname{SET}\ge(l{Z})).
The decomposition
\operatorname{SET}(\operatorname{CYC}(l{Z}))
is used to study unsigned Stirling numbers of the first kind, and in the derivation of the statistics of random permutations. A detailed examination of the exponential generating functions associated to Stirling numbers within symbolic combinatorics may be found on the page on Stirling numbers and exponential generating functions in symbolic combinatorics.