Unavoidable pattern explained

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

Definitions

Pattern

Like a word, a pattern (also called term) is a sequence of symbols over some alphabet.

The minimum multiplicity of the pattern

p

is
m(p)=min(countp(x):x

\inp)

where
countp(x)
is the number of occurrence of symbol

x

in pattern

p

. In other words, it is the number of occurrences in

p

of the least frequently occurring symbol in

p

.

Instance

Given finite alphabets

\Sigma

and

\Delta

, a word

x\in\Sigma*

is an instance of the pattern

p\in\Delta*

if there exists a non-erasing semigroup morphism

f:\Delta*\Sigma*

such that

f(p)=x

, where

\Sigma*

denotes the Kleene star of

\Sigma

. Non-erasing means that

f(a)\varepsilon

for all

a\in\Delta

, where

\varepsilon

denotes the empty string.

Avoidance / Matching

A word

w

is said to match, or encounter, a pattern

p

if a factor (also called subword or substring) of

w

is an instance of

p

. Otherwise,

w

is said to avoid

p

, or to be

p

-free. This definition can be generalized to the case of an infinite

w

, based on a generalized definition of "substring".

Avoidability / Unavoidability on a specific alphabet

A pattern

p

is unavoidable on a finite alphabet

\Sigma

if each sufficiently long word

x\in\Sigma*

must match

p

; formally: if

\existsn\in\Nu.\forallx\in\Sigma*.(|x|\geqn\impliesxmatchesp)

. Otherwise,

p

is avoidable on

\Sigma

, which implies there exist infinitely many words over the alphabet

\Sigma

that avoid

p

.

By Kőnig's lemma, pattern

p

is avoidable on

\Sigma

if and only if there exists an infinite word

w\in\Sigma\omega

that avoids

p

.

Maximal p-free word

Given a pattern

p

and an alphabet

\Sigma

. A

p

-free word

w\in\Sigma*

is a maximal

p

-free word over

\Sigma

if

aw

and

wa

match

p

\foralla\in\Sigma

.

Avoidable / Unavoidable pattern

A pattern

p

is an unavoidable pattern (also called blocking term) if

p

is unavoidable on any finite alphabet.

If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.

k-avoidable / k-unavoidable

A pattern

p

is

k

-avoidable if

p

is avoidable on an alphabet

\Sigma

of size

k

. Otherwise,

p

is

k

-unavoidable, which means

p

is unavoidable on every alphabet of size

k

.[1]

If pattern

p

is

k

-avoidable, then

p

is

g

-avoidable for all

g\geqk

.

Given a finite set of avoidable patterns

S=\{p1,p2,...,pi\}

, there exists an infinite word

w\in\Sigma\omega

such that

w

avoids all patterns of

S

. Let

\mu(S)

denote the size of the minimal alphabet

\Sigma'

such that

\existw'\in{\Sigma'}\omega

avoiding all patterns of

S

.

Avoidability index

The avoidability index of a pattern

p

is the smallest

k

such that

p

is

k

-avoidable, and

infin

if

p

is unavoidable.

Properties

q

is avoidable if

q

is an instance of an avoidable pattern

p

.[2]

p

be a factor of pattern

q

, then

q

is also avoidable.

q

is unavoidable if and only if

q

is a factor of some unavoidable pattern

p

.

p

and a symbol

a

not in

p

, then

pap

is unavoidable.

p

, then the reversal

pR

is unavoidable.

p

, there exists a symbol

a

such that

a

occurs exactly once in

p

.

n\in\Nu

represent the number of distinct symbols of pattern

p

. If

|p|\geq2n

, then

p

is avoidable.

Zimin words

See main article: Sesquipower. Given alphabet

\Delta=\{x1,x2,...\}

, Zimin words (patterns) are defined recursively

Zn+1=Znxn+1Zn

for

n\in\Zeta+

and

Z1=x1

.

Unavoidability

All Zimin words are unavoidable.

A word

w

is unavoidable if and only if it is a factor of a Zimin word.[3]

Given a finite alphabet

\Sigma

, let

f(n,|\Sigma|)

represent the smallest

m\in\Zeta+

such that

w

matches

Zn

for all

w\in\Sigmam

. We have following properties:[4]

f(1,q)=1

f(2,q)=2q+1

f(3,2)=29

f(n,q)\leq{n-1

q
q
q}=\underbrace{q
}_

Zn

is the longest unavoidable pattern constructed by alphabet

\Delta=\{x1,x2,...,xn\}

since
n
|Z
n|=2

-1

.

Pattern reduction

Free letter

Given a pattern

p

over some alphabet

\Delta

, we say

x\in\Delta

is free for

p

if there exist subsets

A,B

of

\Delta

such that the following hold:

uv

is a factor of

p

and

u\inA

uv

is a factor of

p

and

v\inB

x\inA\backslashB\cupB\backslashA

For example, let

p=abcbab

, then

b

is free for

p

since there exist

A=ac,B=b

satisfying the conditions above.

Reduce

A pattern

p\in\Delta*

reduces to pattern

q

if there exists a symbol

x\in\Delta

such that

x

is free for

p

, and

q

can be obtained by removing all occurrence of

x

from

p

. Denote this relation by

p\stackrel{x}{}q

.

For example, let

p=abcbab

, then

p

can reduce to

q=aca

since

b

is free for

p

.

Locked

A word

w

is said to be locked if

w

has no free letter; hence

w

can not be reduced.[5]

Transitivity

Given patterns

p,q,r

, if

p

reduces to

q

and

q

reduces to

r

, then

p

reduces to

r

. Denote this relation by

p\stackrel{*}{}r

.

Unavoidability

A pattern

p

is unavoidable if and only if

p

reduces to a word of length one; hence

\existw

such that

|w|=1

and

p\stackrel{*}{}w

.[6]

Graph pattern avoidance[7]

Avoidance / Matching on a specific graph

Given a simple graph

G=(V,E)

, a edge coloring

c:E\Delta

matches pattern

p

if there exists a simple path

P=[e1,e2,...,er]

in

G

such that the sequence

c(P)=[c(e1),c(e2),...,c(er)]

matches

p

. Otherwise,

c

is said to avoid

p

or be

p

-free.

Similarly, a vertex coloring

c:V\Delta

matches pattern

p

if there exists a simple path

P=[c1,c2,...,cr]

in

G

such that the sequence

c(P)

matches

p

.

Pattern chromatic number

The pattern chromatic number

\pip(G)

is the minimal number of distinct colors needed for a

p

-free vertex coloring

c

over the graph

G

.

Let

\pip(n)=max\{\pip(G):G\inGn\}

where

Gn

is the set of all simple graphs with a maximum degree no more than

n

.

Similarly,

\pip'(G)

and

\pip'(n)

are defined for edge colorings.

Avoidability / Unavoidability on graphs

A pattern

p

is avoidable on graphs if

\pip(n)

is bounded by

cp

, where

cp

only depends on

p

.

p

is avoidable on any finite alphabet if and only if

\pip(Pn)\leqcp

for all

n\in\Zeta+

, where

Pn

is a graph of

n

vertices concatenated.

Probabilistic bound on

\pip(n)

There exists an absolute constant

c

, such that

\pip(n)\leq

m(p)
m(p)-1
cn

\leqcn2

for all patterns

p

with

m(p)\geq2

.

Given a pattern

p

, let

n

represent the number of distinct symbols of

p

. If

|p|\geq2n

, then

p

is avoidable on graphs.

Explicit colorings

Given a pattern

p

such that

countp(x)

is even for all

x\inp

, then

\pip'(K

k)\leq2
2

k-1

for all

k\geq1

, where

Kn

is the complete graph of

n

vertices.

Given a pattern

p

such that

m(p)\geq2

, and an arbitrary tree

T

, let

S

be the set of all avoidable subpatterns and their reflections of

p

. Then

\pip(T)\leq3\mu(S)

.

Given a pattern

p

such that

m(p)\geq2

, and a tree

T

with degree

n\geq2

. Let

S

be the set of all avoidable subpatterns and their reflections of

p

, then

\pip'(T)\leq2(n-1)\mu(S)

.

Examples

xxx

and

xyxyx

.

xx

. The word over the alphabet

\{0,\pm1\}

obtained by taking the first difference of the Thue–Morse sequence is an example of an infinite square-free word.[8] [9]

x

and

xyx

are unavoidable on any alphabet, since they are factors of the Zimin words.[10]

xn

for

n\geq3

are 2-avoidable.

\varepsilon,x,xyx

are unavoidable.

xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy

have avoidability index of 3.

abwbaxbcyaczca

has avoidability index of 4, as well as other locked words.

abvacwbaxbcycdazdcd

has avoidability index of 5.[12]

RT(n)

is the infimum of exponents

k

such that

xk

is avoidable on an alphabet of size

n

. Also see Dejean's theorem.

Open problems

p

such that the avoidability index of

p

is 6?

p

, is there an algorithm to determine the avoidability index of

p

?

References

Notes and References

  1. Book: Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc.. 978-0-8218-7325-0. 127. en.
  2. Schmidt. Ursula. 1987-08-01. Long unavoidable patterns. Acta Informatica. en. 24. 4. 433–445. 10.1007/BF00292112. 7928450. 1432-0525.
  3. Zimin. A. I.. Blocking Sets of Terms. 1984. Mathematics of the USSR-Sbornik. en. 47. 2. 353–364. 10.1070/SM1984v047n02ABEH002647. 1984SbMat..47..353Z. 0025-5734.
  4. Book: Joshua. Cooper. Bounds on Zimin Word Avoidance. Rorabaugh. Danny. 2013. 1409.3080. 2014arXiv1409.3080C.
  5. Baker. Kirby A.. McNulty. George F.. Taylor. Walter. 1989-12-18. Growth problems for avoidable words. Theoretical Computer Science. en. 69. 3. 319–345. 10.1016/0304-3975(89)90071-6. 0304-3975. free.
  6. Bean. Dwight R.. Ehrenfeucht. Andrzej. McNulty. George F.. 1979. Avoidable patterns in strings of symbols.. Pacific Journal of Mathematics. en. 85. 2. 261–294. 10.2140/pjm.1979.85.261. 0030-8730. free.
  7. Grytczuk. Jarosław. 2007-05-28. Pattern avoidance on graphs. Discrete Mathematics. The Fourth Caracow Conference on Graph Theory. en. 307. 11. 1341–1346. 10.1016/j.disc.2005.11.071. 0012-365X. free.
  8. Book: Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc.. 978-0-8218-7325-0. 97. en.
  9. Book: Fogg, N. Pytheas. Substitutions in Dynamics, Arithmetics and Combinatorics. 2002-09-23. Springer Science & Business Media. 978-3-540-44141-0. 104. en.
  10. Book: Allouche. Jean-Paul. Automatic Sequences: Theory, Applications, Generalizations. Shallit. Jeffrey. Shallit. Professor Jeffrey. 2003-07-21. Cambridge University Press. 978-0-521-82332-6. 24. en.
  11. Book: Lothaire. M.. Algebraic Combinatorics on Words. registration. Cambridge University Press. 2002. 9780521812207.
  12. Clark. Ronald J.. 2006-04-01. The existence of a pattern which is 5-avoidable but 4-unavoidable. International Journal of Algebra and Computation. 16. 2. 351–367. 10.1142/S0218196706002950. 0218-1967.