Myhill–Nerode theorem explained
In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 .
Statement
Given a language
, and a pair of strings
and
, define a
distinguishing extension to be a string
such thatexactly one of the two strings
and
belongs to
.Define a relation
on strings as
if there is no distinguishing extension for
and
. It is easy to show that
is an
equivalence relation on strings, and thus it divides the set of all strings into
equivalence classes.
The Myhill–Nerode theorem states that a language
is regular if and only if
has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the minimal
deterministic finite automaton (DFA) accepting
. Furthermore, every minimal DFA for the language is isomorphic to the canonical one .
Generally, for any language, the constructed automaton is a state automaton acceptor. However, it does not necessarily have finitely many states. The Myhill–Nerode theorem shows that finiteness is necessary and sufficient for language regularity.
Some authors refer to the
relation as
Nerode congruence, in honor of
Anil Nerode.
Use and consequences
The Myhill–Nerode theorem may be used to show that a language
is
regular by proving that the number of equivalence classes of
is finite. This may be done by an exhaustive
case analysis in which, beginning from the
empty string, distinguishing extensions are used to find additional equivalence classes until no more can be found. For example, the language consisting of binary representations of numbers that can be divided by 3 is regular. Given the empty string,
(or
),
, and
are distinguishing extensions resulting in the three classes (corresponding to numbers that give remainders 0, 1 and 2 when divided by 3), but after this step there is no distinguishing extension anymore. The minimal automaton accepting our language would have three states corresponding to these three equivalence classes.
Another immediate corollary of the theorem is that if for a language
the relation
has infinitely many equivalence classes, it is regular. It is this corollary that is frequently used to prove that a language is not regular.
Generalizations
The Myhill–Nerode theorem can be generalized to tree automata.[1]
See also
References
- .
- .
- . ASTIA Document No. AD 155741.
- .
Further reading
Notes and References
- Book: Tree Automata Techniques and Applications (TATA) . Hubert Comon . Max Dauchet . Rémi Gilleron . Florent Jacquemard . Denis Lugiez . Christoph Löding . Sophie Tison . Marc Tommasi . Oct 2021 . Here: Sect. 1.5, p.35-36.