Comparison of parser generators explained

This is a list of notable lexer generators and parser generators for various language classes.

Regular languages

Regular languages are a category of languages (sometimes termed Chomsky Type 3) which can be matched by a state machine (more specifically, by a deterministic finite automaton or a nondeterministic finite automaton) constructed from a regular expression. In particular, a regular language can match constructs like "A follows B", "Either A or B", "A, followed by zero or more instances of B", but cannot match constructs which require consistency between non-adjacent elements, such as "some instances of A followed by the same number of instances of B", and also cannot express the concept of recursive "nesting" ("every A is eventually followed by a matching B"). A classic example of a problem which a regular grammar cannot handle is the question of whether a given string contains correctly nested parentheses. (This is typically handled by a Chomsky Type 2 grammar, also termed a context-free grammar.)

Name Lexer algorithm Grammar, code Development platform License
Alex , BSD
AnnoFlex , BSD
AstirDFA table driven, with branching , MIT
AustenX , BSD
C# Flex , GNU GPL
C# Lex ?
CookCC , Apache 2.0
DFA DFA compressed matrix Windows, Visual Studio BSD
Dolphin
DFA table driven , BSD
gelex , MIT
golex , BSD-style
gplex , BSD-like
JFlex , BSD
JLex , BSD-like
, proprietary, CDDL
lexertl ? , GNU LGPL
Quex DFA direct code , GNU LGPL
, GNU GPL, MIT[1] [2]
DFA direct code, DFA table driven, and NFA regex libraries , BSD
DFA direct code , public domain

Deterministic context-free languages

Context-free languages are a category of languages (sometimes termed Chomsky Type 2) which can be matched by a sequence of replacement rules, each of which essentially maps each non-terminal element to a sequence of terminal elements and/or other nonterminal elements. Grammars of this type can match anything that can be matched by a regular grammar, and furthermore, can handle the concept of recursive "nesting" ("every A is eventually followed by a matching B"), such as the question of whether a given string contains correctly nested parentheses. The rules of Context-free grammars are purely local, however, and therefore cannot handle questions that require non-local analysis such as "Does a declaration exist for every variable that is used in a function?". To do so technically would require a more sophisticated grammar, like a Chomsky Type 1 grammar, also termed a context-sensitive grammar. However, parser generators for context-free grammars often support the ability for user-written code to introduce limited amounts of context-sensitivity. (For example, upon encountering a variable declaration, user-written code could save the name and type of the variable into an external data structure, so that these could be checked against later variable references detected by the parser.)

The deterministic context-free languages are a proper subset of the context-free languages which can be efficiently parsed by deterministic pushdown automata.

Name Parsing algorithm Input grammar notation Grammar, code Development platform License
ANTLR4 Adaptive LL(*)[3] generated , BSD
ANTLR3 LL(*) generated , BSD
APG[4] none , BSD
Beaver[5] [6] LALR(1) EBNFJavaexternal , BSD
external , GNU GPL with exception
? external , public domain
LALR(1) external , public domain
CL-Yacc[7] [8] LALR(1) LispCommon Lispexternal , MIT
LL(1) generated , GNU GPL
CppCC[9] [10] LL(k) ? C++generated , GNU GPL
CUP[11] [12] LALR(1) ? Javaexternal , BSD-like
Eli[13] [14] LALR(1) ? generated , GNU GPL, GNU LGPL
Essence[15] LR(?) ? Scheme 48external , BSD
eyapp[16] LALR(1) ? Perlexternal or generated , Artistic
GOLD[17] LALR(1) BNF x86 assembly language, ANSI C, C#, D, Java, Pascal, Object Pascal, Python, Visual Basic 6, Visual Basic .NET, Visual C++generated , zlib modified
Hime Parser Generator[18] LALR(1), GLRBNF dialect C#, Java, Rustgenerated , GNU LGPL
Hyacc[19] LR(1), LALR(1), LR(0) YaccCexternal , GNU GPL
JavaCC[20] [21] LL(k) EBNFJava, C++, JavaScript (via GWT compiler)[22] generated , BSD
LL(1), LALR(1) ? ? ? ?
LL(k) ? generated , GNU GPL
JS/CC LALR(1) internal , BSD
KDevelop-PG-Qt ? generated or external , GNU LGPL
Kelbt Backtracking LALR(1) ? generated , GNU GPL
kmyacc LALR(1) ? external , GNU GPL
Lapg LALR(1) ? generated , GNU GPL
Lark generated , MIT
LALR(1) BNF dialect[23] external , public domain
Lezer[24] [25] [26] EBNF dialect generated , MIT
Lime LALR(1) ? external , GNU GPL
LISA LR(?), LL(?), LALR(?), SLR(?) ? generated , public domain
LLgen LL(1) ? external , BSD
LLnextgen LL(1) ? external , GNU GPL
LLLPG LL(k) + syntactic and semantic predicates ANTLR-like generated (?) , GNU LGPL
LPG Backtracking LALR(k) ? generated , EPL
LRSTAR[27] LALR(1), LALR(*) YACC, ANTLR, EBNF generated , BSD
Menhir LR(1) ? generated , QPL
ML-Yacc LALR(1) ? external ?
Monkey LR(1) ? generated , GNU GPL
Msta LALR(k), LR(k) external or generated , GNU GPL
MTP (More Than Parsing) LL(1) ? generated , GNU GPL
MyParser LL(*) internal , MIT
NLT C#/BNF-like mixed , MIT
ocamlyacc LALR(1) ? external , QPL
olex LL(1) ? generated , GNU GPL
Parsecnone , BSD
yappLALR(1) ? external , GNU GPL
Parser Objects LL(k) ? ? , zlib
PCCTS ? ? ? ?
LALR(1) BNF generated , MIT
PlyPlus LALR(1) EBNF generated , MIT
PRECC LL(k) ? generated , GNU GPL
racc[28] LALR(1)BNF-like, yacc-like[29] Ruby?
QLALR LALR(1) ? external , GNU GPL
LALR(1) ? generated , GNU LGPL
SLK[30] LL(k) LR(k) LALR(k) external SLK[31]
SLY[32] LALR(1) BNF generated , MIT
SP (Simple Parser) generated , GNU LGPL
? internal , Boost
Styx LALR(1) ? generated , GNU LGPL
Sweet Parser LALR(1) ? generated , zlib
Tap LL(1) ? generated , GNU GPL
TextTransformer LL(k) ? generated
TinyPG LL(1) ? ? ? , CPOL 1.0
? generated , GNU LGPL
TP Yacc LALR(1) ? external , GNU GPL
Tree-Sitter[33] C, bindings (Rust, WebAssembly, JavaScript, Python, many other) generated + external , MIT
Tunnel Grammar Studio generated
UltraGram C++, Java, C#, Visual Basic .NET external , public domain
UniCC LALR(1) generated , BSD
LL(1) ? Java ? generated ?
LALR(1) external , CPL & CDDL
Yacc++ LR(1), LALR(1) generated or external
Yapps LL(1) ? generated , MIT
yecc LALR(1) ? generated , Apache 2.0
LR(1), LALR(1) ? generated
YooParse LR(1), LALR(1) ? external , MIT
Parse[34] LR(1) BNF in C++ types ? ? none , MIT
GGLL LL(1) Graph generated , MIT
Product Parsing algorithm Input grammar notation Grammar, code Development platform License

Parsing expression grammars, deterministic boolean grammars

This table compares parser generators with parsing expression grammars, deterministic boolean grammars.

Name Parsing algorithm Grammar, code Development platform License
AustenX Packrat (modified) , BSD
Aurochs , GNU GPL
BNFlite Recursive descent , MIT
Canopy , GNU GPL
CL-peg , MIT
Drat! , GNU GPL
Frisby , BSD
, BSD
Grako Packrat + Cut + Left Recursion Python, C++ (beta) , BSD
IronMeta , BSD
Laja 2-phase scannerless top-down backtracking + runtime support , GNU GPL
lars::Parser Packrat (supporting left-recursion and grammar ambiguity) Identical , BSD
LPeg Parsing machine , MIT
lug Parsing machine , MIT
Mouse Recursive descent (modified, limited memoization and left-recursion) , Apache 2.0
Narwhal , BSD
Nearley , MIT
Nemerle.Peg Recursive descent + Pratt , BSD
neotoma , MIT
NPEG Recursive descent , MIT
Packrat (modified, partial memoization) , MIT
PackCC Packrat (modified, left-recursion support) , MIT
, MIT
, BSD
Recursive descent , Apache 2.0
Lambda PEG Recursive descent , Apache 2.0
parsepp Recursive descent , public domain
Parsnip , GNU GPL
Patterns Parsing machine Identical , MIT
peg Recursive descent , MIT
PEG.js Packrat (partial memoization) , MIT
Peggy[35] Packrat (partial memoization) , MIT
Pegasus Recursive descent, Packrat (selectively) , MIT
pegc Recursive descent , public domain
pest Recursive descent , MIT, Apache 2.0
PetitParser , MIT
PEGTLRecursive descent , Boost
Parser Grammar Engine (PGE) Hybrid recursive descent / operator precedence[36] , Artistic 2.0
PyPy rlib , MIT
Rats! , GNU LGPL
Spirit2 Recursive descent , Boost
Treetop Recursive descent , MIT
Yard Recursive descent , MIT or public domain
Waxeye Parsing machine , MIT
PHP PEG PEG Parser? , BSD

General context-free, conjunctive, or boolean languages

This table compares parser generator languages with a general context-free grammar, a conjunctive grammar, or a boolean grammar.

Name Parsing algorithm Input grammar notation Output languages !Grammar, code Development platform License
ACCENT Yacc variant external , GNU GPL
APaGeD GLR, LALR(1), LL(k) ? generated , Artistic
, except XML external , GNU GPL
? generated
DParser ? , BSD
Dypgen ? generated , CeCILL-B
E3 ? external, or scannerless ?
Elkhound ? external , BSD
GDK ? generated , MIT
Happy ? external , BSD
Hime Parser Generator ? generated , GNU LGPL
IronText Library generated or external , Apache 2.0
Jison LALR(1), LR(0), SLR(1) generated , MIT
Syntax LALR(1), LR(0), SLR(1) CLR(1) LL(1) generated , MIT
Laja Scannerless, two phase Laja , GNU GPL
ModelCC Annotated class model Generated generated , BSD
P3 Earley–combinators BNF-like external, or scannerless ?
P4 Earley–combinators, infinitary CFGs BNF-like external, or scannerless ?
Scannerless GLR (Boolean grammars) ? , BSD
SDF/SGLR , BSD
SmaCC GLR(1), LALR(1), LR(1) ? internal , MIT
SPARK ? external , MIT
Tom ? Generated none , "No licensing or copyright restrictions"
UltraGram ? generated
Wormhole ? , MIT
Whale Calf General tabular, SLL(k), Linear normal form (conjunctive grammars), LR, Binary normal form (Boolean grammars) ? external
yaep Yacc-like external , GNU LGPL

Context-sensitive grammars

This table compares parser generators with context-sensitive grammars.

See also

Notes

  1. Web site: Ragel State Machine Compiler.
  2. http://www.colm.net/open-source/ragel/
  3. Web site: Adaptive LL(*) Parsing: The Power of Dynamic Analysis . 2016-04-03 . Terence Parr.
  4. Web site: Survey on Various Syntax Analyzer Tools . 2023-09-16 . www.ijraset.com . en.
  5. Boyland . John . Spiewak . Daniel . 2010-09-17 . Tool Paper: ScalaBison Recursive Ascent-Descent Parser Generator . Electronic Notes in Theoretical Computer Science . Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications (LDTA 2009) . 253 . 7 . 65–74 . 10.1016/j.entcs.2010.08.032 . 1571-0661. free .
  6. Web site: Beaver - a LALR Parser Generator . 2023-09-16 . beaver.sourceforge.net.
  7. Newton . Jim E. . Demaille . Akim . Verna . Didier . 2016-05-09 . Type-Checking of Heterogeneous Sequences in Common Lisp . Proceedings of the 9th European Lisp Symposium on European Lisp Symposium . ELS2016 . Kraków, Poland . European Lisp Scientific Activities Association . 13–20 . 978-2-9557474-0-7.
  8. Web site: CL-Yacc — a LALR(1) parser generator for Common Lisp . 2023-09-16 . www.irif.fr.
  9. Hosseinpour . Sahereh . Alavi Milani . Mir Mohammad Reza . Pehlivan . Hüseyin . July 2018 . A Step-by-Step Solution Methodology for Mathematical Expressions . Symmetry . en . 10 . 7 . 285 . 10.3390/sym10070285 . 2018Symm...10..285H . 2073-8994 . free .
  10. Web site: CppCC's Home Page . 2023-09-16 . cppcc.sourceforge.net.
  11. Web site: Java Cup . 2023-09-16 . pages.cs.wisc.edu.
  12. Web site: CUP . 2023-09-16 . www2.cs.tum.edu.
  13. Thiemann . Peter . Neubauer . Matthias . 2004-12-31 . Parameterized LR Parsing . Electronic Notes in Theoretical Computer Science . Proceedings of the Fourth Workshop on Language Descriptions, Tools, and Applications (LDTA 2004) . 110 . 115–132 . 10.1016/j.entcs.2004.06.007 . 1571-0661. free .
  14. Gray . Robert W. . Levi . Steven P. . Heuring . Vincent P. . Sloane . Anthony M. . Waite . William M. . Eli: a complete, flexible compiler construction system . Communications of the ACM . 1992 . en . 35 . 2 . 121–130 . 10.1145/129630.129637 . 5121773 . 0001-0782. free .
  15. Owens . Scott . Flatt . M. . Shivers . O. . McMullan . Benjamin . 2004-10-01 . Lexer and Parser Generators in Scheme . Scheme 2004: Proceedings of the Fifth Workshop on Scheme and Functional Programming.
  16. Areias . Hugo . Simões . Alberto . Henriques . P. . Cruz . Daniela Carneiro da . 2010-09-01 . Parser generation in Perl : an overview and available tools .
  17. Web site: Volkman . Victor . 2007-07-19 . Let Your Parser Go for the GOLD . 2023-11-04 . Developer.com . en-US.
  18. Web site: Parsing in C#: All the Tools and Libraries You Can Use (Part 2) - DZone . 2023-11-04 . dzone.com . en.
  19. Ortin . Francisco . Quiroga . Jose . Rodriguez-Prieto . Oscar . Garcia . Miguel . 2022-03-03 . An empirical evaluation of Lex/Yacc and ANTLR parser generation tools . PLOS ONE . en . 17 . 3 . e0264326 . 10.1371/journal.pone.0264326 . 1932-6203 . 8893623 . 35239695 . 2022PLoSO..1764326O . free .
  20. Web site: Enseling . Oliver . 2000-12-29 . Build your own languages with JavaCC . 2023-11-04 . InfoWorld . en.
  21. Web site: JavaCC . 2023-11-04 . JavaCC . en-US.
  22. Web site: Building parsers for the web with JavaCC & GWT (Part one) . 14 April 2014 . 2014-05-04 . Chris Ainsley.
  23. Web site: The Lemon Parser Generator . 2023-11-30 . sqlite.org.
  24. Web site: The Lezer Parser System.
  25. News: Building a ShopifyQL Code Editor . en . Shopify . 2023-12-06.
  26. Web site: 2022-03-11 . Sponsoring the Lezer parser system Tines . 2023-12-06 . www.tines.com.
  27. Web site: An LR(*) parser generator for C++.
  28. Web site: Racc. 2021-11-26. i.loveruby.net.
  29. Web site: Racc Grammar File Reference. 2021-11-26. i.loveruby.net.
  30. Web site: The SLK Parser Generator supports C, C++, Java, JavaScript, and C#, optional backtracking, free.
  31. http://www.slkpg.tech/license.txt
  32. Web site: SLY (Sly Lex Yacc).
  33. Web site: Tree-Sitter - An incremental parsing system for programming tools.
  34. Web site: Parse - Compile time (LR) type safe parser generator for C++. GitHub. 30 December 2021.
  35. Maintained fork of PEG.js
  36. Web site: Parrot: Grammar Engine . 2011 . The Parrot Foundation . PGE rules provide the full power of recursive descent parsing and operator precedence parsing..

External links