Smallest grammar problem explained
In data compression and the theory of formal languages, the smallest grammar problem is the problem of finding the smallest context-free grammar that generates a given string of characters (but no other string). The size of a grammar is defined by some authors as the number of symbols on the right side of the production rules.Others also add the number of rules to that.[1] A grammar that generates only a single string, as required for the solution to this problem, is called a straight-line grammar.[2]
Every binary string of length
has a grammar of length
, as expressed using
big O notation.
[2] For binary
de Bruijn sequences, no better length is possible.
[3] The (decision version of the) smallest grammar problem is NP-complete.[4] It can be approximated in polynomial time to within a logarithmic approximation ratio; more precisely, the ratio is
where
is the length of the given string and
is the size of its smallest grammar. It is hard to approximate to within a constant approximation ratio. An improvement of the approximation ratio to
would also improve certain algorithms for approximate
addition chains.
[5] See also
External links
Notes and References
- Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013.
- Lohrey . Markus . 10.1515/GCC-2012-0016 . 2 . Groups Complexity Cryptology . 241–299 . Algorithmics on SLP-compressed strings: A survey . 4 . 2012.
- Domaratzki . Michael . Pighizzini . Giovanni . Shallit . Jeffrey . 10.1016/S0020-0190(02)00316-2 . 6 . Information Processing Letters . 1937222 . 339–344 . Simulating finite automata with context-free grammars . 84 . 2002.
- The Smallest Grammar Problem. IEEE Transactions on Information Theory. 2005. 10.1109/TIT.2005.850116 . Charikar . Moses . Lehman . Eric . Liu . Ding . Panigrahy . Rina . Prabhakaran . Manoj . Sahai . Amit . Shelat . Abhi . 1296.68086 . 51 . 7 . 2554–2576 . 10.1.1.185.2130. 6900082.
- Book: Charikar . Moses . Lehman . Eric . Liu . Ding . Panigrahy . Rina . Prabhakaran . Manoj . Rasala . April . Sahai . Amit . Shelat . Abhi. Approximating the Smallest Grammar: Kolmogorov Complexity in Natural Models. http://theory.lcs.mit.edu/~arasala/papers/grammar.pdf . 1192.68397 . Proceedings of the thirty-fourth annual ACM symposium on theory of computing (STOC 2002), Montreal, Quebec, Canada, May 19–21, 2002 . New York, NY . ACM Press . 978-1-581-13495-7 . 792–801 . 2002 . 10.1145/509907.510021 . 282489 .