Levenshtein coding explained
Levenshtein coding is a universal code encoding the non-negative integers developed by Vladimir Levenshtein.[1] [2]
Encoding
The code of zero is "0"; to code a positive number:
- Initialize the step count variable C to 1.
- Write the binary representation of the number without the leading "1" to the beginning of the code.
- Let M be the number of bits written in step 2.
- If M is not 0, increment C, repeat from step 2 with M as the new number.
- Write C "1" bits and a "0" to the beginning of the code.
The code begins:
Number | Encoding | Implied probability |
---|
0 | 0 | 1/2 |
|
1 | 10 | 1/4 |
|
2 | 110 0 | 1/16 |
3 | 110 1 | 1/16 |
|
4 | 1110 0 00 | 1/128 |
5 | 1110 0 01 | 1/128 |
6 | 1110 0 10 | 1/128 |
7 | 1110 0 11 | 1/128 |
|
8 | 1110 1 000 | 1/256 |
9 | 1110 1 001 | 1/256 |
10 | 1110 1 010 | 1/256 |
11 | 1110 1 011 | 1/256 |
12 | 1110 1 100 | 1/256 |
13 | 1110 1 101 | 1/256 |
14 | 1110 1 110 | 1/256 |
15 | 1110 1 111 | 1/256 |
|
16 | 11110 0 00 0000 | 1/4096 |
17 | 11110 0 00 0001 | 1/4096 | |
To decode a Levenshtein-coded integer:
- Count the number of "1" bits until a "0" is encountered.
- If the count is zero, the value is zero, otherwise
- Discard the "1" bits just counted and the first "0" encountered
- Start with a variable N, set it to a value of 1 and repeat count minus 1 times:
- Read N bits (and remove them from the encoded integer), prepend "1", assign the resulting value to N
The Levenshtein code of a positive integer is always one bit longer than the Elias omega code of that integer. However, there is a Levenshtein code for zero, whereas Elias omega coding would require the numbers to be shifted so that a zero is represented by the code for one instead.
Example code
Encoding
void levenshteinEncode(char* source, char* dest)
Decoding
void levenshteinDecode(char* source, char* dest)
See also
Notes and References
- Web site: 1968 paper by V. I. Levenshtein (in Russian) .
- Book: Variable-length codes for data compression . David Salomon . Springer . 2007 . 978-1-84628-958-3 . 80 .