Non-cryptographic hash function explained
The non-cryptographic hash functions (NCHFs) are hash functions intended for applications that do not need the rigorous security requirements of the cryptographic hash functions (e.g., preimage resistance) and therefore can be faster and less resource-intensive. Typical examples of CPU-optimized non-cryptographic hashes include FNV-1a, Murmur3. Some of the non-cryptographic hash functions are used in the cryptographic applications (usually in combination with other cryptographic primitives), in this case they are described as universal hash functions.
Applications and requirements
Among the typical uses of the non-cryptographic hash functions are bloom filters, hash tables, and count sketches. These applications require, in addition to speed, uniform distribution and avalanche properties. Collision resistance is an additional feature that can be useful against hash flooding attacks; simple NCHFs, like the Cyclic redundancy check (CRC), have essentially no collision resistance and thus cannot be used with an input open to manipulation by an attacker.
NCHFs are used in diverse systems: lexical analyzers, compilers, databases, communication networks, videogames, DNS servers, filesystems - anywhere in computing where there is a need to find the information very quickly (preferably in the O(1) time, which will also achieve perfect scalability).
Estébanez et al. list the "most important" NCHFs:
- Fowler–Noll–Vo hash function (FNV) was created by Glenn Fowler and Phong Vo in 1991 with contributions from Landon Curt Noll. FNV with its two variants, FNV-1 and FNV-1a, is very widely used in software that ranges from Linux and FreeBSD OSes, DNS servers, NFS to Twitter, PlayStation2, and Xbox console;
- lookup3 was created by Robert Jenkins. Hash is also widely used and can be found in PostgreSQL, Linux, Perl, Ruby, and Infoseek;
- SuperFastHash was created by Paul Hsieh using ideas from FNV and lookup3, with one of the goals being the high degree of avalanche effect. The hash is used in WebKit (part of Safari and Google Chrome);
- MurmurHash2 was created by Austin Appleby in 2008 and is used in libmemcached, Maatkit, and Apache Hadoop;
- DJBX33A ("Daniel J. Bernstein, Times 33 with Addition"). This very simple multiplication and addition function was proposed by Daniel J. Bernstein. It is fast and efficient during initialization. Many programming environments based on PHP 5, Python, ASP.NET use variants of this hash. The hash is easy to flood, exposing the servers;
- BuzHash was created by Robert Uzgalis in 1992. It is designed around a substitution table and can tolerate extremely skewed distributions on the input;
- DEK is an early multiplicative hash based on a proposal by Donald E. Knuth, being one of the oldest hashes that are still being used.
Design
Non-cryptographic hash functions optimized for software frequently involve the multiplication operation. Since in hardware multiplication is resource-intensive and frequency-limiting, ASIC-friendlier designs had been proposed, including SipHash (that has an additional benefit of being able to use a secret key for message authentication), NSGAhash and XORhash. Although technically the lightweight cryptography can be used for the same applications, the latency of its algorithms is usually way too high due to a large number of rounds. Sateesan et al. propose using the reduced-round versions of the lightweight hashes and ciphers as non-cryptographic hash functions.
Many NCHFs have a relatively small result size (e.g., 64 bits for SipHash or even less): large result size does not increase the performance of the target applications, but slows down the calculation, as more bits need to be generated.
See also
- A list of non-cryptographic hash functions
Sources
- Sateesan . Arish . Biesmans . Jelle . Claesen . Thomas . Vliegen . Jo . Mentens . Nele . Optimized algorithms and architectures for fast non-cryptographic hash functions in hardware . Microprocessors and Microsystems . April 2023 . 98 . 104782 . 0141-9331 . 10.1016/j.micpro.2023.104782 .
- Estébanez . César . Saez . Yago . Recio . Gustavo . Isasi . Pedro . Performance of the most common non-cryptographic hash functions . Software: Practice and Experience . 28 January 2013 . 44 . 6 . 681–698 . 0038-0644 . 10.1002/spe.2179 .
- Book: Mark . Stamp . 8 November 2011 . Information Security: Principles and Practice . 2 . John Wiley & Sons . 978-1-118-02796-7 . 1039294381 . Non-Cryptographic Hashes . https://books.google.com/books?id=UW3SS9P9hdEC&pg=PT118.
- Book: Ripon . Patgiri . Sabuzima . Nayak . Naresh Babu . Muppalaneni . 25 April 2023 . Bloom Filter: A Data Structure for Computer Networking, Big Data, Cloud Computing, Internet of Things, Bioinformatics and Beyond . Academic Press . 37–38 . 978-0-12-823646-8 . 1377693258 .
- Book: Mittelbach, Arno . Fischlin . Marc . The Theory of Hash Functions and Random Oracles . Non-cryptographic Hashing . Springer International Publishing . Cham . 2021 . 978-3-030-63286-1 . 10.1007/978-3-030-63287-8_7 . 303-334.