The Rabin fingerprinting scheme (aka Polynomial fingerprinting) is a method for implementing fingerprints using polynomials over a finite field. It was proposed by Michael O. Rabin.[1]
Given an n-bit message m0,...,mn-1, we view it as a polynomial of degree n-1 over the finite field GF(2).
f(x)=m0+m1x+\ldots+mn-1xn-1
We then pick a random irreducible polynomial of degree k over GF(2), and we define the fingerprint of the message m to be the remainder
r(x)
f(x)
p(x)
Many implementations of the Rabin–Karp algorithm internally use Rabin fingerprints.
The Low Bandwidth Network Filesystem (LBFS) from MIT uses Rabin fingerprints to implement variable size shift-resistant blocks.[2] The basic idea is that the filesystem computes the cryptographic hash of each block in a file. To save on transfers between the client and server,they compare their checksums and only transfer blocks whose checksums differ. But one problem with this scheme is that a single insertion at the beginning of the file will cause every checksum to change if fixed-sized (e.g. 4 KB) blocks are used. So the idea is to select blocks not based on a specific offset but rather by some property of the block contents. LBFS does this by sliding a 48 byte window over the file and computing the Rabin fingerprint of each window. When the low 13 bits of the fingerprint are zero LBFS calls those 48 bytes a breakpoint and ends the current block and begins a new one. Since the output of Rabin fingerprints are pseudo-random the probability of any given 48 bytes being a breakpoint is
2-13
Note that this is a problem similar to that faced by rsync.