The census transform (CT) is an image operator that associates to each pixel of a grayscale image a binary string, encoding whether the pixel has smaller intensity than each of its neighbours, one for each bit. It is a non-parametric transform that depends only on relative ordering of intensities, and not on the actual values of intensity, making it invariant with respect to monotonic variations of illumination, and it behaves well in presence of multimodal distributions of intensity, e.g. along object boundaries.[1] It has applications in computer vision, and it is commonly used in visual correspondence problems such as optical flow calculation and disparity estimation.[2]
The census transform is related to the rank transform, that associates to each pixel the number of neighbouring pixels with higher intensity than the pixel itself, and was introduced in the same paper.[3]
The most common version of the census transform uses a 3x3 window, comparing each pixel
p
\xi
\xi(p,p')=\begin{cases} 0&ifp>p'\\ 1&ifp\lep'\\ \end{cases}.
The results of these comparisons are concatenated and the value of the transform is an 8-bit value, that can be easily encoded in a byte.
\begin{array}{|c|c||c|} \hline124&74&32\\ \hline124&64&18\\ \hline157&116&84\\ \hline \end{array} \longrightarrow \begin{array}{|c|c|c|} \hline1&1&0\\ \hline1&x&0\\ \hline1&1&1\\ \hline \end{array} \longrightarrow 11010111
Similarity between images is determined by comparing the values of the census transform for corresponding pixels, using the Hamming distance. Several variations of the algorithm exist, using different size of the window, order of the neighbours in the pattern (row-wise, clockwise, counterclockwise), comparison operator (greater, greater or equal, lesser, lesser or equal).[4]
An extension of the algorithm uses a three-way comparison that allows to represent similar pixels, whose intensity difference is smaller than a tolerance parameter
\epsilon
\xi(p,p')=\begin{cases} 0&ifp-p'>\epsilon\\ 1&if|p-p'|\le\epsilon\\ 2&ifp'-p>\epsilon \end{cases}
whose result can be encoded with two bits for each neighbour, thus doubling the size of the pattern for each pixel.
\begin{array}{|c|c||c|} \hline124&74&32\\ \hline124&64&18\\ \hline157&116&84\\ \hline \end{array} \longrightarrow \begin{array}{|c|c|c|} \hline2&1&0\\ \hline2&x&0\\ \hline2&2&2\\ \hline \end{array} \longrightarrow 21020222