In computer science, geometric hashing is a method for efficiently finding two-dimensional objects represented by discrete points that have undergone an affine transformation, though extensions exist to other object representations and transformations. In an off-line step, the objects are encoded by treating each pair of points as a geometric basis. The remaining points can be represented in an invariant fashion with respect to this basis using two parameters. For each point, its quantized transformed coordinates are stored in the hash table as a key, and indices of the basis points as a value. Then a new pair of basis points is selected, and the process is repeated. In the on-line (recognition) step, randomly selected pairs of data points are considered as candidate bases. For each candidate basis, the remaining data points are encoded according to the basis and possible correspondences from the object are found in the previously constructed table. The candidate basis is accepted if a sufficiently large number of the data points index a consistent object basis.
Geometric hashing was originally suggested in computer vision for object recognition in 2D and 3D,[1] but later was applied to different problems such as structural alignment of proteins.[2] [3]
Geometric hashing is a method used for object recognition. Let’s say that we want to check if a model image can be seen in an input image. This can be accomplished with geometric hashing. The method could be used to recognize one of the multiple objects in a base, in this case the hash table should store not only the pose information but also the index of object model in the base.
For simplicity, this example will not use too many point features and assume that their descriptors are given by their coordinates only (in practice local descriptors such as SIFT could be used for indexing).
(12,17);
(45,13);
(40,46);
(20,35);
(35,25)
x'
y'
x'
(-0.75,-1.25);
(1.00,0.00);
(-0.50,1.25);
(-1.00,0.00);
(0.00,0.25)
Hash Table:
Vector ( x' y' | basis | |
---|---|---|
(-0.75,-1.25); | (P2,P4) | |
(1.00,0.00); | (P2,P4) | |
(-0.50,1.25); | (P2,P4) | |
(-1.00,0.00); | (P2,P4) | |
(0.00,0.25) | (P2,P4) | |
(1.00,0.00); | (P1,P3) | |
(0.00,1.25); | (P1,P3) | |
(-1.00,0.00); | (P1,P3) | |
(0.00,-0.25); | (P1,P3) | |
(0.00,0.50) | (P1,P3) |
Most hash tables cannot have identical keys mapped to different values. So in real life one won’t encode basis keys (1.0, 0.0) and (-1.0, 0.0) in a hash table.
It seems that this method is only capable of handling scaling, translation, and rotation. However, the input image may contain the object in mirror transform. Therefore, geometric hashing should be able to find the object, too. There are two ways to detect mirrored objects.
Similar to the example above, hashing applies to higher-dimensional data. For three-dimensional data points, three points are also needed for the basis. The first two points define the x-axis, and the third point defines the y-axis (with the first point). The z-axis is perpendicular to the created axis using the right-hand rule. Notice that the order of the points affects the resulting basis