Closest pair of points problem explained
The closest pair of points problem or closest pair problem is a problem of computational geometry: given
points in
metric space, find a pair of points with the smallest distance between them. The closest pair problem for points in the Euclidean plane
[1] was among the first geometric problems that were treated at the origins of the systematic study of the
computational complexity of geometric algorithms.
Time bounds
Randomized algorithms that solve the problem in linear time are known, in Euclidean spaces whose dimension is treated as a constant for the purposes of asymptotic analysis.[2] [3] [4] This is significantly faster than the
time (expressed here in
big O notation) that would be obtained by a naive algorithm of finding distances between all pairs of points and selecting the smallest.
It is also possible to solve the problem without randomization, in random-access machine models of computation with unlimited memory that allow the use of the floor function, in near-linear
time.
[5] In even more restricted models of computation, such as the
algebraic decision tree, the problem can be solved in the somewhat slower
time bound,
[6] and this is optimal for this model, by a reduction from the
element uniqueness problem. Both
sweep line algorithms and
divide-and-conquer algorithms with this slower time bound are commonly taught as examples of these algorithm design techniques.
[7] Linear-time randomized algorithms
A linear expected time randomized algorithm of, modified slightly by Richard Lipton to make its analysis easier, proceeds as follows, on an input set
consisting of
points in a
-dimensional Euclidean space:
- Select
pairs of points uniformly at random, with replacement, and let
be the minimum distance of the selected pairs.
- Round the input points to a square grid of points whose size (the separation between adjacent grid points) is
, and use a
hash table to collect together pairs of input points that round to the same grid point.
- For each input point, compute the distance to all other inputs that either round to the same grid point or to another grid point within the Moore neighborhood of
surrounding grid points.
- Return the smallest of the distances computed throughout this process.
The algorithm will always correctly determine the closest pair, because it maps any pair closer than distance
to the same grid point or to adjacent grid points. The uniform sampling of pairs in the first step of the algorithm (compared to a different method of Rabin for sampling a similar number of pairs) simplifies the proof that the expected number of distances computed by the algorithm is linear.
[4] Instead, a different algorithm goes through two phases: a random iterated filtering process that approximates the closest distance to within an approximation ratio of
, together with a finishing step that turns this approximate distance into the exact closest distance. The filtering process repeat the following steps, until
becomes empty:
- Choose a point
uniformly at random from
.
- Compute the distances from
to all the other points of
and let
be the minimum such distance.
- Round the input points to a square grid of size
, and delete from
all points whose Moore neighborhood has no other points.The approximate distance found by this filtering process is the final value of
, computed in the step before
becomes empty. Each step removes all points whose closest neighbor is at distance
or greater, at least half of the points in expectation, from which it follows that the total expected time for filtering is linear. Once an approximate value of
is known, it can be used for the final steps of Rabin's algorithm; in these steps each grid point has a constant number of inputs rounded to it, so again the time is linear.
[3] Dynamic closest-pair problem
The dynamic version for the closest-pair problem is stated as follows:
- Given a dynamic set of objects, find algorithms and data structures for efficient recalculation of the closest pair of objects each time the objects are inserted or deleted.
If the bounding box for all points is known in advance and the constant-time floor function is available, then the expected
-space data structure was suggested that supports expected-time
insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require
expected time.
[8] The complexity of the dynamic closest pair algorithm cited above is exponential in the dimension
, and therefore such an algorithm becomes less suitable for high-dimensional problems.
An algorithm for the dynamic closest-pair problem in
dimensional space was developed by Sergey Bespamyatnikh in 1998.
[9] Points can be inserted and deleted in
time per point (in the worst case).
See also
Notes and References
- Shamos . Michael Ian . Michael Ian Shamos . Hoey . Dan . Closest-point problems . 10.1109/SFCS.1975.8 . 151–162 . IEEE Computer Society . 16th Annual Symposium on Foundations of Computer Science, Berkeley, California, USA, October 13-15, 1975 . 1975.
- Rabin . M. . Michael O. Rabin . Probabilistic algorithms . 21–39 . Academic Press . Algorithms and Complexity: Recent Results and New Directions . 1976. As cited by .
- Khuller . Samir . Samir Khuller . Matias . Yossi . Yossi Matias . 10.1006/inco.1995.1049 . 1 . . 1329236 . 34–37 . A simple randomized sieve algorithm for the closest-pair problem . 118 . 1995. 206566076 . free .
- Web site: Rabin Flips a Coin. Gödel's Lost Letter and P=NP. Richard. Lipton . Richard Lipton. 24 September 2011.
- Fortune . Steve . Hopcroft . John . John Hopcroft . 10.1016/0020-0190(79)90085-1 . 1 . . 515507 . 20–23 . A note on Rabin's nearest-neighbor algorithm . 8 . 1979. 1813/7460 . free .
- Clarkson . Kenneth L. . Kenneth L. Clarkson . Fast algorithms for the all nearest neighbors problem . 10.1109/SFCS.1983.16 . 226–232 . IEEE Computer Society . 24th Annual Symposium on Foundations of Computer Science, Tucson, Arizona, USA, 7-9 November 1983 . 1983.
- Book: Kleinberg . Jon M. . Jon Kleinberg . Tardos . Éva . Éva Tardos . 5.4 Finding the closest pair of points . 978-0-321-37291-8 . 225–231 . Addison-Wesley . Algorithm Design . 2006.
- Golin . Mordecai . Raman . Rajeev . Schwarz . Christian . Smid . Michiel . 10.1137/S0097539794277718 . 4 . . 1622005 . 1036–1072 . Randomized data structures for the dynamic closest-pair problem . 27 . 1998. 1242364 .
- Bespamyatnikh . S. N. . 10.1007/PL00009340 . 2 . . 1600047 . 175–195 . An optimal algorithm for closest-pair maintenance . 19 . 1998. free .