Introselect Explained

Introselect (Musser)
Class:Selection algorithm
Data:Array
Best-Time:O(n)
Time:O(n)
Introselect (Quickselect - Heapselect)
Class:Selection algorithm
Data:Array
Best-Time:O(n)
Time:O(n log n)

In computer science, introselect (short for "introspective selection") is a selection algorithm that is a hybrid of quickselect and median of medians which has fast average performance and optimal worst-case performance. Introselect is related to the introsort sorting algorithm: these are analogous refinements of the basic quickselect and quicksort algorithms, in that they both start with the quick algorithm, which has good average performance and low overhead, but fall back to an optimal worst-case algorithm (with higher overhead) if the quick algorithm does not progress rapidly enough. Both algorithms were introduced by David Musser in, with the purpose of providing generic algorithms for the C++ Standard Library that have both fast average performance and optimal worst-case performance, thus allowing the performance requirements to be tightened.[1]

However, in most C++ Standard Library implementations, a different "introselect" algorithm is used, which combines quickselect and heapselect, and has a worst-case running time of O(n log n).[2] The C++ draft standard, as of 2022, does not have requirements on the worst-case performance, therefore allowing such choice.[3]

Algorithms

Introsort achieves practical performance comparable to quicksort while preserving O(n log n) worst-case behavior by creating a hybrid of quicksort and heapsort. Introsort starts with quicksort, so it achieves performance similar to quicksort if quicksort works, and falls back to heapsort (which has optimal worst-case performance) if quicksort does not progress quickly enough. Similarly, introselect combines quickselect with median of medians to achieve worst-case linear selection with performance similar to quickselect.

Introselect works by optimistically starting out with quickselect and only switching to a worst-case linear-time selection algorithm (the Blum-Floyd-Pratt-Rivest-Tarjan median of medians algorithm) if it recurses too many times without making sufficient progress. The switching strategy is the main technical content of the algorithm. Simply limiting the recursion to constant depth is not good enough, since this would make the algorithm switch on all sufficiently large lists. Musser discusses a couple of simple approaches:

Both approaches limit the recursion depth to k ⌈log n⌉ = O(log n) and the total running time to O(n).

The paper suggested that more research on introselect was forthcoming, but the author retired in 2007 without having published any such further research.

See also

References

Notes and References

  1. "Generic Algorithms", David Musser
  2. Web site: 35968 – nth_element fails to meet its complexity requirements.
  3. Web site: 27.8.3 Nth element [alg.nth.element] ]. Working Draft, Standard for Programming Language C++, eel.is.