Binary search tree explained
In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and less than the ones in its right subtree. The time complexity of operations on the binary search tree is linear with respect to the height of the tree.
Binary search trees allow binary search for fast lookup, addition, and removal of data items. Since the nodes in a BST are laid out so that each comparison skips about half of the remaining tree, the lookup performance is proportional to that of binary logarithm. BSTs were devised in the 1960s for the problem of efficient storage of labeled data and are attributed to Conway Berners-Lee and David Wheeler.
The performance of a binary search tree is dependent on the order of insertion of the nodes into the tree since arbitrary insertions may lead to degeneracy; several variations of the binary search tree can be built with guaranteed worst-case performance. The basic operations include: search, traversal, insert and delete. BSTs with guaranteed worst-case complexities perform better than an unsorted array, which would require linear search time.
The complexity analysis of BST shows that, on average, the insert, delete and search takes
for
nodes. In the worst case, they degrade to that of a singly linked list:
. To address the boundless increase of the tree height with arbitrary insertions and deletions, self-balancing variants of BSTs are introduced to bound the worst lookup complexity to that of the binary logarithm.
AVL trees were the first self-balancing binary search trees, invented in 1962 by
Georgy Adelson-Velsky and
Evgenii Landis.
Binary search trees can be used to implement abstract data types such as dynamic sets, lookup tables and priority queues, and used in sorting algorithms such as tree sort.
History
The binary search tree algorithm was discovered independently by several researchers, including P.F. Windley, Andrew Donald Booth, Andrew Colin, Thomas N. Hibbard.[1] [2] The algorithm is attributed to Conway Berners-Lee and David Wheeler, who used it for storing labeled data in magnetic tapes in 1960.[3] One of the earliest and popular binary search tree algorithm is that of Hibbard.
The time complexities of a binary search tree increases boundlessly with the tree height if the nodes are inserted in an arbitrary order, therefore self-balancing binary search trees were introduced to bound the height of the tree to
.
[4] Various
height-balanced binary search trees were introduced to confine the tree height, such as
AVL trees,
Treaps, and
red–black trees.
[5] The AVL tree was invented by Georgy Adelson-Velsky and Evgenii Landis in 1962 for the efficient organization of information.[6] [7] It was the first self-balancing binary search tree to be invented.[8]
Overview
A binary search tree is a rooted binary tree in which nodes are arranged in strict total order in which the nodes with keys greater than any particular node A is stored on the right sub-trees to that node A and the nodes with keys equal to or less than A are stored on the left sub-trees to A, satisfying the binary search property.[9] [10]
Binary search trees are also efficacious in sortings and search algorithms. However, the search complexity of a BST depends upon the order in which the nodes are inserted and deleted; since in worst case, successive operations in the binary search tree may lead to degeneracy and form a singly linked list (or "unbalanced tree") like structure, thus has the same worst-case complexity as a linked list.[11]
Binary search trees are also a fundamental data structure used in construction of abstract data structures such as sets, multisets, and associative arrays.
Operations
Searching
Searching in a binary search tree for a specific key can be programmed recursively or iteratively.
Searching begins by examining the root node. If the tree is , the key being searched for does not exist in the tree. Otherwise, if the key equals that of the root, the search is successful and the node is returned. If the key is less than that of the root, the search proceeds by examining the left subtree. Similarly, if the key is greater than that of the root, the search proceeds by examining the right subtree. This process is repeated until the key is found or the remaining subtree is
. If the searched key is not found after a
subtree is reached, then the key is not present in the tree.
Recursive search
The following pseudocode implements the BST search procedure through recursion.
Recursive-Tree-Search(x, key) if x = NIL or key = x.key then return x if key < x.key then return Recursive-Tree-Search(x.left, key) else return Recursive-Tree-Search(x.right, key) end if | |
The recursive procedure continues until a
or the
being searched for are encountered.
Iterative search
The recursive version of the search can be "unrolled" into a while loop. On most machines, the iterative version is found to be more efficient.
Iterative-Tree-Search(x, key) while x ≠ NIL and key ≠ x.key do if key < x.key then x := x.left else x := x.right end if repeat return x | |
Since the search may proceed till some leaf node, the running time complexity of BST search is
where
is the height of the tree. However, the worst case for BST search is
where
is the total number of nodes in the BST, because an unbalanced BST may degenerate to a linked list. However, if the BST is
height-balanced the height is
.
Successor and predecessor
For certain operations, given a node
, finding the successor or predecessor of
is crucial. Assuming all the keys of a BST are distinct, the successor of a node
in a BST is the node with the smallest key greater than
's key. On the other hand, the predecessor of a node
in a BST is the node with the largest key smaller than
's key. The following pseudocode finds the successor and predecessor of a node
in a BST.
[12] [13] BST-Successor(x) if x.right ≠ NIL then return BST-Minimum(x.right) end if y := x.parent while y ≠ NIL and x = y.right do x := y y := y.parent repeat return y | BST-Predecessor(x) if x.left ≠ NIL then return BST-Maximum(x.left) end if y := x.parent while y ≠ NIL and x = y.left do x := y y := y.parent repeat return y | |
Operations such as finding a node in a BST whose key is the maximum or minimum are critical in certain operations, such as determining the successor and predecessor of nodes. Following is the pseudocode for the operations.
BST-Maximum(x) while x.right ≠ NIL do x := x.right repeat return x | BST-Minimum(x) while x.left ≠ NIL do x := x.left repeat return x | |
Insertion
Operations such as insertion and deletion cause the BST representation to change dynamically. The data structure must be modified in such a way that the properties of BST continue to hold. New nodes are inserted as leaf nodes in the BST. Following is an iterative implementation of the insertion operation.
1 BST-Insert(T, z) 2 y := NIL 3 x := T.root 4 while x ≠ NIL do 5 y := x 6 if z.key < x.key then 7 x := x.left 8 else 9 x := x.right 10 end if 11 repeat 12 z.parent := y 13 if y = NIL then 14 T.root := z 15 else if z.key < y.key then 16 y.left := z 17 else 18 y.right := z 19 end if | |
The procedure maintains a "trailing pointer"
as a parent of
. After initialization on line 2, the
while loop along lines 4-11 causes the pointers to be updated. If
is
, the BST is empty, thus
is inserted as the root node of the binary search tree
, if it is not
, insertion proceeds by comparing the keys to that of
on the lines 15-19 and the node is inserted accordingly.
Deletion
The deletion of a node, say
, from the binary search tree
has three cases:
- If
is a leaf node, the parent node of
gets replaced by
and consequently
is removed from the
, as shown in (a).
- If
has only one child, the child node of
gets elevated by modifying the parent node of
to point to the child node, consequently taking
's position in the tree, as shown in (b) and (c).
- If
has both left and right children, the successor of
, say
, displaces
by following the two cases:
- If
is
's right child, as shown in (d),
displaces
and
's right child remain unchanged.
- If
lies within
's right subtree but is not
's right child, as shown in (e),
first gets replaced by its own right child, and then it displaces
's position in the tree.
The following pseudocode implements the deletion operation in a binary search tree.
1 BST-Delete(BST, D) 2 if D.left = NIL then 3 Shift-Nodes(BST, D, D.right) 4 else if D.right = NIL then 5 Shift-Nodes(BST, D, D.left) 6 else 7 E := BST-Successor(D) 8 if E.parent ≠ D then 9 Shift-Nodes(BST, E, E.right) 10 E.right := D.right 11 E.right.parent := E 12 end if 13 Shift-Nodes(BST, D, E) 14 E.left := D.left 15 E.left.parent := E 16 end if |
1 Shift-Nodes(BST, u, v) 2 if u.parent = NIL then 3 BST.root := v 4 else if u = u.parent.left then 5 u.parent.left := v 5 else 6 u.parent.right := v 7 end if 8 if v ≠ NIL then 9 v.parent := u.parent 10 end if | |
The
procedure deals with the 3 special cases mentioned above. Lines 2-3 deal with case 1; lines 4-5 deal with case 2 and lines 6-16 for case 3. The helper function
is used within the deletion algorithm for the purpose of replacing the node
with
in the binary search tree
. This procedure handles the deletion (and substitution) of
from
.
Traversal
See main article: article and Tree traversal.
See also: Threaded binary tree. A BST can be traversed through three basic algorithms: inorder, preorder, and postorder tree walks.
- Inorder tree walk: Nodes from the left subtree get visited first, followed by the root node and right subtree. Such a traversal visits all the nodes in the order of non-decreasing key sequence.
- Preorder tree walk: The root node gets visited first, followed by left and right subtrees.
- Postorder tree walk: Nodes from the left subtree get visited first, followed by the right subtree, and finally, the root.
Following is a recursive implementation of the tree walks.
Inorder-Tree-Walk(x) if x ≠ NIL then Inorder-Tree-Walk(x.left) visit node Inorder-Tree-Walk(x.right) end if | Preorder-Tree-Walk(x) if x ≠ NIL then visit node Preorder-Tree-Walk(x.left) Preorder-Tree-Walk(x.right) end if | Postorder-Tree-Walk(x) if x ≠ NIL then Postorder-Tree-Walk(x.left) Postorder-Tree-Walk(x.right) visit node end if | |
Balanced binary search trees
See main article: Self-balancing binary search tree. Without rebalancing, insertions or deletions in a binary search tree may lead to degeneration, resulting in a height
of the tree (where
is number of items in a tree), so that the lookup performance is deteriorated to that of a linear search.
[14] Keeping the search tree balanced and height bounded by
is a key to the usefulness of the binary search tree. This can be achieved by "self-balancing" mechanisms during the updation operations to the tree designed to maintain the tree height to the binary logarithmic complexity.
[15] Height-balanced trees
A tree is height-balanced if the heights of the left sub-tree and right sub-tree are guaranteed to be related by a constant factor. This property was introduced by the AVL tree and continued by the red–black tree. The heights of all the nodes on the path from the root to the modified leaf node have to be observed and possibly corrected on every insert and delete operation to the tree.
Weight-balanced trees
See main article: Weight-balanced tree. In a weight-balanced tree, the criterion of a balanced tree is the number of leaves of the subtrees. The weights of the left and right subtrees differ at most by
.
[16] However, the difference is bound by a ratio
of the weights, since a strong balance condition of
cannot be maintained with
rebalancing work during insert and delete operations. The
-weight-balanced trees gives an entire family of balance conditions, where each left and right subtrees have each at least a fraction of
of the total weight of the subtree.
Types
There are several self-balanced binary search trees, including T-tree,[17] treap, red-black tree,[18] B-tree, 2–3 tree,[19] and Splay tree.[20]
Examples of applications
Sort
See main article: article and Tree sort. Binary search trees are used in sorting algorithms such as tree sort, where all the elements are inserted at once and the tree is traversed at an in-order fashion.[21] BSTs are also used in quicksort.[22]
Priority queue operations
See main article: Priority queue. Binary search trees are used in implementing priority queues, using the node's key as priorities. Adding new elements to the queue follows the regular BST insertion operation but the removal operation depends on the type of priority queue:[23]
- If it is an ascending order priority queue, removal of an element with the lowest priority is done through leftward traversal of the BST.
- If it is a descending order priority queue, removal of an element with the highest priority is done through rightward traversal of the BST.
See also
Further reading
- Book: Cormen. Thomas H. . Thomas H. Cormen. Leiserson. Charles E. . Charles E. Leiserson. Rivest. Ronald L. . Ronald L. Rivest. Clifford Stein. Clifford . Stein. Introduction to Algorithms. 2nd. 2001. MIT Press. 0-262-03293-7. 253–272, 356–363. 12: Binary search trees, 15.5: Optimal binary search trees.
- Web site: Binary Tree Traversals. Jarc. Duane J.. 3 December 2005. Interactive Data Structure Visualizations. University of Maryland. 30 April 2006. 27 February 2014. https://web.archive.org/web/20140227082917/http://nova.umuc.edu/~jarc/idsv/lesson1.html. dead.
- Book: Knuth, Donald. Donald Knuth. The Art of Computer Programming. 3rd. 3: "Sorting and Searching". 1997. Addison-Wesley. 0-201-89685-0. 426–458. 6.2.2: Binary Tree Searching.
- Web site: Binary Search Tree. Long. Sean. Data Structures and Algorithms Visualization-A PowerPoint Slides Based Approach. SUNY Oneonta. PPT.
- Web site: Binary Trees. https://ghostarchive.org/archive/20220130/http://cslibrary.stanford.edu/110/BinaryTrees.html . 2022-01-30 . live. Parlante. Nick. 2001. CS Education Library. Stanford University.
External links
Notes and References
- The Computer Journal. 1 January 1989. 10.1093/comjnl/32.1.68. 32. 1. 68–69. J.. Culberson. J. I.. Munro. Explaining the Behaviour of Binary Search Trees Under Prolonged Updates: A Model and Simulations. free.
- Algorithmica. Springer Publishing, University of Waterloo. Analysis of the standard deletion algorithms in exact fit domain binary search trees. 28 July 1986. 10.1007/BF01840390. J.. Culberson. J. I.. Munro. 5 . 1–4 . 297. 971813 .
- The Computer Journal. 1 January 1960. 10.1093/comjnl/3.2.84. Trees, Forests and Rearranging. P. F. Windley. 3. 2. 84. free.
- Book: Knuth, Donald. The Art of Computer Programming. Donald Knuth. Addison-Wesley. 1998. Section 6.2.3: Balanced Trees. 458–481. 3. 2. https://ghostarchive.org/archive/20221009/https://ia801604.us.archive.org/17/items/B-001-001-250/B-001-001-250.pdf . 2022-10-09 . live. 978-0201896855.
- Paul E. Black, "red-black tree", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. 12 November 2019. (accessed May 19 2022) from: https://www.nist.gov/dads/HTML/redblack.html
- Web site: CS 312 Lecture: AVL Trees. Cornell University, Department of Computer Science. 19 May 2022. Andrew. Myers. live. https://web.archive.org/web/20210427195749/http://www.cs.cornell.edu/courses/cs312/2008sp/lectures/lec_avl.html. 27 April 2021.
- Adelson-Velsky. Georgy. Landis. Evgenii. 1962. An algorithm for the organization of information. Proceedings of the USSR Academy of Sciences. 146. 263–266. ru. English translation by Myron J. Ricci in Soviet Mathematics - Doklady, 3:1259–1263, 1962.
- Web site: CSC263: Balanced BSTs, AVL tree. 19 May 2022. live. https://web.archive.org/web/20190214212633/http://www.cs.toronto.edu/~toni/Courses/263-2015/lectures/lec04-balanced-augmentation.pdf. Toniann. Pitassi. 2015. University of Toronto, Department of Computer Science. 14 February 2019. 6.
- Book: Thareja, Reema. Data Structures Using C. 13 October 2018. 2. Oxford University Press. 9780198099307. subscription. Hashing and Collision.
- Book: Cormen. Thomas H. . Thomas H. Cormen. Leiserson. Charles E. . Charles E. Leiserson. Rivest. Ronald L. . Ronald L. Rivest. Clifford Stein. Clifford . Stein. Introduction to Algorithms. 2nd. 2001. MIT Press. 0-262-03293-7.
- The Computer Journal. 25. 1. 1 February 1982. 10.1093/comjnl/25.1.158. R. A. Frost. M. M. Peterson. 158. A Short Note on Binary Search Trees. Oxford University Press.
- Web site: Design and Analysis of Algorithms. https://web.archive.org/web/20210413045057/http://ranger.uta.edu/~huang/teaching/CSE5311/CSE5311_Lecture10.pdf. 13 April 2021. 12. University of Texas at Arlington. 17 May 2021. live. Junzhou Huang.
- Web site: Ray . Ray . Binary Search Tree . 17 May 2022 . Loyola Marymount University, Department of Computer Science.
- Web site: ICS 46: Binary Search Trees. University of California, Irvine. 2021. Alex. Thornton. https://web.archive.org/web/20210704141729/https://www.ics.uci.edu/~thornton/ics46/Notes/BinarySearchTrees/. 4 July 2021. live. 21 October 2021.
- Book: Brass, Peter. Cambridge University Press. Advanced Data Structure. January 2011. 9780511800191. 10.1017/CBO9780511800191.
- 10.1016/0304-3975(80)90018-3. On the Average Number of Rebalancing Operations in Weight-Balanced Trees. Theoretical Computer Science. 11. 3. 303–320. 1978. Blum. Norbert. Mehlhorn. Kurt. https://ghostarchive.org/archive/20221009/http://scidok.sulb.uni-saarland.de/volltexte/2011/4019/pdf/fb14_1978_06.pdf . 2022-10-09 . live.
- A Study of Index Structures for Main Memory Database Management Systems. Tobin J.. Lehman. Michael J.. Carey. Kyoto. 25–28 August 1986. Twelfth International Conference on Very Large Databases (VLDB 1986). 0-934613-18-4. registration.
- Book: Introduction to Algorithms . Cormen . Thomas H. . Thomas H. Cormen . Leiserson . Charles E. . Charles E. Leiserson . Rivest . Ronald L. . Ronald L. Rivest . Stein . Clifford . Clifford Stein . second . MIT Press . 2001 . 978-0-262-03293-3 . Red - Black Trees . 273–301 . Introduction to Algorithms .
- Book: The Art of Computer Programming . 3. 6.2.4 . The 2–3 trees defined at the close of Section 6.2.3 are equivalent to B-Trees of order 3. . Donald M . Knuth . 2 . 9780201896855 . Addison Wesley. 1998.
- Daniel D. . Sleator . Daniel Sleator . Robert E. . Tarjan . Robert Tarjan . Self-Adjusting Binary Search Trees . . 32 . 3 . 652–686 . 1985 . 10.1145/3828.3835 . 1165848 .
- Web site: COS226: Binary search trees. Princeton University School of Engineering and Applied Science. Arvind. Narayanan. 21 October 2021. https://web.archive.org/web/20210322040843/https://www.cs.princeton.edu/courses/archive/spring19/cos226/lectures/study/32BinarySearchTrees.html. 22 March 2021. live. 2019. cs.princeton.edu.
- Web site: A Connection Between Binary Search Trees and Quicksort. https://web.archive.org/web/20210226103159/http://mathcenter.oxford.emory.edu/site/cs171/bstQuicksortConnection/. Oxford College of Emory University, The Department of Mathematics and Computer Science. Li. Xiong. live. 4 June 2022. 26 February 2021.
- Web site: Cornell University, Department of Computer Science. Andrew. Myers. CS 2112 Lecture and Recitation Notes: Priority Queues and Heaps. 21 October 2021. live. https://web.archive.org/web/20211021202727/https://www.cs.cornell.edu/courses/cs4120/2016sp/lectures/lec_heaps/. 21 October 2021.