In computer science, join-based tree algorithms are a class of algorithms for self-balancing binary search trees. This framework aims at designing highly-parallelized algorithms for various balanced binary search trees. The algorithmic framework is based on a single operation join. Under this framework, the join operation captures all balancing criteria of different balancing schemes, and all other functions join have generic implementation across different balancing schemes. The join-based algorithms can be applied to at least four balancing schemes: AVL trees, red–black trees, weight-balanced trees and treaps.
The join
(L,k,R)
L
R
k
t
L
k
R
L
k
R
k
The join operation was first defined by Tarjan on red–black trees, which runs in worst-case logarithmic time. Later Sleator and Tarjan described a join algorithm for splay trees which runs in amortized logarithmic time. Later Adams [1] extended join to weight-balanced trees and used it for fast set–set functions including union, intersection and set difference. In 1998, Blelloch and Reid-Miller extended join on treaps, and proved the bound of the set functions to be
O(mlog(1+\tfrac{n}{m}))
m
n(\gem)
The function join
(t1,k,t2)
The following is the join algorithms on different balancing schemes.
The join algorithm for AVL trees: function joinRightAVL(TL, k, TR) (l, k', c) := expose(TL) if h(c) ≤ h(TR) + 1 T' := Node(c, k, TR) if h(T') ≤ h(l) + 1 return Node(l, k', T') else return rotateLeft(Node(l, k', rotateRight(T'))) else T' := joinRightAVL(c, k, TR) T := Node(l, k', T') if h(T') ≤ h(l) + 1 return T else return rotateLeft(T) function joinLeftAVL(TL, k, TR) /* symmetric to joinRightAVL */ function join(TL, k, TR) if h(TL) > h(TR) + 1 return joinRightAVL(TL, k, TR) else if h(TR) > h(TL) + 1 return joinLeftAVL(TL, k, TR) else return Node(TL, k, TR)Where:
h(v)
v
expose(v)
l
k
r
v
(l,k,r)
Node(l,k,r)
l
k
r
The join algorithm for red–black trees: function joinRightRB(TL, k, TR) if TL.color = black and ĥ(TL) = ĥ(TR) return Node(TL, ⟨k, red⟩, TR) else (L', ⟨k', c'⟩, R') := expose(TL) T' := Node(L', ⟨k', c'⟩, joinRightRB(R', k, TR)) if c' = black and T'.right.color = T'.right.right.color = red T'.right.right.color := black return rotateLeft(T') else return T' function joinLeftRB(TL, k, TR) /* symmetric to joinRightRB */ function join(TL, k, TR) if ĥ(TL) > ĥ(TR) T' := joinRightRB(TL, k, TR) if (T'.color = red) and (T'.right.color = red) T'.color := black return T' else if ĥ(TR) > ĥ(TL) /* symmetric */ else if TL.color = black and TR = black return Node(TL, ⟨k, red⟩, TR) else return Node(TL, ⟨k, black⟩, TR)Where:
v
expose(v)
l
k
c
r
v
(l,\langlek,c\rangle,r)
l
k
c
r
The join algorithm for weight-balanced trees: function joinRightWB(TL, k, TR) (l, k', c) := expose(TL) if w(TL) =α w(TR) return Node(TL, k, TR) else T' := joinRightWB(c, k, TR) (l1, k1, r1) := expose(T') if w(l) =α w(T') return Node(l, k', T') else if w(l) =α w(l1) and w(l)+w(l1) =α w(r1) return rotateLeft(Node(l, k', T')) else return rotateLeft(Node(l, k', rotateRight(T')) function joinLeftWB(TL, k, TR) /* symmetric to joinRightWB */ function join(TL, k, TR) if w(TL) >α w(TR) return joinRightWB(TL, k, TR) else if w(TR) >α w(TL) return joinLeftWB(TL, k, TR) else return Node(TL, k, TR)Where:
w(v)
v
w1=\alphaw2
w1
w2
w1>\alphaw2
w1
w2
expose(v)
l
k
r
v
(l,k,r)
Node(l,k,r)
l
k
r
In the following,
expose(v)
l
k
r
v
(l,k,r)
Node(l,k,r)
l
k
r
s1||s2
s1
s2
To split a tree into two trees, those smaller than key x, and those larger than key x, we first draw a path from the root by inserting x into the tree. After this insertion, all values less than x will be found on the left of the path, and all values greater than x will be found on the right. By applying Join, all the subtrees on the left side are merged bottom-up using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is asymmetric. For some applications, Split also returns a boolean value denoting if x appears in the tree. The cost of Split is
O(logn)
The split algorithm is as follows:
function split(T, k) if (T = nil) return (nil, false, nil) else (L, m, R) := expose(T) if k < m (L', b, R') := split(L, k) return (L', b, join(R', m, R)) else if k > m (L', b, R') := split(R, k) return (join(L, m, L'), b, R')) else return (L, true, R)
This function is defined similarly as join but without the middle key. It first splits out the last key
k
k
function splitLast(T) (L, k, R) := expose(T) if R = nil return (L, k) else (T', k') := splitLast(R) return (join(L, k, T'), k') function join2(L, R) if L = nil return R else (L', k) := splitLast(L) return join(L', k, R)
The cost is
O(logn)
n
The insertion and deletion algorithms, when making use of join can be independent of balancing schemes. For an insertion, the algorithm compares the key to be inserted with the key in the root, inserts it to the left/right subtree if the key is smaller/greater than the key in the root, and joins the two subtrees back with the root. A deletion compares the key to be deleted with the key in the root. If they are equal, return join2 on the two subtrees. Otherwise, delete the key from the corresponding subtree, and join the two subtrees back with the root. The algorithms are as follows:
function insert(T, k) if T = nil return Node(nil, k, nil) else (L, k', R) := expose(T) if k < k' return join(insert(L,k), k', R) else if k > k' return join(L, k', insert(R, k)) else return T function delete(T, k) if T = nil return nil else (L, k', R) := expose(T) if k < k' return join(delete(L, k), k', R) else if k > k' return join(L, k', delete(R, k)) else return join2(L, R)
Both insertion and deletion requires
O(logn)
|T|=n
Several set operations have been defined on weight-balanced trees: union, intersection and set difference. The union of two weight-balanced trees and representing sets and, is a tree that represents . The following recursive function computes this union:
function union(t1, t2) if t1 = nil return t2 else if t2 = nil return t1 else (l1, k1, r1) := expose(t1) (t<, b, t>) := split(t2, k1) l' := union(l1, t<) || r' := union(r1, t>) return join(l', k1, r')
Similarly, the algorithms of intersection and set-difference are as follows:
function intersection(t1, t2) if t1 = nil or t2 = nil return nil else (l1, k1, r1) := expose(t1) (t<, b, t>) = split(t2, k1) l' := intersection(l1, t<) || r' := intersection(r1, t>) if b return join(l', k1, r') else return join2(l', r') function difference(t1, t2) if t1 = nil return nil else if t2 = nil return t1 else (l1, k1, r1) := expose(t1) (t<, b, t>) := split(t2, k1) l' = difference(l1, t<) || r' = difference(r1, t>) if b return join2(l', r') else return join(l', k1, r')
The complexity of each of union, intersection and difference is
O\left(mlog\left(\tfrac{n}{m}+1\right)\right)
m
n(\gem)
O(logmlogn)
m=1
The algorithm for building a tree can make use of the union algorithm, and use the divide-and-conquer scheme:
function build(A[], n) if n = 0 return nil else if n = 1 return Node(nil, A[0], nil) else l' := build(A, n/2) || r' := (A+n/2, n-n/2) return union(L, R)
This algorithm costs
O(nlogn)
O(log3n)
function buildSorted(A[], n) if n = 0 return nil else if n = 1 return Node(nil, A[0], nil) else l' := build(A, n/2) || r' := (A+n/2+1, n-n/2-1) return join(l', A[n/2], r') function build(A[], n) A' := sort(A, n) return buildSorted(A, n)
This algorithm costs
O(nlogn)
O(logn)
O(nlogn)
O(logn)
This function selects all entries in a tree satisfying a predicate
p
p
function filter(T, p) if T = nil return nil else (l, k, r) := expose(T) l' := filter(l, p) || r' := filter(r, p) if p(k) return join(l', k, r') else return join2(l', R)
This algorithm costs work
O(n)
O(log2n)
n
p
The join-based algorithms are applied to support interface for sets, maps, and augmented maps in libraries such as Hackage, SML/NJ, and PAM.