Top tree explained

A top tree is a data structure based on a binary tree for unrooted dynamic trees that is used mainly for various path-related operations. It allows simple divide-and-conquer algorithms. It has since been augmented to maintain dynamically various properties of a tree such as diameter, center and median.

A top tree

\Re

is defined for an underlying tree

l{T}

and a set

\partial{T}

of at most two vertices called as External Boundary Vertices

Glossary

Boundary Node

See Boundary Vertex

Boundary Vertex

A vertex in a connected subtree is a Boundary Vertex if it is connected to a vertex outside the subtree by an edge.

External Boundary Vertices

Up to a pair of vertices in the top tree

\Re

can be called as External Boundary Vertices, they can be thought of as Boundary Vertices of the cluster which represents the entire top tree.

Cluster

A cluster is a connected subtree with at most two Boundary Vertices.The set of Boundary Vertices of a given cluster

l{C}

is denoted as

\partial{C}.

With each cluster

l{C}

the user may associate some meta information

I(l{C}),

and give methods to maintain it under the various internal operations.

Path Cluster

If

\pi(l{C})

contains at least one edge then

l{C}

is called a Path Cluster.

Point Cluster

See Leaf Cluster

Leaf Cluster

If

\pi(l{C})

does not contain any edge i.e.

l{C}

has only one Boundary Vertex then

l{C}

is called a Leaf Cluster.

Edge Cluster

A Cluster containing a single edge is called an Edge Cluster.

Leaf Edge Cluster

A Leaf in the original Cluster is represented by a Cluster with just a single Boundary Vertex and is called a Leaf Edge Cluster.

Path Edge Cluster

Edge Clusters with two Boundary Nodes are called Path Edge Cluster.

Internal Node

A node in

l{C}

\

\partial{C}

is called an Internal Node of

l{C}.

Cluster Path

The path between the Boundary Vertices of

l{C}

is called the cluster path of

l{C}

and it is denoted by

\pi(l{C}).

Mergeable Clusters

Two Clusters

l{A}

and

l{B}

are Mergeable if

l{A}\capl{B}

is a singleton set (they have exactly one node in common) and

l{A}\cupl{B}

is a Cluster.

Introduction

Top trees are used for maintaining a Dynamic forest (set of trees) under link and cut operations.

\Re

of logarithmic height in the number of nodes in the original tree

l{T}

(i.e. in

l{O}(logn)

time) ; the top tree essentially represents the recursive subdivision of the original tree

l{T}

into clusters.

In general the tree

l{T}

may have weight on its edges.

There is a one-to-one correspondence with the edges of the original tree

l{T}

and the leaf nodes of the top tree

\Re

and each internal node of

\Re

represents a cluster that is formed due to the union of the clusters that are its children.

The top tree data structure can be initialized in

l{O}(n)

time.

Therefore the top tree

\Re

over (

l{T},

\partial{T}

) is a binary tree such that

\Re

are clusters of (

l{T},

\partial{T}

);

\Re

are the edges of

l{T};

\Re

is the tree

l{T}

itself, with a set of at most two External Boundary Vertices.

A tree with a single vertex has an empty top tree, and one with just an edge is just a single node.

These trees are freely augmentable allowing the user a wide variety of flexibility and productivity without going into the details of the internal workings of the data structure, something which is also referred to as the Black Box.

Dynamic Operations

The following three are the user allowable Forest Updates.

v

and

w

are vertices in different trees

l{T}

1 and

l{T}

2. It returns a single top tree representing

\Re

v

\cup

\Re

w

\cup{(v,w)}

{(v,w)}

from a tree

l{T}

with top tree

\Re,

thereby turning it into two trees

l{T}

v and

l{T}

w and returning two top trees

\Re

v and

\Re

w.

S

contains at most 2 vertices. It makes original external vertices to be normal vertices and makes vertices from

S

the new External Boundary Vertices of the top tree. If

S

is nonempty it returns the new Root cluster

l{C}

with

\partial{C}=S.

Expose fails if the vertices are from different trees.

Internal Operations

The Forest updates are all carried out by a sequence of at most

l{O}(logn)

Internal Operations, the sequence of which is computed in further

l{O}(logn)

time. It may happen that during a tree update, a leaf cluster may change to a path cluster and the converse. Updates to top tree are done exclusively by these internal operations.

The

I(l{C})

is updated by calling a user defined function associated with each internal operation.

(l{A},l{B}){:}

Here

l{A}

and

l{B}

are Mergeable Clusters, it returns

l{C}

as the parent cluster of

l{A}

and

l{B}

and with boundary vertices as the boundary vertices of

l{A}\cupl{B}.

Computes

I(l{C})

using

I(l{A})

and

I(l{B}).

(l{C}){:}

Here

l{C}

is the root cluster

l{A}\cupl{B}.

It updates

I(l{A})

and

I(l{B})

using

I(l{C})

and than it deletes the cluster

l{C}

from

\Re

.

Split is usually implemented using Clean

(l{C})

method which calls user method for updates of

I(l{A})

and

I(l{B})

using

I(l{C})

and updates

I(l{C})

such that it's known there is no pending update needed in its children. Than the

l{C}

is discarded without calling user defined functions. Clean is often required for queries without need to Split.If Split does not use Clean subroutine, and Clean is required, its effect could be achieved with overhead by combining Merge and Split.

The next two functions are analogous to the above two and are used for base clusters.

(v,w){:}

Creates a cluster

l{C}

for the edge

(v,w).

Sets

\partial{C}=\partial

(v,w).

I(l{C})

is computed from scratch.

(l{C}){:}

l{C}

is the edge cluster

(v,w).

User defined function is called to process

I(l{C})

and than the cluster

l{C}

is deleted from the top tree.

Non local search

User can define Choose

(l{C}){:}

operation which for a root (nonleaf) cluster selects one of its child clusters. The top tree blackbox provides Search

(l{C}){:}

routine, which organizes Choose queries and reorganization of the top tree (using the Internal operations) such that it locates the only edge in intersection of all selected clusters. Sometimes the search should be limited to a path. There is a variant of nonlocal search for such purposes.If there are two external boundary vertices in the root cluster

l{C}

, the edge is searched only on the path

\pi(l{C})

. It is sufficient to do following modification: If only one of root cluster children is path cluster, it is selected by default without calling the
Choose operation.

Examples of non local search

Finding i-th edge on longer path from

v

to

w

could be done by

l{C}

=Expose
followed by Search(

l{C}

)
with appropriate Choose. To implement the Choose we use global variable representing

v

and global variable representing

i.

Choose selects the cluster

l{A}

with

v\in\partial{A}

iff length of

\pi(l{A})

is at least

i

. To support the operation the length must be maintained in the

I

.

Similar task could be formulated for graph with edges with nonunit lengths. In that case the distance could address an edge or a vertex between two edges. We could define Choose such that the edge leading to the vertex is returned in the latter case. There could be defined update increasing all edge lengths along a path by a constant. In such scenario these updates are done in constant time just in root cluster. Clean is required to distribute the delayed update to the children. The Clean should be called before the Choose is invoked. To maintain length in

I

would in that case require to maintain unitlength in

I

as well.

Finding center of tree containing vertex

v

could be done by finding either bicenter edge or edge with center as one endpoint. The edge could be found by

l{C}

=Expose
followed by Search(

l{C}

)
with appropriate Choose. The choose selects between children

l{A},

l{B}

with

a\in\partial{A}\cap\partial{B}

the child with higher maxdistance

(a)

. To support the operation the maximal distance in the cluster subtree from a boundary vertex should be maintained in the

I

. That requires maintenance of the cluster path length as well.

Interesting Results and Applications

A number of interesting applications originally implemented by other methods have been easily implemented using the top tree's interface. Some of them include

l{O}(logn)

time per link and cut, supporting queries about the maximum edge weight between any two vertices in

O(logn)

time.

l{C}

) is initialised as

-infty.

When a cluster is a union of two clusters then it is the maximum value of the two merged clusters. If we have to find the max wt between

v

and

w

then we do

l{C}=

Expose

(v,w),

and report max_wt

(l{C}).

x

to all edges on a given path

v

· · ·

w

in

l{O}(logn)

time.

l{C}

) to be added to all the edges in

\pi(l{C}).

Which is maintained appropriately ; split(

l{C}

) requires that, for each path child

l{A}

of

l{C},

we set max_wt(A) := max_wt(

l{A}

) + extra(

l{C}

) and extra(

l{A}

) := extra(

l{A}

) + extra(

l{C}

). For

l{C}

:= join(

l{A},

l{B}

), we set max_wt(

l{C}

) := max and extra(

l{C}

) := 0. Finally, to find the maximum weight on the path

v

· · ·

w,

we set

l{C}

:= Expose

(v,w)

and return max_wt(

l{C}

).

v

in

l{O}(logn)

time.

v

and

w

can be found in

l{O}(logn)

time as length(Expose

(v,w)

).

l{C}

) of the cluster path. The length is maintained as the maximum weight except that, if

l{C}

is created by a join(Merge), length(

l{C}

) is the sum of lengths stored with its path children.

l{O}(logn)

time.

l{O}(logn)

time.

O(log4n)

, and

O(logn/loglogn)

query time. Subsequent work by Holm, Rotenberg, and Thorup improves this to an amortized update time of

O(log2nlog2logn)

, also using top trees

O(log5n)

. Queries could be implemented even faster. The algorithm is not trivial,

I(l{C})

uses

\Theta(log2n)

space.[2]

Implementation

Top trees have been implemented in a variety of ways, some of them include implementation using a Multilevel Partition (Top-trees and dynamic graph algorithms Jacob Holm and Kristian de Lichtenberg. Technical Report), and even by using Sleator-Tarjan s-t trees (typically with amortized time bounds), Frederickson's Topology Trees (with worst case time bounds) (Alstrup et al. Maintaining Information in Fully Dynamic Trees with Top Trees).

Amortized implementations are more simple, and with small multiplicative factors in time complexity.On the contrary the worst case implementations allow speeding up queries by switching off unneeded info updates during the query (implemented by persistence techniques). After the query is answered the original state of the top tree is used and the query version is discarded.

Using Multilevel Partitioning

Any partitioning of clusters of a tree

l{T}

can be represented by a Cluster Partition Tree CPT

(l{T}),

by replacing each cluster in the tree

l{T}

by an edge. If we use a strategy P for partitioning

l{T}

then the CPT would be CPTP

l{T}.

This is done recursively till only one edge remains.

We would notice that all the nodes of the corresponding top tree

\Re

are uniquely mapped into the edges of this multilevel partition. There may be some edges in the multilevel partition that do not correspond to any node in the top tree, these are the edges which represent only a single child in the level below it, i.e. a simple cluster. Only the edges that correspond to composite clusters correspond to nodes in the top tree

\Re.

A partitioning strategy is important while we partition the Tree

l{T}

into clusters. Only a careful strategy ensures that we end up in an

l{O}(logn)

height Multilevel Partition (and therefore the top tree).

The above partitioning strategy ensures the maintenance of the top tree in

l{O}(logn)

time.

See also

References

External links

Notes and References

  1. 10.1145/502090.502095. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. Journal of the ACM. 48. 4. 723. 2001. Holm . J. . De Lichtenberg . K. . Thorup . M. . 7273552.
  2. 10.1145/502090.502095. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. Journal of the ACM. 48. 4. 723. 2001. Holm . J. . De Lichtenberg . K. . Thorup . M. . 7273552.
  3. 10.1016/j.ic.2014.12.012 . Tree Compression with Top Trees. Inf. Comput. . Philip . Bille. Inge Li . Gørtz . Gad M. . Landau . Oren . Weimann. 2015. 243. 166–177. 1304.5702.