Generalized tree alignment explained

In computational phylogenetics, generalized tree alignment is the problem of producing a multiple sequence alignment and a phylogenetic tree on a set of sequences simultaneously, as opposed to separately.[1]

Formally, Generalized tree alignment is the following optimization problem.

Input: A set

S

and an edit distance function

d

between sequences,

Output: A tree

T

leaf-labeled by

S

and labeled with sequences at the internal nodes, such that

\Sigmaed(e)

is minimized, where

d(e)

is the edit distance between the endpoints of

e

.[2]

Note that this is in contrast to tree alignment, where the tree is provided as input.

Notes and References

  1. Schwikowski. Benno. Vingron. Martin. The Deferred Path Heuristic for the Generalized Tree Alignment Problem. Journal of Computational Biology. 4. 3. 1997. 415–431. 1066-5277. 10.1089/cmb.1997.4.415. 9278068.
  2. Book: Srinivas Aluru. Handbook of Computational Molecular Biology. 21 December 2005. CRC Press. 978-1-4200-3627-5. 19–26. Srinivas Aluru.