Generalized tree alignment explained

In computational phylogenetics, generalized tree alignment is the problem of producing a multiple sequence alignment and a phylogenetic tree on a set of sequences simultaneously, as opposed to separately.^[1]

Formally, Generalized tree alignment is the following optimization problem.

Input: A set

and an edit distance function

between sequences,

Output: A tree

leaf-labeled by

and labeled with sequences at the internal nodes, such that

\Sigma_ed(e)

is minimized, where

d(e)

is the edit distance between the endpoints of

.^[2]

Note that this is in contrast to tree alignment, where the tree is provided as input.

Notes and References

Schwikowski. Benno. Vingron. Martin. The Deferred Path Heuristic for the Generalized Tree Alignment Problem. Journal of Computational Biology. 4. 3. 1997. 415–431. 1066-5277. 10.1089/cmb.1997.4.415. 9278068.
Book: Srinivas Aluru. Handbook of Computational Molecular Biology. 21 December 2005. CRC Press. 978-1-4200-3627-5. 19–26. Srinivas Aluru.