Algorithms of Phylogenetic Trees Construction
Generation Description
UPGMA stands for Unweighted Pair Group Method using Arithmetic average, which is a sequential clustering algorithm. It works by clustering the sequences, at each stage amalgamating two OTUs and at the same time creating a new node in the tree. The edge lengths are determined by the difference in the heights of the nodes at the top and bottom of an edge. The distance di,j between two composite OTUs Ci and Cj, is the arithmetic mean of the pairwise distances between constituent OTUs of composite OTUs:
1 ___ di,j = ----------- \ dp,q |Ci||Cj| /__ p in Ci, q in Cjwhere |Ci| and |Cj| denote the number of constituent OTUs in the composite OTUs Ci and Cj, respectively.
The construction of the tree is bottom-up, from the leaves to the root node.
Algorithm
Assign each sequence i to its own OTUs Ci
When only two clusters i, j remain, place the root at height di,j / 2
The time and space complexity of UPGMA is O(n2), since there are n-1 iterations, with O(n) work in each one.
Send comments and suggestions about PTC to khuri@cs.sjsu.edu or cherryyang@yahoo.com