Robinson and Foulds topological distance

Vladimir Makarenkov
Département de Sciences Biologiques
Université de Montréal
C.P. 6128, succursale Centre-ville
Montréal, Québec H3C 3J7, Canada
This program allows to compute in the optimal time the Robinson and Foulds topological distance between two or more additive trees given their distance matrices. The program is based on the algorithm proposed by Makarenkov and Leclerc (1999, b) for calculation of this distance. The algorithm implemented in this program uses the notion of circular orders to compare the topology of two trees. The algorithm described in Makarenkov and Leclerc (1999, b) has optimal time complexity, requiring O(n2) time when performed on two nxn distance matrices. The Robinson and Foulds topological distance is an important and frequently used tool to compare additive (phylogenetic) tree structures (see for instance Robinson and Foulds (1981) or Makarenkov and Leclerc (1999, a,b)). This distance is equal to the minimum number of elementary operations, consisting of merging or splitting nodes, necessary to transform one tree into the other. As proved in Robinson and Foulds (1981), it is also the number of bipartitions, or Buneman's splits (1971), which belong to exactly one of the two trees. If we deal with two unrooted trees having no internal vertices labeled according to the elements of the set X, the Robinson and Foulds distance of on the set X of n elements varies between 0 (when the trees are isomorphic) and 2n-6 (when all non-trivial bipartitions in two trees are different; a trivial bipartition corresponds to an edge incident to a leaf). The performed algorithm is also part of the computer package T-REX (Tree Reconstruction) including some popular methods of tree reconstruction such as ADDTREE by Sattath and Tversky (1977), the Neighbor Joining method by Saitou and Nei (1987), the Unweighted Neighbor Joining method by Gascuel (1997), the Method of Weights by Makarenkov and Leclerc (1999, a), and others. References:
Buneman, P. 1971. The Recovery of Trees from Measures of Dissimilarity, 387-395. In Mathematics in Archaeological and Historical Sciences, eds. F.R. Hodson, D.G. Kendall and P. Tautu, Edinburgh : Edinburgh University Press. Day, W.H.E. 1985. Optimal Algorithms for Comparing Trees with Labelled Leaves. Journal of Classification 2: 7-28. Gascuel, O. 1997. Concerning the NJ algorithm and its unweighted version UNJ, 149-170 In Mathematical hierarchies and Biology (B. Mirkin et al., eds.), DIMACS Series in Discrete Mathematics and Theoretical Computer Science 37, Amer. Math. Soc., Providence, R.I. Makarenkov, V., and Leclerc, B. 1997. Tree metrics and their circular orders: some uses for the reconstruction and fitting of phylogenetic trees, 183-208. In Mathematical hierarchies and Biology (B. Mirkin et al., eds.), DIMACS Series in Discrete Mathematics and Theoretical Computer Science 37, Amer. Math. Soc., Providence, R.I. Makarenkov, V., and Leclerc, B. 1999 (a). The fitting of a tree metric to a given dissimilarity with the weighted least squares criterion. Journal of Classification 16: 3-26. Makarenkov, V., and Leclerc, B. 2000. An optimal way to compare additive trees using circular orders. Journal of Computational Biology 7: 731-744. Robinson, D.R., and Foulds, L.R. 1981. Comparison of phylogenetic trees. Mathematical Biosciences 53: 131-147. Saitou, N., and Nei, M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology Evolution 4: 406-425. Sattath, S., and Tversky, A. 1977. Additive similarity trees. Psychometrika 42: 319-345.