From Gene Trees to Species Trees: Algorithms for Parsimonious Reconciliation
Nakhleh, Luay K.
Master of Science
One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is minimize deep coalescence , or MDC. Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. Further, the MDC criterion considers only the topologies of the gene trees. So the contributions of my work are three-fold: 1. We propose new MDC formulations for the cases in which the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary, prove structural theorems that allow me to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. 2. We propose an algorithm for inferring a species tree from a collection of gene trees with coalescence times that takes into account not only the topology of the gene trees but also the coalescence times. 3. We devise MDC-based algorithms for cases in which multiple alleles per species may be sampled. We have implemented all of the algorithms in the PhyloNet software package and studied their performance in coalescent-based simulation studies in comparison with other methods including democratic vote, greedy consensus, STEM, and GLASS.
Applied sciences; Biological sciences; Bioinformatics; Computer science