Extracting conflict-free information from multi-labeled trees

Persistent Link:
http://hdl.handle.net/10150/609998
Title:
Extracting conflict-free information from multi-labeled trees
Author:
Deepak, Akshay; Fernandez-Baca, David; McMahon, Michelle
Affiliation:
Department of Computer Science, Iowa State University, Ames, Iowa, USA; School of Plant Sciences, University of Arizona, Tucson, Arizona, USA
Issue Date:
2013
Publisher:
BioMed Central
Citation:
Deepak et al. Algorithms for Molecular Biology 2013, 8:18 http://www.almob.org/content/8/18
Journal:
Algorithms for Molecular Biology
Rights:
© 2013 Deepak et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)
Collection Information:
This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.
Abstract:
BACKGROUND:A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious.RESULTS:We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.CONCLUSIONS:Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes.
EISSN:
1748-7188
DOI:
10.1186/1748-7188-8-18
Keywords:
Phylogenetic trees; Evolutionary trees; Multi-labeled trees; Reduction; Singly-labeled trees
Version:
Final published version
Additional Links:
http://www.almob.org/content/8/1/18

Full metadata record

DC FieldValue Language
dc.contributor.authorDeepak, Akshayen
dc.contributor.authorFernandez-Baca, Daviden
dc.contributor.authorMcMahon, Michelleen
dc.date.accessioned2016-05-20T08:55:46Z-
dc.date.available2016-05-20T08:55:46Z-
dc.date.issued2013en
dc.identifier.citationDeepak et al. Algorithms for Molecular Biology 2013, 8:18 http://www.almob.org/content/8/18en
dc.identifier.doi10.1186/1748-7188-8-18en
dc.identifier.urihttp://hdl.handle.net/10150/609998-
dc.description.abstractBACKGROUND:A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious.RESULTS:We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.CONCLUSIONS:Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes.en
dc.language.isoenen
dc.publisherBioMed Centralen
dc.relation.urlhttp://www.almob.org/content/8/1/18en
dc.rights© 2013 Deepak et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)en
dc.subjectPhylogenetic treesen
dc.subjectEvolutionary treesen
dc.subjectMulti-labeled treesen
dc.subjectReductionen
dc.subjectSingly-labeled treesen
dc.titleExtracting conflict-free information from multi-labeled treesen
dc.typeArticleen
dc.identifier.eissn1748-7188en
dc.contributor.departmentDepartment of Computer Science, Iowa State University, Ames, Iowa, USAen
dc.contributor.departmentSchool of Plant Sciences, University of Arizona, Tucson, Arizona, USAen
dc.identifier.journalAlgorithms for Molecular Biologyen
dc.description.collectioninformationThis item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.