Accurate genome relative abundance estimation for closely related species in a metagenomic sample

Persistent Link:
http://hdl.handle.net/10150/610276
Title:
Accurate genome relative abundance estimation for closely related species in a metagenomic sample
Author:
Sohn, Michael; An, Lingling; Pookhao, Naruekamol; Li, Qike
Affiliation:
Interdisciplinary Program in Statistics, University of Arizona, Tucson AZ 85721, USA; Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson AZ 85721, USA
Issue Date:
2014
Publisher:
BioMed Central
Citation:
Sohn et al. BMC Bioinformatics 2014, 15:242 http://www.biomedcentral.com/1471-2105/15/242
Journal:
BMC Bioinformatics
Rights:
© 2014 Sohn et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)
Collection Information:
This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.
Abstract:
BACKGROUND:Metagenomics has a great potential to discover previously unattainable information about microbial communities. An important prerequisite for such discoveries is to accurately estimate the composition of microbial communities. Most of prevalent homology-based approaches utilize solely the results of an alignment tool such as BLAST, limiting their estimation accuracy to high ranks of the taxonomy tree.RESULTS:We developed a new homology-based approach called Taxonomic Analysis by Elimination and Correction (TAEC), which utilizes the similarity in the genomic sequence in addition to the result of an alignment tool. The proposed method is comprehensively tested on various simulated benchmark datasets of diverse complexity of microbial structure. Compared with other available methods designed for estimating taxonomic composition at a relatively low taxonomic rank, TAEC demonstrates greater accuracy in quantification of genomes in a given microbial sample. We also applied TAEC on two real metagenomic datasets, oral cavity dataset and Crohn's disease dataset. Our results, while agreeing with previous findings at higher ranks of the taxonomy tree, provide accurate estimation of taxonomic compositions at the species/strain level, narrowing down which species/strains need more attention in the study of oral cavity and the Crohn's disease.CONCLUSIONS:By taking account of the similarity in the genomic sequence TAEC outperforms other available tools in estimating taxonomic composition at a very low rank, especially when closely related species/strains exist in a metagenomic sample.
EISSN:
1471-2105
DOI:
10.1186/1471-2105-15-242
Keywords:
Metagenomics; Alignment similarity; Genomic similarity; Closely related species
Version:
Final published version
Additional Links:
http://www.biomedcentral.com/1471-2105/15/242

Full metadata record

DC FieldValue Language
dc.contributor.authorSohn, Michaelen
dc.contributor.authorAn, Linglingen
dc.contributor.authorPookhao, Naruekamolen
dc.contributor.authorLi, Qikeen
dc.date.accessioned2016-05-20T09:02:54Z-
dc.date.available2016-05-20T09:02:54Z-
dc.date.issued2014en
dc.identifier.citationSohn et al. BMC Bioinformatics 2014, 15:242 http://www.biomedcentral.com/1471-2105/15/242en
dc.identifier.doi10.1186/1471-2105-15-242en
dc.identifier.urihttp://hdl.handle.net/10150/610276-
dc.description.abstractBACKGROUND:Metagenomics has a great potential to discover previously unattainable information about microbial communities. An important prerequisite for such discoveries is to accurately estimate the composition of microbial communities. Most of prevalent homology-based approaches utilize solely the results of an alignment tool such as BLAST, limiting their estimation accuracy to high ranks of the taxonomy tree.RESULTS:We developed a new homology-based approach called Taxonomic Analysis by Elimination and Correction (TAEC), which utilizes the similarity in the genomic sequence in addition to the result of an alignment tool. The proposed method is comprehensively tested on various simulated benchmark datasets of diverse complexity of microbial structure. Compared with other available methods designed for estimating taxonomic composition at a relatively low taxonomic rank, TAEC demonstrates greater accuracy in quantification of genomes in a given microbial sample. We also applied TAEC on two real metagenomic datasets, oral cavity dataset and Crohn's disease dataset. Our results, while agreeing with previous findings at higher ranks of the taxonomy tree, provide accurate estimation of taxonomic compositions at the species/strain level, narrowing down which species/strains need more attention in the study of oral cavity and the Crohn's disease.CONCLUSIONS:By taking account of the similarity in the genomic sequence TAEC outperforms other available tools in estimating taxonomic composition at a very low rank, especially when closely related species/strains exist in a metagenomic sample.en
dc.language.isoenen
dc.publisherBioMed Centralen
dc.relation.urlhttp://www.biomedcentral.com/1471-2105/15/242en
dc.rights© 2014 Sohn et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)en
dc.subjectMetagenomicsen
dc.subjectAlignment similarityen
dc.subjectGenomic similarityen
dc.subjectClosely related speciesen
dc.titleAccurate genome relative abundance estimation for closely related species in a metagenomic sampleen
dc.typeArticleen
dc.identifier.eissn1471-2105en
dc.contributor.departmentInterdisciplinary Program in Statistics, University of Arizona, Tucson AZ 85721, USAen
dc.contributor.departmentDepartment of Agricultural and Biosystems Engineering, University of Arizona, Tucson AZ 85721, USAen
dc.identifier.journalBMC Bioinformaticsen
dc.description.collectioninformationThis item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.