Fizzy: feature subset selection for metagenomics

Persistent Link:
http://hdl.handle.net/10150/610268
Title:
Fizzy: feature subset selection for metagenomics
Author:
Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin; Rosen, Gail L.
Affiliation:
Department of Electrical & Computer Engineering, The University of Arizona; Department of Electrical & Computer Engineering, Drexel University; School of Biomedical Engineering, Science and Health, Drexel University
Issue Date:
2015
Publisher:
BioMed Central
Citation:
Ditzler et al. BMC Bioinformatics (2015) 16:358 DOI 10.1186/s12859-015-0793-8
Journal:
BMC Bioinformatics
Rights:
© 2015 Ditzler et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/)
Collection Information:
This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.
Abstract:
BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection - a sub-field of machine learning - can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.
EISSN:
1471-2105
DOI:
10.1186/s12859-015-0793-8
Keywords:
Feature subset selection; Comparative metagenomics; Open-source software
Version:
Final published version
Additional Links:
http://www.biomedcentral.com/1471-2105/16/358

Full metadata record

DC FieldValue Language
dc.contributor.authorDitzler, Gregoryen
dc.contributor.authorMorrison, J. Calvinen
dc.contributor.authorLan, Yeminen
dc.contributor.authorRosen, Gail L.en
dc.date.accessioned2016-05-20T09:02:42Z-
dc.date.available2016-05-20T09:02:42Z-
dc.date.issued2015en
dc.identifier.citationDitzler et al. BMC Bioinformatics (2015) 16:358 DOI 10.1186/s12859-015-0793-8en
dc.identifier.doi10.1186/s12859-015-0793-8en
dc.identifier.urihttp://hdl.handle.net/10150/610268-
dc.description.abstractBACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection - a sub-field of machine learning - can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.en
dc.language.isoenen
dc.publisherBioMed Centralen
dc.relation.urlhttp://www.biomedcentral.com/1471-2105/16/358en
dc.rights© 2015 Ditzler et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/)en
dc.subjectFeature subset selectionen
dc.subjectComparative metagenomicsen
dc.subjectOpen-source softwareen
dc.titleFizzy: feature subset selection for metagenomicsen
dc.typeArticleen
dc.identifier.eissn1471-2105en
dc.contributor.departmentDepartment of Electrical & Computer Engineering, The University of Arizonaen
dc.contributor.departmentDepartment of Electrical & Computer Engineering, Drexel Universityen
dc.contributor.departmentSchool of Biomedical Engineering, Science and Health, Drexel Universityen
dc.identifier.journalBMC Bioinformaticsen
dc.description.collectioninformationThis item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.