The taxonomic name resolution service: an online tool for automated standardization of plant names

Persistent Link:
http://hdl.handle.net/10150/610265
Title:
The taxonomic name resolution service: an online tool for automated standardization of plant names
Author:
Boyle, Brad; Hopkins, Nicole; Lu, Zhenyuan; Raygoza Garay, Juan Antonio; Mozzherin, Dmitry; Rees, Tony; Matasci, Naim; Narro, Martha; Piel, William; Mckay, Sheldon; Lowry, Sonya; Freeland, Chris; Peet, Robert; Enquist, Brian
Affiliation:
Department of Ecology and Evolutionary Biology, University of Arizona Tucson, P.O. Box 210088, Tucson, AZ, 85721, USA; The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, Tucson, AZ, 85721, USA; BIO5 Institute, 1657 East Helen Street, PO Box 210240, Tucson, AZ, 85721-0240, USA; Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724-2202, USA; 7 MBL street, Center for Library and Informatics, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA, 02543, USA; Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania, 7001, Australia; Yale-NUS College, 6 College Avenue East, Singapore, 138614, Singapore; Missouri Botanical Garden, 4344 Shaw Blvd.; , St. Louis, MO, 63110, USA; Department of Biology, CB 3280, University of North Carolina, Chapel Hill, NC, 27599-3280, USA
Issue Date:
2013
Publisher:
BioMed Central
Citation:
Boyle et al. BMC Bioinformatics 2013, 14:16 http://www.biomedcentral.com/1471-2105/14/16
Journal:
BMC Bioinformatics
Rights:
© 2013 Boyle et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)
Collection Information:
This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.
Abstract:
BACKGROUND:The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this 'names problem' has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.RESULTS:The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.CONCLUSIONS:We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ webcite and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/ webcite.
EISSN:
1471-2105
DOI:
10.1186/1471-2105-14-16
Keywords:
Biodiversity informatics; Database integration; Taxonomy; Plants
Version:
Final published version
Additional Links:
http://www.biomedcentral.com/1471-2105/14/16

Full metadata record

DC FieldValue Language
dc.contributor.authorBoyle, Braden
dc.contributor.authorHopkins, Nicoleen
dc.contributor.authorLu, Zhenyuanen
dc.contributor.authorRaygoza Garay, Juan Antonioen
dc.contributor.authorMozzherin, Dmitryen
dc.contributor.authorRees, Tonyen
dc.contributor.authorMatasci, Naimen
dc.contributor.authorNarro, Marthaen
dc.contributor.authorPiel, Williamen
dc.contributor.authorMckay, Sheldonen
dc.contributor.authorLowry, Sonyaen
dc.contributor.authorFreeland, Chrisen
dc.contributor.authorPeet, Roberten
dc.contributor.authorEnquist, Brianen
dc.date.accessioned2016-05-20T09:02:37Z-
dc.date.available2016-05-20T09:02:37Z-
dc.date.issued2013en
dc.identifier.citationBoyle et al. BMC Bioinformatics 2013, 14:16 http://www.biomedcentral.com/1471-2105/14/16en
dc.identifier.doi10.1186/1471-2105-14-16en
dc.identifier.urihttp://hdl.handle.net/10150/610265-
dc.description.abstractBACKGROUND:The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this 'names problem' has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.RESULTS:The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.CONCLUSIONS:We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ webcite and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/ webcite.en
dc.language.isoenen
dc.publisherBioMed Centralen
dc.relation.urlhttp://www.biomedcentral.com/1471-2105/14/16en
dc.rights© 2013 Boyle et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0)en
dc.subjectBiodiversity informaticsen
dc.subjectDatabase integrationen
dc.subjectTaxonomyen
dc.subjectPlantsen
dc.titleThe taxonomic name resolution service: an online tool for automated standardization of plant namesen
dc.typeArticleen
dc.identifier.eissn1471-2105en
dc.contributor.departmentDepartment of Ecology and Evolutionary Biology, University of Arizona Tucson, P.O. Box 210088, Tucson, AZ, 85721, USAen
dc.contributor.departmentThe iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, Tucson, AZ, 85721, USAen
dc.contributor.departmentBIO5 Institute, 1657 East Helen Street, PO Box 210240, Tucson, AZ, 85721-0240, USAen
dc.contributor.departmentCold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724-2202, USAen
dc.contributor.department7 MBL street, Center for Library and Informatics, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA, 02543, USAen
dc.contributor.departmentDivisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania, 7001, Australiaen
dc.contributor.departmentYale-NUS College, 6 College Avenue East, Singapore, 138614, Singaporeen
dc.contributor.departmentMissouri Botanical Garden, 4344 Shaw Blvd.en
dc.contributor.department, St. Louis, MO, 63110, USAen
dc.contributor.departmentDepartment of Biology, CB 3280, University of North Carolina, Chapel Hill, NC, 27599-3280, USAen
dc.identifier.journalBMC Bioinformaticsen
dc.description.collectioninformationThis item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.