An Adaptive Data Structure for Nearest Neighbors Search in a General Metric Space

Persistent Link:
http://hdl.handle.net/10150/146680
Title:
An Adaptive Data Structure for Nearest Neighbors Search in a General Metric Space
Author:
Thomas, Joseph Scott
Issue Date:
May-2010
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
We consider the problem of computing nearest neighbors in an arbitrary metric space, particularly a metric space that cannot be easily embedded in Rn. We present a data structure, the Partition Tree, that can be constructed in O(n log n) time, where n is the size of the set of points to be searched, and has been experimentally shown to have an average query time that is a sublinear function of n. Our experiments show that this data structure could have applications in bioinformatics, particularly protein secondary structure prediction, where it can be used for similarity search among short sequences of proteins' primary structure.
Type:
text; Electronic Thesis
Degree Name:
B.S.
Degree Level:
bachelors
Degree Program:
Honors College; Computer Science
Degree Grantor:
University of Arizona

Full metadata record

DC FieldValue Language
dc.language.isoenen_US
dc.titleAn Adaptive Data Structure for Nearest Neighbors Search in a General Metric Spaceen_US
dc.creatorThomas, Joseph Scotten_US
dc.contributor.authorThomas, Joseph Scotten_US
dc.date.issued2010-05-
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractWe consider the problem of computing nearest neighbors in an arbitrary metric space, particularly a metric space that cannot be easily embedded in Rn. We present a data structure, the Partition Tree, that can be constructed in O(n log n) time, where n is the size of the set of points to be searched, and has been experimentally shown to have an average query time that is a sublinear function of n. Our experiments show that this data structure could have applications in bioinformatics, particularly protein secondary structure prediction, where it can be used for similarity search among short sequences of proteins' primary structure.en_US
dc.typetexten_US
dc.typeElectronic Thesisen_US
thesis.degree.nameB.S.en_US
thesis.degree.levelbachelorsen_US
thesis.degree.disciplineHonors Collegeen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Arizonaen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.