Persistent Link:
http://hdl.handle.net/10150/185914
Title:
Approximate pattern matching and its applications.
Author:
Wu, Sun.
Issue Date:
1992
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
In this thesis, we study approximate pattern matching problems. Our study is based on the Levenshtein distance model, where errors considered are 'insertions', 'deletions', and 'substitutions'. In general, given a text string, a pattern, and an integer k, we want to find substrings in the text that match the pattern with no more than k errors. The pattern can be a fixed string, a limited expression, or a regular expression. The problem has different variations with different levels of difficulties depending on the types of the pattern as well as the constraint imposed on the matching. We present new results both of theoretical interest and practical value. We present a new algorithm for approximate regular expression matching, which is the first to achieve a subquadratic asymptotic time for this problem. For the practical side, we present new algorithms for approximate pattern matching that are very efficient and flexible. Based on these algorithms, we developed a new software tool called 'agrep', which is the first general purpose approximate pattern matching tool in the UNIX system. 'agrep' is not only usually faster than the UNIX 'grep/egrep/fgrep' family, it also provides many new features such as searching with errors allowed, record-oriented search, AND/OR combined patterns, and mixed exact/approximate matching. 'agrep' has been made publicly available through anonymous ftp from cs.arizona.edu since June 1991.
Type:
text; Dissertation-Reproduction (electronic)
Keywords:
Computer science.
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Computer Science; Graduate College
Degree Grantor:
University of Arizona
Advisor:
Manber, Udi

Full metadata record

DC FieldValue Language
dc.language.isoenen_US
dc.titleApproximate pattern matching and its applications.en_US
dc.creatorWu, Sun.en_US
dc.contributor.authorWu, Sun.en_US
dc.date.issued1992en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractIn this thesis, we study approximate pattern matching problems. Our study is based on the Levenshtein distance model, where errors considered are 'insertions', 'deletions', and 'substitutions'. In general, given a text string, a pattern, and an integer k, we want to find substrings in the text that match the pattern with no more than k errors. The pattern can be a fixed string, a limited expression, or a regular expression. The problem has different variations with different levels of difficulties depending on the types of the pattern as well as the constraint imposed on the matching. We present new results both of theoretical interest and practical value. We present a new algorithm for approximate regular expression matching, which is the first to achieve a subquadratic asymptotic time for this problem. For the practical side, we present new algorithms for approximate pattern matching that are very efficient and flexible. Based on these algorithms, we developed a new software tool called 'agrep', which is the first general purpose approximate pattern matching tool in the UNIX system. 'agrep' is not only usually faster than the UNIX 'grep/egrep/fgrep' family, it also provides many new features such as searching with errors allowed, record-oriented search, AND/OR combined patterns, and mixed exact/approximate matching. 'agrep' has been made publicly available through anonymous ftp from cs.arizona.edu since June 1991.en_US
dc.typetexten_US
dc.typeDissertation-Reproduction (electronic)en_US
dc.subjectComputer science.en_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorManber, Udien_US
dc.contributor.committeememberMyers, Eugene W.en_US
dc.contributor.committeememberKannan, Sampathen_US
dc.contributor.committeememberMyers, Donalden_US
dc.identifier.proquest9234909en_US
dc.identifier.oclc701908329en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.