Domain-independent semantic concept extraction using corpus linguistics, statistics and artificial intelligence techniques

Persistent Link:
http://hdl.handle.net/10150/280502
Title:
Domain-independent semantic concept extraction using corpus linguistics, statistics and artificial intelligence techniques
Author:
Tolle, Kristin M.
Issue Date:
2003
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
For this dissertation two software applications were developed and three experiments were conducted to evaluate the viability of a unique approach to medical information extraction. The first system, the AZ Noun Phraser, was designed as a concept extraction tool. The second application, ANNEE, is a neural net-based entity extraction (EE) system. These two systems were combined to perform concept extraction and semantic classification specifically for use in medical document retrieval systems. The goal of this research was to create a system that automatically (without human interaction) enabled semantic type assignment, such as gene name and disease, to concepts extracted from unstructured medical text documents. Improving conceptual analysis of search phrases has been shown to improve the precision of information retrieval systems. Enabling this capability in the field of medicine can aid medical researchers, doctors and librarians in locating information, potentially improving healthcare decision-making. Due to the flexibility and non-domain specificity of the implementation, these applications have also been successfully deployed in other text retrieval experimentation for law enforcement (Atabakhsh et al., 2001; Hauck, Atabakhsh, Ongvasith, Gupta, & Chen, 2002), medicine (Tolle & Chen, 2000), query expansion (Leroy, Tolle, & Chen, 2000), web document categorization (Chen, Fan, Chau, & Zeng, 2001), Internet spiders (Chau, Zeng, & Chen, 2001), collaborative agents (Chau, Zeng, Chen, Huang, & Hendriawan, 2002), competitive intelligence (Chen, Chau, & Zeng, 2002), and Internet chat-room data visualization (Zhu & Chen, 2001).
Type:
text; Dissertation-Reproduction (electronic)
Keywords:
Information Science.; Artificial Intelligence.; Computer Science.
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Business Administration
Degree Grantor:
University of Arizona
Advisor:
Dror, Moshe

Full metadata record

DC FieldValue Language
dc.language.isoen_USen_US
dc.titleDomain-independent semantic concept extraction using corpus linguistics, statistics and artificial intelligence techniquesen_US
dc.creatorTolle, Kristin M.en_US
dc.contributor.authorTolle, Kristin M.en_US
dc.date.issued2003en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractFor this dissertation two software applications were developed and three experiments were conducted to evaluate the viability of a unique approach to medical information extraction. The first system, the AZ Noun Phraser, was designed as a concept extraction tool. The second application, ANNEE, is a neural net-based entity extraction (EE) system. These two systems were combined to perform concept extraction and semantic classification specifically for use in medical document retrieval systems. The goal of this research was to create a system that automatically (without human interaction) enabled semantic type assignment, such as gene name and disease, to concepts extracted from unstructured medical text documents. Improving conceptual analysis of search phrases has been shown to improve the precision of information retrieval systems. Enabling this capability in the field of medicine can aid medical researchers, doctors and librarians in locating information, potentially improving healthcare decision-making. Due to the flexibility and non-domain specificity of the implementation, these applications have also been successfully deployed in other text retrieval experimentation for law enforcement (Atabakhsh et al., 2001; Hauck, Atabakhsh, Ongvasith, Gupta, & Chen, 2002), medicine (Tolle & Chen, 2000), query expansion (Leroy, Tolle, & Chen, 2000), web document categorization (Chen, Fan, Chau, & Zeng, 2001), Internet spiders (Chau, Zeng, & Chen, 2001), collaborative agents (Chau, Zeng, Chen, Huang, & Hendriawan, 2002), competitive intelligence (Chen, Chau, & Zeng, 2002), and Internet chat-room data visualization (Zhu & Chen, 2001).en_US
dc.typetexten_US
dc.typeDissertation-Reproduction (electronic)en_US
dc.subjectInformation Science.en_US
dc.subjectArtificial Intelligence.en_US
dc.subjectComputer Science.en_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineBusiness Administrationen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorDror, Mosheen_US
dc.identifier.proquest3119988en_US
dc.identifier.bibrecord.b45646715en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.