Language- and domain-independent knowledge maps: A statistical phrase indexing approach

Persistent Link:
http://hdl.handle.net/10150/290042
Title:
Language- and domain-independent knowledge maps: A statistical phrase indexing approach
Author:
Ong, Thian-Huat
Issue Date:
2004
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
Global economy increases the need for multilingual systems, while each domain has a large repository of knowledge, particularly explicit knowledge usually captured in text. The speed of textual information being produced has exceeded the speed at which a person can process the information, so an automated approach to alleviate the information overload problem is needed. Unlike structured data in databases, unstructured text cannot be readily understood and processed by computers. This dissertation aims to create a language- and domain-independent approach to automatically generating hierarchical knowledge maps that enable the users to browse and understand the concepts hidden in the underlying knowledge sources. A system development research methodology was adopted to build and evaluate prototype systems to study the research questions. In order to process textual knowledge, a statistical phrase indexing algorithm was proposed and applied to the Chinese language. Next, the algorithm was extended to be able to process multiple languages and domains. Lastly, the results of the algorithm was further applied to a case study using the dissertation's proposed automated framework for generating hierarchical knowledge maps in Chinese news collection. This dissertation has two main contributions. First, it demonstrated that an automated approach is effective in creating knowledge maps for users to browse the underlying knowledge. The approach combines statistical phrase extraction algorithm for representing textual knowledge and neural networks for clustering related concepts and visualization. Second, it provided a set of language- and domain-independent tools to extract phrases from a textual knowledge in order to support text mining applications.
Type:
text; Dissertation-Reproduction (electronic)
Keywords:
Business Administration, Management.; Computer Science.
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Business Administration
Degree Grantor:
University of Arizona
Advisor:
Chen, Hsinchun

Full metadata record

DC FieldValue Language
dc.language.isoen_USen_US
dc.titleLanguage- and domain-independent knowledge maps: A statistical phrase indexing approachen_US
dc.creatorOng, Thian-Huaten_US
dc.contributor.authorOng, Thian-Huaten_US
dc.date.issued2004en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractGlobal economy increases the need for multilingual systems, while each domain has a large repository of knowledge, particularly explicit knowledge usually captured in text. The speed of textual information being produced has exceeded the speed at which a person can process the information, so an automated approach to alleviate the information overload problem is needed. Unlike structured data in databases, unstructured text cannot be readily understood and processed by computers. This dissertation aims to create a language- and domain-independent approach to automatically generating hierarchical knowledge maps that enable the users to browse and understand the concepts hidden in the underlying knowledge sources. A system development research methodology was adopted to build and evaluate prototype systems to study the research questions. In order to process textual knowledge, a statistical phrase indexing algorithm was proposed and applied to the Chinese language. Next, the algorithm was extended to be able to process multiple languages and domains. Lastly, the results of the algorithm was further applied to a case study using the dissertation's proposed automated framework for generating hierarchical knowledge maps in Chinese news collection. This dissertation has two main contributions. First, it demonstrated that an automated approach is effective in creating knowledge maps for users to browse the underlying knowledge. The approach combines statistical phrase extraction algorithm for representing textual knowledge and neural networks for clustering related concepts and visualization. Second, it provided a set of language- and domain-independent tools to extract phrases from a textual knowledge in order to support text mining applications.en_US
dc.typetexten_US
dc.typeDissertation-Reproduction (electronic)en_US
dc.subjectBusiness Administration, Management.en_US
dc.subjectComputer Science.en_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineBusiness Administrationen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorChen, Hsinchunen_US
dc.identifier.proquest3131626en_US
dc.identifier.bibrecord.b46708121en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.