An Automatic Indexing and Neural Network Approach to Concept Retrieval and Classification of Multilingual (Chinese-English) Documents

Persistent Link:
http://hdl.handle.net/10150/105797
Title:
An Automatic Indexing and Neural Network Approach to Concept Retrieval and Classification of Multilingual (Chinese-English) Documents
Author:
Lin, Chung-hsin; Chen, Hsinchun
Citation:
An Automatic Indexing and Neural Network Approach to Concept Retrieval and Classification of Multilingual (Chinese-English) Documents 1996-02, 26(1):1-14 IEEE Transactional on Systems, Man, and Cybermetics
Publisher:
IEEE
Journal:
IEEE Transactional on Systems, Man, and Cybermetics
Issue Date:
Feb-1996
Description:
Artificial Intelligence Lab, Department of MIS, University of Arizona
URI:
http://hdl.handle.net/10150/105797
Submitted date:
2004-10-01
Abstract:
An automatic indexing and concept classification approach to a multilingual (Chinese and English) bibliographic database is presented. We introduced a multi-linear termphrasing technique to extract concept descriptors (terms or keywords) from a Chinese-English bibliographic database. A concept space of related descriptors was then generated using a co-occurrence analysis technique. Like a man-made thesaurus, the system-generated concept space can be used to generate additional semantically-relevant terms for search. For concept classification and clustering, a variant of a Hopfield neural network was developed to cluster similar concept descriptors and to generate a small number of concept groups to represent (summarize) the subject matter of the database. The concept space approach to information classification and retrieval has been adopted by the aupors in other scientific databases and business applications, but multilingual information retrieval presents a unique challenge. This research reports our experiment on multilingual databases. Our system was initially developed in the MS-DOS environment, running ETEN Chinese operating system. For performance reasons, it was then tested on a UNIX-based system. Due to the unique ideographic nature of the Chinese language, a Chinese term-phrase indexing paradigm considering the ideographic characteristics of Chinese was developed as a multilingual information classification model. By applying the neural network based concept classification technique, the model presents a novel way of organizing unstructured multilingual information.
Type:
Journal Article (Paginated)
Language:
en
Keywords:
Indexing; Classification
Local subject classification:
National Science Digital Library; NSDL; Artificial intelligence lab; AI lab; Information retrieval

Full metadata record

DC FieldValue Language
dc.contributor.authorLin, Chung-hsinen_US
dc.contributor.authorChen, Hsinchunen_US
dc.date.accessioned2004-10-01T00:00:01Z-
dc.date.available2010-06-18T23:34:36Z-
dc.date.issued1996-02en_US
dc.date.submitted2004-10-01en_US
dc.identifier.citationAn Automatic Indexing and Neural Network Approach to Concept Retrieval and Classification of Multilingual (Chinese-English) Documents 1996-02, 26(1):1-14 IEEE Transactional on Systems, Man, and Cybermeticsen_US
dc.identifier.urihttp://hdl.handle.net/10150/105797-
dc.descriptionArtificial Intelligence Lab, Department of MIS, University of Arizonaen_US
dc.description.abstractAn automatic indexing and concept classification approach to a multilingual (Chinese and English) bibliographic database is presented. We introduced a multi-linear termphrasing technique to extract concept descriptors (terms or keywords) from a Chinese-English bibliographic database. A concept space of related descriptors was then generated using a co-occurrence analysis technique. Like a man-made thesaurus, the system-generated concept space can be used to generate additional semantically-relevant terms for search. For concept classification and clustering, a variant of a Hopfield neural network was developed to cluster similar concept descriptors and to generate a small number of concept groups to represent (summarize) the subject matter of the database. The concept space approach to information classification and retrieval has been adopted by the aupors in other scientific databases and business applications, but multilingual information retrieval presents a unique challenge. This research reports our experiment on multilingual databases. Our system was initially developed in the MS-DOS environment, running ETEN Chinese operating system. For performance reasons, it was then tested on a UNIX-based system. Due to the unique ideographic nature of the Chinese language, a Chinese term-phrase indexing paradigm considering the ideographic characteristics of Chinese was developed as a multilingual information classification model. By applying the neural network based concept classification technique, the model presents a novel way of organizing unstructured multilingual information.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectIndexingen_US
dc.subjectClassificationen_US
dc.subject.otherNational Science Digital Libraryen_US
dc.subject.otherNSDLen_US
dc.subject.otherArtificial intelligence laben_US
dc.subject.otherAI laben_US
dc.subject.otherInformation retrievalen_US
dc.titleAn Automatic Indexing and Neural Network Approach to Concept Retrieval and Classification of Multilingual (Chinese-English) Documentsen_US
dc.typeJournal Article (Paginated)en_US
dc.identifier.journalIEEE Transactional on Systems, Man, and Cybermeticsen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.