Persistent Link:
http://hdl.handle.net/10150/289879
Title:
An ontology for linguistics on the Semantic Web
Author:
Farrar, Scott O.
Issue Date:
2003
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
The current research presents an ontology for linguistics useful for an implementation on the Semantic Web. By adhering to this model, it is shown that data of the kind routinely collected by field linguists may be represented so as to facilitate automatic analysis and semantic search. The literature concerning typological databases, knowledge engineering, and the Semantic Web is reviewed. It is argued that the time is right for the integration of these three areas of research. Linguistic knowledge is discussed in the overall context of common-sense knowledge representation. A three-layer approach to meaning is assumed, one that includes conceptual, semantic, and linguistic levels of knowledge. In particular the level of semantics is shown to be crucial for a notional account of grammatical categories such as tense, aspect, and case. The level of semantic is viewed as an encoding of common-sense reality. To develop the ontology an upper model based on the Suggested Upper Merged Ontology (SUMO) is adopted, though elements from other ontologies are utilized as well. A brief comparison of available upper models is presented. It is argued that any ontology for linguistics should provide an account of at least (1) linguistic expressions, (2) mental linguistic units, (3) linguistic categories, and (4) discrete semantic units. The concepts and relations concerning these four domains are motivated as part of the ontology. Finally, an implementation for the Semantic Web is given by discussing the various data constructs necessary for markup (interlinear text, lexicons, paradigms, grammatical descriptions). It is argued that a characterization of the data constructs should not be included in the general ontology, but should be left up to the individual data provider to implement in XML Schema. A search scenario for linguistic data is discussed. It is shown that an ontology for linguistics provides the machinery for pure semantic search, that is, an advanced search framework whereby the user may use linguistic concepts, not just simple strings, as the search query.
Type:
text; Dissertation-Reproduction (electronic)
Keywords:
Language, Linguistics.; Information Science.; Computer Science.
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Linguistics
Degree Grantor:
University of Arizona
Advisor:
Langendoen, D. Terence

Full metadata record

DC FieldValue Language
dc.language.isoen_USen_US
dc.titleAn ontology for linguistics on the Semantic Weben_US
dc.creatorFarrar, Scott O.en_US
dc.contributor.authorFarrar, Scott O.en_US
dc.date.issued2003en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractThe current research presents an ontology for linguistics useful for an implementation on the Semantic Web. By adhering to this model, it is shown that data of the kind routinely collected by field linguists may be represented so as to facilitate automatic analysis and semantic search. The literature concerning typological databases, knowledge engineering, and the Semantic Web is reviewed. It is argued that the time is right for the integration of these three areas of research. Linguistic knowledge is discussed in the overall context of common-sense knowledge representation. A three-layer approach to meaning is assumed, one that includes conceptual, semantic, and linguistic levels of knowledge. In particular the level of semantics is shown to be crucial for a notional account of grammatical categories such as tense, aspect, and case. The level of semantic is viewed as an encoding of common-sense reality. To develop the ontology an upper model based on the Suggested Upper Merged Ontology (SUMO) is adopted, though elements from other ontologies are utilized as well. A brief comparison of available upper models is presented. It is argued that any ontology for linguistics should provide an account of at least (1) linguistic expressions, (2) mental linguistic units, (3) linguistic categories, and (4) discrete semantic units. The concepts and relations concerning these four domains are motivated as part of the ontology. Finally, an implementation for the Semantic Web is given by discussing the various data constructs necessary for markup (interlinear text, lexicons, paradigms, grammatical descriptions). It is argued that a characterization of the data constructs should not be included in the general ontology, but should be left up to the individual data provider to implement in XML Schema. A search scenario for linguistic data is discussed. It is shown that an ontology for linguistics provides the machinery for pure semantic search, that is, an advanced search framework whereby the user may use linguistic concepts, not just simple strings, as the search query.en_US
dc.typetexten_US
dc.typeDissertation-Reproduction (electronic)en_US
dc.subjectLanguage, Linguistics.en_US
dc.subjectInformation Science.en_US
dc.subjectComputer Science.en_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineLinguisticsen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorLangendoen, D. Terenceen_US
dc.identifier.proquest3089940en_US
dc.identifier.bibrecord.b44419892en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.