Modeling semantic coherence from corpus data: the fact and the frequency of a co-occurrence

Persistent Link:
http://hdl.handle.net/10150/126619
Title:
Modeling semantic coherence from corpus data: the fact and the frequency of a co-occurrence
Author:
Pekar, Viktor
Affiliation:
Bashkir State University
Publisher:
University of Arizona Linguistics Circle
Journal:
Coyote Papers: Working Papers in Linguistics, Language in Cognitive Science
Issue Date:
2001
URI:
http://hdl.handle.net/10150/126619
Abstract:
The paper presents a preliminary evaluation of a corpus-based representation of individual words and a method to generalize over these representations. The vector space is represented in a way that gives weight to the fact that words co-occur rather than to the frequency of their co-occurrence. This format is hypothesized to allow for reducing the vector space, minimizing negative effects of data sparseness and enhancing ability of the model to generalize words to novel contexts. The model is assessed by comparing computer-calculated probabilities of different verb-argument combinations with human subjects' judgements about appropriateness of these combinations. The results indicate that there is a correlation between the probabilities calculated by the model and the subjects' evaluations.
Type:
text; Article
Language:
en_US
ISSN:
0894-4539

Full metadata record

DC FieldValue Language
dc.contributor.authorPekar, Viktoren_US
dc.date.accessioned2011-03-31T18:02:32Z-
dc.date.available2011-03-31T18:02:32Z-
dc.date.issued2001-
dc.identifier.issn0894-4539-
dc.identifier.urihttp://hdl.handle.net/10150/126619-
dc.description.abstractThe paper presents a preliminary evaluation of a corpus-based representation of individual words and a method to generalize over these representations. The vector space is represented in a way that gives weight to the fact that words co-occur rather than to the frequency of their co-occurrence. This format is hypothesized to allow for reducing the vector space, minimizing negative effects of data sparseness and enhancing ability of the model to generalize words to novel contexts. The model is assessed by comparing computer-calculated probabilities of different verb-argument combinations with human subjects' judgements about appropriateness of these combinations. The results indicate that there is a correlation between the probabilities calculated by the model and the subjects' evaluations.en_US
dc.language.isoen_USen_US
dc.publisherUniversity of Arizona Linguistics Circleen_US
dc.titleModeling semantic coherence from corpus data: the fact and the frequency of a co-occurrenceen_US
dc.typetexten_US
dc.typeArticleen_US
dc.contributor.departmentBashkir State Universityen_US
dc.identifier.journalCoyote Papers: Working Papers in Linguistics, Language in Cognitive Scienceen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.