Indexing and retrieving images in a multilingual world (extended abstract)

Persistent Link:
http://hdl.handle.net/10150/105900
Title:
Indexing and retrieving images in a multilingual world (extended abstract)
Author:
Ménard, Elaine
Editors:
Tennis, Joseph T.
Citation:
Indexing and retrieving images in a multilingual world (extended abstract) 2007, 1:105-106
Publisher:
dLIST
Issue Date:
2007
URI:
http://hdl.handle.net/10150/105900
Submitted date:
2007-06-05
Abstract:
The Internet constitutes a vast universe of knowledge and human culture, allowing the dissemination of ideas and information without borders. The Web also became an important media for the diffusion of multilingual resources. However, linguistic differences still form a major obstacle to scientific, cultural, and educational exchange. With the ever increasing size of the Web and the availability of more and more documents in various languages, this problem becomes all the more pervasive. Besides this linguistic diversity, a multitude of databases and collections now contain documents in various formats, which may also adversely affect the retrieval process. This paper presents the context, the problem statement, and the experiment carried out of a research project aiming to verify the existing relations between two different indexing approaches: (1) traditional image indexing recommending the use of controlled vocabularies or (2) free image indexing using uncontrolled vocabulary, and their respective performance for image retrieval, in a multilingual context. The use of controlled vocabularies or uncontrolled vocabularies raises a certain number of difficulties for the indexing process. These difficulties will necessarily entail consequences at the time of image retrieval. Indexing with controlled or uncontrolled vocabularies is a question extensively discussed in the literature. However, it is clear that many searchers recognize the advantages of either form of vocabulary according to circumstances (Arsenault, 2006). It appears that the many difficulties associated with free indexing using uncontrolled vocabularies can only be understood via a comparative analysis with controlled vocabulary indexing (Macgregor & McCulloch, 2006). This research compares image retrieval within two contexts: a monolingual context where the language of the query is the same as the indexing language; and a multilingual context where the language of the query is different from the indexing language. This research will indicate if one of these indexing approaches surpasses the other, in terms of effectiveness, efficiency, and satisfaction of the image searchers. For this research, three data collection methods are used: (1) the analysis of the vocabularies used for image indexing in order to examine the multiplicity of term types applied to images (generic description, identification, and interpretation) and the degree of indexing difficulty due to the subject and the nature of the image; (2) the simulation of the retrieval process with a subset of images indexed according to each indexing approach studied, and finally, (3) the administration of a questionnaire to gather information on searcher satisfaction during and after the retrieval process. The quantification of the retrieval performance of each indexing approach is based on the usability measures recommended by the standard ISO 9241-11, i.e. effectiveness, efficiency, and satisfaction of the user (AFNOR, 1998). The need to retrieve a particular image from a collection is shared by several user communities including teachers, artists, journalists, scientists, historians, filmmakers and librarians, all over the world. Image collections also have many areas of application: commercial, scientific, educational, and cultural. Until recently, image collections were difficult to access due to limitations in dissemination and duplication procedures. This research underlines the pressing necessity to optimize the methods used for image processing, in order to facilitate the imagesâ retrieval and their dissemination in multilingual environments. The results of this study will offer preliminary information to deepen our understanding of the influence of the vocabulary used in image indexing. In turn, these results can be used to enhance access to digital collections of visual material in multilingual environments.
Type:
Conference Paper
Language:
en
Keywords:
Indexing; Information Retrieval; Linguistics
Local subject classification:
French; English; multilingual indexing and retrieval

Full metadata record

DC FieldValue Language
dc.contributor.authorMénard, Elaineen_US
dc.contributor.editorTennis, Joseph T.en_US
dc.date.accessioned2007-06-05T00:00:01Z-
dc.date.available2010-06-18T23:36:25Z-
dc.date.issued2007en_US
dc.date.submitted2007-06-05en_US
dc.identifier.citationIndexing and retrieving images in a multilingual world (extended abstract) 2007, 1:105-106en_US
dc.identifier.urihttp://hdl.handle.net/10150/105900-
dc.description.abstractThe Internet constitutes a vast universe of knowledge and human culture, allowing the dissemination of ideas and information without borders. The Web also became an important media for the diffusion of multilingual resources. However, linguistic differences still form a major obstacle to scientific, cultural, and educational exchange. With the ever increasing size of the Web and the availability of more and more documents in various languages, this problem becomes all the more pervasive. Besides this linguistic diversity, a multitude of databases and collections now contain documents in various formats, which may also adversely affect the retrieval process. This paper presents the context, the problem statement, and the experiment carried out of a research project aiming to verify the existing relations between two different indexing approaches: (1) traditional image indexing recommending the use of controlled vocabularies or (2) free image indexing using uncontrolled vocabulary, and their respective performance for image retrieval, in a multilingual context. The use of controlled vocabularies or uncontrolled vocabularies raises a certain number of difficulties for the indexing process. These difficulties will necessarily entail consequences at the time of image retrieval. Indexing with controlled or uncontrolled vocabularies is a question extensively discussed in the literature. However, it is clear that many searchers recognize the advantages of either form of vocabulary according to circumstances (Arsenault, 2006). It appears that the many difficulties associated with free indexing using uncontrolled vocabularies can only be understood via a comparative analysis with controlled vocabulary indexing (Macgregor & McCulloch, 2006). This research compares image retrieval within two contexts: a monolingual context where the language of the query is the same as the indexing language; and a multilingual context where the language of the query is different from the indexing language. This research will indicate if one of these indexing approaches surpasses the other, in terms of effectiveness, efficiency, and satisfaction of the image searchers. For this research, three data collection methods are used: (1) the analysis of the vocabularies used for image indexing in order to examine the multiplicity of term types applied to images (generic description, identification, and interpretation) and the degree of indexing difficulty due to the subject and the nature of the image; (2) the simulation of the retrieval process with a subset of images indexed according to each indexing approach studied, and finally, (3) the administration of a questionnaire to gather information on searcher satisfaction during and after the retrieval process. The quantification of the retrieval performance of each indexing approach is based on the usability measures recommended by the standard ISO 9241-11, i.e. effectiveness, efficiency, and satisfaction of the user (AFNOR, 1998). The need to retrieve a particular image from a collection is shared by several user communities including teachers, artists, journalists, scientists, historians, filmmakers and librarians, all over the world. Image collections also have many areas of application: commercial, scientific, educational, and cultural. Until recently, image collections were difficult to access due to limitations in dissemination and duplication procedures. This research underlines the pressing necessity to optimize the methods used for image processing, in order to facilitate the imagesâ retrieval and their dissemination in multilingual environments. The results of this study will offer preliminary information to deepen our understanding of the influence of the vocabulary used in image indexing. In turn, these results can be used to enhance access to digital collections of visual material in multilingual environments.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dc.publisherdLISTen_US
dc.subjectIndexingen_US
dc.subjectInformation Retrievalen_US
dc.subjectLinguisticsen_US
dc.subject.otherFrenchen_US
dc.subject.otherEnglishen_US
dc.subject.othermultilingual indexing and retrievalen_US
dc.titleIndexing and retrieving images in a multilingual world (extended abstract)en_US
dc.typeConference Paperen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.