Persistent Link:
http://hdl.handle.net/10150/106337
Title:
Tagging for Health Information Organisation and Retrieval
Author:
Kipp, Margaret E. I.
Citation:
Tagging for Health Information Organisation and Retrieval 2007,
Issue Date:
2007
URI:
http://hdl.handle.net/10150/106337
Submitted date:
2007-09-18
Abstract:
INTRODUCTION Medical professionals seek to capture papers which can be located via keyword or free text search in digital libraries or on the web but are also interested in finding material that has not yet been indexed in on-line databases. Search engines provide a multitude of results [1]. Social bookmarking, where users tag items for their own use, offers a way to locate new and relevant information. CiteULike (citeulike.org), a social bookmarking service, allows articles to be tagged with useful keywords for later retrieval. RELATED STUDIES A previous study [2] compared social bookmarking to existing information organisation structures and found similarities in terminology use and intriguing differences. A sample of articles tagged on CiteULike was examined for contextual differences in keyword usage between users of social bookmarking sites, authors and indexers. Many tags were related to thesaurus terms (descriptors), but were not formally in the thesaurus. [2] This study examines how term usage patterns in tags, keywords and descriptors suggest a similar (or differing) context between users, authors and intermediaries. METHODOLOGY This study examines the use of tags on CiteULike from three medical or biology journals (JAMA, Proteins, and Journal of Molecular Biology) indexed in Pubmed. 1299 unique articles were retrieved from Citeulike; Medical Subject Headings (MeSH) were collected from Pubmed. Articles were analysed using standard informetric techniques to examine the use of user assigned tags and their Pubmed assigned MeSH index terms. Data was analysed for term usage and categorised to see what contextual clues users expose in their tag use. RESULTS Articles were tagged by up to 14 users (average 2-4). 1449 unique tags were used in the data set. Some articles were heavily tagged by users (max. 29, min. 1, median 2). Descriptors were more heavily assigned to articles (2746 unique descriptors). Articles had, on average, 10 descriptors assigned (max. 40, min. 2). Some tags occurred frequently: protein_structure (140), no-tag (134), and protein (114). By journal, tags were: docking (Proteins, 85), no-tag (JAMA, 20), and protein_structure (J Mol Biol, 52). No-tag (system assigned) indicated no tag assigned. Descriptors were more heavily reused than tags, for example: 'Models, Molecular' (550), Protein Conformation (363), and Humans (341). By journal, descriptors were: 'Models, Molecular' (Proteins, 252), 'Models, Molecular' (J Mol Biol, 235), and Humans (JAMA, 137). DISCUSSIONS AND CONCLUSIONS Comparison of tag and descriptor lists shows many of the same similarities and differences as the previous study [2]. Many user terms were related to the author and intermediary terms but not in the thesaurus (e.g. 'diet' and 'fat' used separately in the tag lists where they were linked as 'dietary fats' in the thesaurus). Terms such as 'human' and 'family-studies' show users tagging biology articles are interested in methodology and user groups associated with articles. This study has system design implications for accessing, indexing and searching document spaces. Users express frustration trying to narrow search results. Controlled vocabularies help narrow a search to a manageable size but can be expensive. User tagging could provide additional access points to traditional controlled vocabularies and the associative classifications necessary to tie documents and articles to time and task relationships among other novel items. REFERENCES [1] Tang H, Ng J.HK. 2006. Googling for a diagnosis -- use of Google as a diagnostic aid: internet based study. BMJ 333 (2 Dec), 1143-1145. [2] Kipp MEI. 2006. Complementary or discrete contexts in online indexing: A comparison of user, creator, and intermediary keywords. Canadian Journal of Information and Library Science (in press) http://dlist.sir.arizona.edu/1533/
Type:
Conference Poster
Language:
en
Keywords:
Classification; Knowledge Organization
Local subject classification:
Tagging; Health information; Social bookmarking; Informetrics

Full metadata record

DC FieldValue Language
dc.contributor.authorKipp, Margaret E. I.en_US
dc.date.accessioned2007-09-18T00:00:01Z-
dc.date.available2010-06-18T23:44:42Z-
dc.date.issued2007en_US
dc.date.submitted2007-09-18en_US
dc.identifier.citationTagging for Health Information Organisation and Retrieval 2007,en_US
dc.identifier.urihttp://hdl.handle.net/10150/106337-
dc.description.abstractINTRODUCTION Medical professionals seek to capture papers which can be located via keyword or free text search in digital libraries or on the web but are also interested in finding material that has not yet been indexed in on-line databases. Search engines provide a multitude of results [1]. Social bookmarking, where users tag items for their own use, offers a way to locate new and relevant information. CiteULike (citeulike.org), a social bookmarking service, allows articles to be tagged with useful keywords for later retrieval. RELATED STUDIES A previous study [2] compared social bookmarking to existing information organisation structures and found similarities in terminology use and intriguing differences. A sample of articles tagged on CiteULike was examined for contextual differences in keyword usage between users of social bookmarking sites, authors and indexers. Many tags were related to thesaurus terms (descriptors), but were not formally in the thesaurus. [2] This study examines how term usage patterns in tags, keywords and descriptors suggest a similar (or differing) context between users, authors and intermediaries. METHODOLOGY This study examines the use of tags on CiteULike from three medical or biology journals (JAMA, Proteins, and Journal of Molecular Biology) indexed in Pubmed. 1299 unique articles were retrieved from Citeulike; Medical Subject Headings (MeSH) were collected from Pubmed. Articles were analysed using standard informetric techniques to examine the use of user assigned tags and their Pubmed assigned MeSH index terms. Data was analysed for term usage and categorised to see what contextual clues users expose in their tag use. RESULTS Articles were tagged by up to 14 users (average 2-4). 1449 unique tags were used in the data set. Some articles were heavily tagged by users (max. 29, min. 1, median 2). Descriptors were more heavily assigned to articles (2746 unique descriptors). Articles had, on average, 10 descriptors assigned (max. 40, min. 2). Some tags occurred frequently: protein_structure (140), no-tag (134), and protein (114). By journal, tags were: docking (Proteins, 85), no-tag (JAMA, 20), and protein_structure (J Mol Biol, 52). No-tag (system assigned) indicated no tag assigned. Descriptors were more heavily reused than tags, for example: 'Models, Molecular' (550), Protein Conformation (363), and Humans (341). By journal, descriptors were: 'Models, Molecular' (Proteins, 252), 'Models, Molecular' (J Mol Biol, 235), and Humans (JAMA, 137). DISCUSSIONS AND CONCLUSIONS Comparison of tag and descriptor lists shows many of the same similarities and differences as the previous study [2]. Many user terms were related to the author and intermediary terms but not in the thesaurus (e.g. 'diet' and 'fat' used separately in the tag lists where they were linked as 'dietary fats' in the thesaurus). Terms such as 'human' and 'family-studies' show users tagging biology articles are interested in methodology and user groups associated with articles. This study has system design implications for accessing, indexing and searching document spaces. Users express frustration trying to narrow search results. Controlled vocabularies help narrow a search to a manageable size but can be expensive. User tagging could provide additional access points to traditional controlled vocabularies and the associative classifications necessary to tie documents and articles to time and task relationships among other novel items. REFERENCES [1] Tang H, Ng J.HK. 2006. Googling for a diagnosis -- use of Google as a diagnostic aid: internet based study. BMJ 333 (2 Dec), 1143-1145. [2] Kipp MEI. 2006. Complementary or discrete contexts in online indexing: A comparison of user, creator, and intermediary keywords. Canadian Journal of Information and Library Science (in press) http://dlist.sir.arizona.edu/1533/en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dc.subjectClassificationen_US
dc.subjectKnowledge Organizationen_US
dc.subject.otherTaggingen_US
dc.subject.otherHealth informationen_US
dc.subject.otherSocial bookmarkingen_US
dc.subject.otherInformetricsen_US
dc.titleTagging for Health Information Organisation and Retrievalen_US
dc.typeConference Posteren_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.