Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment

Persistent Link:
http://hdl.handle.net/10150/106219
Title:
Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment
Author:
Leydesdorff, Loet; Vaughan, Liwen
Citation:
Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment 2006,
Issue Date:
2006
Description:
Journal of the American Society for Information Science and Technology [JASIST] (forthcoming)
URI:
http://hdl.handle.net/10150/106219
Submitted date:
2006-09-22
Abstract:
To be published in Journal of the American Society for Information Science & Technology 57(12) (2006) 1616-1628. Abstract: Co-occurrence matrices, such as co-citation, co-word, and co-link matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of this data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This paper discusses the difference between a symmetrical co-citation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (like the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical co-citation matrix, but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment where the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed using both the traditional methods of multivariate analysis and the new visualization software Pajek that is based on social network analysis and graph theory.
Type:
Preprint
Language:
en
Keywords:
Bibliometrics; Information Science; Informetrics; Citation Analysis; Science Technology Studies

Full metadata record

DC FieldValue Language
dc.contributor.authorLeydesdorff, Loeten_US
dc.contributor.authorVaughan, Liwenen_US
dc.date.accessioned2006-09-22T00:00:01Z-
dc.date.available2010-06-18T23:42:44Z-
dc.date.issued2006en_US
dc.date.submitted2006-09-22en_US
dc.identifier.citationCo-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment 2006,en_US
dc.identifier.urihttp://hdl.handle.net/10150/106219-
dc.descriptionJournal of the American Society for Information Science and Technology [JASIST] (forthcoming)en_US
dc.description.abstractTo be published in Journal of the American Society for Information Science & Technology 57(12) (2006) 1616-1628. Abstract: Co-occurrence matrices, such as co-citation, co-word, and co-link matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of this data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This paper discusses the difference between a symmetrical co-citation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (like the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical co-citation matrix, but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment where the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed using both the traditional methods of multivariate analysis and the new visualization software Pajek that is based on social network analysis and graph theory.en_US
dc.format.mimetypehtmen_US
dc.language.isoenen_US
dc.subjectBibliometricsen_US
dc.subjectInformation Scienceen_US
dc.subjectInformetricsen_US
dc.subjectCitation Analysisen_US
dc.subjectScience Technology Studiesen_US
dc.titleCo-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environmenten_US
dc.typePreprinten_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.