Persistent Link:
http://hdl.handle.net/10150/105302
Title:
The freshness of Web search engine databases
Author:
Lewandowski, Dirk; Wahlig, Henry; Meyer-Bautor, Gunnar
Citation:
The freshness of Web search engine databases 2005,
Issue Date:
2005
URI:
http://hdl.handle.net/10150/105302
Submitted date:
2006-05-25
Abstract:
This is a preprint of an article published in the Journal of Information Science Vol. 32, No. 2, 131-148 (2006). This study measures the frequency in which search engines update their indices. Therefore, 38 websites that are updated on a daily basis were analysed within a time-span of six weeks. The analysed search engines were Google, Yahoo and MSN. We find that Google performs best overall with the most pages updated on a daily basis, but only MSN is able to update all pages within a time-span of less than 20 days. Both other engines have outliers that are quite older. In terms of indexing patterns, we find different approaches at the different engines: While MSN shows clear update patterns, Google shows some outliers and the update process of the Yahoo index seems to be quite chaotic. Implications are that the quality of different search engine indices varies and not only one engine should be used when searching for current content.
Type:
Preprint
Language:
en
Keywords:
World Wide Web; Information Science; Information Retrieval; Internet; Information Systems
Local subject classification:
Search engines; Online information retrieval; Index freshness

Full metadata record

DC FieldValue Language
dc.contributor.authorLewandowski, Dirken_US
dc.contributor.authorWahlig, Henryen_US
dc.contributor.authorMeyer-Bautor, Gunnaren_US
dc.date.accessioned2006-05-25T00:00:01Z-
dc.date.available2010-06-18T23:23:21Z-
dc.date.issued2005en_US
dc.date.submitted2006-05-25en_US
dc.identifier.citationThe freshness of Web search engine databases 2005,en_US
dc.identifier.urihttp://hdl.handle.net/10150/105302-
dc.description.abstractThis is a preprint of an article published in the Journal of Information Science Vol. 32, No. 2, 131-148 (2006). This study measures the frequency in which search engines update their indices. Therefore, 38 websites that are updated on a daily basis were analysed within a time-span of six weeks. The analysed search engines were Google, Yahoo and MSN. We find that Google performs best overall with the most pages updated on a daily basis, but only MSN is able to update all pages within a time-span of less than 20 days. Both other engines have outliers that are quite older. In terms of indexing patterns, we find different approaches at the different engines: While MSN shows clear update patterns, Google shows some outliers and the update process of the Yahoo index seems to be quite chaotic. Implications are that the quality of different search engine indices varies and not only one engine should be used when searching for current content.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dc.subjectWorld Wide Weben_US
dc.subjectInformation Scienceen_US
dc.subjectInformation Retrievalen_US
dc.subjectInterneten_US
dc.subjectInformation Systemsen_US
dc.subject.otherSearch enginesen_US
dc.subject.otherOnline information retrievalen_US
dc.subject.otherIndex freshnessen_US
dc.titleThe freshness of Web search engine databasesen_US
dc.typePreprinten_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.