1st International ICST Conference on Scalable Information Systems

Research Article

Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering

  • @INPROCEEDINGS{10.1145/1146847.1146883,
        author={Iraklis A.  Klampanos and Joemon M.  Jose and C. J. “Keith”  van Rijsbergen},
        title={Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering},
        proceedings={1st International ICST Conference on Scalable Information Systems},
        publisher={ACM},
        proceedings_a={INFOSCALE},
        year={2006},
        month={6},
        keywords={},
        doi={10.1145/1146847.1146883}
    }
    
  • Iraklis A. Klampanos
    Joemon M. Jose
    C. J. “Keith” van Rijsbergen
    Year: 2006
    Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering
    INFOSCALE
    ACM
    DOI: 10.1145/1146847.1146883
Iraklis A. Klampanos1,*, Joemon M. Jose1,*, C. J. “Keith” van Rijsbergen1,*
  • 1: Department of Computing Science, University of Glasgow, Scotland.
*Contact email: iraklis@dcs.gla.ac.uk, jj@dcs.gla.ac.uk, keith@dcs.gla.ac.uk

Abstract

Document clustering has been a particularly active research field within the Information Retrieval (IR) community. Among the numerous clustering algorithms proposed, single-pass clustering stands out in terms of both time and space efficiency. However, it is generally acknowledged that single-pass clustering has a major defect, namely its output depends on the order in which documents are presented. Building on our previous work, and having identified single-pass clustering as potentially useful for P2P IR, we study the extent to which this is true in practical terms. We do so by experimenting with two large web-based testbeds, which are suitable for Peer-to-Peer IR evaluation. The results of our study show that document ordering does not practically matter for single-pass clustering.