Research Article
Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering
@INPROCEEDINGS{10.1145/1146847.1146883, author={Iraklis A. Klampanos and Joemon M. Jose and C. J. “Keith” van Rijsbergen}, title={Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering}, proceedings={1st International ICST Conference on Scalable Information Systems}, publisher={ACM}, proceedings_a={INFOSCALE}, year={2006}, month={6}, keywords={}, doi={10.1145/1146847.1146883} }
- Iraklis A. Klampanos
Joemon M. Jose
C. J. “Keith” van Rijsbergen
Year: 2006
Single-pass clustering for peer-to-peer information retrieval: the effect of document ordering
INFOSCALE
ACM
DOI: 10.1145/1146847.1146883
Abstract
Document clustering has been a particularly active research field within the Information Retrieval (IR) community. Among the numerous clustering algorithms proposed, single-pass clustering stands out in terms of both time and space efficiency. However, it is generally acknowledged that single-pass clustering has a major defect, namely its output depends on the order in which documents are presented. Building on our previous work, and having identified single-pass clustering as potentially useful for P2P IR, we study the extent to which this is true in practical terms. We do so by experimenting with two large web-based testbeds, which are suitable for Peer-to-Peer IR evaluation. The results of our study show that document ordering does not practically matter for single-pass clustering.