1st International ICST Conference on Scalable Information Systems

Research Article

M-Chord: a scalable distributed similarity search structure

  • @INPROCEEDINGS{10.1145/1146847.1146866,
        author={David Novak  and Pavel  Zezula},
        title={M-Chord: a scalable distributed similarity search structure},
        proceedings={1st International ICST Conference on Scalable Information Systems},
        publisher={ACM},
        proceedings_a={INFOSCALE},
        year={2006},
        month={6},
        keywords={},
        doi={10.1145/1146847.1146866}
    }
    
  • David Novak
    Pavel Zezula
    Year: 2006
    M-Chord: a scalable distributed similarity search structure
    INFOSCALE
    ACM
    DOI: 10.1145/1146847.1146866
David Novak 1,*, Pavel Zezula1,*
  • 1: Masaryk University, Brno, Czech Republic
*Contact email: xnovak8@fi.muni.cz, zezula @fi.muni.cz

Abstract

The need for a retrieval based not on the attribute values but on the very data content has recently led to rise of the metric-based similarity search. The computational complexity of such a retrieval and large volumes of processed data call for distributed processing which allows to achieve scalability. In this paper, we propose M-Chord, a distributed data structure for metric-based similarity search. The structure takes advantage of the idea of a vector index method iDistance in order to transform the issue of similarity searching into the problem of interval search in one dimension. The proposed peer-to-peer organization, based on the Chord protocol, distributes the storage space and parallelizes the execution of similarity queries. Promising features of the structure are validated by experiments on the prototype implementation and two real-life datasets.