2nd International ICST Conference on Scalable Information Systems

Research Article

A Distributed Incremental Nearest Neighbor Algorithm

Download549 downloads
  • @INPROCEEDINGS{10.4108/infoscale.2007.196,
        author={Fabrizio Falchi and Fausto Rabitti and Claudio Gennaro and Pavel Zezula},
        title={A Distributed Incremental Nearest Neighbor Algorithm},
        proceedings={2nd International ICST Conference on Scalable Information Systems},
        proceedings_a={INFOSCALE},
        year={2010},
        month={5},
        keywords={distributed systems algorithms performance},
        doi={10.4108/infoscale.2007.196}
    }
    
  • Fabrizio Falchi
    Fausto Rabitti
    Claudio Gennaro
    Pavel Zezula
    Year: 2010
    A Distributed Incremental Nearest Neighbor Algorithm
    INFOSCALE
    ICST
    DOI: 10.4108/infoscale.2007.196
Fabrizio Falchi1,*, Fausto Rabitti1,*, Claudio Gennaro1,*, Pavel Zezula2,*
  • 1: ISTI-CNR Pisa, Italy
  • 2: Masaryk University Brno, Czech Republic
*Contact email: fabrizio.falchi@isti.cnr.it, fausto.rabitti@isti.cnr.it, claudio.gennaro@isti.cnr.it, zezula@fi.muni.cz

Abstract

Searching for non-text data (e.g., images) is mostly done by means of metadata annotations or by extracting the text close to the data. However, supporting real content-based audio-visual search, based on similarity search on features, is significantly more expensive than searching for text. Moreover, the search exhibits linear scalability with respect to the data set size. In this paper, we present a Distributed Incremental Nearest Neighbor algorithm (DINN) for finding nearest neighbor in an incremental fashion over data distributed between nodes which are able to perform a local Incremental Nearest Neighbor (local-INN). We prove that our algorithm is optimal with respect to both number of involved nodes and number of local-INN invocations. An implementation of our DINN algorithm, on a real P2P system called MCAN, was used for conducting an extensive experimental evaluation on a real-life dataset.