2nd International ICST Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

Managing and Recovering High Data Availability in a DHT under Churn

  • @INPROCEEDINGS{10.1109/COLCOM.2006.361874,
        author={Predrag Knežević and Andreas Wombacher and Thomas Risse},
        title={Managing and Recovering High Data Availability in a DHT under Churn},
        proceedings={2nd International ICST Conference on Collaborative Computing: Networking, Applications and Worksharing},
        publisher={IEEE},
        proceedings_a={COLLABORATECOM},
        year={2007},
        month={5},
        keywords={Availability Computer science Costs Current measurement Measurement errors Peer to peer computing Protocols Runtime Testing Time measurement},
        doi={10.1109/COLCOM.2006.361874}
    }
    
  • Predrag Knežević
    Andreas Wombacher
    Thomas Risse
    Year: 2007
    Managing and Recovering High Data Availability in a DHT under Churn
    COLLABORATECOM
    IEEE
    DOI: 10.1109/COLCOM.2006.361874
Predrag Knežević1,*, Andreas Wombacher2,*, Thomas Risse1,*
  • 1: Fraunhofer IPSI, Dolivostrasse 15, 64293 Darmstadt, Germany
  • 2: University of Twente, Department of Computer Science, Enschede, The Netherlands
*Contact email: knezevic@ipsi.fhg.de, a.wombacher@cs.utwente.nl, risse@ipsi.fhg.de

Abstract

An essential issue in peer-to-peer data management is to keep data highly available all the time. A common idea is to replicate data hoping that at least one replica is available when needed. However, due to churns, the number of created replicas could be not sufficient for guaranteeing the intended data availability. If the number of replicas is computed according to the lowest expected peer availability (a classical case), but the expectation were too high, then the peer availability after a churn could be too low, and the system could not be able to recover the requested data availability. The paper is a continuation of previous work (Knezevic et al., 2006) and presents a replication protocol that delivers a configured data availability guarantee, and is resistant to, or recovers fast from churns. The protocol is based on a distributed hash table (DHT), measurement of peer online probability in the system, and adjustment of the number of replicas accordingly. The evaluation shows that we are able to maintain or recover the requested data availability during or shortly after stronger or weaker churns, and at the same time the storage overhead is close to the theoretical minimum.