5th International ICST Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services

Research Article

Context-Aware Fault Tolerance in Migratory Services

Download282 downloads
        author={Oriana Riva and Josiane Nzouonta and Cristian Borcea},
        title={Context-Aware Fault Tolerance in Migratory Services},
        proceedings={5th International ICST Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services},
        keywords={Context-aware Fault Tolerance Migratory Services Mobile Ad Hoc Networks},
  • Oriana Riva
    Josiane Nzouonta
    Cristian Borcea
    Year: 2010
    Context-Aware Fault Tolerance in Migratory Services
    DOI: 10.4108/ICST.MOBIQUITOUS2008.3564
Oriana Riva1,*, Josiane Nzouonta2,*, Cristian Borcea2,*
  • 1: Department of Computer Science, ETH Zürich, 8092 Zürich, Switzerland.
  • 2: Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
*Contact email: oriva@inf.ethz.ch, jn62@cs.njit.edu, borcea@cs.njit.edu


Mobile ad hoc networks can be leveraged to provide ubiquitous services capable of acquiring, processing, and sharing real-time information from the physical world. Unlike Internet services, these services have to survive frequent and unpredictable faults such as disconnections, crashes, or users turning off their devices. This paper describes a context-aware fault tolerance mechanism for our migratory services model. In this model, a per-client service instance transparently migrates to different nodes in the network to provide a continuous and semantically-correct interaction with its client. The proposed fault tolerance mechanism extends the primary-backup approach with a context-aware checkpointing process. The backup node is dynamically selected based on its distance from the client and service, the similarity of its mobility pattern with those of the client and service, the frequency of the checkpointing process, and the size of the checkpointing state. We demonstrate the feasibility of our approach through a prototype implementation tested in a small scale ad hoc network of smart phones. Additionally, we simulate our mechanism in a realistic urban environment with 300 pedestrians, cyclists, and cars. Compared to approaches where the backup node is a neighbor of the service node or the client node itself, our mechanism performs as much as 80% better than the former for recovery ratio, and three times better than the latter for network overhead, while achieving better or similar recovery latency.