3rd International ICST Conference on Scalable Information Systems

Research Article

Modeling Replication Strategies in Data Grid Systems with Arbitrary Clustered Demands

Download581 downloads
  • @INPROCEEDINGS{10.4108/ICST.INFOSCALE2008.3519,
        author={Jianjin Jiang and Guangwen Yang and Dingxing Wang},
        title={Modeling Replication Strategies in Data Grid Systems with Arbitrary Clustered Demands},
        proceedings={3rd International ICST Conference on Scalable Information Systems},
        publisher={ICST},
        proceedings_a={INFOSCALE},
        year={2010},
        month={5},
        keywords={Data grid replication strategy clustered demands access latency optimized distribution},
        doi={10.4108/ICST.INFOSCALE2008.3519}
    }
    
  • Jianjin Jiang
    Guangwen Yang
    Dingxing Wang
    Year: 2010
    Modeling Replication Strategies in Data Grid Systems with Arbitrary Clustered Demands
    INFOSCALE
    ICST
    DOI: 10.4108/ICST.INFOSCALE2008.3519
Jianjin Jiang1,*, Guangwen Yang1,*, Dingxing Wang1,*
  • 1: Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
*Contact email: jiangjj02@mails.tsinghua.edu.cn, ygw@tsinghua.edu.cn, dxwang@tsinghua.edu.cn

Abstract

This paper considers the relationship between request distribution and replica distribution in data grid when request exhibits arbitrary clustered demands. We first give formal model of replication strategies in data grid system. Second, we investigate what is optimal way at the objective of minimizing average access latency to replicate data when request exhibits arbitrary clustered demands. We explain why replicas should be replicated uniformly when request is uniformly distributed in a sub grid in the sense of optimal strategy. Then we investigate the relationship between different files in a sub grid. Furthermore, we analyze the case when all sub grids are equal-sized and conclude that when request is uniformly distributed in system, replicas should be uniformly distributed in system too. Finally, we give an optimal strategy when sub grids are not equal-sized and different sub grids exhibit different request clustering patterns. Compared with some popular strategies, the optimal strategy has some advantages of lower wide area network bandwidth requirement and lower average access latency. Simulation results validate the effectiveness of optimal strategy.