Machine Learning and Intelligent Communications. First International Conference, MLICOM 2016, Shanghai, China, August 27-28, 2016, Revised Selected Papers

Research Article

Research on Decentralized Group Replication Strategy Based on Correlated Patterns Mining in Data Grids

Download
215 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-52730-7_30,
        author={Danyang Qin and Ruixue Liu and Jiaqi Zhen and Songxiang Yang and Erfu Wang},
        title={Research on Decentralized Group Replication Strategy Based on Correlated Patterns Mining in Data Grids},
        proceedings={Machine Learning and Intelligent Communications. First International Conference, MLICOM 2016, Shanghai, China, August 27-28, 2016, Revised Selected Papers},
        proceedings_a={MLICOM},
        year={2017},
        month={2},
        keywords={Data mining Correlated patterns Data replication Distributed groups},
        doi={10.1007/978-3-319-52730-7_30}
    }
    
  • Danyang Qin
    Ruixue Liu
    Jiaqi Zhen
    Songxiang Yang
    Erfu Wang
    Year: 2017
    Research on Decentralized Group Replication Strategy Based on Correlated Patterns Mining in Data Grids
    MLICOM
    Springer
    DOI: 10.1007/978-3-319-52730-7_30
Danyang Qin1,*, Ruixue Liu2,*, Jiaqi Zhen1,*, Songxiang Yang1,*, Erfu Wang1,*
  • 1: Heilongjiang University
  • 2: Harbin Institute of Technology Shenzhen Graduate School
*Contact email: qindanyang@hlju.edu.cn, liuruixue@hlju.edu.cn, zhenjiaqi@hlju.edu.cn, yangsongxiang@hlju.edu.cn, wangerfu@hlju.edu.cn

Abstract

Aiming at the problem that most of the existing data mining based replication strategies cannot extract correlations between files effectively, a new decentralized replication strategy based on maximal frequent correlated patterns mining, called RSMFCP, is proposed. By translating the files access history to the binary access history, applying maximal frequent correlated patterns mining and performing replication, RSMFCP can extremely eliminate redundancy and optimize the replication performance. Data analysis and simulation results show that, comparing with other strategies like no replication, PRA, DR2 and PDDRA, RSMFCP can extract correlations more effectively and gain lower mean job execute time under different access patterns, which will provide a new option to reduce transmission delay in data grid.