Cloud Computing, Security, Privacy in New Computing Environments. 7th International Conference, CloudComp 2016, and First International Conference, SPNCE 2016, Guangzhou, China, November 25–26, and December 15–16, 2016, Proceedings

Research Article

Efficient Graph Mining on Heterogeneous Platforms in the Cloud

  • @INPROCEEDINGS{10.1007/978-3-319-69605-8_2,
        author={Tao Zhang and Weiqin Tong and Wenfeng Shen and Junjie Peng and Zhihua Niu},
        title={Efficient Graph Mining on Heterogeneous Platforms in the Cloud},
        proceedings={Cloud Computing, Security, Privacy in New Computing Environments. 7th International Conference, CloudComp 2016, and First International Conference, SPNCE 2016, Guangzhou, China, November 25--26, and December 15--16, 2016, Proceedings},
        proceedings_a={CLOUDCOMP},
        year={2017},
        month={11},
        keywords={Graph mining GPGPU Graph partitioning Load balancing Cloud computing},
        doi={10.1007/978-3-319-69605-8_2}
    }
    
  • Tao Zhang
    Weiqin Tong
    Wenfeng Shen
    Junjie Peng
    Zhihua Niu
    Year: 2017
    Efficient Graph Mining on Heterogeneous Platforms in the Cloud
    CLOUDCOMP
    Springer
    DOI: 10.1007/978-3-319-69605-8_2
Tao Zhang1,*, Weiqin Tong1,*, Wenfeng Shen1,*, Junjie Peng1,*, Zhihua Niu1,*
  • 1: Shanghai University
*Contact email: taozhang@shu.edu.cn, wqtong@shu.edu.cn, wfshen@shu.edu.cn, jjie.peng@shu.edu.cn, zhniu@staff.shu.edu.cn

Abstract

In this Big Data era, many large-scale and complex graphs have been produced with the rapid growth of novel Internet applications and the new experiment data collecting methods in biological and chemistry areas. As the scale and complexity of the graph data increase explosively, it becomes urgent and challenging to develop more efficient graph processing frameworks which are capable of executing general graph algorithms efficiently. In this paper, we propose to leverage GPUs to accelerate large-scale graph mining in the cloud. To achieve good performance and scalability, we propose the graph summary method and runtime system optimization techniques for load balancing and message handling. Experiment results manifest that the prototype framework outperforms two state-of-the-art distributed frameworks GPS and GraphLab in terms of performance and scalability.