Collaborative Computing: Networking, Applications and Worksharing. 13th International Conference, CollaborateCom 2017, Edinburgh, UK, December 11–13, 2017, Proceedings

Research Article

A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster

Download
65 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-00916-8_32,
        author={Pengcheng Yang and Honghao Gao and Huahu Xu and Minjie Bian and Danqi Chu},
        title={A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster},
        proceedings={Collaborative Computing: Networking, Applications and Worksharing. 13th International Conference, CollaborateCom 2017, Edinburgh, UK, December 11--13, 2017, Proceedings},
        proceedings_a={COLLABORATECOM},
        year={2018},
        month={10},
        keywords={Cloud computing Hadoop Heterogeneous cluster Load balancing Relative load},
        doi={10.1007/978-3-030-00916-8_32}
    }
    
  • Pengcheng Yang
    Honghao Gao
    Huahu Xu
    Minjie Bian
    Danqi Chu
    Year: 2018
    A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster
    COLLABORATECOM
    Springer
    DOI: 10.1007/978-3-030-00916-8_32
Pengcheng Yang, Honghao Gao, Huahu Xu,*, Minjie Bian, Danqi Chu1
  • 1: Shanghai University
*Contact email: huahuxu@163.com

Abstract

In a heterogeneous cluster, how to handle load balancing is an urgent problem. This paper proposes a method of load balancing based on node features. The method first analyses the main indexes that determine node performance. Then, a formula is defined to describe the node performance based on the contributions of those indexes. We combine node performance with node busy status to calculate the relative load value. By analysing the relative load value of each node and the cluster storage utilization rate, the recommended value of the storage utilization rate for each node is calculated. Finally, the balancer threshold is generated dynamically based on the current cluster’s disk load. The results of experiments show that the load balancing method proposed in this paper provides a more reasonable equilibrium for heterogeneous clusters, improves efficiency and substantially reduces the execution time.