Collaborative Computing: Networking, Applications and Worksharing. 13th International Conference, CollaborateCom 2017, Edinburgh, UK, December 11–13, 2017, Proceedings

Research Article

HSAStore: A Hierarchical Storage Architecture for Computing Systems Containing Large-Scale Intermediate Data

Download
138 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-00916-8_54,
        author={Zhoujie Zhang and Limin Xiao and Shubin Su and Li Ruan and Bing Wei and Nan Zhou and Xi Liu and Haitao Wang and Zipeng Wei},
        title={HSAStore: A Hierarchical Storage Architecture for Computing Systems Containing Large-Scale Intermediate Data},
        proceedings={Collaborative Computing: Networking, Applications and Worksharing. 13th International Conference, CollaborateCom 2017, Edinburgh, UK, December 11--13, 2017, Proceedings},
        proceedings_a={COLLABORATECOM},
        year={2018},
        month={10},
        keywords={Hierarchical storage architecture Bandwidth bottleneck Large-scale intermediate data Distributed file system},
        doi={10.1007/978-3-030-00916-8_54}
    }
    
  • Zhoujie Zhang
    Limin Xiao
    Shubin Su
    Li Ruan
    Bing Wei
    Nan Zhou
    Xi Liu
    Haitao Wang
    Zipeng Wei
    Year: 2018
    HSAStore: A Hierarchical Storage Architecture for Computing Systems Containing Large-Scale Intermediate Data
    COLLABORATECOM
    Springer
    DOI: 10.1007/978-3-030-00916-8_54
Zhoujie Zhang,*, Limin Xiao,*, Shubin Su,*, Li Ruan, Bing Wei, Nan Zhou, Xi Liu, Haitao Wang1,*, Zipeng Wei1,*
  • 1: Space Star Technology Co., Ltd
*Contact email: jokerzhang@buaa.edu.cn, xiaolm@buaa.edu.cn, dreamsu@buaa.edu.cn, wanghaitao@spacestar.com.cn, wei.zipeng@outlook.com

Abstract

This paper introduces HSAStore, a newly designed storage system whose goal goes to build an efficient storage system for computing systems that containing large-scale intermediate data. HSAStore involves three sub-systems to work together, a distributed file system which is in charge of the intermediate data, a centralized Network Attached Storage (NAS) which stores the raw input data and the results data, and a local file system that serves for the local data. The HSAStore takes full advantage of the network bandwidth of the computing cluster. Moreover, HSAStore adopts the distributed file system, so it is helpful for efficient execution of parallel programs. Experiments show that HSAStore has a significant improvement on efficiency in the computing systems that containing large-scale intermediate data.