Security and Privacy in Communication Networks. 13th International Conference, SecureComm 2017, Niagara Falls, ON, Canada, October 22–25, 2017, Proceedings

Research Article

Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework

  • @INPROCEEDINGS{10.1007/978-3-319-78813-5_4,
        author={Hong Rong and Huimei Wang and Jian Liu and Jialu Hao and Ming Xian},
        title={Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework},
        proceedings={Security and Privacy in Communication Networks. 13th International Conference, SecureComm 2017, Niagara Falls, ON, Canada, October 22--25, 2017, Proceedings},
        proceedings_a={SECURECOMM},
        year={2018},
        month={4},
        keywords={Outsourced k-means clustering Multiple keys Cloud environment Spark framework},
        doi={10.1007/978-3-319-78813-5_4}
    }
    
  • Hong Rong
    Huimei Wang
    Jian Liu
    Jialu Hao
    Ming Xian
    Year: 2018
    Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework
    SECURECOMM
    Springer
    DOI: 10.1007/978-3-319-78813-5_4
Hong Rong1,*, Huimei Wang1,*, Jian Liu1,*, Jialu Hao1,*, Ming Xian1,*
  • 1: National University of Defense Technology
*Contact email: r.hong_nudt@hotmail.com, freshcdwhm@163.com, ljabc730@gmail.com, haojialupb@163.com, qwertmingx@sina.com

Abstract

As the quantity of data produced is rapidly rising in recent years, clients lack of computational and storage resources tend to outsource data mining tasks to cloud service providers in order to improve efficiency and save costs. It’s also increasing common for clients to perform collaborative mining to maximize profits. However, due to the rise of privacy leakage issues, the data contributed by clients should be encrypted under their own keys. This paper focuses on privacy-preserving k-means clustering over the joint datasets from multiple sources. Unfortunately, existing secure outsourcing protocols are either restricted to a single key setting or quite inefficient because of frequent client-to-server interactions, making it impractical for wide application. To address these issues, we propose a set of secure building blocks and outsourced clustering protocol under Spark framework. Theoretical analysis shows that our scheme protects the confidentiality of the joint database and mining results in the standard threat model with small computation and communication overhead. Experimental results also demonstrate its significant efficiency improvements compared with existing methods.