Research Article
Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework
@INPROCEEDINGS{10.1007/978-3-319-78813-5_4, author={Hong Rong and Huimei Wang and Jian Liu and Jialu Hao and Ming Xian}, title={Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework}, proceedings={Security and Privacy in Communication Networks. 13th International Conference, SecureComm 2017, Niagara Falls, ON, Canada, October 22--25, 2017, Proceedings}, proceedings_a={SECURECOMM}, year={2018}, month={4}, keywords={Outsourced k-means clustering Multiple keys Cloud environment Spark framework}, doi={10.1007/978-3-319-78813-5_4} }
- Hong Rong
Huimei Wang
Jian Liu
Jialu Hao
Ming Xian
Year: 2018
Outsourced -Means Clustering over Encrypted Data Under Multiple Keys in Spark Framework
SECURECOMM
Springer
DOI: 10.1007/978-3-319-78813-5_4
Abstract
As the quantity of data produced is rapidly rising in recent years, clients lack of computational and storage resources tend to outsource data mining tasks to cloud service providers in order to improve efficiency and save costs. It’s also increasing common for clients to perform collaborative mining to maximize profits. However, due to the rise of privacy leakage issues, the data contributed by clients should be encrypted under their own keys. This paper focuses on privacy-preserving k-means clustering over the joint datasets from multiple sources. Unfortunately, existing secure outsourcing protocols are either restricted to a single key setting or quite inefficient because of frequent client-to-server interactions, making it impractical for wide application. To address these issues, we propose a set of secure building blocks and outsourced clustering protocol under Spark framework. Theoretical analysis shows that our scheme protects the confidentiality of the joint database and mining results in the standard threat model with small computation and communication overhead. Experimental results also demonstrate its significant efficiency improvements compared with existing methods.