
Research Article
A Hybrid Cloud Deployment Architecture for Privacy-Preserving Collaborative Genome-Wide Association Studies
@INPROCEEDINGS{10.1007/978-3-031-06365-7_21, author={Fatima-zahra Boujdad and David Niyitegeka and Reda Bellafqira and Gouenou Coatrieux and Emmanuelle Genin and Mario S\'{y}dholt}, title={A Hybrid Cloud Deployment Architecture for Privacy-Preserving Collaborative Genome-Wide Association Studies}, proceedings={Digital Forensics and Cyber Crime. 12th EAI International Conference, ICDF2C 2021, Virtual Event, Singapore, December 6-9, 2021, Proceedings}, proceedings_a={ICDF2C}, year={2022}, month={6}, keywords={Secure GWAS Privacy Data watermarking Homomorphic encryption Integrity Intel SGX Anonymization}, doi={10.1007/978-3-031-06365-7_21} }
- Fatima-zahra Boujdad
David Niyitegeka
Reda Bellafqira
Gouenou Coatrieux
Emmanuelle Genin
Mario Südholt
Year: 2022
A Hybrid Cloud Deployment Architecture for Privacy-Preserving Collaborative Genome-Wide Association Studies
ICDF2C
Springer
DOI: 10.1007/978-3-031-06365-7_21
Abstract
The increasing availability of sequenced human genomes is enabling health professionals and genomics researchers to well understand the implication of genetic variants in the development of common diseases, notably by means of genome-wide association studies (GWAS) which are very promising for personalized medicine and diagnostic testing. However, the ever present need to handle genetic data from different sources to conduct large studies entails multiple privacy and security issues. Actually, classical methods of anonymization are inapplicable for genetic data that are now known to be identifying per se. In this paper, we propose a novel framework for privacy-preserving collaborative GWAS performed in the cloud. Indeed, our proposal is the first framework which combines a hybrid cloud deployment with a set of four security mechanisms that are digital watermarking, homomorphic encryption, meta-data de-identification and the Intel Software Guard Extensions technology in order to ensure confidentiality of genetic data as well as their integrity. Furthermore, our approach describes meta-data management which has rarely been considered in state-of-the-art propositions despite their importance to genetic analyses; in addition, the new deployment model we suggest fits with existing infrastructures which makes its integration straightforward. Experimental results of a prototypical implementation on typical data sizes demonstrate that our solution protocol is feasible and that the framework is practical for real-world scenarios.