Research Article
How data-sharing nudges influence people's privacy preferences: A machine learning-based analysis
@ARTICLE{10.4108/eai.21-12-2021.172440, author={Yang Lu and Shujun Li and Alex Freitas and Athina Ioannou}, title={How data-sharing nudges influence people's privacy preferences: A machine learning-based analysis}, journal={EAI Endorsed Transactions on Security and Safety}, volume={8}, number={30}, publisher={EAI}, journal_a={SESA}, year={2022}, month={8}, keywords={Privacy, Nudging, Persuasive Technology, Data Sharing, User Segmentation, User Profiling, Machine Learning}, doi={10.4108/eai.21-12-2021.172440} }
- Yang Lu
Shujun Li
Alex Freitas
Athina Ioannou
Year: 2022
How data-sharing nudges influence people's privacy preferences: A machine learning-based analysis
SESA
EAI
DOI: 10.4108/eai.21-12-2021.172440
Abstract
INTRODUCTION: Many online services use data-sharing nudges to solicit personal data from their customers for personalized services. OBJECTIVES: This study aims to study people’s privacy preferences in sharing different types of personal data under different nudging conditions, how digital nudging can change their data sharing willingness, and if people’s data sharing preferences can be predicted using their responses to a questionnaire. METHODS: This paper reports a machine learning-based analysis on people’s privacy preference patterns under four different data-sharing nudging conditions (without nudging, monetary incentives, non-monetary incentives, and privacy assurance). The analysis is based on data collected from 685 UK residents who participated in a panel survey. Their self-reported willingness levels towards sharing 23 different types of personal data were analyzed by using both unsupervised (clustering) and supervised (classification) machine learning algorithms. RESULTS: The results led to a better understanding of people’s privacy preference patterns across different data-sharing nudging conditions, e.g., our participants’ preferences are distributed in a space of 48 possible profiles more sparsely than we expected, and the unexpected observation that all the three data-sharing nudging strategies led to an overall negative effect: they led to a reduced level of self-reported willingness for more participants, comparing with the case of no nudging at all. Our experiments with supervised machine learning models also showed that people’s privacy (data-sharing) preference profiles can be automatically predicted with a good accuracy, even when a small questionnaire with just seven questions is used. CONCLUSION: Our work revealed a more complicated structure of people’s privacy preference profiles, which have some dependencies on the type of data nudging and the type of personal data shared. Such complicated privacy preference profiles can be effectively analyzed using machine learning methods, including automatic prediction based on a small questionnaire. The negative results on the overall effect of different data-sharing nudges imply that service providers should consider if and how to use such mechanisms to incentivise their consumers to share personal data. We believe that more consumer-centric and transparent methods and tools should be used to help improve trust between consumers and service providers.
Copyright © 2021 Yang Lu Lu et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.