7th International Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

m-Privacy for Collaborative Data Publishing

Download647 downloads
  • @INPROCEEDINGS{10.4108/icst.collaboratecom.2011.247094,
        author={Slawomir Goryczka and Li Xiong and Benjamin Fung},
        title={m-Privacy for Collaborative Data Publishing},
        proceedings={7th International Conference on Collaborative Computing: Networking, Applications and Worksharing},
        publisher={IEEE},
        proceedings_a={COLLABORATECOM},
        year={2012},
        month={4},
        keywords={data anonymization distributed data anonymization data privacy collaborative data publishing},
        doi={10.4108/icst.collaboratecom.2011.247094}
    }
    
  • Slawomir Goryczka
    Li Xiong
    Benjamin Fung
    Year: 2012
    m-Privacy for Collaborative Data Publishing
    COLLABORATECOM
    ICST
    DOI: 10.4108/icst.collaboratecom.2011.247094
Slawomir Goryczka1,*, Li Xiong1, Benjamin Fung2
  • 1: Emory University
  • 2: Concordia University
*Contact email: sgorycz@emory.edu

Abstract

In this paper, we consider the collaborative data publishing problem for anonymizing horizontally partitioned data at multiple data providers. We consider a new type of “insider attack” by colluding data providers who may use their own data records (a subset of the overall data) in addition to the external background knowledge to infer the data records contributed by other data providers. The paper addresses this new threat and makes several contributions. First, we introduce the notion of m-privacy, which guarantees that the anonymized data satisfies a given privacy constraint against any group of up to m colluding data providers. Second, we present heuristic algorithms exploiting the equivalence group monotonicity of privacy constraints and adaptive ordering techniques for efficiently checking m-privacy given a set of records. Finally, we present a data provider-aware anonymization algorithm with adaptive m-privacy checking strategies to ensure high utility and m-privacy of anonymized data with efficiency. Experiments on real-life datasets suggest that our approach achieves better or comparable utility and efficiency than existing and baseline algorithms while providing m-privacy guarantee.