8th International Conference on Communications and Networking in China

Research Article

Content Knowledge Based Privacy Estimation Model for Anonymous OSN Data Publishing

  • @INPROCEEDINGS{10.1109/ChinaCom.2013.6694636,
        author={Cheng Cheng and Chunhong Zhang and Qingyuan Hu},
        title={Content Knowledge Based Privacy Estimation Model for Anonymous OSN Data Publishing},
        proceedings={8th International Conference on Communications and Networking in China},
        publisher={IEEE},
        proceedings_a={CHINACOM},
        year={2013},
        month={11},
        keywords={privacy estimation background knowledge user generated content principal component analysis},
        doi={10.1109/ChinaCom.2013.6694636}
    }
    
  • Cheng Cheng
    Chunhong Zhang
    Qingyuan Hu
    Year: 2013
    Content Knowledge Based Privacy Estimation Model for Anonymous OSN Data Publishing
    CHINACOM
    IEEE
    DOI: 10.1109/ChinaCom.2013.6694636
Cheng Cheng,*, Chunhong Zhang1, Qingyuan Hu1
  • 1: Beijing University of Posts and Telecommunications
*Contact email: chengcheng20090901@gmail.com

Abstract

Online Social Network (OSN) data is often collected by the third parties for various purposes. One of the problems in such practices is how to measure the privacy breach to assure secure users. However, the recent works on privacy estimation are not systematic enough and are mainly focus on the traditional datasets, such as bank data and hospital data. Compared with these closed environments, the open APIs and lower register barriers make OSNs an open environment. Thus the openness of OSN makes more User Generated Content (UGC) like blogs and remarks be achieved easily by adversaries. In this paper, we analyzed the background knowledge in OSNs and proposed a general privacy estimation model facing OSNs data based on linear regression. In particular, our model takes the content knowledge of adversary into consideration. Considered the high dimension of content knowledge, which could cause high computational overhead, we optimized our model by Principal Component Analysis (PCA).