Future Internet Technologies and Trends. First International Conference, ICFITT 2017, Surat, India, August 31 - September 2, 2017, Proceedings

Research Article

-: Histogram Modification Based Perturbation Approach for Privacy Preserving Data Mining

Download
249 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-73712-6_3,
        author={Alpa Shah and Ravi Gulati},
        title={
          -: Histogram Modification Based Perturbation Approach for Privacy Preserving Data Mining},
        proceedings={Future Internet Technologies and Trends. First International Conference, ICFITT 2017, Surat, India, August 31 - September 2, 2017, Proceedings},
        proceedings_a={ICFITT},
        year={2018},
        month={2},
        keywords={Privacy preserving data mining Histogram Modification Additive white Gaussian noise Multiplicative perturbation Geometric Data Perturbation},
        doi={10.1007/978-3-319-73712-6_3}
    }
    
  • Alpa Shah
    Ravi Gulati
    Year: 2018
    -: Histogram Modification Based Perturbation Approach for Privacy Preserving Data Mining
    ICFITT
    Springer
    DOI: 10.1007/978-3-319-73712-6_3
Alpa Shah1,*, Ravi Gulati2,*
  • 1: Sarvajanik College of Engineering and Technology
  • 2: Veer Narmad South Gujarat University
*Contact email: alpa.shah@scet.ac.in, rmgulati@vnsgu.ac.in

Abstract

Privacy Preserving Data Mining (PPDM) protects the disclosure of sensitive quasi-identifiers of dataset during mining by perturbing the data. This perturbed dataset is then used by trusted Third Party for effective derivation of association rules. Many PPDM algorithms destroy the original data to generate the mining results. It is essential that the perturbed data preserves the statistical inference of the sensitive attributes and minimize the information loss. Existing techniques based on Additive, Multiplicative and Geometric Transformations have minimal information loss, but suffer from reconstruction vulnerabilities. We propose Histogram Modification based method, viz. HiMod-Pert, for preserving the sensitive numeric attributes of perturbed dataset. Our method uses the difference in neighboring values to determine the perturbation factor. Experiments are performed to implement and test the applicability of the proposed technique. Evaluation using descriptive statistic metrics shows that the information loss is minimal.