Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24–25, 2019, Proceedings

Research Article

A Data Quality Improvement Method Based on the Greedy Algorithm

Download
50 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-32388-2_22,
        author={Zhongfeng Wang and Yatong Fu and Chunhe Song and Weichun Ge and Lin Qiao and Hongyu Zhang},
        title={A Data Quality Improvement Method Based on the Greedy Algorithm},
        proceedings={Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24--25, 2019, Proceedings},
        proceedings_a={MLICOM},
        year={2019},
        month={10},
        keywords={Data quality Improvement order Greedy algorithm},
        doi={10.1007/978-3-030-32388-2_22}
    }
    
  • Zhongfeng Wang
    Yatong Fu
    Chunhe Song
    Weichun Ge
    Lin Qiao
    Hongyu Zhang
    Year: 2019
    A Data Quality Improvement Method Based on the Greedy Algorithm
    MLICOM
    Springer
    DOI: 10.1007/978-3-030-32388-2_22
Zhongfeng Wang, Yatong Fu, Chunhe Song,*, Weichun Ge1, Lin Qiao1, Hongyu Zhang1
  • 1: State Grid Liaoning Electric Power Co., Ltd.
*Contact email: songchunhe@sia.cn

Abstract

High-quality data is very important for data analysis and mining. Data quality can be indicated by many indicators, and some methods have been proposed for data quality improvement by improving one or more data quality indicators. However, there is few work to discuss the impact of the processing order of data quality indicators on the overall data quality. In this paper, first, some data quality indicators and their improvement methods are given; second, the impact of the processing order of data quality indicators on the overall data quality is discussed, and then a novel data quality improvement method based on the greedy algorithm is proposed. Experiments have been shown that the proposed method can improves the data quality while reducing the time and computational costs.