
Research Article
Time Series Data Imputation Using Expectation-Maximization with Principal Component Analysis
@INPROCEEDINGS{10.1007/978-3-030-94182-6_26, author={Renkang Geng and Jing Cao and Qinjun Zhao and Yujie Wang}, title={Time Series Data Imputation Using Expectation-Maximization with Principal Component Analysis}, proceedings={IoT and Big Data Technologies for Health Care. Second EAI International Conference, IoTCare 2021, Virtual Event, October 18-19, 2021, Proceedings, Part II}, proceedings_a={IOTCARE PART 2}, year={2022}, month={6}, keywords={Traffic flow data Missing value PCA-EM}, doi={10.1007/978-3-030-94182-6_26} }
- Renkang Geng
Jing Cao
Qinjun Zhao
Yujie Wang
Year: 2022
Time Series Data Imputation Using Expectation-Maximization with Principal Component Analysis
IOTCARE PART 2
Springer
DOI: 10.1007/978-3-030-94182-6_26
Abstract
Data quality is the basis of data analysis and determines the effect and depth of data analysis. Missing values are a very important factor affecting data quality. Since machine learning algorithms have been used to process data, the processing of missing values has become an important field of machine learning. For the vast majority of data, the first step of data analysis is often to complete the missing values of data. With the increasing complexity of current social traffic conditions and the increasingly serious urban road congestion, traffic data, as the most intuitive reflection of urban road conditions, has great application value and application potential. The quality of traffic data is directly related to whether we can accurately predict the traffic flow and judge the road traffic conditions so as to effectively govern the urban traffic problems. Therefore, it is very important to choose appropriate algorithms and methods to fill in the missing values in traffic data quickly and accurately. In this paper, taking traffic flow data as an example, we create different missing value ratios for the data and use the PCA-EM algorithm to fill in the missing value. Through the experimental results, we have a preliminary evaluation of the comprehensive performance of the PCA-EM algorithm when it is used to fill in the missing value of traffic data.