Wireless Internet. 10th International Conference, WiCON 2017, Tianjin, China, December 16-17, 2017, Proceedings

Research Article

Optimization of Density-Based K-means Algorithm in Trajectory Data Clustering

  • @INPROCEEDINGS{10.1007/978-3-319-90802-1_39,
        author={Mei-Wei Hao and Hua-Lin Dai and Kun Hao and Cheng Li and Yun-Jie Zhang and Hao-Nan Song},
        title={Optimization of Density-Based K-means Algorithm in Trajectory Data Clustering},
        proceedings={Wireless Internet. 10th International Conference, WiCON 2017, Tianjin, China, December 16-17, 2017, Proceedings},
        proceedings_a={WICON},
        year={2018},
        month={5},
        keywords={K-means algorithm Based on density Characteristics of vehicle activity Weighted density Initial clustering center Between-Within Proportion (BWP) index},
        doi={10.1007/978-3-319-90802-1_39}
    }
    
  • Mei-Wei Hao
    Hua-Lin Dai
    Kun Hao
    Cheng Li
    Yun-Jie Zhang
    Hao-Nan Song
    Year: 2018
    Optimization of Density-Based K-means Algorithm in Trajectory Data Clustering
    WICON
    Springer
    DOI: 10.1007/978-3-319-90802-1_39
Mei-Wei Hao1,*, Hua-Lin Dai1,*, Kun Hao1,*, Cheng Li1,*, Yun-Jie Zhang1,*, Hao-Nan Song2,*
  • 1: Tianjin Chengjian University
  • 2: Tsinghua University
*Contact email: angelsamle@126.com, 99871382@qq.com, littlehao@126.com, licheng.mum@gmail.com, zhangyunjietj@163.com, shn14@mails.tsinghua.edu.com

Abstract

Since the amount of trajectory data is large and the structure of trajectory data is complex, an improved density-based K-means algorithm was proposed. Firstly, high-density trajectory data points were selected as the initial clustering centers based on the density and increasing the density weight of important points, to perform K-means clustering. Secondly the clustering results were evaluated by the Between-Within Proportion index. Finally, the optimal clustering number and the best clustering were determined according to the clustering results evaluation. Theoretical researches and experimental results showed that the improved algorithm could be better at extracting the trajectory key points. The accuracy of clustering results was 24% points higher than that of the traditional K-means algorithm and 16% points higher than that of the Density-Based Spatial Clustering of Applications with Noise algorithm. The proposed algorithm has a better stability and a higher accuracy in trajectory data clustering.