About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Quality, Reliability, Security and Robustness in Heterogeneous Systems. 19th EAI International Conference, QShine 2023, Shenzhen, China, October 8 – 9, 2023, Proceedings, Part I

Research Article

Short Text Data Mining Based on Incremental AP Clustering

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-65126-7_31,
        author={Fuyu Lu and Ying Guo and Peiyi Qu and Yonglin Leng},
        title={Short Text Data Mining Based on Incremental AP Clustering},
        proceedings={Quality, Reliability, Security and Robustness in Heterogeneous Systems. 19th EAI International Conference, QShine 2023, Shenzhen, China, October 8 -- 9, 2023, Proceedings, Part I},
        proceedings_a={QSHINE},
        year={2024},
        month={8},
        keywords={Short Text Vector Representation Model Incremental AP Clustering},
        doi={10.1007/978-3-031-65126-7_31}
    }
    
  • Fuyu Lu
    Ying Guo
    Peiyi Qu
    Yonglin Leng
    Year: 2024
    Short Text Data Mining Based on Incremental AP Clustering
    QSHINE
    Springer
    DOI: 10.1007/978-3-031-65126-7_31
Fuyu Lu1, Ying Guo1, Peiyi Qu1, Yonglin Leng1,*
  • 1: College of Information Science and Technology, Bohai University
*Contact email: lengyonglin@qq.com

Abstract

The rapid development of mobile internet technology generates many short text data, which contains many hot topics. By clustering short text data, we can identify many hot topics in time. This information is crucial for discovering public opinion and analyzing user emotions. This paper proposes a hybrid vector representation model (HVRM) that combines weight and topic features to address the feature information loss caused by a single short text vector representation model and short text sparsity. Firstly, HVRM mines the local features using Word2Vec and TF-IDF to get the weighted vector of short text. Next, use BTM to obtain global feature vectors. And then connect the two feature vectors to form short text vectors. Finally, we use KNN to initialize the responsibility and availability matrices of incremental AP clustering (IAPC). The experimental results show that the hybrid vector representation model proposed in this paper can effectively improve the clustering effect.

Keywords
Short Text Vector Representation Model Incremental AP Clustering
Published
2024-08-20
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-65126-7_31
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL