About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Multimedia Technology and Enhanced Learning. Third EAI International Conference, ICMTEL 2021, Virtual Event, April 8–9, 2021, Proceedings, Part I

Research Article

Personal Name Disambiguation for Chinese Documents in Online Medium

Download(Requires a free EAI acccount)
3 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-030-82562-1_23,
        author={Chao Fan and Yu Li},
        title={Personal Name Disambiguation for Chinese Documents in Online Medium},
        proceedings={Multimedia Technology and Enhanced Learning. Third EAI International Conference, ICMTEL 2021, Virtual Event, April 8--9, 2021, Proceedings, Part I},
        proceedings_a={ICMTEL},
        year={2021},
        month={7},
        keywords={Personal name disambiguation Chinese personal names Agglomerative clustering},
        doi={10.1007/978-3-030-82562-1_23}
    }
    
  • Chao Fan
    Yu Li
    Year: 2021
    Personal Name Disambiguation for Chinese Documents in Online Medium
    ICMTEL
    Springer
    DOI: 10.1007/978-3-030-82562-1_23
Chao Fan1,*, Yu Li1
  • 1: The School of Artificial Intelligence and Computer Science, Jiangnan University
*Contact email: fanchao@jiangnan.edu.cn

Abstract

Disambiguating various people that share the same name is a critical issue for analyzing contents in online medium. This paper develops a framework for dealing with personal names in Chinese dataset. Web pages containing personal name are crawled from the online website and standardized at first. Then documents are parsed with lexical analysis technologies, such as segmentation, part-of-speech tagging, named entity recognition. We extract several groups of words as features, testing different weighting schemes (e.g. Boolean term frequency, absolute term frequency, tf-idf, entropy weights). By conducting the agglomerative clustering, a measure of interdependence within clusters and independence between clusters is proposed for automatically determining the number of clusters. Moreover, a technique that merges noise clusters is utilized to improve the clustering results. Experiments are performed on six groups of Chinese personal names and the final results confirm our proposed approach.

Keywords
Personal name disambiguation Chinese personal names Agglomerative clustering
Published
2021-07-22
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-030-82562-1_23
Copyright © 2021–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL