About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Applied Cryptography in Computer and Communications. Second EAI International Conference, AC3 2022, Virtual Event, May 14-15, 2022, Proceedings

Research Article

ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-17081-2_14,
        author={Haifeng Lv and Yong Ding},
        title={ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary},
        proceedings={Applied Cryptography in Computer and Communications. Second EAI International Conference, AC3 2022, Virtual Event, May 14-15, 2022, Proceedings},
        proceedings_a={AC3},
        year={2022},
        month={10},
        keywords={NER ALBERT Lattice transformer CRF Word segmentation},
        doi={10.1007/978-3-031-17081-2_14}
    }
    
  • Haifeng Lv
    Yong Ding
    Year: 2022
    ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary
    AC3
    Springer
    DOI: 10.1007/978-3-031-17081-2_14
Haifeng Lv1,*, Yong Ding2
  • 1: School of Data Science and Software Engineering
  • 2: Guangxi Key Laboratory of Cryptography and Information Security, School of Computer Science and Information Security
*Contact email: 421538806@qq.com

Abstract

Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information. However, one hand, since the lattice structure is dynamic and complex, although some existing lattice-based models are effectively utilize the parallel computation of GPUs, they do not fully utilize word segmentation boundary tags that as features are helpful for NER task. On the other hand, the character-word vector needs to be trained, and the user-defined entity dictionary cannot be effectively used. In this paper, we propose ALFLAT: based on a flat-lattice Transformer to incorporate ALBERT pre-trained model, word segmentation information and user-defined entity dictionary for Chinese NER. ALFLAT converts the lattice structure into a flat structure consisting of spans, integrate word segmentation embedding with the output of flat-lattice Transformer model, then modifies the emission scores according to the user-defined entity dictionary, finally utilize Viterbi decoding of the CRF layer to obtain the correct entity results. Each span corresponds to a character or latent word and its position in the original lattice. With the power of ALBERT pre-trained model, Transformer and position encoding, ALFLAT can fully leverage the lattice, word segmentation and user-defined entity dictionary information. Experiments on MSRA dataset show ALFLAT outperforms other lexicon-based models in performance and efficiency.

Keywords
NER ALBERT Lattice transformer CRF Word segmentation
Published
2022-10-06
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-17081-2_14
Copyright © 2022–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL