
Research Article
ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary
@INPROCEEDINGS{10.1007/978-3-031-17081-2_14, author={Haifeng Lv and Yong Ding}, title={ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary}, proceedings={Applied Cryptography in Computer and Communications. Second EAI International Conference, AC3 2022, Virtual Event, May 14-15, 2022, Proceedings}, proceedings_a={AC3}, year={2022}, month={10}, keywords={NER ALBERT Lattice transformer CRF Word segmentation}, doi={10.1007/978-3-031-17081-2_14} }
- Haifeng Lv
Yong Ding
Year: 2022
ALFLAT: Chinese NER Using ALBERT, Flat-Lattice Transformer, Word Segmentation and Entity Dictionary
AC3
Springer
DOI: 10.1007/978-3-031-17081-2_14
Abstract
Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information. However, one hand, since the lattice structure is dynamic and complex, although some existing lattice-based models are effectively utilize the parallel computation of GPUs, they do not fully utilize word segmentation boundary tags that as features are helpful for NER task. On the other hand, the character-word vector needs to be trained, and the user-defined entity dictionary cannot be effectively used. In this paper, we propose ALFLAT: based on a flat-lattice Transformer to incorporate ALBERT pre-trained model, word segmentation information and user-defined entity dictionary for Chinese NER. ALFLAT converts the lattice structure into a flat structure consisting of spans, integrate word segmentation embedding with the output of flat-lattice Transformer model, then modifies the emission scores according to the user-defined entity dictionary, finally utilize Viterbi decoding of the CRF layer to obtain the correct entity results. Each span corresponds to a character or latent word and its position in the original lattice. With the power of ALBERT pre-trained model, Transformer and position encoding, ALFLAT can fully leverage the lattice, word segmentation and user-defined entity dictionary information. Experiments on MSRA dataset show ALFLAT outperforms other lexicon-based models in performance and efficiency.