sis 22(1): e10

Research Article

Automatic Grammar Error Correction Model Based on Encoder-decoder Structure for English Texts

Download117 downloads
  • @ARTICLE{10.4108/eetsis.v9i5.2011,
        author={Jiahao Wang and Guimin Huang and Yabing Wang},
        title={Automatic Grammar Error Correction Model Based on Encoder-decoder Structure for English Texts},
        journal={EAI Endorsed Transactions on Scalable Information Systems},
        volume={10},
        number={1},
        publisher={EAI},
        journal_a={SIS},
        year={2022},
        month={9},
        keywords={Encoder-decoder, Grammar Error Correction(GEC), deep neural network, attention mechanism, beam search},
        doi={10.4108/eetsis.v9i5.2011}
    }
    
  • Jiahao Wang
    Guimin Huang
    Yabing Wang
    Year: 2022
    Automatic Grammar Error Correction Model Based on Encoder-decoder Structure for English Texts
    SIS
    EAI
    DOI: 10.4108/eetsis.v9i5.2011
Jiahao Wang1,*, Guimin Huang1, Yabing Wang1
  • 1: Guilin University of Electronic Technology
*Contact email: gymer0729@163.com

Abstract

The role of information transmission in social life is irreplaceable, and language is a very important information carrier. Among all kinds of languages, English always occupies an important position. In the process of English learning, grammar error has become a difficult problem for most learners. In this paper, we propose an automatic grammar error correction model based on encoder-decoder structure. Different from traditional encoders, we design a dual-encoder structure to capture the information of source sentence and context sentence separately. The decoder is designed with a gated structure, it can effectively integrate output information of encoders. At the same time, the self-attention mechanism is combined to better solve the problem of long-distance information extraction. In addition, we propose a dynamic beam search algorithm to improve the accuracy of the word prediction process, and achieve dynamic extraction of the decoder output by combining kernel sampling techniques. We add a penalty factor to reduce the probability of generating repeated words, while suppressing the model's preference for generating shorter sentences. Finally, the proposed method is validated on the official English grammar error correction dataset. Experiments show that the dual encoder model in this paper has a good performance.