A novel knowledge enhancement method for large-scale natural language training model

Qi Han; Gilja So

airo 25(1):

Research Article

A novel knowledge enhancement method for large-scale natural language training model

Download24 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/airo.8987,
    author={Qi Han and Gilja So},
    title={A novel knowledge enhancement method for large-scale natural language training model},
    journal={EAI Endorsed Transactions on AI and Robotics},
    volume={4},
    number={1},
    publisher={EAI},
    journal_a={AIRO},
    year={2025},
    month={7},
    keywords={large-scale natural language training model, knowledge enhancement, long text representation, pre-trained mode},
    doi={10.4108/airo.8987}
}

Qi Han
Gilja So
Year: 2025
A novel knowledge enhancement method for large-scale natural language training model
AIRO
EAI
DOI: 10.4108/airo.8987

Qi Han¹^,*, Gilja So¹

1: Youngsan University

*Contact email: aqiufenga@163.com

Abstract

Knowledge enhancement-based large-scale natural language training model is an advanced language model that combines deep learning and knowledge enhancement. By learning from massive unlabeled data and combining with external knowledge such as knowledge graph, it breaks through the limitations of traditional models in interpretability and reasoning ability. Introducing knowledge into data-driven artificial intelligence model is an important way to realize human-machine hybrid intelligence. However, since most pre-trained models are trained on large-scale unstructured corpus data, the defects in certainty and explainability can be remedied to some extent by introducing external knowledge. To solve the above problems, we present a knowledge-enhanced large-scale natural language training model that integrates deep learning with external knowledge sources (e.g., knowledge graphs) to improve interpretability and reasoning ability. This approach addresses the limitations of traditional models trained on unstructured data by incorporating external knowledge to enhance certainty and explainability. We propose a new knowledge enhancement method and demonstrate its effectiveness through a long text representation model. This model processes structured, knowledge-rich long texts by extracting and integrating knowledge and semantic information at the sentence and document levels. It then fuses these representations to generate an enhanced long text representation. Experiments on legal case matching tasks show that our model significantly outperforms existing methods, highlighting its innovation and practical value.

Keywords: large-scale natural language training model, knowledge enhancement, long text representation, pre-trained mode

Received: 2025-03-29
Accepted: 2025-07-05
Published: 2025-07-15
Publisher: EAI

: http://dx.doi.org/10.4108/airo.8987

Copyright © 2025 Qi Han et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.

A novel knowledge enhancement method for large-scale natural language training model

Abstract

About EAI

Community

Publish with EAI