
Research Article
Coreference Resolution for Cybersecurity Entity: Towards Explicit, Comprehensive Cybersecurity Knowledge Graph with Low Redundancy
@INPROCEEDINGS{10.1007/978-3-031-25538-0_6, author={Zhengyu Liu and Haochen Su and Nannan Wang and Cheng Huang}, title={Coreference Resolution for Cybersecurity Entity: Towards Explicit, Comprehensive Cybersecurity Knowledge Graph with Low Redundancy}, proceedings={Security and Privacy in Communication Networks. 18th EAI International Conference, SecureComm 2022, Virtual Event, October 2022, Proceedings}, proceedings_a={SECURECOMM}, year={2023}, month={2}, keywords={Coreference resolution Security intelligence Semantic text matching Entity type}, doi={10.1007/978-3-031-25538-0_6} }
- Zhengyu Liu
Haochen Su
Nannan Wang
Cheng Huang
Year: 2023
Coreference Resolution for Cybersecurity Entity: Towards Explicit, Comprehensive Cybersecurity Knowledge Graph with Low Redundancy
SECURECOMM
Springer
DOI: 10.1007/978-3-031-25538-0_6
Abstract
Cybersecurity Knowledge Graph (CKG) has become an important structure to address the current cybersecurity crises and challenges, due to its powerful ability to model, mine, and leverage massive security intelligence data. To construct a comprehensive and explicit CKG with low redundancy, coreference resolution (CR) plays a crucial role as the core step in knowledge fusion. Although the research on coreference resolution techniques in Natural Language Processing (NLP) field has made notable achievements, there is still a great gap in the cybersecurity field. Therefore, the paper first investigates the effectiveness of the existing CR models on cybersecurity corpus and presents CyberCoref, an end-to-end coreference resolution model for cybersecurity entities. We propose an entity type prediction network that not only helps to improve mention representations and provide type consistency checks, but also enables the model to distinguish the coreference among different entity types and thus run the coreference resolution more granular. To overcome the problem of implicit contextual modeling adopted by the existing CR models, we innovative propose an explicit contextual modeling method for the coreference resolution task based on semantic text matching. Finally, we improve the span representation by introducing lexical and syntactic features. The experimental results demonstrate that CyberCoref improves the F1 values on the cybersecurity corpus by 6.9% compared to existing CR models.