A Method of Calculating the Semantic Similarity Between English and Chinese Concepts

Jingwen Cao; Tiexin Wang; Wenxin Li; Chuanqi Tao

Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24–25, 2019, Proceedings

Research Article

A Method of Calculating the Semantic Similarity Between English and Chinese Concepts

Download

736 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-32388-2_27,
    author={Jingwen Cao and Tiexin Wang and Wenxin Li and Chuanqi Tao},
    title={A Method of Calculating the Semantic Similarity Between English and Chinese Concepts},
    proceedings={Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24--25, 2019, Proceedings},
    proceedings_a={MLICOM},
    year={2019},
    month={10},
    keywords={HowNet MongoDB Semantic similarity Knowledge driven},
    doi={10.1007/978-3-030-32388-2_27}
}

Jingwen Cao
Tiexin Wang
Wenxin Li
Chuanqi Tao
Year: 2019
A Method of Calculating the Semantic Similarity Between English and Chinese Concepts
MLICOM
Springer
DOI: 10.1007/978-3-030-32388-2_27

Jingwen Cao¹^,*, Tiexin Wang^,*, Wenxin Li¹^,*, Chuanqi Tao^,*

1: Nanjing University of Aeronautics and Astronautics

*Contact email: caojingwen1028@126.com, tiexin.wang@nuaa.edu.cn, freedomtot@nuaa.edu.cn, t-chuanqi@163.com

Abstract

In the big data era, data and information processing is a common concern of diverse fields. To achieve the two keys “efficiency” and “intelligence” to the processing process, it’s necessary to search, define and build the potential links among heterogeneous data. Focusing on this issue, this paper proposes a knowledge-driven method to calculate the semantic similarity between (bilingual English-Chinese) words. This method is built on the knowledge base “HowNet”, which defines and maintains the “atom taxonomy tree” and the “semantic dictionary” - a network of knowledge system describing the relationships between word concepts and attributes of the concepts. Compared to other knowledge bases, HowNet pays more attention to the connections between words based on concepts. Besides, this method is more complete in the analysis of concepts and more convenient in calculation methods. The non-relational database MongoDB is employed to improve the efficiency and fully use the rich knowledge maintained in HowNet. Considering both the structure of HowNet and characteristics of MongoDB, a certain number of equations are defined to calculate the semantic similarity.

Keywords: HowNet, MongoDB, Semantic similarity, Knowledge driven

Published: 2019-10-28
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-32388-2_27

A Method of Calculating the Semantic Similarity Between English and Chinese Concepts

Abstract

About EAI

Community

Publish with EAI