Research Article
A Method of Calculating the Semantic Similarity Between English and Chinese Concepts
@INPROCEEDINGS{10.1007/978-3-030-32388-2_27, author={Jingwen Cao and Tiexin Wang and Wenxin Li and Chuanqi Tao}, title={A Method of Calculating the Semantic Similarity Between English and Chinese Concepts}, proceedings={Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24--25, 2019, Proceedings}, proceedings_a={MLICOM}, year={2019}, month={10}, keywords={HowNet MongoDB Semantic similarity Knowledge driven}, doi={10.1007/978-3-030-32388-2_27} }
- Jingwen Cao
Tiexin Wang
Wenxin Li
Chuanqi Tao
Year: 2019
A Method of Calculating the Semantic Similarity Between English and Chinese Concepts
MLICOM
Springer
DOI: 10.1007/978-3-030-32388-2_27
Abstract
In the big data era, data and information processing is a common concern of diverse fields. To achieve the two keys “efficiency” and “intelligence” to the processing process, it’s necessary to search, define and build the potential links among heterogeneous data. Focusing on this issue, this paper proposes a knowledge-driven method to calculate the semantic similarity between (bilingual English-Chinese) words. This method is built on the knowledge base “HowNet”, which defines and maintains the “atom taxonomy tree” and the “semantic dictionary” - a network of knowledge system describing the relationships between word concepts and attributes of the concepts. Compared to other knowledge bases, HowNet pays more attention to the connections between words based on concepts. Besides, this method is more complete in the analysis of concepts and more convenient in calculation methods. The non-relational database MongoDB is employed to improve the efficiency and fully use the rich knowledge maintained in HowNet. Considering both the structure of HowNet and characteristics of MongoDB, a certain number of equations are defined to calculate the semantic similarity.