Incremental Knowledge Acquisition for WSD: A Rough Set and IL based Method

Xu Huang; Xiulan Hao; Qing Shen; Bin Shao

Research Article

Incremental Knowledge Acquisition for WSD: A Rough Set and IL based Method

Download986 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/sis.2.5.e3,
    author={Xu Huang and Xiulan Hao and Qing Shen and Bin Shao},
    title={Incremental Knowledge Acquisition for WSD: A Rough Set and IL based Method},
    journal={EAI Endorsed Transactions on Scalable Information Systems},
    volume={2},
    number={5},
    publisher={ICST},
    journal_a={SIS},
    year={2015},
    month={7},
    keywords={Rough Set (RS), Instance-based learning (IL), Word Sense Disambiguation (WSD), Knowledge Acquisition, Natural Language Processing (NLP)},
    doi={10.4108/sis.2.5.e3}
}

Xu Huang
Xiulan Hao
Qing Shen
Bin Shao
Year: 2015
Incremental Knowledge Acquisition for WSD: A Rough Set and IL based Method
SIS
ICST
DOI: 10.4108/sis.2.5.e3

Xu Huang^1,2^,*, Xiulan Hao¹, Qing Shen¹, Bin Shao¹

1: School of Information Engineering, Huzhou University, Huzhou, Zhejiang, 313000, China
2: Department of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang, 310058, China

*Contact email: hxzj@zju.edu.cn

Abstract

Word sense disambiguation (WSD) is one of tricky tasks in natural language processing (NLP) as it needs to take into full account all the complexities of language. Because WSD involves in discovering semantic structures from unstructured text, automatic knowledge acquisition of word sense is profoundly difficult. To acquire knowledge about Chinese multi-sense verbs, we introduce an incremental machine learning method which combines rough set method and instance based learning. First, context of a multi-sense verb is extracted into a table; its sense is annotated by a skilled human and stored in the same table. By this way, decision table is formed, and then rules can be extracted within the framework of attributive value reduction of rough set. Instances not entailed by any rule are treated as outliers. When new instances are added to decision table, only the new added and outliers need to be learned further, thus incremental leaning is fulfilled. Experiments show the scale of decision table can be reduced dramatically by this method without performance decline.

Keywords: Rough Set (RS), Instance-based learning (IL), Word Sense Disambiguation (WSD), Knowledge Acquisition, Natural Language Processing (NLP)

Received: 2015-03-27
Accepted: 2015-05-01
Published: 2015-07-02
Publisher: ICST

: http://dx.doi.org/10.4108/sis.2.5.e3

Copyright © 2015 X. Huang et al., licensed to ICST. This is an open access article distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.