Testbeds and Research Infrastructure: Development of Networks and Communities. 9th International ICST Conference, TridentCom 2014, Guangzhou, China, May 5-7, 2014, Revised Selected Papers

Research Article

Refined Feature Extraction for Chinese Question Classification in CQA

Download44 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-13326-3_30,
        author={Lei Su and Bin Yang and Xiangxiang Qi and Yantuan Xian},
        title={Refined Feature Extraction for Chinese Question Classification in CQA},
        proceedings={Testbeds and Research Infrastructure: Development of Networks and Communities. 9th International ICST Conference, TridentCom 2014, Guangzhou, China, May 5-7, 2014, Revised Selected Papers},
        proceedings_a={TRIDENTCOM},
        year={2014},
        month={11},
        keywords={Community-based Question Answering Wikipedia Question Classification Semantic Knowledge},
        doi={10.1007/978-3-319-13326-3_30}
    }
    
  • Lei Su
    Bin Yang
    Xiangxiang Qi
    Yantuan Xian
    Year: 2014
    Refined Feature Extraction for Chinese Question Classification in CQA
    TRIDENTCOM
    Springer
    DOI: 10.1007/978-3-319-13326-3_30
Lei Su1,*, Bin Yang1,*, Xiangxiang Qi1,*, Yantuan Xian1,*
  • 1: Kunming University of Science and Technology
*Contact email: s28341@hotmail.com, yangbin0724@126.com, qixiangfighting@163.com, yantuan.xian@gmail.com

Abstract

Community-based Question Answering (CQA) services, such as Baidu Zhidao, have attracted increasing attention over recent years, where the users can voluntarily post the questions and obtain the answers by the other users from the community. Question classification module of a CQA system plays a very important role in understanding the user intents, which could effectively enhance the CQA systems to identify the similar questions and retrieve the candidate answers. However, the poor semantic information could be obtained from the questions because of the short sentences. This paper proposes a refined feature extraction method for question classification. The method aims to use Wikipedia to expand the semantic knowledge of sentences, and extract the features step by step to overcome the shortness of semantic knowledge. Experimental results on 714,582 Chinese questions crawled from Baidu Knows show that the proposed method could effectively improve the performance of question classification in CQA.