Research Article
Indexing Based on Topic Modeling and MATHML for Building Vietnamese Technical Document Retrieval Effectively
@INPROCEEDINGS{10.1007/978-3-319-29236-6_31, author={Tuan Xuan and Linh Khanh and Hung Trung and Ha Thu and Tinh Thanh}, title={Indexing Based on Topic Modeling and MATHML for Building Vietnamese Technical Document Retrieval Effectively}, proceedings={Context-Aware Systems and Applications. 4th International Conference, ICCASA 2015, Vung Tau, Vietnam, November 26-27, 2015, Revised Selected Papers}, proceedings_a={ICCASA}, year={2016}, month={4}, keywords={Mathml Topic modeling Vietnamese technical text Search engine Information retrieval}, doi={10.1007/978-3-319-29236-6_31} }
- Tuan Xuan
Linh Khanh
Hung Trung
Ha Thu
Tinh Thanh
Year: 2016
Indexing Based on Topic Modeling and MATHML for Building Vietnamese Technical Document Retrieval Effectively
ICCASA
Springer
DOI: 10.1007/978-3-319-29236-6_31
Abstract
The grow of data on the Internet has brought to people many information and it also opened some important problem in Information retrieval…Along with it, some search engines have developed for user’s purpose. User can retrieve information by content, keyword or anything what they need. However, data on the Internet is too huge, the results feedback is often millions or hundreds millions for each query. Therefore, with the narrow field, we will meet a difficult to find related information, especially technical information that contain formulas. In this paper, we present a method for building Vietnamese technical text based on topic modeling and MathML for indexing. System has built and tested with over 500 Vietnamese technical text shown that, this system satisfied users’ requires in accuracy and speed.