Power Micro-Blog Text Classification Based on Domain Dictionary and LSTM-RNN

Meng-yao Shen; Jing-sheng Lei; Fei-ye Du; Zhong-qin Bi

Testbeds and Research Infrastructures for the Development of Networks and Communications. 14th EAI International Conference, TridentCom 2019, Changsha, China, December 7-8, 2019, Proceedings

Research Article

Power Micro-Blog Text Classification Based on Domain Dictionary and LSTM-RNN

Download

373 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-43215-7_3,
    author={Meng-yao Shen and Jing-sheng Lei and Fei-ye Du and Zhong-qin Bi},
    title={Power Micro-Blog Text Classification Based on Domain Dictionary and LSTM-RNN},
    proceedings={Testbeds and Research Infrastructures for the Development of Networks and Communications. 14th EAI International Conference, TridentCom 2019, Changsha, China, December 7-8, 2019, Proceedings},
    proceedings_a={TRIDENTCOM},
    year={2020},
    month={3},
    keywords={Text classification Power micro-blog Domain dictionary Word vector Classification accuracy LSTM-RNN},
    doi={10.1007/978-3-030-43215-7_3}
}

Meng-yao Shen
Jing-sheng Lei
Fei-ye Du
Zhong-qin Bi
Year: 2020
Power Micro-Blog Text Classification Based on Domain Dictionary and LSTM-RNN
TRIDENTCOM
Springer
DOI: 10.1007/978-3-030-43215-7_3

Meng-yao Shen¹, Jing-sheng Lei¹, Fei-ye Du¹, Zhong-qin Bi¹^,*

1: Shanghai University of Electric Power

*Contact email: zqbi@shiep.edu.cn

Abstract

The micro-blog texts of the national grid provinces and cities will be analyzed as the main data, including the micro-blogs and corresponding comments, which will help us understand the events of power industry and people’s attitudes towards these events. In this work, the data set is composed of 420,000 micro-blog texts. Firstly, the professional vocabulary of electric power is extracted, and these vocabulary are manually labeled, thus proposing a new field dictionary closely related to the power industry. Secondly, using the new power domain dictionary to classify the 2018 electric micro-blogs, and we can find that classification accuracy increased from 88.7% to 95.2%. Finally, a classification model based on LSTM (Long Short-Term Memory) and RNN (Recurrent Neural Network) is used to deal with the comments under the micro-blog. The experimental result shows that the classification of the LSTM-RNN is more accurate. The rate was 83.1%, which was significantly better than the traditional LSTM and RNN text classification models of 78.4% and 73.1%.

Keywords: Text classification, Power micro-blog, Domain dictionary, Word vector, Classification accuracy, LSTM-RNN

Published: 2020-03-05
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-43215-7_3

Power Micro-Blog Text Classification Based on Domain Dictionary and LSTM-RNN

Abstract

About EAI

Community

Publish with EAI