Proceedings of the 4th International Conference on Science and Technology Applications, ICoSTA 2022, 1-2 November 2022, Medan, North Sumatera Province, Indonesia

Research Article

Computational Linguistics Using Latent Dirichlet Allocation for Topic Modeling on Wattpad Review

Download330 downloads
  • @INPROCEEDINGS{10.4108/eai.1-11-2022.2326169,
        author={Rachma  Awantina and Wahyu  Wibowo},
        title={Computational Linguistics Using Latent Dirichlet Allocation for Topic Modeling on Wattpad Review},
        proceedings={Proceedings of the 4th International Conference on Science and Technology Applications, ICoSTA 2022, 1-2 November 2022, Medan, North Sumatera Province, Indonesia},
        publisher={EAI},
        proceedings_a={ICOSTA},
        year={2023},
        month={1},
        keywords={latent dirichlet allocation topic model user review wattpad},
        doi={10.4108/eai.1-11-2022.2326169}
    }
    
  • Rachma Awantina
    Wahyu Wibowo
    Year: 2023
    Computational Linguistics Using Latent Dirichlet Allocation for Topic Modeling on Wattpad Review
    ICOSTA
    EAI
    DOI: 10.4108/eai.1-11-2022.2326169
Rachma Awantina1,*, Wahyu Wibowo1
  • 1: Department of Business Statistics, Institut Teknologi Sepuluh Nopember, Building TC, 2nd Floor, ITS Campus, Sukolilo, Surabaya, 60111
*Contact email: rachma905@gmail.com

Abstract

The advancement of the digital era helps human activities because all needs are available in one hand, including reading or writing. The development of digital novel applications makes it easier for readers to access novels through gadgets and self-publishing platforms for writers. Wattpad as the most popular digital novel application on the Google Play Store with a total of more than one hundred million downloads offers a variety of interesting features that can be downloaded for free. The purpose of this research is to find out the results of topic modeling on user reviews of the Wattpad application. The data is taken from the Google Play Store by web scraping technique. The topic modeling uses Latent Dirichlet Allocation (LDA), an unsupervised learning method that is effective in finding different topics in a collection of documents where the document is the observed object, but the word distribution is a hidden structure. Topic modeling generates several words that represent the main topic so that people can understand the latest information about Wattpad.