About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Digital Forensics and Cyber Crime. 14th EAI International Conference, ICDF2C 2023, New York City, NY, USA, November 30, 2023, Proceedings, Part I

Research Article

Removing Noise (Opinion Messages) for Fake News Detection in Discussion Forum Using BERT Model

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-56580-9_5,
        author={Cheuk Yu Ip and Fu Kay Frankie Li and Yi Anson Lam and Siu Ming Yiu},
        title={Removing Noise (Opinion Messages) for Fake News Detection in Discussion Forum Using BERT Model},
        proceedings={Digital Forensics and Cyber Crime. 14th EAI International Conference, ICDF2C 2023, New York City, NY, USA, November 30, 2023, Proceedings, Part I},
        proceedings_a={ICDF2C},
        year={2024},
        month={4},
        keywords={Fact Opinion Text classification Check-worthy Fake news Misinformation Discussion forum Lihkg BERT},
        doi={10.1007/978-3-031-56580-9_5}
    }
    
  • Cheuk Yu Ip
    Fu Kay Frankie Li
    Yi Anson Lam
    Siu Ming Yiu
    Year: 2024
    Removing Noise (Opinion Messages) for Fake News Detection in Discussion Forum Using BERT Model
    ICDF2C
    Springer
    DOI: 10.1007/978-3-031-56580-9_5
Cheuk Yu Ip,*, Fu Kay Frankie Li, Yi Anson Lam, Siu Ming Yiu
    *Contact email: lesterip@connect.hku.hk

    Abstract

    The exponential growth and widespread of fake news in online media have been causing unprecedented threats to the election result, public hygiene and justice. With ever-growing contents in online media, scrutinizing every single message could be extremely resource intensive, if not impracticable. However, most of the messages are opinion of the authors, not presenting a fact (whether it is fake or true), which contribute a significant portion of noise. This paper suggests a cost-effective approach to identify opinion contents (noise) in discussion forums which cannot be classified as fake or true news. By excluding opinion contents which are not check-worthy in the preprocessing step, the cost of detection could significantly be reduced, especially if voluminous contents are to be dealt with timely. This paper built up an opinion and factual statement dataset in a mixture of officially written Traditional Chinese from the most popular discussion forum in Hong Kong, namely, LIHKG, relating to local Government officials, then used the Bidirectional Encoder Representations from Transformers (BERT) model to identify opinion contents which achieve 98.7% accuracy, and generalized well in public hygiene related contents which the BERT model did not pre-train. This paper further discovered that some of the 15 most active LIHKG users creating discussion threads relating to the local Government officials might be troll accounts with underlying purposes, and assessment on their behavior and sentiments might assist in detecting misinformation.

    Keywords
    Fact Opinion Text classification Check-worthy Fake news Misinformation Discussion forum Lihkg BERT
    Published
    2024-04-03
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-56580-9_5
    Copyright © 2023–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL