
Research Article
Crowdturfing Detection in Online Review System: A Graph-Based Modeling
@INPROCEEDINGS{10.1007/978-3-030-92638-0_21, author={Qilong Feng and Yue Zhang and Li Kuang}, title={Crowdturfing Detection in Online Review System: A Graph-Based Modeling}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 17th EAI International Conference, CollaborateCom 2021, Virtual Event, October 16-18, 2021, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2022}, month={1}, keywords={Crowdturfing detection Graph-based modeling Class imbalance solution Online review system}, doi={10.1007/978-3-030-92638-0_21} }
- Qilong Feng
Yue Zhang
Li Kuang
Year: 2022
Crowdturfing Detection in Online Review System: A Graph-Based Modeling
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-030-92638-0_21
Abstract
With the widespread popularity of online reviews and crowdsourcing, people may publish fake comments on online review system and get paid for crowdsourcing tasks. In order to identify these reviewers, machine learning methods are commonly used in traditional strategies and it is difficult to guarantee the accuracy of detection. In this work, we adopt a modeling method based on the graph structure and propose a novel aggregation method called CrowdDet. Therefore, two clear diagrams of Reviewer-to-Product and Co-Reviewer are constructed. Specifically, we first extract the node features and structure information in the graph, gaining the reviewers’ features and neighborhood relations features. Secondly, we use an elaborate attention-based mechanism to aggregate the factors of reviewers in Review-space and Sociality-space, which comprehensively combines the representation of the reviewer factors from multiple dimensions. Thirdly, we get the classification results and optimize the original loss function by Focal loss to alleviate the impact of class imbalance. In the experiment, we verify the proposed scheme on a real dataset and compare it with other methods. The results show that our scheme has a significant effect under the real dataset, with a recall rate of 0.85+. Our research also provides a relevant foundation for resisting the malicious behavior from crowdsourcing.