
Research Article
Incorporating Feature Labeling into Crowdsourcing for More Accurate Aggregation Labels
@INPROCEEDINGS{10.1007/978-3-031-24386-8_17, author={Yili Fang and Zhaoqi Pei and Xinyi Ding and Wentao Xu and Tao Han}, title={Incorporating Feature Labeling into Crowdsourcing for More Accurate Aggregation Labels}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 18th EAI International Conference, CollaborateCom 2022, Hangzhou, China, October 15-16, 2022, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2023}, month={1}, keywords={Crowdsourcing Answer collection Answer aggregation Probability graph model}, doi={10.1007/978-3-031-24386-8_17} }
- Yili Fang
Zhaoqi Pei
Xinyi Ding
Wentao Xu
Tao Han
Year: 2023
Incorporating Feature Labeling into Crowdsourcing for More Accurate Aggregation Labels
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-031-24386-8_17
Abstract
Crowdsourcing is a popular way of collecting crowd wisdom and has been deployed in various senarios. Effectiveanswer collectionandanswer aggregationare two important crowdsourcing topics as workers may give incorrect responses. For difficult tasks, workers tend to implicitly use task related information duringanswer collection, and those information could play an important role in aggregating high-quality results. For example, the identification of the size and hair style of one dog in a picture is a simple and necessary prerequisite step for dog breed labeling. However, most existing methods ignore those task related information and fail to achieve high quality data.
In this study, we propose a framework that incorporates the answers of corresponding tasks from workers and their labeling to object features, which we believe are critical task related information foranswer aggregation. Then, we propose a novel generative probability graph model that can infer the task answers by exploiting label features, as well as worker ability and their responses. We use EM algorithm to estimate model parameters and infer true answers. Experimental results demonstrate that incorporating task related information can greatly improve the accuracy ofanswer aggregation. Compared with state of the art ones that ignore these information, our methods could achieve about 15.9%–36.8% improvement in accuracy.