Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Dağ, Arif, Sahin, Simay, Karaköse, Mehmet
Format:	Recurso digital
Language:
Published:	Zenodo 2026
Online Access:	https://doi.org/10.5281/zenodo.19103155
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866901765943721984
author	Dağ, Arif Sahin, Simay Karaköse, Mehmet
author_facet	Dağ, Arif Sahin, Simay Karaköse, Mehmet
contents	<p><span lang="EN">Crowdsourcing is widely used to collect labels for machine learning, but open participation also allows spammers, colluders, and Sybil-style attackers to create persuasive yet incorrect consensus. This paper studies robust truth inference under such attacks with a label-aware graph neural network that represents workers and tasks as a bipartite graph. The proposed framework combines edge-label-aware message passing, an auxiliary worker-trust head, and adaptive use of task-content features. Rather than relying on worker-maliciousness labels during training, the primary model is trained only with task supervision and selects between content-enabled and no-content variants on validation data.</span><span lang="EN"> </span><span lang="EN">Evaluation uses a held-out train/validation/test protocol on simulated cifar_binary, imdb, and newsgroups labeling tasks under realistic and oracle threat models. We compare against majority voting, weighted majority voting, Dawid–Skene, the binary KOS baseline where applicable, MMSR, content-only baselines, and collusion/Sybil defenses adapted from prior work. We also validate on two public real crowdsourcing benchmarks, relevance-2 and relevance-5. On these real benchmarks, the adaptive GNN reaches 81.85% and 90.80% accuracy, respectively, and significantly outperforms the classical and robust aggregation baselines considered in this study. In simulation, the method is competitive with the strongest fair content-aware baseline, improves substantially over a fixed-content GNN on newsgroups, and remains stronger than classical crowd-only aggregation on the attack-sensitive cifar binary setting. Ablation analysis shows that task content helps on cifar binary and imdb but hurts on newsgroups, motivating adaptive content selection instead of a fixed multimodal design. Overall, the results support a qualified claim: graph-based robust aggregation can work without worker-maliciousness labels, but its gains are dataset-dependent and are strongest when relational evidence and task semantics complement each other.</span></p>
format	Recurso digital
id	zenodo_https___doi_org_10_5281_zenodo_19103155
institution	Zenodo
language
publishDate	2026
publisher	Zenodo
record_format	zenodo
spellingShingle	Robust Truth Inference in Crowdsourcing under Adversarial Attacks via Graph Neural Networks Dağ, Arif Sahin, Simay Karaköse, Mehmet <p><span lang="EN">Crowdsourcing is widely used to collect labels for machine learning, but open participation also allows spammers, colluders, and Sybil-style attackers to create persuasive yet incorrect consensus. This paper studies robust truth inference under such attacks with a label-aware graph neural network that represents workers and tasks as a bipartite graph. The proposed framework combines edge-label-aware message passing, an auxiliary worker-trust head, and adaptive use of task-content features. Rather than relying on worker-maliciousness labels during training, the primary model is trained only with task supervision and selects between content-enabled and no-content variants on validation data.</span><span lang="EN"> </span><span lang="EN">Evaluation uses a held-out train/validation/test protocol on simulated cifar_binary, imdb, and newsgroups labeling tasks under realistic and oracle threat models. We compare against majority voting, weighted majority voting, Dawid–Skene, the binary KOS baseline where applicable, MMSR, content-only baselines, and collusion/Sybil defenses adapted from prior work. We also validate on two public real crowdsourcing benchmarks, relevance-2 and relevance-5. On these real benchmarks, the adaptive GNN reaches 81.85% and 90.80% accuracy, respectively, and significantly outperforms the classical and robust aggregation baselines considered in this study. In simulation, the method is competitive with the strongest fair content-aware baseline, improves substantially over a fixed-content GNN on newsgroups, and remains stronger than classical crowd-only aggregation on the attack-sensitive cifar binary setting. Ablation analysis shows that task content helps on cifar binary and imdb but hurts on newsgroups, motivating adaptive content selection instead of a fixed multimodal design. Overall, the results support a qualified claim: graph-based robust aggregation can work without worker-maliciousness labels, but its gains are dataset-dependent and are strongest when relational evidence and task semantics complement each other.</span></p>
title	Robust Truth Inference in Crowdsourcing under Adversarial Attacks via Graph Neural Networks
url	https://doi.org/10.5281/zenodo.19103155

Similar Items