Achieving Budget-optimality with Adaptive Schemes in Crowdsourcing
release_qorrngjnffe7fjkd2lnnjbvwfe
by
Ashish Khetan, Sewoong Oh
2017
Abstract
Crowdsourcing platforms provide marketplaces where task requesters can pay to
get labels on their data. Such markets have emerged recently as popular venues
for collecting annotations that are crucial in training machine learning models
in various applications. However, as jobs are tedious and payments are low,
errors are common in such crowdsourced labels. A common strategy to overcome
such noise in the answers is to add redundancy by getting multiple answers for
each task and aggregating them using some methods such as majority voting. For
such a system, there is a fundamental question of interest: how can we maximize
the accuracy given a fixed budget on how many responses we can collect on the
crowdsourcing system. We characterize this fundamental trade-off between the
budget (how many answers the requester can collect in total) and the accuracy
in the estimated labels. In particular, we ask whether adaptive task assignment
schemes lead to a more efficient trade-off between the accuracy and the budget.
Adaptive schemes, where tasks are assigned adaptively based on the data
collected thus far, are widely used in practical crowdsourcing systems to
efficiently use a given fixed budget. However, existing theoretical analyses of
crowdsourcing systems suggest that the gain of adaptive task assignments is
minimal. To bridge this gap, we investigate this question under a strictly more
general probabilistic model, which has been recently introduced to model
practical crowdsourced annotations. Under this generalized Dawid-Skene model,
we characterize the fundamental trade-off between budget and accuracy. We
introduce a novel adaptive scheme that matches this fundamental limit. We
further quantify the fundamental gap between adaptive and non-adaptive schemes,
by comparing the trade-off with the one for non-adaptive schemes. Our analyses
confirm that the gap is significant.
In text/plain
format
Archived Files and Locations
application/pdf 681.1 kB
file_rpskrwpha5gz5ecpf6tyivg434
|
arxiv.org (repository) web.archive.org (webarchive) |
1602.03481v3
access all versions, variants, and formats of this works (eg, pre-prints)