Learning and Feature Selection under Budget Constraints in Crowdsourcing

Besmira Nushi; Adish Singla; Andreas Krause; Donald Kossmann

doi:10.1609/hcomp.v4i1.13278

Authors

Besmira Nushi ETH Zurich
Adish Singla ETH Zurich
Andreas Krause ETH Zurich
Donald Kossmann ETH Zurich and Microsoft Research Redmond

DOI:

https://doi.org/10.1609/hcomp.v4i1.13278

Keywords:

crowdsourcing, budgeted learning, feature selection

Abstract

The cost of data acquisition limits the amount of labeled data available for machine learning algorithms, both at the training and the testing phase. This problem is further exacerbated in real-world crowdsourcing applications where labels are aggregated from multiple noisy answers. We tackle classification problems where the underlying feature labels are unknown to the algorithm and a (noisy) label of the desired feature can be acquired at a fixed cost. This problem has two types of budget constraints - the total cost of feature labels available for learning at the training phase, and the cost of features to use during the testing phase for classification. We propose a novel budgeted learning and feature selection algorithm, B-LEAFS, for jointly tackling this problem in the presence of noise. Experimental evaluation on synthetic and real-world crowdsourcing data demonstrate the practical applicability of our approach.

Learning and Feature Selection under Budget Constraints in Crowdsourcing

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information