15–19 Sept 2024
Leuven, Belgium
Europe/Berlin timezone

Which query strategy should one choose in active learning of real-valued functions when the model class is possibly misspecified?

17 Sept 2024, 10:05
20m
Conference room 3

Conference room 3

Machine Learning Regression

Speaker

Bernhard Spangl (University of Natural Resources and Life Sciences, Vienna)

Description

We discuss the problem of active learning in regression scenarios. In active learning, the goal is to provide criteria that the learning algorithm can employ to improve its performance by actively selecting data that are most informative.

Active learning is usually thought of as being a sequential process where the training set is augmented one data point at a time. Additionally, it is assumed that an experiment to gain a label $y$ for an instance $x$ is costly but computation is cheap.

However, in some application areas, e.g. in biotechnology, selecting queries in serial may be inefficient. Hence, we focus on batch-mode active learning that allows the learner to query instances in groups.

We restrict ourselves to a pool-based sampling scenario and investigate several query strategies, namely uncertainty sampling, committee-based approaches, and variance reduction, for actively selecting instantiations of the input variables $x$ that should be labelled and incorporated into the training set, when the model class is possibly misspecified.

We compare all active selection strategies to the passive one that selects the next input points at random from the unlabelled examples using toy and real data sets and present the results of our numerical studies.

Type of presentation Talk
Classification Mainly methodology
Keywords Active Learning; Regression; Misspecification of Models.

Primary author

Bernhard Spangl (University of Natural Resources and Life Sciences, Vienna)

Presentation materials

There are no materials yet.