Abstract

In recent years, stream-based active learning has become an intensively investigated research topic. In this work, we propose a new algorithm for stream-based active learning that decides immediately whether to acquire a label (selective sampling). It uses Probabilistic Active Learning (PAL) to measure the spatial usefulness of each instance in the stream. To determine if a currently arrived instance belongs to the most useful instances (temporal usefulness) given a predefined budget, we propose BIQF - a Balanced Incremental Quantile Filter. It uses a sliding window to represent the distribution of the most recent usefulness values and finds a labeling threshold using quantiles. The balancing mechanism ensures that the predefined budget will be met within a given tolerance window. We evaluate our approach against other stream active learning approaches on multiple datasets. The results confirm the effectiveness of our method.

Learning curves

Complete Pseudocode and Description

Download PDF

Probabilistic Active Learning in Datastreams

Abstract

Learning curves

Complete Pseudocode and Description