Saved in:
Bibliographic Details
Main Authors: Hira, Rupkatha, Kau, Dominik, Sorrell, Jessica
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.09686
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Active learning aims to reduce the number of labeled data points required by machine learning algorithms by selectively querying labels from initially unlabeled data. Ensuring replicability, where an algorithm produces consistent outcomes across different runs, is essential for the reliability of machine learning models but often increases sample complexity. This paper investigates the cost of replicability in active learning using two classical disagreement-based methods: the CAL and A^2 algorithms. Leveraging randomized thresholding techniques, we propose two replicable active learning algorithms: one for realizable learning of finite hypothesis classes and another for the agnostic setting. Our theoretical analysis shows that while enforcing replicability increases label complexity, CAL and A^2 still achieve substantial label savings under this constraint. These findings provide insights into balancing efficiency and stability in active learning.