Guardat en:
Dades bibliogràfiques
Autors principals: Hu, Kangping, Mussmann, Stephen
Format: Preprint
Publicat: 2025
Matèries:
Accés en línia:https://arxiv.org/abs/2510.09877
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
Taula de continguts:
  • Over the past couple of decades, many active learning acquisition functions have been proposed, leaving practitioners with an unclear choice of which to use. Bayesian-based active learning offers principled objectives with explainable intuition, including Expected Error Reduction (EER), Expected Predictive Information Gain (EPIG), and Bayesian Active Learning by Disagreements (BALD). A key challenge of such methods is the difficult scaling to large batch sizes, leading to either computational challenges (BatchBALD) or dramatic performance drops (top-$B$ selection). Here, using a particular formulation of Bayesian Decision Theory, we derive Partial Batch Label Sampling (ParBaLS) for the EPIG algorithm. We show experimentally for several datasets that ParBaLS EPIG gives superior performance for a fixed budget and Bayesian Logistic Regression on embeddings from large pre-trained models. Our code is available at https://github.com/ADDAPT-ML/ParBaLS.