Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Wu, Dongrui
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2604.02019
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917456159703040
author	Wu, Dongrui
author_facet	Wu, Dongrui
contents	Pool-based sequential active learning for regression (ALR) optimally selects a small number of samples sequentially from a large pool of unlabeled samples to label, so that a more accurate regression model can be constructed under a given labeling budget. Representativeness and diversity, which involve computing the distances among different samples, are important considerations in ALR. However, previous ALR approaches do not incorporate the importance of different features in inter-sample distance computation, resulting in inaccurate distances and hence sub-optimal sample selection. This paper proposes four feature weighted single-task ALR approaches and three feature weighted multi-task ALR approaches, where the ridge regression coefficients trained from a small amount of previously labeled samples are used to weight the corresponding features in inter-sample distance computation. Extensive experiments showed that this intuitive and easy-to-implement enhancement almost always improves the performance of five existing ALR approaches, in both single-task and multi-task regression problems. The feature weighting strategy may also be easily extended to stream-based ALR, and classification algorithms.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_02019
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Feature Weighting Improves Pool-Based Sequential Active Learning for Regression Wu, Dongrui Machine Learning Pool-based sequential active learning for regression (ALR) optimally selects a small number of samples sequentially from a large pool of unlabeled samples to label, so that a more accurate regression model can be constructed under a given labeling budget. Representativeness and diversity, which involve computing the distances among different samples, are important considerations in ALR. However, previous ALR approaches do not incorporate the importance of different features in inter-sample distance computation, resulting in inaccurate distances and hence sub-optimal sample selection. This paper proposes four feature weighted single-task ALR approaches and three feature weighted multi-task ALR approaches, where the ridge regression coefficients trained from a small amount of previously labeled samples are used to weight the corresponding features in inter-sample distance computation. Extensive experiments showed that this intuitive and easy-to-implement enhancement almost always improves the performance of five existing ALR approaches, in both single-task and multi-task regression problems. The feature weighting strategy may also be easily extended to stream-based ALR, and classification algorithms.
title	Feature Weighting Improves Pool-Based Sequential Active Learning for Regression
topic	Machine Learning
url	https://arxiv.org/abs/2604.02019

Similar Items