Saved in:
Bibliographic Details
Main Authors: Shi, Haoming, Feng, Yang, Liu, Xiaoqian
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.06659
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914456156504064
author Shi, Haoming
Feng, Yang
Liu, Xiaoqian
author_facet Shi, Haoming
Feng, Yang
Liu, Xiaoqian
contents High-dimensional data in modern applications, such as COVID-19 mortality, often span multiple domains. Leveraging auxiliary information from source domains to improve performance in a target domain motivates the use of transfer learning. However, a practical issue that has been overlooked is data contamination, which induces heterogeneity and can significantly degrade transfer learning performance. To address this challenge, we propose a novel approach that tackles transfer learning under data contamination within a structured regression setting. By employing the robust L2E criterion, we develop the TransL2E method that accounts for contamination in both target and source data while effectively transferring relevant information. Beyond robust estimation, TransL2E introduces a data-driven bi-level source detection mechanism, operating at both individual and cohort levels, which possesses multiple advantages over existing source detection approaches. Comprehensive simulation studies and a real data application demonstrate the superior performance of TransL2E in both robust estimation and structure recovery in the presence of data limitation and contamination.
format Preprint
id arxiv_https___arxiv_org_abs_2604_06659
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Transfer Learning for Robust Structured Regression with Bi-level Source Detection
Shi, Haoming
Feng, Yang
Liu, Xiaoqian
Methodology
High-dimensional data in modern applications, such as COVID-19 mortality, often span multiple domains. Leveraging auxiliary information from source domains to improve performance in a target domain motivates the use of transfer learning. However, a practical issue that has been overlooked is data contamination, which induces heterogeneity and can significantly degrade transfer learning performance. To address this challenge, we propose a novel approach that tackles transfer learning under data contamination within a structured regression setting. By employing the robust L2E criterion, we develop the TransL2E method that accounts for contamination in both target and source data while effectively transferring relevant information. Beyond robust estimation, TransL2E introduces a data-driven bi-level source detection mechanism, operating at both individual and cohort levels, which possesses multiple advantages over existing source detection approaches. Comprehensive simulation studies and a real data application demonstrate the superior performance of TransL2E in both robust estimation and structure recovery in the presence of data limitation and contamination.
title Transfer Learning for Robust Structured Regression with Bi-level Source Detection
topic Methodology
url https://arxiv.org/abs/2604.06659