Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Zheng, Weidong, Chen, Kongyang, Guo, Yuanwei, Xiao, Yatie
Formato:	Preprint
Publicado:	2026
Materias:	Machine Learning Cryptography and Security
Acceso en línea:	https://arxiv.org/abs/2605.08730
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866910204733423616
author	Zheng, Weidong Chen, Kongyang Guo, Yuanwei Xiao, Yatie
author_facet	Zheng, Weidong Chen, Kongyang Guo, Yuanwei Xiao, Yatie
contents	Class-level machine unlearning aims to remove the influence of specified classes while preserving model utility on retained classes. Existing methods are commonly evaluated by retain-set accuracy, forget-set accuracy, and unlearning time, but these metrics provide limited insight into how forgetting is achieved internally. In this paper, we reveal a bias-dominated shortcut in class-level unlearning: the prediction of forgotten classes can be suppressed by decreasing the corresponding bias terms in the final classification head. We first analyze the gradient dynamics of classification-head biases under softmax cross-entropy training, explaining why retain-set-only optimization tends to reduce the biases of absent classes. Based on this observation, we introduce BiasShift as a diagnostic baseline, showing that simple bias manipulation can satisfy conventional unlearning metrics while leaving abnormal bias patterns that reveal forgotten labels. To mitigate excessive forgotten-class bias suppression, we propose two bias-aware mechanisms, namely Two-Stage Bias Gradient Reversal Mechanism (TS-BGRM) and Lower-Bound Hinge Regularization (LB-HR). We further introduce three bias-oriented metrics, including Bias Stability Coefficient (BSC), Median Bias Gap (MBG), and Minimal Bias Score (MBS), to quantify bias dependence and potential leakage. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that the proposed methods maintain competitive unlearning performance while producing more stable bias distributions. We have released our code at {https://github.com/zwd2024/Beyond-the-Shadow-of-Bias-From-Classification-Head-Bias-to-Parameter-Redistribution}.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_08730
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation Zheng, Weidong Chen, Kongyang Guo, Yuanwei Xiao, Yatie Machine Learning Cryptography and Security Class-level machine unlearning aims to remove the influence of specified classes while preserving model utility on retained classes. Existing methods are commonly evaluated by retain-set accuracy, forget-set accuracy, and unlearning time, but these metrics provide limited insight into how forgetting is achieved internally. In this paper, we reveal a bias-dominated shortcut in class-level unlearning: the prediction of forgotten classes can be suppressed by decreasing the corresponding bias terms in the final classification head. We first analyze the gradient dynamics of classification-head biases under softmax cross-entropy training, explaining why retain-set-only optimization tends to reduce the biases of absent classes. Based on this observation, we introduce BiasShift as a diagnostic baseline, showing that simple bias manipulation can satisfy conventional unlearning metrics while leaving abnormal bias patterns that reveal forgotten labels. To mitigate excessive forgotten-class bias suppression, we propose two bias-aware mechanisms, namely Two-Stage Bias Gradient Reversal Mechanism (TS-BGRM) and Lower-Bound Hinge Regularization (LB-HR). We further introduce three bias-oriented metrics, including Bias Stability Coefficient (BSC), Median Bias Gap (MBG), and Minimal Bias Score (MBS), to quantify bias dependence and potential leakage. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that the proposed methods maintain competitive unlearning performance while producing more stable bias distributions. We have released our code at {https://github.com/zwd2024/Beyond-the-Shadow-of-Bias-From-Classification-Head-Bias-to-Parameter-Redistribution}.
title	Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation
topic	Machine Learning Cryptography and Security
url	https://arxiv.org/abs/2605.08730

Ejemplares similares