Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Ying, Qiao, Congyu, Geng, Xin, Xu, Ning
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.07883
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910201771196416
author	Zhang, Ying Qiao, Congyu Geng, Xin Xu, Ning
author_facet	Zhang, Ying Qiao, Congyu Geng, Xin Xu, Ning
contents	Large Language Models (LLMs) rely on safety alignment to obey safe requests while refusing harmful ones. However, traditional refusal mechanisms often lead to "rigid rejection," where a general template (e.g., "I cannot fulfill this request") indiscriminately triggers refusals and severely undermines the naturalness of interactions between humans and LLMs. To address this issue, LANCE is proposed in this paper to ensure safe yet flexible and natural responses via label enhancement. Specifically, LANCE employs variational inference to perform label enhancement, predicting a continuous distribution across multiple rejection categories. These fine-grained rejection distributions provide multi-way textual gradients for a refinement model to neutralize the hazardous elements in the prompt, so that the LLMs could generate safe responses that avoid rigid rejections while preserving the naturalness of interactions. Experiments demonstrate that LANCE significantly alleviates the rigid rejection problem while maintaining high security standards, significantly outperforming existing baseline models in terms of helpfulness and naturalness of responses.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_07883
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Beyond "I cannot fulfill this request": Alleviating Rigid Rejection in LLMs via Label Enhancement Zhang, Ying Qiao, Congyu Geng, Xin Xu, Ning Computation and Language Large Language Models (LLMs) rely on safety alignment to obey safe requests while refusing harmful ones. However, traditional refusal mechanisms often lead to "rigid rejection," where a general template (e.g., "I cannot fulfill this request") indiscriminately triggers refusals and severely undermines the naturalness of interactions between humans and LLMs. To address this issue, LANCE is proposed in this paper to ensure safe yet flexible and natural responses via label enhancement. Specifically, LANCE employs variational inference to perform label enhancement, predicting a continuous distribution across multiple rejection categories. These fine-grained rejection distributions provide multi-way textual gradients for a refinement model to neutralize the hazardous elements in the prompt, so that the LLMs could generate safe responses that avoid rigid rejections while preserving the naturalness of interactions. Experiments demonstrate that LANCE significantly alleviates the rigid rejection problem while maintaining high security standards, significantly outperforming existing baseline models in terms of helpfulness and naturalness of responses.
title	Beyond "I cannot fulfill this request": Alleviating Rigid Rejection in LLMs via Label Enhancement
topic	Computation and Language
url	https://arxiv.org/abs/2605.07883

Similar Items