Saved in:
Bibliographic Details
Main Authors: Chen, Weishu, Hou, Zhouhui, Zhan, Mingjie, Zhao, Zhicheng, Su, Fei
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.06176
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908945108434944
author Chen, Weishu
Hou, Zhouhui
Zhan, Mingjie
Zhao, Zhicheng
Su, Fei
author_facet Chen, Weishu
Hou, Zhouhui
Zhan, Mingjie
Zhao, Zhicheng
Su, Fei
contents We present an empirical study of embedding-based retrieval under realistic conversational settings, where queries are short, dialogue-like, and weakly specified, and retrieval corpora contain structured conversational artifacts. Focusing on Qwen3-embedding models, we identify a deployment-relevant robustness vulnerability: under conversational retrieval without query prompting, structured dialogue-style noise can become disproportionately retrievable and intrude into top-ranked results, despite being semantically uninformative. This failure mode emerges consistently across model scales, remains largely invisible under standard clean-query benchmarks, and is significantly more pronounced in Qwen3 than in earlier Qwen variants and other widely used dense retrieval baselines. We further show that lightweight query prompting qualitatively alters retrieval behavior, effectively suppressing noise intrusion and restoring ranking stability. Our findings highlight an underexplored robustness risk in conversational retrieval and underscore the importance of evaluation protocols that reflect the complexities of deployed systems.
format Preprint
id arxiv_https___arxiv_org_abs_2604_06176
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Robustness Risk of Conversational Retrieval: Identifying and Mitigating Noise Sensitivity in Qwen3-Embedding Model
Chen, Weishu
Hou, Zhouhui
Zhan, Mingjie
Zhao, Zhicheng
Su, Fei
Information Retrieval
Artificial Intelligence
Computation and Language
We present an empirical study of embedding-based retrieval under realistic conversational settings, where queries are short, dialogue-like, and weakly specified, and retrieval corpora contain structured conversational artifacts. Focusing on Qwen3-embedding models, we identify a deployment-relevant robustness vulnerability: under conversational retrieval without query prompting, structured dialogue-style noise can become disproportionately retrievable and intrude into top-ranked results, despite being semantically uninformative. This failure mode emerges consistently across model scales, remains largely invisible under standard clean-query benchmarks, and is significantly more pronounced in Qwen3 than in earlier Qwen variants and other widely used dense retrieval baselines. We further show that lightweight query prompting qualitatively alters retrieval behavior, effectively suppressing noise intrusion and restoring ranking stability. Our findings highlight an underexplored robustness risk in conversational retrieval and underscore the importance of evaluation protocols that reflect the complexities of deployed systems.
title Robustness Risk of Conversational Retrieval: Identifying and Mitigating Noise Sensitivity in Qwen3-Embedding Model
topic Information Retrieval
Artificial Intelligence
Computation and Language
url https://arxiv.org/abs/2604.06176