Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tang, Jiwei, Huang, Zhijing, Zhang, Xinyu, Zhang, Chen Jason, Yu, Jianxing, Zheng, Libin, Meng, Rui, Yin, Jian
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.09463
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913108133412864
author	Tang, Jiwei Huang, Zhijing Zhang, Xinyu Zhang, Chen Jason Yu, Jianxing Zheng, Libin Meng, Rui Yin, Jian
author_facet	Tang, Jiwei Huang, Zhijing Zhang, Xinyu Zhang, Chen Jason Yu, Jianxing Zheng, Libin Meng, Rui Yin, Jian
contents	Large Language Models (LLMs) have demonstrated exceptional performance across diverse tasks. However, their deployment in long-context scenarios faces high computational overhead and information redundancy. While soft prompt compression has emerged as a promising way to mitigate these costs by compressing sequences into compact embeddings, existing paradigms remain fundamentally constrained by position bias: they primarily rely on learnable tokens insertion at fixed positions or group tokens according to their physical token layout, thereby inducing performance instability and semantic fragmentation. To overcome this bottleneck, we propose Semantic Consistency Context Compression (SeCo), a method that shifts context compression from position-driven to semantic-driven. Rather than constraint by physical token layout, SeCo dynamically anchors compression directly in the semantic space by selecting query-relevant tokens as semantic centers and aggregating remaining tokens via consistency-weighted merging. This design inherently preserves semantic consistency while eliminating position bias. Extensive experiments on 14 benchmarks across two backbone models demonstrate that SeCo consistently shows superiority in downstream tasks, inference latency, and out-of-domain robustness. The code is available at https://anonymous.4open.science/r/seco-EE5E.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_09463
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Beyond Position Bias: Shifting Context Compression from Position-Driven to Semantic-Driven Tang, Jiwei Huang, Zhijing Zhang, Xinyu Zhang, Chen Jason Yu, Jianxing Zheng, Libin Meng, Rui Yin, Jian Computation and Language Large Language Models (LLMs) have demonstrated exceptional performance across diverse tasks. However, their deployment in long-context scenarios faces high computational overhead and information redundancy. While soft prompt compression has emerged as a promising way to mitigate these costs by compressing sequences into compact embeddings, existing paradigms remain fundamentally constrained by position bias: they primarily rely on learnable tokens insertion at fixed positions or group tokens according to their physical token layout, thereby inducing performance instability and semantic fragmentation. To overcome this bottleneck, we propose Semantic Consistency Context Compression (SeCo), a method that shifts context compression from position-driven to semantic-driven. Rather than constraint by physical token layout, SeCo dynamically anchors compression directly in the semantic space by selecting query-relevant tokens as semantic centers and aggregating remaining tokens via consistency-weighted merging. This design inherently preserves semantic consistency while eliminating position bias. Extensive experiments on 14 benchmarks across two backbone models demonstrate that SeCo consistently shows superiority in downstream tasks, inference latency, and out-of-domain robustness. The code is available at https://anonymous.4open.science/r/seco-EE5E.
title	Beyond Position Bias: Shifting Context Compression from Position-Driven to Semantic-Driven
topic	Computation and Language
url	https://arxiv.org/abs/2605.09463

Similar Items