Saved in:
Bibliographic Details
Main Authors: Tang, Jiwei, Huang, Zhijing, Zhang, Xinyu, Zhang, Chen Jason, Yu, Jianxing, Zheng, Libin, Meng, Rui, Yin, Jian
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.09463
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913108133412864
author Tang, Jiwei
Huang, Zhijing
Zhang, Xinyu
Zhang, Chen Jason
Yu, Jianxing
Zheng, Libin
Meng, Rui
Yin, Jian
author_facet Tang, Jiwei
Huang, Zhijing
Zhang, Xinyu
Zhang, Chen Jason
Yu, Jianxing
Zheng, Libin
Meng, Rui
Yin, Jian
contents Large Language Models (LLMs) have demonstrated exceptional performance across diverse tasks. However, their deployment in long-context scenarios faces high computational overhead and information redundancy. While soft prompt compression has emerged as a promising way to mitigate these costs by compressing sequences into compact embeddings, existing paradigms remain fundamentally constrained by position bias: they primarily rely on learnable tokens insertion at fixed positions or group tokens according to their physical token layout, thereby inducing performance instability and semantic fragmentation. To overcome this bottleneck, we propose Semantic Consistency Context Compression (SeCo), a method that shifts context compression from position-driven to semantic-driven. Rather than constraint by physical token layout, SeCo dynamically anchors compression directly in the semantic space by selecting query-relevant tokens as semantic centers and aggregating remaining tokens via consistency-weighted merging. This design inherently preserves semantic consistency while eliminating position bias. Extensive experiments on 14 benchmarks across two backbone models demonstrate that SeCo consistently shows superiority in downstream tasks, inference latency, and out-of-domain robustness. The code is available at https://anonymous.4open.science/r/seco-EE5E.
format Preprint
id arxiv_https___arxiv_org_abs_2605_09463
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Beyond Position Bias: Shifting Context Compression from Position-Driven to Semantic-Driven
Tang, Jiwei
Huang, Zhijing
Zhang, Xinyu
Zhang, Chen Jason
Yu, Jianxing
Zheng, Libin
Meng, Rui
Yin, Jian
Computation and Language
Large Language Models (LLMs) have demonstrated exceptional performance across diverse tasks. However, their deployment in long-context scenarios faces high computational overhead and information redundancy. While soft prompt compression has emerged as a promising way to mitigate these costs by compressing sequences into compact embeddings, existing paradigms remain fundamentally constrained by position bias: they primarily rely on learnable tokens insertion at fixed positions or group tokens according to their physical token layout, thereby inducing performance instability and semantic fragmentation. To overcome this bottleneck, we propose Semantic Consistency Context Compression (SeCo), a method that shifts context compression from position-driven to semantic-driven. Rather than constraint by physical token layout, SeCo dynamically anchors compression directly in the semantic space by selecting query-relevant tokens as semantic centers and aggregating remaining tokens via consistency-weighted merging. This design inherently preserves semantic consistency while eliminating position bias. Extensive experiments on 14 benchmarks across two backbone models demonstrate that SeCo consistently shows superiority in downstream tasks, inference latency, and out-of-domain robustness. The code is available at https://anonymous.4open.science/r/seco-EE5E.
title Beyond Position Bias: Shifting Context Compression from Position-Driven to Semantic-Driven
topic Computation and Language
url https://arxiv.org/abs/2605.09463