Saved in:
| Main Authors: | Xie, Wenxuan, Wang, Yujia, Tan, Xin, Lu, Chaochao, Hu, Xia, Wang, Xuhong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.10021 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing
by: Wang, Changyue, et al.
Published: (2025)
by: Wang, Changyue, et al.
Published: (2025)
DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning
by: Gao, Yaxin, et al.
Published: (2025)
by: Gao, Yaxin, et al.
Published: (2025)
Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026)
by: Wu, Zimeng, et al.
Published: (2026)
Dynamic Adversarial Reinforcement Learning for Robust Multimodal Large Language Models
by: Bao, Yicheng, et al.
Published: (2026)
by: Bao, Yicheng, et al.
Published: (2026)
Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
by: Jo, Dongwon, et al.
Published: (2026)
by: Jo, Dongwon, et al.
Published: (2026)
VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts
by: Liu, Xin, et al.
Published: (2025)
by: Liu, Xin, et al.
Published: (2025)
Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
by: Weng, Fenghua, et al.
Published: (2025)
by: Weng, Fenghua, et al.
Published: (2025)
Dual-Density Inference for Efficient Language Model Reasoning
by: Zhao, Zhengyi, et al.
Published: (2025)
by: Zhao, Zhengyi, et al.
Published: (2025)
From Implicit to Explicit: Token-Efficient Logical Supervision for Mathematical Reasoning in LLMs
by: Wang, Shaojie, et al.
Published: (2026)
by: Wang, Shaojie, et al.
Published: (2026)
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)
by: Wu, Wei, et al.
Published: (2024)
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
by: Peng, Bo, et al.
Published: (2025)
by: Peng, Bo, et al.
Published: (2025)
D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
by: Wan, Zhongwei, et al.
Published: (2024)
by: Wan, Zhongwei, et al.
Published: (2024)
LoPT: Lossless Parallel Tokenization Acceleration for Long Context Inference of Large Language Model
by: Shao, Wei, et al.
Published: (2025)
by: Shao, Wei, et al.
Published: (2025)
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
by: Wan, Zhongwei, et al.
Published: (2025)
by: Wan, Zhongwei, et al.
Published: (2025)
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
by: Fu, Qichen, et al.
Published: (2024)
by: Fu, Qichen, et al.
Published: (2024)
DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference
by: Liu, Xiang, et al.
Published: (2025)
by: Liu, Xiang, et al.
Published: (2025)
DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning
by: Zarch, Hossein Entezari, et al.
Published: (2025)
by: Zarch, Hossein Entezari, et al.
Published: (2025)
Implicit Hierarchical GRPO: Decoupling Tool Invocation from Execution for Tool-Integrated Mathematical Reasoning
by: Wang, Li, et al.
Published: (2026)
by: Wang, Li, et al.
Published: (2026)
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs
by: Synk, Ryan, et al.
Published: (2025)
by: Synk, Ryan, et al.
Published: (2025)
SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
by: Jeong, Soyeong, et al.
Published: (2025)
by: Jeong, Soyeong, et al.
Published: (2025)
Training-free Context-adaptive Attention for Efficient Long Context Modeling
by: You, Zeng, et al.
Published: (2025)
by: You, Zeng, et al.
Published: (2025)
Efficient Long-Context LLM Inference via KV Cache Clustering
by: Hu, Jie, et al.
Published: (2025)
by: Hu, Jie, et al.
Published: (2025)
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
by: Fei, Weizhi, et al.
Published: (2025)
by: Fei, Weizhi, et al.
Published: (2025)
Latent-Condensed Transformer for Efficient Long Context Modeling
by: You, Zeng, et al.
Published: (2026)
by: You, Zeng, et al.
Published: (2026)
JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation
by: Liu, Bingyang Kelvin, et al.
Published: (2025)
by: Liu, Bingyang Kelvin, et al.
Published: (2025)
Tokenization Falling Short: On Subword Robustness in Large Language Models
by: Chai, Yekun, et al.
Published: (2024)
by: Chai, Yekun, et al.
Published: (2024)
Membership Inference Attack against Long-Context Large Language Models
by: Wang, Zixiong, et al.
Published: (2024)
by: Wang, Zixiong, et al.
Published: (2024)
Decoupling Understanding from Reasoning via Problem Space Mapping for Small-Scale Model Reasoning
by: Wang, Li, et al.
Published: (2025)
by: Wang, Li, et al.
Published: (2025)
Enhancing Retrieval Systems with Inference-Time Logical Reasoning
by: Faltings, Felix, et al.
Published: (2025)
by: Faltings, Felix, et al.
Published: (2025)
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
by: Hu, Zhiyuan, et al.
Published: (2024)
by: Hu, Zhiyuan, et al.
Published: (2024)
Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models
by: Chen, Wei, et al.
Published: (2024)
by: Chen, Wei, et al.
Published: (2024)
R$^2$PO: Decoupling Training Trajectories from Inference Responses for LLM Reasoning
by: Wang, Jingchu, et al.
Published: (2026)
by: Wang, Jingchu, et al.
Published: (2026)
ATACompressor: Adaptive Task-Aware Compression for Efficient Long-Context Processing in LLMs
by: Li, Xuancheng, et al.
Published: (2026)
by: Li, Xuancheng, et al.
Published: (2026)
SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
by: Long, Lingkun, et al.
Published: (2025)
by: Long, Lingkun, et al.
Published: (2025)
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning
by: Wang, Yibo, et al.
Published: (2026)
by: Wang, Yibo, et al.
Published: (2026)
Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework
by: Zhang, Chenyuan, et al.
Published: (2026)
by: Zhang, Chenyuan, et al.
Published: (2026)
HyLRA: Hybrid Layer Reuse Attention for Efficient Long-Context Inference
by: Ai, Xuan, et al.
Published: (2026)
by: Ai, Xuan, et al.
Published: (2026)
Think Through Uncertainty: Improving Long-Form Generation Factuality via Reasoning Calibration
by: Liu, Xin, et al.
Published: (2026)
by: Liu, Xin, et al.
Published: (2026)
Think Dense, Not Long: Dynamic Decoupled Conditional Advantage for Efficient Reasoning
by: Peng, Keqin, et al.
Published: (2026)
by: Peng, Keqin, et al.
Published: (2026)
Similar Items
-
Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing
by: Wang, Changyue, et al.
Published: (2025) -
DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning
by: Gao, Yaxin, et al.
Published: (2025) -
Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026) -
Dynamic Adversarial Reinforcement Learning for Robust Multimodal Large Language Models
by: Bao, Yicheng, et al.
Published: (2026) -
Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
by: Jo, Dongwon, et al.
Published: (2026)