Saved in:
| Main Authors: | Gao, Yifei, Wang, Lei, Tu, Rong-Cheng, Zhang, Qixin, Cheng, Jun, Tao, Dacheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08329 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
by: Nadali, Alireza, et al.
Published: (2026)
by: Nadali, Alireza, et al.
Published: (2026)
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
by: Gao, Yifei, et al.
Published: (2024)
by: Gao, Yifei, et al.
Published: (2024)
Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs
by: Cui, Wanyun, et al.
Published: (2025)
by: Cui, Wanyun, et al.
Published: (2025)
SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference
by: Chauhan, Anay, et al.
Published: (2026)
by: Chauhan, Anay, et al.
Published: (2026)
OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration
by: Ma, Xinyue, et al.
Published: (2026)
by: Ma, Xinyue, et al.
Published: (2026)
Large Language Models as Oracles for Ontology Alignment
by: Lushnei, Sviatoslav, et al.
Published: (2025)
by: Lushnei, Sviatoslav, et al.
Published: (2025)
LCFO: Long Context and Long Form Output Dataset and Benchmarking
by: Costa-jussà, Marta R., et al.
Published: (2024)
by: Costa-jussà, Marta R., et al.
Published: (2024)
SQLord: A Robust Enterprise Text-to-SQL Solution via Reverse Data Generation and Workflow Decomposition
by: Cheng, Song, et al.
Published: (2025)
by: Cheng, Song, et al.
Published: (2025)
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
by: Ashuach, Tomer, et al.
Published: (2025)
by: Ashuach, Tomer, et al.
Published: (2025)
Reflective Translation: Improving Low-Resource Machine Translation via Structured Self-Reflection
by: Cheng, Nicholas
Published: (2026)
by: Cheng, Nicholas
Published: (2026)
Algorithmic Consequences of Particle Filters for Sentence Processing: Amplified Garden-Paths and Digging-In Effects
by: Maina-Kilaas, Amani, et al.
Published: (2026)
by: Maina-Kilaas, Amani, et al.
Published: (2026)
Is (Selective) Round-To-Nearest Quantization All You Need?
by: Kogan, Alex
Published: (2025)
by: Kogan, Alex
Published: (2025)
Counterfactual Causal Inference in Natural Language with Large Language Models
by: Gendron, Gaël, et al.
Published: (2024)
by: Gendron, Gaël, et al.
Published: (2024)
Improving ML Training Data with Gold-Standard Quality Metrics
by: Barrett, Leslie, et al.
Published: (2025)
by: Barrett, Leslie, et al.
Published: (2025)
Overcoming Long-Context Limitations of State-Space Models via Context-Dependent Sparse Attention
by: Zhan, Zhihao, et al.
Published: (2025)
by: Zhan, Zhihao, et al.
Published: (2025)
VIGOR+: Iterative Confounder Generation and Validation via LLM-CEVAE Feedback Loop
by: Zhu, JiaWei, et al.
Published: (2025)
by: Zhu, JiaWei, et al.
Published: (2025)
Resonant Context Anchoring: Decoupling Attention Routing and Signal Gain at Inference Time
by: Zhao, Mingkuan, et al.
Published: (2026)
by: Zhao, Mingkuan, et al.
Published: (2026)
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
by: Ge, Danying, et al.
Published: (2025)
by: Ge, Danying, et al.
Published: (2025)
DWFS-Obfuscation: Dynamic Weighted Feature Selection for Robust Malware Familial Classification under Obfuscation
by: Wei, Xingyuan, et al.
Published: (2025)
by: Wei, Xingyuan, et al.
Published: (2025)
TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL
by: Bian, Tingcheng, et al.
Published: (2026)
by: Bian, Tingcheng, et al.
Published: (2026)
Improving Retrospective Language Agents via Joint Policy Gradient Optimization
by: Feng, Xueyang, et al.
Published: (2025)
by: Feng, Xueyang, et al.
Published: (2025)
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
by: Petrov, Egor, et al.
Published: (2025)
by: Petrov, Egor, et al.
Published: (2025)
Semantic Decomposition and Selective Context Filtering -- Text Processing Techniques for Context-Aware NLP-Based Systems
by: Villardar, Karl John
Published: (2025)
by: Villardar, Karl John
Published: (2025)
QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization
by: Khanna, Danush, et al.
Published: (2025)
by: Khanna, Danush, et al.
Published: (2025)
Efficient Reasoning via Thought-Training and Thought-Free Inference
by: Wu, Canhui, et al.
Published: (2025)
by: Wu, Canhui, et al.
Published: (2025)
The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving
by: Luyten, Max Ruiz, et al.
Published: (2026)
by: Luyten, Max Ruiz, et al.
Published: (2026)
Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction
by: Garcia, Gabriel
Published: (2026)
by: Garcia, Gabriel
Published: (2026)
Pareto-Optimized Open-Source LLMs for Healthcare via Context Retrieval
by: Bayarri-Planas, Jordi, et al.
Published: (2024)
by: Bayarri-Planas, Jordi, et al.
Published: (2024)
Graph-GRPO: Dependency-Aware Credit Assignment for Generative E-commerce Search Relevance
by: Che, Jiarui, et al.
Published: (2026)
by: Che, Jiarui, et al.
Published: (2026)
EmoLoom-2B: Fast Base-Model Screening for Emotion Classification and VAD with Lexicon-Weak Supervision and KV-Off Evaluation
by: Li, Zilin, et al.
Published: (2026)
by: Li, Zilin, et al.
Published: (2026)
Are Generative Models Underconfident? Better Quality Estimation with Boosted Model Probability
by: Dinh, Tu Anh, et al.
Published: (2025)
by: Dinh, Tu Anh, et al.
Published: (2025)
Sigmoid Head for Quality Estimation under Language Ambiguity
by: Dinh, Tu Anh, et al.
Published: (2026)
by: Dinh, Tu Anh, et al.
Published: (2026)
On Initializing Transformers with Pre-trained Embeddings
by: Kim, Ha Young, et al.
Published: (2024)
by: Kim, Ha Young, et al.
Published: (2024)
Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR
by: Chang, Sidi, et al.
Published: (2026)
by: Chang, Sidi, et al.
Published: (2026)
Sparse Regression for Machine Translation
by: Biçici, Ergun
Published: (2024)
by: Biçici, Ergun
Published: (2024)
On the Limits of Learned Importance Scoring for KV Cache Compression
by: Steele, Brady
Published: (2026)
by: Steele, Brady
Published: (2026)
Distribution-Free Uncertainty Quantification for Continuous AI Agent Evaluation
by: Gao, Yuxuan, et al.
Published: (2026)
by: Gao, Yuxuan, et al.
Published: (2026)
An Evaluation of the Pedagogical Soundness and Usability of AI-Generated Lesson Plans Across Different Models and Prompt Frameworks in High-School Physics
by: Liu, Xincheng
Published: (2025)
by: Liu, Xincheng
Published: (2025)
Hierarchical Dual-Head Model for Suicide Risk Assessment via MentalRoBERTa
by: Yang, Chang, et al.
Published: (2025)
by: Yang, Chang, et al.
Published: (2025)
The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework
by: Shah, Aakriti, et al.
Published: (2025)
by: Shah, Aakriti, et al.
Published: (2025)
Similar Items
-
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
by: Nadali, Alireza, et al.
Published: (2026) -
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
by: Gao, Yifei, et al.
Published: (2024) -
Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs
by: Cui, Wanyun, et al.
Published: (2025) -
SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference
by: Chauhan, Anay, et al.
Published: (2026) -
OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration
by: Ma, Xinyue, et al.
Published: (2026)