Saved in:
| Main Authors: | Li, Wenhao, Yu, Daohai, Luo, Gen, Zhang, Yuxin, Chao, Fei, Ji, Rongrong, Wu, Yifan, Liu, Jiaxin, Gong, Ziyang, Liao, Zimu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.02108 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training Long-Context LLMs Efficiently via Chunk-wise Optimization
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference
by: Huang, Zhaohong, et al.
Published: (2026)
by: Huang, Zhaohong, et al.
Published: (2026)
Beyond a Million Tokens: Benchmarking and Enhancing Long-Term Memory in LLMs
by: Tavakoli, Mohammad, et al.
Published: (2025)
by: Tavakoli, Mohammad, et al.
Published: (2025)
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
by: Li, Wenhao, et al.
Published: (2024)
by: Li, Wenhao, et al.
Published: (2024)
Towards Efficient Automatic Self-Pruning of Large Language Models
by: Huang, Weizhong, et al.
Published: (2025)
by: Huang, Weizhong, et al.
Published: (2025)
Boosting the Cross-Architecture Generalization of Dataset Distillation through an Empirical Study
by: Zhao, Lirui, et al.
Published: (2023)
by: Zhao, Lirui, et al.
Published: (2023)
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
by: Xiao, Chaojun, et al.
Published: (2024)
by: Xiao, Chaojun, et al.
Published: (2024)
SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence
by: Gong, Ziyang, et al.
Published: (2025)
by: Gong, Ziyang, et al.
Published: (2025)
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
by: Luo, Gen, et al.
Published: (2024)
by: Luo, Gen, et al.
Published: (2024)
Memory as a Markov Matrix: Sample Efficient Knowledge Expansion via Token-to-Dictionary Mapping
by: Pethkar, Kaustubh, et al.
Published: (2026)
by: Pethkar, Kaustubh, et al.
Published: (2026)
Memory-Efficient Training with In-Place FFT Implementation
by: Ding, Xinyu, et al.
Published: (2025)
by: Ding, Xinyu, et al.
Published: (2025)
In Memory of Millions: The Holocaust Museum Library.
by: Chepesiuk, Ron
Published: (1996)
by: Chepesiuk, Ron
Published: (1996)
Context Parallelism for Scalable Million-Token Inference
by: Yang, Amy, et al.
Published: (2024)
by: Yang, Amy, et al.
Published: (2024)
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective
by: Huang, Weizhong, et al.
Published: (2025)
by: Huang, Weizhong, et al.
Published: (2025)
Prototype-Based Test-Time Adaptation of Vision-Language Models
by: Huang, Zhaohong, et al.
Published: (2026)
by: Huang, Zhaohong, et al.
Published: (2026)
Learning Image Demoireing from Unpaired Real Data
by: Zhong, Yunshan, et al.
Published: (2024)
by: Zhong, Yunshan, et al.
Published: (2024)
GS-Bias: Global-Spatial Bias Learner for Single-Image Test-Time Adaptation of Vision-Language Models
by: Huang, Zhaohong, et al.
Published: (2025)
by: Huang, Zhaohong, et al.
Published: (2025)
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
by: Luo, Qijun, et al.
Published: (2025)
by: Luo, Qijun, et al.
Published: (2025)
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
by: Zhu, Dawei, et al.
Published: (2023)
by: Zhu, Dawei, et al.
Published: (2023)
VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
by: Jiang, Pengfei, et al.
Published: (2025)
by: Jiang, Pengfei, et al.
Published: (2025)
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences
by: Bekman, Stas, et al.
Published: (2025)
by: Bekman, Stas, et al.
Published: (2025)
Hypertokens: Holographic Associative Memory in Tokenized LLMs
by: Augeri, Christopher James
Published: (2025)
by: Augeri, Christopher James
Published: (2025)
Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation
by: Zhou, Jinxing, et al.
Published: (2025)
by: Zhou, Jinxing, et al.
Published: (2025)
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget
by: Wang, Kun, et al.
Published: (2024)
by: Wang, Kun, et al.
Published: (2024)
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
by: Shen, Yunhang, et al.
Published: (2025)
by: Shen, Yunhang, et al.
Published: (2025)
LightCache: Memory-Efficient, Training-Free Acceleration for Video Generation
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
by: He, Linda, et al.
Published: (2025)
by: He, Linda, et al.
Published: (2025)
Context Distillation as Latent Memory Management
by: Zheng, Ziyang, et al.
Published: (2026)
by: Zheng, Ziyang, et al.
Published: (2026)
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
by: Wu, Mingrui, et al.
Published: (2024)
by: Wu, Mingrui, et al.
Published: (2024)
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
by: Zhang, Yuxin, et al.
Published: (2023)
by: Zhang, Yuxin, et al.
Published: (2023)
KTV: Keyframes and Key Tokens Selection for Efficient Training-Free Video LLMs
by: Song, Baiyang, et al.
Published: (2026)
by: Song, Baiyang, et al.
Published: (2026)
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs
by: Synk, Ryan, et al.
Published: (2025)
by: Synk, Ryan, et al.
Published: (2025)
Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens
by: Huang, Ting-Ji, et al.
Published: (2024)
by: Huang, Ting-Ji, et al.
Published: (2024)
Memory Efficient Matting with Adaptive Token Routing
by: Lin, Yiheng, et al.
Published: (2024)
by: Lin, Yiheng, et al.
Published: (2024)
Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding
by: Zhou, Yigeng, et al.
Published: (2026)
by: Zhou, Yigeng, et al.
Published: (2026)
Capacitance and Conductance Compensation Methods for Efficient Computing‐In‐Memory Designs
by: Yubiao Luo, et al.
Published: (2024)
by: Yubiao Luo, et al.
Published: (2024)
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
by: Liu, Liming, et al.
Published: (2025)
by: Liu, Liming, et al.
Published: (2025)
AstraNav-Memory: Contexts Compression for Long Memory
by: Ren, Botao, et al.
Published: (2025)
by: Ren, Botao, et al.
Published: (2025)
Similar Items
-
Training Long-Context LLMs Efficiently via Chunk-wise Optimization
by: Li, Wenhao, et al.
Published: (2025) -
Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
by: Li, Wenhao, et al.
Published: (2025) -
CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling
by: Li, Wenhao, et al.
Published: (2025) -
ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference
by: Huang, Zhaohong, et al.
Published: (2026) -
Beyond a Million Tokens: Benchmarking and Enhancing Long-Term Memory in LLMs
by: Tavakoli, Mohammad, et al.
Published: (2025)