:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Liang, Shi, Bowen, Hu, Yitao, Zhang, Jiawei, Li, Ruofan, Chen, Sheng, Li, Wenxin, Li, Keqiu
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.06562
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
by: Zhao, Zhixin, et al.
Published: (2024)

PAT: Accelerating LLM Decoding via Prefix-Aware Attention with Resource Efficient Multi-Tile Kernel
by: Yi, Jinjun, et al.
Published: (2025)

RAGPulse: An Open-Source RAG Workload Trace to Optimize RAG Serving Systems
by: Wang, Zhengchao, et al.
Published: (2025)

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
by: Liu, Xiaoran, et al.
Published: (2025)

Taming Wild Knots with Mosaics
by: Deng, Mary Y., et al.
Published: (2026)

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
by: Jin, Bowen, et al.
Published: (2024)

UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference
by: Zhou, Lang, et al.
Published: (2026)

A multi‐dimensional incentive mechanism based on age of update in hierarchical federated learning
by: Zhaohua Zheng, et al.
Published: (2024)

Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching
by: Pang, Bowen, et al.
Published: (2025)

Taming Stable Diffusion for Computed Tomography Blind Super-Resolution
by: Li, Chunlei, et al.
Published: (2025)

Are Large Language Models In-Context Graph Learners?
by: Li, Jintang, et al.
Published: (2025)

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
by: Wang, Zihao, et al.
Published: (2024)

Training-Inference Consistent Segmented Execution for Long-Context LLMs
by: Shang, Xianpeng, et al.
Published: (2026)

Long-Context Speech Synthesis with Context-Aware Memory
by: Li, Zhipeng, et al.
Published: (2025)

Memory Mosaics
by: Zhang, Jianyu, et al.
Published: (2024)

ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs
by: Sui, Yifan, et al.
Published: (2025)

LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling
by: Tang, Zecheng, et al.
Published: (2025)

XKV: Personalized KV Cache Memory Reduction for Long-Context LLM Inference
by: Li, Weizhuo, et al.
Published: (2024)

Lookahead Path Likelihood Optimization for Diffusion LLMs
by: Liu, Xuejie, et al.
Published: (2026)

HyperMem: Hypergraph Memory for Long-Term Conversations
by: Yue, Juwei, et al.
Published: (2026)

Taming the Memory Footprint Crisis: System Design for Production Diffusion LLM Serving
by: Fan, Jiakun, et al.
Published: (2025)

LLMs Know What to Drop: Self-Attention Guided KV Cache Eviction for Efficient Long-Context Inference
by: Wang, Guangtao, et al.
Published: (2025)

Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning
by: Chen, Zhuoen, et al.
Published: (2026)

Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
by: Sivtsov, Danil, et al.
Published: (2025)

Efficient Long-Context LLM Inference via KV Cache Clustering
by: Hu, Jie, et al.
Published: (2025)

DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
by: Duan, Zheng-Peng, et al.
Published: (2025)

Memory Mosaics at scale
by: Zhang, Jianyu, et al.
Published: (2025)

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization
by: Shen, Weizhou, et al.
Published: (2025)

Linear recurrence sequences and palindromic concatenations of two repdigits in base $β$
by: Li, Ruofan
Published: (2026)

Dynamic Vocabulary Pruning: Stable LLM-RL by Taming the Tail
by: Li, Yingru, et al.
Published: (2025)

CoDiCast: Conditional Diffusion Model for Global Weather Prediction with Uncertainty Quantification
by: Shi, Jimeng, et al.
Published: (2024)

Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026)

Preserving Cross-Modal Stability for Visual Unlearning in Multimodal Scenarios
by: Li, Jinghan Xu Yuyang Zhang Qixuan Cai Jiancheng Chen Keqiu
Published: (2025)

Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding
by: Zhou, Yigeng, et al.
Published: (2026)

Chameleon: Taming Dynamic Operator Sequences for Memory-Intensive LLM Training
by: Wang, Zibo, et al.
Published: (2025)

MiA-Signature: Approximating Global Activation for Long-Context Understanding
by: Li, Yuqing, et al.
Published: (2026)

A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
by: Ge, Suyu, et al.
Published: (2024)

Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
by: Xiao, Qingfa, et al.
Published: (2025)

DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs
by: Zhou, Xiabin, et al.
Published: (2024)

TalkMosaic: Interactive PhotoMosaic with Multi-modal LLM Q&A Interactions
by: Li, Kevin, et al.
Published: (2024)