Saved in:
| Main Authors: | Chen, Xihao, Guo, Yangyang, Zimmermann, Roger |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.00789 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
by: Yang, Yaoxin, et al.
Published: (2025)
by: Yang, Yaoxin, et al.
Published: (2025)
LagKV: Lag-Relative Information of the KV Cache Tells Which Tokens Are Important
by: Liang, Manlai, et al.
Published: (2025)
by: Liang, Manlai, et al.
Published: (2025)
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching
by: Aiello, Emanuele, et al.
Published: (2024)
by: Aiello, Emanuele, et al.
Published: (2024)
Efficient Long-Horizon GUI Agents via Training-Free KV Cache Compression
by: Zhou, Bowen, et al.
Published: (2026)
by: Zhou, Bowen, et al.
Published: (2026)
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
by: Zhao, Yaqi, et al.
Published: (2024)
by: Zhao, Yaqi, et al.
Published: (2024)
Vlogger: Make Your Dream A Vlog
by: Zhuang, Shaobin, et al.
Published: (2024)
by: Zhuang, Shaobin, et al.
Published: (2024)
Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models
by: Sai, Vishnu, et al.
Published: (2026)
by: Sai, Vishnu, et al.
Published: (2026)
Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck
by: Bulat, Adrian, et al.
Published: (2025)
by: Bulat, Adrian, et al.
Published: (2025)
Free$^2$Guide: Training-Free Text-to-Video Alignment using Image LVLM
by: Kim, Jaemin, et al.
Published: (2024)
by: Kim, Jaemin, et al.
Published: (2024)
Rethinking Token-wise Feature Caching: Accelerating Diffusion Transformers with Dual Feature Caching
by: Zou, Chang, et al.
Published: (2024)
by: Zou, Chang, et al.
Published: (2024)
Pushing the Frontier of Black-Box LVLM Attacks via Fine-Grained Detail Targeting
by: Zhao, Xiaohan, et al.
Published: (2026)
by: Zhao, Xiaohan, et al.
Published: (2026)
Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model
by: Yan, Hao, et al.
Published: (2024)
by: Yan, Hao, et al.
Published: (2024)
Quantized Keys Steal Attention: Bias Correction for KV-Cache Compression in Video Diffusion
by: Tuncer, Tuna, et al.
Published: (2026)
by: Tuncer, Tuna, et al.
Published: (2026)
Test-Time Training with KV Binding Is Secretly Linear Attention
by: Liu, Junchen, et al.
Published: (2026)
by: Liu, Junchen, et al.
Published: (2026)
Your Image Generator Is Your New Private Dataset
by: Resmini, Nicolo, et al.
Published: (2025)
by: Resmini, Nicolo, et al.
Published: (2025)
OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
by: Chu, Huanpeng, et al.
Published: (2025)
by: Chu, Huanpeng, et al.
Published: (2025)
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?
by: Feng, Mingqian, et al.
Published: (2024)
by: Feng, Mingqian, et al.
Published: (2024)
WorldCache: Content-Aware Caching for Accelerated Video World Models
by: Nawaz, Umair, et al.
Published: (2026)
by: Nawaz, Umair, et al.
Published: (2026)
Accelerating Diffusion Transformers with Token-wise Feature Caching
by: Zou, Chang, et al.
Published: (2024)
by: Zou, Chang, et al.
Published: (2024)
Your VAR Model is Secretly an Efficient and Explainable Generative Classifier
by: Chen, Yi-Chung, et al.
Published: (2025)
by: Chen, Yi-Chung, et al.
Published: (2025)
MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference
by: Gao, Huanlin, et al.
Published: (2026)
by: Gao, Huanlin, et al.
Published: (2026)
Graph Your Own Prompt
by: Ding, Xi, et al.
Published: (2025)
by: Ding, Xi, et al.
Published: (2025)
SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching
by: Liu, Jiacheng, et al.
Published: (2025)
by: Liu, Jiacheng, et al.
Published: (2025)
Building Efficient Lightweight CNN Models
by: Isong, Nathan
Published: (2025)
by: Isong, Nathan
Published: (2025)
FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching
by: Liu, Jiacheng, et al.
Published: (2025)
by: Liu, Jiacheng, et al.
Published: (2025)
Open Your Eyes: Vision Enhances Message Passing Neural Networks in Link Prediction
by: Wei, Yanbin, et al.
Published: (2025)
by: Wei, Yanbin, et al.
Published: (2025)
LightPneumoNet: Lightweight Pneumonia Classifier
by: Chauhan, Neilansh, et al.
Published: (2025)
by: Chauhan, Neilansh, et al.
Published: (2025)
Fake it till You Make it: Reward Modeling as Discriminative Prediction
by: Liu, Runtao, et al.
Published: (2025)
by: Liu, Runtao, et al.
Published: (2025)
Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting
by: Li, Yuqi, et al.
Published: (2025)
by: Li, Yuqi, et al.
Published: (2025)
VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture
by: Li, Long, et al.
Published: (2025)
by: Li, Long, et al.
Published: (2025)
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
by: Liu, Jiacheng, et al.
Published: (2025)
by: Liu, Jiacheng, et al.
Published: (2025)
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
To Trust Or Not To Trust Your Vision-Language Model's Prediction
by: Dong, Hao, et al.
Published: (2025)
by: Dong, Hao, et al.
Published: (2025)
Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery
by: Xu, Yangyang, et al.
Published: (2026)
by: Xu, Yangyang, et al.
Published: (2026)
Lightweight Cloud Masking Models for On-Board Inference in Hyperspectral Imaging
by: Ali, Mazen, et al.
Published: (2025)
by: Ali, Mazen, et al.
Published: (2025)
Less is More: AMBER-AFNO -- a New Benchmark for Lightweight 3D Medical Image Segmentation
by: Dosi, Andrea, et al.
Published: (2025)
by: Dosi, Andrea, et al.
Published: (2025)
A More Word-like Image Tokenization for MLLMs
by: Lee, Hyun, et al.
Published: (2026)
by: Lee, Hyun, et al.
Published: (2026)
Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving
by: Sun, Zhengqi, et al.
Published: (2026)
by: Sun, Zhengqi, et al.
Published: (2026)
Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation
by: Kim, Joong Ho, et al.
Published: (2026)
by: Kim, Joong Ho, et al.
Published: (2026)
A Lightweight Neural Architecture Search Model for Medical Image Classification
by: Xie, Lunchen, et al.
Published: (2024)
by: Xie, Lunchen, et al.
Published: (2024)
Similar Items
-
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
by: Yang, Yaoxin, et al.
Published: (2025) -
LagKV: Lag-Relative Information of the KV Cache Tells Which Tokens Are Important
by: Liang, Manlai, et al.
Published: (2025) -
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching
by: Aiello, Emanuele, et al.
Published: (2024) -
Efficient Long-Horizon GUI Agents via Training-Free KV Cache Compression
by: Zhou, Bowen, et al.
Published: (2026) -
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
by: Zhao, Yaqi, et al.
Published: (2024)