Saved in:
| Main Authors: | Liang, Yuxuan, Li, Xu, Chen, Xiaolei, Zheng, Yi, Chen, Haotian, Li, Bin, Xue, Xiangyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15704 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models
by: Li, Xu, et al.
Published: (2025)
by: Li, Xu, et al.
Published: (2025)
ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models
by: Li, Xu, et al.
Published: (2026)
by: Li, Xu, et al.
Published: (2026)
Global Semantic-Guided Sub-image Feature Weight Allocation in High-Resolution Large Vision-Language Models
by: Liang, Yuxuan, et al.
Published: (2025)
by: Liang, Yuxuan, et al.
Published: (2025)
Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models
by: Li, Xu, et al.
Published: (2024)
by: Li, Xu, et al.
Published: (2024)
Efficient Vision-Language Reasoning via Adaptive Token Pruning
by: Li, Xue, et al.
Published: (2025)
by: Li, Xue, et al.
Published: (2025)
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
by: Luo, Junwei, et al.
Published: (2025)
by: Luo, Junwei, et al.
Published: (2025)
HiPrune: Hierarchical Attention for Efficient Token Pruning in Vision-Language Models
by: Liu, Jizhihui, et al.
Published: (2025)
by: Liu, Jizhihui, et al.
Published: (2025)
PLPHP: Per-Layer Per-Head Vision Token Pruning for Efficient Large Vision-Language Models
by: Meng, Yu, et al.
Published: (2025)
by: Meng, Yu, et al.
Published: (2025)
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
by: Li, Kaiyuan, et al.
Published: (2025)
by: Li, Kaiyuan, et al.
Published: (2025)
Decoupled Similarity for Task-Aware Token Pruning in Large Vision-Language Models
by: Ma, Kexin, et al.
Published: (2026)
by: Ma, Kexin, et al.
Published: (2026)
RedVTP: Training-Free Acceleration of Diffusion Vision-Language Models Inference via Masked Token-Guided Visual Token Pruning
by: Xu, Jingqi, et al.
Published: (2025)
by: Xu, Jingqi, et al.
Published: (2025)
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
by: Cao, Jianjian, et al.
Published: (2024)
by: Cao, Jianjian, et al.
Published: (2024)
TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model
by: Li, Ao, et al.
Published: (2025)
by: Li, Ao, et al.
Published: (2025)
VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
by: Chen, Hanning, et al.
Published: (2024)
by: Chen, Hanning, et al.
Published: (2024)
Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models
by: Luan, Bozhi, et al.
Published: (2025)
by: Luan, Bozhi, et al.
Published: (2025)
Object-Centric Vision Token Pruning for Vision Language Models
by: Li, Guangyuan, et al.
Published: (2025)
by: Li, Guangyuan, et al.
Published: (2025)
IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models
by: Lee, Dong-Jae, et al.
Published: (2026)
by: Lee, Dong-Jae, et al.
Published: (2026)
When Large Vision-Language Models Meet Person Re-Identification
by: Wang, Qizao, et al.
Published: (2024)
by: Wang, Qizao, et al.
Published: (2024)
QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
by: Wang, Xinhao, et al.
Published: (2026)
by: Wang, Xinhao, et al.
Published: (2026)
SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models
by: Wang, Zekun, et al.
Published: (2023)
by: Wang, Zekun, et al.
Published: (2023)
LVPruning: An Effective yet Simple Language-Guided Vision Token Pruning Approach for Multi-modal Large Language Models
by: Sun, Yizheng, et al.
Published: (2025)
by: Sun, Yizheng, et al.
Published: (2025)
IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning
by: Sun, Zhichao, et al.
Published: (2026)
by: Sun, Zhichao, et al.
Published: (2026)
EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs
by: Chen, Yuhao, et al.
Published: (2026)
by: Chen, Yuhao, et al.
Published: (2026)
Prune Redundancy, Preserve Essence: Vision Token Compression in VLMs via Synergistic Importance-Diversity
by: Fang, Zhengyao, et al.
Published: (2026)
by: Fang, Zhengyao, et al.
Published: (2026)
Adaptive Pruning for Large Language Models with Structural Importance Awareness
by: Zheng, Haotian, et al.
Published: (2024)
by: Zheng, Haotian, et al.
Published: (2024)
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models
by: Lee, Jaewoo, et al.
Published: (2025)
by: Lee, Jaewoo, et al.
Published: (2025)
Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents
by: Xu, Zhou, et al.
Published: (2026)
by: Xu, Zhou, et al.
Published: (2026)
Towards Joint Quantization and Token Pruning of Vision-Language Models
by: Li, Xinqing, et al.
Published: (2026)
by: Li, Xinqing, et al.
Published: (2026)
History-Conditioned Spatio-Temporal Visual Token Pruning for Efficient Vision-Language Navigation
by: Wang, Qitong, et al.
Published: (2026)
by: Wang, Qitong, et al.
Published: (2026)
HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit
by: Wu, Hao, et al.
Published: (2026)
by: Wu, Hao, et al.
Published: (2026)
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
by: Arif, Kazi Hasan Ibn, et al.
Published: (2024)
by: Arif, Kazi Hasan Ibn, et al.
Published: (2024)
Energy-Driven Adaptive Visual Token Pruning for Efficient Vision-Language Models
by: He, Jialuo, et al.
Published: (2026)
by: He, Jialuo, et al.
Published: (2026)
EgoPrune: Efficient Token Pruning for Egomotion Video Reasoning in Embodied Agent
by: Li, Jiaao, et al.
Published: (2025)
by: Li, Jiaao, et al.
Published: (2025)
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models
by: Zeng, Quan-Sheng, et al.
Published: (2025)
by: Zeng, Quan-Sheng, et al.
Published: (2025)
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
by: Liu, Yudong, et al.
Published: (2025)
by: Liu, Yudong, et al.
Published: (2025)
DTP: A Simple yet Effective Distracting Token Pruning Framework for Vision-Language Action Models
by: Li, Chenyang, et al.
Published: (2026)
by: Li, Chenyang, et al.
Published: (2026)
PPT: Token Pruning and Pooling for Efficient Vision Transformers
by: Wu, Xinjian, et al.
Published: (2023)
by: Wu, Xinjian, et al.
Published: (2023)
EntropyPrune: Matrix Entropy Guided Visual Token Pruning for Multimodal Large Language Models
by: Wang, Yahong, et al.
Published: (2026)
by: Wang, Yahong, et al.
Published: (2026)
SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning
by: Ji, Yicheng, et al.
Published: (2025)
by: Ji, Yicheng, et al.
Published: (2025)
HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
by: Zhu, Qihui, et al.
Published: (2026)
by: Zhu, Qihui, et al.
Published: (2026)
Similar Items
-
HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models
by: Li, Xu, et al.
Published: (2025) -
ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models
by: Li, Xu, et al.
Published: (2026) -
Global Semantic-Guided Sub-image Feature Weight Allocation in High-Resolution Large Vision-Language Models
by: Liang, Yuxuan, et al.
Published: (2025) -
Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models
by: Li, Xu, et al.
Published: (2024) -
Efficient Vision-Language Reasoning via Adaptive Token Pruning
by: Li, Xue, et al.
Published: (2025)