Saved in:
| Main Authors: | Pathak, Surendra, Han, Bo |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.14549 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Efficient Large Vision-Language Models: A Comprehensive Survey on Inference Strategies
by: Pathak, Surendra, et al.
Published: (2026)
by: Pathak, Surendra, et al.
Published: (2026)
Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
by: Gabetni, Firas, et al.
Published: (2025)
by: Gabetni, Firas, et al.
Published: (2025)
Rényi Attention Entropy for Patch Pruning
by: Aizawa, Hiroaki, et al.
Published: (2026)
by: Aizawa, Hiroaki, et al.
Published: (2026)
CAPA: Contribution-Aware Pruning and FFN Approximation for Efficient Large Vision-Language Models
by: Jha, Samyak, et al.
Published: (2026)
by: Jha, Samyak, et al.
Published: (2026)
Structured Model Pruning for Efficient Inference in Computational Pathology
by: Adnan, Mohammed, et al.
Published: (2024)
by: Adnan, Mohammed, et al.
Published: (2024)
Hybrid-Regularized Magnitude Pruning for Robust Federated Learning under Covariate Shift
by: Goksu, Ozgu, et al.
Published: (2024)
by: Goksu, Ozgu, et al.
Published: (2024)
PruneFuse: Efficient Data Selection via Weight Pruning and Network Fusion
by: Kousar, Humaira, et al.
Published: (2026)
by: Kousar, Humaira, et al.
Published: (2026)
ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks
by: Hu, Wenhao, et al.
Published: (2025)
by: Hu, Wenhao, et al.
Published: (2025)
Equivariant-Aware Structured Pruning for Efficient Edge Deployment: A Comprehensive Framework with Adaptive Fine-Tuning
by: Alnemari, Mohammed
Published: (2025)
by: Alnemari, Mohammed
Published: (2025)
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
by: Bair, Anna, et al.
Published: (2023)
by: Bair, Anna, et al.
Published: (2023)
Make Your LVLM KV Cache More Lightweight
by: Chen, Xihao, et al.
Published: (2026)
by: Chen, Xihao, et al.
Published: (2026)
VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
by: Wu, Zhenkai, et al.
Published: (2025)
by: Wu, Zhenkai, et al.
Published: (2025)
Large-scale Dataset Pruning with Dynamic Uncertainty
by: He, Muyang, et al.
Published: (2023)
by: He, Muyang, et al.
Published: (2023)
NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation
by: Joshi, Devvrat, et al.
Published: (2025)
by: Joshi, Devvrat, et al.
Published: (2025)
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
by: Wang, Hanrui, et al.
Published: (2022)
by: Wang, Hanrui, et al.
Published: (2022)
Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
by: Park, NaHyeon, et al.
Published: (2025)
by: Park, NaHyeon, et al.
Published: (2025)
Interpretability-Aware Pruning for Efficient Medical Image Analysis
by: Malik, Nikita, et al.
Published: (2025)
by: Malik, Nikita, et al.
Published: (2025)
STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference
by: Guo, Yichen, et al.
Published: (2025)
by: Guo, Yichen, et al.
Published: (2025)
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models
by: Shirkavand, Reza, et al.
Published: (2024)
by: Shirkavand, Reza, et al.
Published: (2024)
Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts
by: Zhang, Ruipeng, et al.
Published: (2024)
by: Zhang, Ruipeng, et al.
Published: (2024)
LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights
by: Castells, Thibault, et al.
Published: (2024)
by: Castells, Thibault, et al.
Published: (2024)
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
by: Zhao, Yaqi, et al.
Published: (2024)
by: Zhao, Yaqi, et al.
Published: (2024)
Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck
by: Bulat, Adrian, et al.
Published: (2025)
by: Bulat, Adrian, et al.
Published: (2025)
C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression
by: Bauvin, Baptiste, et al.
Published: (2025)
by: Bauvin, Baptiste, et al.
Published: (2025)
AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models
by: Baek, Changwoo, et al.
Published: (2026)
by: Baek, Changwoo, et al.
Published: (2026)
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
by: Shihab, Ibne Farabi, et al.
Published: (2025)
by: Shihab, Ibne Farabi, et al.
Published: (2025)
Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning
by: Yang, Suorong, et al.
Published: (2025)
by: Yang, Suorong, et al.
Published: (2025)
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
by: Ding, Dujian, et al.
Published: (2024)
by: Ding, Dujian, et al.
Published: (2024)
Free$^2$Guide: Training-Free Text-to-Video Alignment using Image LVLM
by: Kim, Jaemin, et al.
Published: (2024)
by: Kim, Jaemin, et al.
Published: (2024)
CHAI: CacHe Attention Inference for text2video
by: Cherian, Joel Mathew, et al.
Published: (2026)
by: Cherian, Joel Mathew, et al.
Published: (2026)
Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure
by: Yang, Minghao, et al.
Published: (2024)
by: Yang, Minghao, et al.
Published: (2024)
Efficient Vision-Language Reasoning via Adaptive Token Pruning
by: Li, Xue, et al.
Published: (2025)
by: Li, Xue, et al.
Published: (2025)
LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
by: Zhang, Mingyang, et al.
Published: (2023)
by: Zhang, Mingyang, et al.
Published: (2023)
Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts
by: Gupta, Madhav, et al.
Published: (2025)
by: Gupta, Madhav, et al.
Published: (2025)
CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models
by: Wang, Qinsi, et al.
Published: (2025)
by: Wang, Qinsi, et al.
Published: (2025)
Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning
by: Zhang, Dingkun, et al.
Published: (2026)
by: Zhang, Dingkun, et al.
Published: (2026)
Mixture-of-Top-k Attention: Efficient Attention via Scalable Fast Weights
by: Wen, Qishuai, et al.
Published: (2026)
by: Wen, Qishuai, et al.
Published: (2026)
Is Complexity Required for Neural Network Pruning? A Case Study on Global Magnitude Pruning
by: Gupta, Manas, et al.
Published: (2022)
by: Gupta, Manas, et al.
Published: (2022)
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion
by: Ramesh, Samarth N, et al.
Published: (2024)
by: Ramesh, Samarth N, et al.
Published: (2024)
Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
by: Zhang, Xinxi, et al.
Published: (2024)
by: Zhang, Xinxi, et al.
Published: (2024)
Similar Items
-
Towards Efficient Large Vision-Language Models: A Comprehensive Survey on Inference Strategies
by: Pathak, Surendra, et al.
Published: (2026) -
Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
by: Gabetni, Firas, et al.
Published: (2025) -
Rényi Attention Entropy for Patch Pruning
by: Aizawa, Hiroaki, et al.
Published: (2026) -
CAPA: Contribution-Aware Pruning and FFN Approximation for Efficient Large Vision-Language Models
by: Jha, Samyak, et al.
Published: (2026) -
Structured Model Pruning for Efficient Inference in Computational Pathology
by: Adnan, Mohammed, et al.
Published: (2024)