:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Liu, Xuyang, Wang, Ziming, Chen, Junjie, Han, Yuhang, Wang, Yingyao, Yuan, Jiale, Song, Jun, Huang, Siteng, Chen, Honggang
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computer Vision and Pattern Recognition
Online-Zugang:	https://arxiv.org/abs/2501.05179
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models
von: Liu, Xuyang, et al.
Veröffentlicht: (2025)

Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration
von: Han, Yuhang, et al.
Veröffentlicht: (2024)

Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
von: Chen, Junjie, et al.
Veröffentlicht: (2025)

VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders
von: Liu, Xuyang, et al.
Veröffentlicht: (2023)

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games
von: Chen, Peng, et al.
Veröffentlicht: (2025)

Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
von: Xiao, Chaojun, et al.
Veröffentlicht: (2023)

Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
von: Xiong, Minhao, et al.
Veröffentlicht: (2025)

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
von: Chen, Liang, et al.
Veröffentlicht: (2024)

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models
von: Han, Yuhang, et al.
Veröffentlicht: (2026)

DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
von: Liu, Ting, et al.
Veröffentlicht: (2024)

Score-Based Turbo Message Passing for Plug-and-Play Compressive Imaging
von: Cai, Chang, et al.
Veröffentlicht: (2025)

Score-Based Turbo Message Passing for Plug-and-Play Compressive Image Recovery
von: Cai, Chang, et al.
Veröffentlicht: (2025)

Accelerating Cross‐Scenario Metasurface Adaptability with Plug‐and‐Play Kernel
von: Nanxuan Wu, et al.
Veröffentlicht: (2025)

PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
von: Li, Jinyi, et al.
Veröffentlicht: (2024)

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
von: Ju, Chen, et al.
Veröffentlicht: (2024)

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models
von: Gu, Jihao, et al.
Veröffentlicht: (2025)

Accelerating Diffusion Transformers with Token-wise Feature Caching
von: Zou, Chang, et al.
Veröffentlicht: (2024)

M2IST: Multi-Modal Interactive Side-Tuning for Efficient Referring Expression Comprehension
von: Liu, Xuyang, et al.
Veröffentlicht: (2024)

Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment
von: Lin, Xinying, et al.
Veröffentlicht: (2024)

Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
von: Liu, Ting, et al.
Veröffentlicht: (2024)

FasterVAR: Plug-and-Play Acceleration for Visual Autoregressive Models
von: Li, Senmao, et al.
Veröffentlicht: (2025)

Plug-and-Play DISep: Separating Dense Instances for Scene-to-Pixel Weakly-Supervised Change Detection in High-Resolution Remote Sensing Images
von: Zhao, Zhenghui, et al.
Veröffentlicht: (2025)

STAC: Plug-and-Play Spatio-Temporal Aware Cache Compression for Streaming 3D Reconstruction
von: Wang, Runze, et al.
Veröffentlicht: (2026)

Plug-and-Play Versatile Compressed Video Enhancement
von: Zeng, Huimin, et al.
Veröffentlicht: (2025)

ARM: A Learnable, Plug-and-Play Module for CLIP-based Open-vocabulary Semantic Segmentation
von: Liu, Ziquan, et al.
Veröffentlicht: (2025)

Glycolysis Plays a Critical and Dual Role in Periodontitis
von: Hongyu Ming, et al.
Veröffentlicht: (2025)

PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
von: Yan, Hanshu, et al.
Veröffentlicht: (2024)

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
von: Ge, Chunjiang, et al.
Veröffentlicht: (2024)

Plug-and-Play Homeostatic Spark: Zero-Cost Acceleration for SNN Training Across Paradigms
von: Chen, Rui, et al.
Veröffentlicht: (2025)

SoulX-Duplug: Plug-and-Play Streaming State Prediction Module for Realtime Full-Duplex Speech Conversation
von: Yan, Ruiqi, et al.
Veröffentlicht: (2026)

Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
von: Wang, Yiyu, et al.
Veröffentlicht: (2025)

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
von: Zhang, Jintao, et al.
Veröffentlicht: (2024)

Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
von: Zhao, Han, et al.
Veröffentlicht: (2024)

Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
von: Cao, Jiaqi, et al.
Veröffentlicht: (2025)

SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models
von: Chai, Weilong, et al.
Veröffentlicht: (2023)

LogLite: Lightweight Plug-and-Play Streaming Log Compression
von: Tang, Benzhao, et al.
Veröffentlicht: (2025)

Fancy123: One Image to High-Quality 3D Mesh Generation via Plug-and-Play Deformation
von: Yu, Qiao, et al.
Veröffentlicht: (2024)

Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models
von: Chen, Junjie, et al.
Veröffentlicht: (2025)

Harnessing the Plug-and-Play Controller by Prompting
von: Wang, Hao, et al.
Veröffentlicht: (2024)

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration
von: Yu, Hanxun, et al.
Veröffentlicht: (2026)