Saved in:
| Main Authors: | Yang, Longrong, Shen, Dong, Cai, Chaoxiang, Yang, Fan, Gao, Tingting, Zhang, Di, Li, Xi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.19905 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model
by: Cai, Chaoxiang, et al.
Published: (2025)
by: Cai, Chaoxiang, et al.
Published: (2025)
Quant Experts: Token-aware Adaptive Error Reconstruction with Mixture of Experts for Large Vision-Language Models Quantization
by: Jia, Chenwei, et al.
Published: (2026)
by: Jia, Chenwei, et al.
Published: (2026)
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
by: Lin, Bin, et al.
Published: (2024)
by: Lin, Bin, et al.
Published: (2024)
Language-Guided Token Compression with Reinforcement Learning in Large Vision-Language Models
by: Cao, Sihan, et al.
Published: (2026)
by: Cao, Sihan, et al.
Published: (2026)
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
by: Wang, Dianyi, et al.
Published: (2025)
by: Wang, Dianyi, et al.
Published: (2025)
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
by: Yu, Jiazuo, et al.
Published: (2024)
by: Yu, Jiazuo, et al.
Published: (2024)
Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts
by: Chen, Xi, et al.
Published: (2025)
by: Chen, Xi, et al.
Published: (2025)
SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts
by: Liu, Jiaqi, et al.
Published: (2025)
by: Liu, Jiaqi, et al.
Published: (2025)
Mixpert: Mitigating Multimodal Learning Conflicts with Efficient Mixture-of-Vision-Experts
by: He, Xin, et al.
Published: (2025)
by: He, Xin, et al.
Published: (2025)
VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation
by: Dong, Shaoqi, et al.
Published: (2025)
by: Dong, Shaoqi, et al.
Published: (2025)
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
by: Shen, Leyang, et al.
Published: (2024)
by: Shen, Leyang, et al.
Published: (2024)
Resolving Task Objective Conflicts in Unified Model via Task-Aware Mixture-of-Experts
by: Zhang, Jiaxing, et al.
Published: (2025)
by: Zhang, Jiaxing, et al.
Published: (2025)
DIMoE-Adapters: Dynamic Expert Evolution for Continual Learning in Vision-Language Models
by: Qin, Mengxin, et al.
Published: (2026)
by: Qin, Mengxin, et al.
Published: (2026)
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
by: Wen, Zichen, et al.
Published: (2025)
by: Wen, Zichen, et al.
Published: (2025)
HybridToken-VLM: Hybrid Token Compression for Vision-Language Models
by: Zhang, Jusheng, et al.
Published: (2025)
by: Zhang, Jusheng, et al.
Published: (2025)
EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens
by: Feng, Ze, et al.
Published: (2025)
by: Feng, Ze, et al.
Published: (2025)
Local Precise Refinement: A Dual-Gated Mixture-of-Experts for Enhancing Foundation Model Generalization against Spectral Shifts
by: Chen, Xi, et al.
Published: (2026)
by: Chen, Xi, et al.
Published: (2026)
SARES-DEIM: Sparse Mixture-of-Experts Meets DETR for Robust SAR Ship Detection
by: Song, Fenghao, et al.
Published: (2026)
by: Song, Fenghao, et al.
Published: (2026)
Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models
by: Wang, Peiran, et al.
Published: (2025)
by: Wang, Peiran, et al.
Published: (2025)
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection
by: Yang, Longrong, et al.
Published: (2023)
by: Yang, Longrong, et al.
Published: (2023)
MINT: Mitigating Hallucinations in Large Vision-Language Models via Token Reduction
by: Wang, Chao, et al.
Published: (2025)
by: Wang, Chao, et al.
Published: (2025)
LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition
by: Zhou, Rixin, et al.
Published: (2025)
by: Zhou, Rixin, et al.
Published: (2025)
EVLM: An Efficient Vision-Language Model for Visual Understanding
by: Chen, Kaibing, et al.
Published: (2024)
by: Chen, Kaibing, et al.
Published: (2024)
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
by: Chen, Qizhou, et al.
Published: (2024)
by: Chen, Qizhou, et al.
Published: (2024)
Unveiling the Response of Large Vision-Language Models to Visually Absent Tokens
by: Kim, Sohee, et al.
Published: (2025)
by: Kim, Sohee, et al.
Published: (2025)
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
by: Zong, Zhuofan, et al.
Published: (2024)
by: Zong, Zhuofan, et al.
Published: (2024)
TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models
by: Liang, Xiao, et al.
Published: (2025)
by: Liang, Xiao, et al.
Published: (2025)
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models
by: Zhang, Yue, et al.
Published: (2024)
by: Zhang, Yue, et al.
Published: (2024)
FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
by: Cai, Kaitong, et al.
Published: (2025)
by: Cai, Kaitong, et al.
Published: (2025)
EchoVLM: Dynamic Mixture-of-Experts Vision-Language Model for Universal Ultrasound Intelligence
by: She, Chaoyin, et al.
Published: (2025)
by: She, Chaoyin, et al.
Published: (2025)
PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
by: Yang, Chenyu, et al.
Published: (2024)
by: Yang, Chenyu, et al.
Published: (2024)
VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models
by: Zhang, Ce, et al.
Published: (2025)
by: Zhang, Ce, et al.
Published: (2025)
SEMC: Structure-Enhanced Mixture-of-Experts Contrastive Learning for Ultrasound Standard Plane Recognition
by: Cai, Qing, et al.
Published: (2025)
by: Cai, Qing, et al.
Published: (2025)
Beyond Surrogate Gradients: Fully Differentiable Token Pruning for Vision-Language Models
by: He, Landi, et al.
Published: (2026)
by: He, Landi, et al.
Published: (2026)
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
by: Wang, Yimu, et al.
Published: (2025)
by: Wang, Yimu, et al.
Published: (2025)
Towards Vision Mixture of Experts for Wildlife Monitoring on the Edge
by: Mensah, Emmanuel Azuh, et al.
Published: (2024)
by: Mensah, Emmanuel Azuh, et al.
Published: (2024)
Rethinking Efficient Mixture-of-Experts for Remote Sensing Modality-Missing Classification
by: Gao, Qinghao, et al.
Published: (2025)
by: Gao, Qinghao, et al.
Published: (2025)
A Survey of Token Compression for Efficient Multimodal Large Language Models
by: Shao, Kele, et al.
Published: (2025)
by: Shao, Kele, et al.
Published: (2025)
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models
by: Shang, Yuying, et al.
Published: (2024)
by: Shang, Yuying, et al.
Published: (2024)
Similar Items
-
Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model
by: Cai, Chaoxiang, et al.
Published: (2025) -
Quant Experts: Token-aware Adaptive Error Reconstruction with Mixture of Experts for Large Vision-Language Models Quantization
by: Jia, Chenwei, et al.
Published: (2026) -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
by: Lin, Bin, et al.
Published: (2024) -
Language-Guided Token Compression with Reinforcement Learning in Large Vision-Language Models
by: Cao, Sihan, et al.
Published: (2026) -
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
by: Wang, Dianyi, et al.
Published: (2025)