Saved in:
| Main Authors: | Yi, Chao, He, Yu-Hang, Zhan, De-Chuan, Ye, Han-Jia |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.13797 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning without Forgetting for Vision-Language Models
by: Zhou, Da-Wei, et al.
Published: (2023)
by: Zhou, Da-Wei, et al.
Published: (2023)
BOFA: Bridge-Layer Orthogonal Low-Rank Fusion for CLIP-Based Class-Incremental Learning
by: Li, Lan, et al.
Published: (2025)
by: Li, Lan, et al.
Published: (2025)
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox
by: Sun, Hai-Long, et al.
Published: (2023)
by: Sun, Hai-Long, et al.
Published: (2023)
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning
by: Zhou, Da-Wei, et al.
Published: (2024)
by: Zhou, Da-Wei, et al.
Published: (2024)
Addressing Imbalanced Domain-Incremental Learning through Dual-Balance Collaborative Experts
by: Li, Lan, et al.
Published: (2025)
by: Li, Lan, et al.
Published: (2025)
Bridging the Modality Gap in Roadside LiDAR: A Training-Free Vision-Language Model Framework for Vehicle Classification
by: Li, Yiqiao, et al.
Published: (2026)
by: Li, Yiqiao, et al.
Published: (2026)
Continual Learning with Pre-Trained Models: A Survey
by: Zhou, Da-Wei, et al.
Published: (2024)
by: Zhou, Da-Wei, et al.
Published: (2024)
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
by: Zhou, Da-Wei, et al.
Published: (2024)
by: Zhou, Da-Wei, et al.
Published: (2024)
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need
by: Zhou, Da-Wei, et al.
Published: (2023)
by: Zhou, Da-Wei, et al.
Published: (2023)
TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen
by: Zhou, Da-Wei, et al.
Published: (2024)
by: Zhou, Da-Wei, et al.
Published: (2024)
Hawk: Leveraging Spatial Context for Faster Autoregressive Text-to-Image Generation
by: Chen, Zhi-Kai, et al.
Published: (2025)
by: Chen, Zhi-Kai, et al.
Published: (2025)
MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning
by: Sun, Hai-Long, et al.
Published: (2024)
by: Sun, Hai-Long, et al.
Published: (2024)
Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification
by: Yi, Chao, et al.
Published: (2024)
by: Yi, Chao, et al.
Published: (2024)
Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning
by: Qi, Zhi-Hong, et al.
Published: (2024)
by: Qi, Zhi-Hong, et al.
Published: (2024)
OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions
by: Zhang, Yi-Kai, et al.
Published: (2024)
by: Zhang, Yi-Kai, et al.
Published: (2024)
External Knowledge Injection for CLIP-Based Class-Incremental Learning
by: Zhou, Da-Wei, et al.
Published: (2025)
by: Zhou, Da-Wei, et al.
Published: (2025)
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
by: Schrodi, Simon, et al.
Published: (2024)
by: Schrodi, Simon, et al.
Published: (2024)
Class-Incremental Learning: A Survey
by: Zhou, Da-Wei, et al.
Published: (2023)
by: Zhou, Da-Wei, et al.
Published: (2023)
Déjà Vu Memorization in Vision-Language Models
by: Jayaraman, Bargav, et al.
Published: (2024)
by: Jayaraman, Bargav, et al.
Published: (2024)
Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories
by: Naharas, Nilay, et al.
Published: (2025)
by: Naharas, Nilay, et al.
Published: (2025)
X-VILA: Cross-Modality Alignment for Large Language Model
by: Ye, Hanrong, et al.
Published: (2024)
by: Ye, Hanrong, et al.
Published: (2024)
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
by: Huang, Wenxuan, et al.
Published: (2025)
by: Huang, Wenxuan, et al.
Published: (2025)
Quantifying Cross-Modality Memorization in Vision-Language Models
by: Wen, Yuxin, et al.
Published: (2025)
by: Wen, Yuxin, et al.
Published: (2025)
Improved Alignment of Modalities in Large Vision Language Models
by: Jangra, Kartik, et al.
Published: (2025)
by: Jangra, Kartik, et al.
Published: (2025)
Rethinking Fine-Tuning: Unlocking Hidden Capabilities in Vision-Language Models
by: Zhang, Mingyuan, et al.
Published: (2025)
by: Zhang, Mingyuan, et al.
Published: (2025)
Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language Models
by: Hu, Rui, et al.
Published: (2025)
by: Hu, Rui, et al.
Published: (2025)
MMRL: Multi-Modal Representation Learning for Vision-Language Models
by: Guo, Yuncheng, et al.
Published: (2025)
by: Guo, Yuncheng, et al.
Published: (2025)
Information Router for Mitigating Modality Dominance in Vision-Language Models
by: Kim, Seulgi, et al.
Published: (2026)
by: Kim, Seulgi, et al.
Published: (2026)
Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models
by: Yang, Juncheng, et al.
Published: (2024)
by: Yang, Juncheng, et al.
Published: (2024)
Bridging the Gap Between Multimodal Foundation Models and World Models
by: He, Xuehai
Published: (2025)
by: He, Xuehai
Published: (2025)
Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP
by: Xu, Zhongxing, et al.
Published: (2024)
by: Xu, Zhongxing, et al.
Published: (2024)
Ramen: Robust Test-Time Adaptation of Vision-Language Models with Active Sample Selection
by: Bao, Wenxuan, et al.
Published: (2026)
by: Bao, Wenxuan, et al.
Published: (2026)
Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training
by: Zhang, Wenyu, et al.
Published: (2024)
by: Zhang, Wenyu, et al.
Published: (2024)
Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning
by: Ye, Bo, et al.
Published: (2023)
by: Ye, Bo, et al.
Published: (2023)
Tactile Modality Fusion for Vision-Language-Action Models
by: Morissette, Charlotte, et al.
Published: (2026)
by: Morissette, Charlotte, et al.
Published: (2026)
Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
by: Su, Hung-Ting, et al.
Published: (2024)
by: Su, Hung-Ting, et al.
Published: (2024)
Bridging Vision and Language Spaces with Assignment Prediction
by: Park, Jungin, et al.
Published: (2024)
by: Park, Jungin, et al.
Published: (2024)
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
by: Wang, Yucen, et al.
Published: (2024)
by: Wang, Yucen, et al.
Published: (2024)
Multi-Modal Adapter for Vision-Language Models
by: Seputis, Dominykas, et al.
Published: (2024)
by: Seputis, Dominykas, et al.
Published: (2024)
Vision-Language Models Create Cross-Modal Task Representations
by: Luo, Grace, et al.
Published: (2024)
by: Luo, Grace, et al.
Published: (2024)
Similar Items
-
Learning without Forgetting for Vision-Language Models
by: Zhou, Da-Wei, et al.
Published: (2023) -
BOFA: Bridge-Layer Orthogonal Low-Rank Fusion for CLIP-Based Class-Incremental Learning
by: Li, Lan, et al.
Published: (2025) -
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox
by: Sun, Hai-Long, et al.
Published: (2023) -
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning
by: Zhou, Da-Wei, et al.
Published: (2024) -
Addressing Imbalanced Domain-Incremental Learning through Dual-Balance Collaborative Experts
by: Li, Lan, et al.
Published: (2025)