Saved in:
| Main Authors: | Cui, Fangming, Zhang, Yonggang, Wang, Xuan, Wang, Xule, Xiao, Liang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.01263 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Enhancing Target-unspecific Tasks through a Features Matrix
by: Cui, Fangming, et al.
Published: (2025)
by: Cui, Fangming, et al.
Published: (2025)
Advancing Prompt Learning through an External Layer
by: Cui, Fangming, et al.
Published: (2024)
by: Cui, Fangming, et al.
Published: (2024)
A Similarity Paradigm Through Textual Regularization Without Forgetting
by: Cui, Fangming, et al.
Published: (2025)
by: Cui, Fangming, et al.
Published: (2025)
VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models
by: Wang, Jiapeng, et al.
Published: (2024)
by: Wang, Jiapeng, et al.
Published: (2024)
Can CLIP Count Stars? An Empirical Study on Quantity Bias in CLIP
by: Zhang, Zeliang, et al.
Published: (2024)
by: Zhang, Zeliang, et al.
Published: (2024)
Foodfusion: A Novel Approach for Food Image Composition via Diffusion Models
by: Shi, Chaohua, et al.
Published: (2024)
by: Shi, Chaohua, et al.
Published: (2024)
Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning
by: Koleilat, Taha, et al.
Published: (2026)
by: Koleilat, Taha, et al.
Published: (2026)
CLIP-SVD: Efficient and Interpretable Vision-Language Adaptation via Singular Values
by: Koleilat, Taha, et al.
Published: (2025)
by: Koleilat, Taha, et al.
Published: (2025)
LLM2CLIP: Powerful Language Model Unlocks Richer Cross-Modality Representation
by: Huang, Weiquan, et al.
Published: (2024)
by: Huang, Weiquan, et al.
Published: (2024)
Is CLIP Cross-Eyed? Revealing and Mitigating Center Bias in the CLIP Family
by: Chew, Oscar, et al.
Published: (2026)
by: Chew, Oscar, et al.
Published: (2026)
Meta CLIP 2: A Worldwide Scaling Recipe
by: Chuang, Yung-Sung, et al.
Published: (2025)
by: Chuang, Yung-Sung, et al.
Published: (2025)
Demystifying CLIP Data
by: Xu, Hu, et al.
Published: (2023)
by: Xu, Hu, et al.
Published: (2023)
Unlearning the Noisy Correspondence Makes CLIP More Robust
by: Han, Haochen, et al.
Published: (2025)
by: Han, Haochen, et al.
Published: (2025)
AttriPrompt: Dynamic Prompt Composition Learning for CLIP
by: Zhan, Qiqi, et al.
Published: (2025)
by: Zhan, Qiqi, et al.
Published: (2025)
MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
by: Koleilat, Taha, et al.
Published: (2024)
by: Koleilat, Taha, et al.
Published: (2024)
Dual-branch Prompting for Multimodal Machine Translation
by: Wang, Jie, et al.
Published: (2025)
by: Wang, Jie, et al.
Published: (2025)
Learning Generalizable Prompt for CLIP with Class Similarity Knowledge
by: Jung, Sehun, et al.
Published: (2025)
by: Jung, Sehun, et al.
Published: (2025)
CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
by: Huang, Yuchen, et al.
Published: (2025)
by: Huang, Yuchen, et al.
Published: (2025)
TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
by: Patel, Maitreya, et al.
Published: (2024)
by: Patel, Maitreya, et al.
Published: (2024)
BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models
by: Koleilat, Taha, et al.
Published: (2024)
by: Koleilat, Taha, et al.
Published: (2024)
Universal Prompt Optimizer for Safe Text-to-Image Generation
by: Wu, Zongyu, et al.
Published: (2024)
by: Wu, Zongyu, et al.
Published: (2024)
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
by: Gao, Peng, et al.
Published: (2021)
by: Gao, Peng, et al.
Published: (2021)
LowCLIP: Adapting the CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task
by: Asgarov, Ali, et al.
Published: (2024)
by: Asgarov, Ali, et al.
Published: (2024)
FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval
by: Xie, Jingyou, et al.
Published: (2024)
by: Xie, Jingyou, et al.
Published: (2024)
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
by: Tang, Yuwei, et al.
Published: (2024)
by: Tang, Yuwei, et al.
Published: (2024)
GSCo: Towards Generalizable AI in Medicine via Generalist-Specialist Collaboration
by: He, Sunan, et al.
Published: (2024)
by: He, Sunan, et al.
Published: (2024)
Distinctive Image Captioning: Leveraging Ground Truth Captions in CLIP Guided Reinforcement Learning
by: Chaffin, Antoine, et al.
Published: (2024)
by: Chaffin, Antoine, et al.
Published: (2024)
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
by: Cheng, Hao, et al.
Published: (2025)
by: Cheng, Hao, et al.
Published: (2025)
HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignment
by: Wu, Ruijia, et al.
Published: (2025)
by: Wu, Ruijia, et al.
Published: (2025)
TiC-CLIP: Continual Training of CLIP Models
by: Garg, Saurabh, et al.
Published: (2023)
by: Garg, Saurabh, et al.
Published: (2023)
InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection
by: Chen, Junjie, et al.
Published: (2024)
by: Chen, Junjie, et al.
Published: (2024)
Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition
by: Liang, Siyu, et al.
Published: (2025)
by: Liang, Siyu, et al.
Published: (2025)
Debiasing CLIP: Interpreting and Correcting Bias in Attention Heads
by: Yeo, Wei Jie, et al.
Published: (2025)
by: Yeo, Wei Jie, et al.
Published: (2025)
MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation
by: Koleilat, Taha, et al.
Published: (2026)
by: Koleilat, Taha, et al.
Published: (2026)
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM
by: Ji, Yatai, et al.
Published: (2024)
by: Ji, Yatai, et al.
Published: (2024)
Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning
by: Li, Siwei, et al.
Published: (2024)
by: Li, Siwei, et al.
Published: (2024)
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
by: Gao, Bingjie, et al.
Published: (2025)
by: Gao, Bingjie, et al.
Published: (2025)
Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
by: Park, Junsung, et al.
Published: (2025)
by: Park, Junsung, et al.
Published: (2025)
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
by: Zhu, Wencheng, et al.
Published: (2025)
by: Zhu, Wencheng, et al.
Published: (2025)
ComCLIP: Training-Free Compositional Image and Text Matching
by: Jiang, Kenan, et al.
Published: (2022)
by: Jiang, Kenan, et al.
Published: (2022)
Similar Items
-
Enhancing Target-unspecific Tasks through a Features Matrix
by: Cui, Fangming, et al.
Published: (2025) -
Advancing Prompt Learning through an External Layer
by: Cui, Fangming, et al.
Published: (2024) -
A Similarity Paradigm Through Textual Regularization Without Forgetting
by: Cui, Fangming, et al.
Published: (2025) -
VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models
by: Wang, Jiapeng, et al.
Published: (2024) -
Can CLIP Count Stars? An Empirical Study on Quantity Bias in CLIP
by: Zhang, Zeliang, et al.
Published: (2024)