Saved in:
| Main Authors: | Li, Xinyao, Min, Yinjie, Chen, Hongbo, Du, Zhekai, Li, Fengling, Li, Jingjing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.02421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
by: Li, Xinyao, et al.
Published: (2024)
by: Li, Xinyao, et al.
Published: (2024)
Unified modality separation: A vision-language framework for unsupervised domain adaptation
by: Li, Xinyao, et al.
Published: (2025)
by: Li, Xinyao, et al.
Published: (2025)
Agile Multi-Source-Free Domain Adaptation
by: Li, Xinyao, et al.
Published: (2024)
by: Li, Xinyao, et al.
Published: (2024)
Generalizing vision-language models to novel domains: A comprehensive survey
by: Li, Xinyao, et al.
Published: (2025)
by: Li, Xinyao, et al.
Published: (2025)
ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models
by: Ye, Wencheng, et al.
Published: (2025)
by: Ye, Wencheng, et al.
Published: (2025)
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
by: Li, Zheng, et al.
Published: (2024)
by: Li, Zheng, et al.
Published: (2024)
Revisiting Prompt Pretraining of Vision-Language Models
by: Chen, Zhenyuan, et al.
Published: (2024)
by: Chen, Zhenyuan, et al.
Published: (2024)
Modeling Variants of Prompts for Vision-Language Models
by: Li, Ao, et al.
Published: (2025)
by: Li, Ao, et al.
Published: (2025)
Cascade Prompt Learning for Vision-Language Model Adaptation
by: Wu, Ge, et al.
Published: (2024)
by: Wu, Ge, et al.
Published: (2024)
VPG: Visual Prefix Guidance for Autoregressive Image and Video Generation
by: Liao, Xinyao, et al.
Published: (2026)
by: Liao, Xinyao, et al.
Published: (2026)
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
by: Wang, Xin, et al.
Published: (2024)
by: Wang, Xin, et al.
Published: (2024)
Multi-modal Mutual-Guidance Conditional Prompt Learning for Vision-Language Models
by: Yang, Shijun, et al.
Published: (2025)
by: Yang, Shijun, et al.
Published: (2025)
Mixture of Prompt Learning for Vision Language Models
by: Du, Yu, et al.
Published: (2024)
by: Du, Yu, et al.
Published: (2024)
Granular Computing-driven SAM: From Coarse-to-Fine Guidance for Prompt-Free Segmentation
by: Yu, Qiyang, et al.
Published: (2025)
by: Yu, Qiyang, et al.
Published: (2025)
Quantized Prompt for Efficient Generalization of Vision-Language Models
by: Hao, Tianxiang, et al.
Published: (2024)
by: Hao, Tianxiang, et al.
Published: (2024)
Evolving Prompt Adaptation for Vision-Language Models
by: Zhang, Enming, et al.
Published: (2026)
by: Zhang, Enming, et al.
Published: (2026)
Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models
by: Chen, Wenyue, et al.
Published: (2026)
by: Chen, Wenyue, et al.
Published: (2026)
Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization
by: Cheng, De, et al.
Published: (2025)
by: Cheng, De, et al.
Published: (2025)
Transitive Vision-Language Prompt Learning for Domain Generalization
by: Wang, Liyuan, et al.
Published: (2024)
by: Wang, Liyuan, et al.
Published: (2024)
TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers
by: Liang, Guang, et al.
Published: (2025)
by: Liang, Guang, et al.
Published: (2025)
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
by: Wang, Yubin, et al.
Published: (2024)
by: Wang, Yubin, et al.
Published: (2024)
Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models
by: Guo, Boyang, et al.
Published: (2026)
by: Guo, Boyang, et al.
Published: (2026)
Cluster-Aware Prompt Ensemble Learning for Few-Shot Vision-Language Model Adaptation
by: Chen, Zhi, et al.
Published: (2025)
by: Chen, Zhi, et al.
Published: (2025)
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
by: Wang, Zixuan, et al.
Published: (2025)
by: Wang, Zixuan, et al.
Published: (2025)
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
by: Jing, Liqiang, et al.
Published: (2023)
by: Jing, Liqiang, et al.
Published: (2023)
Vision Transformers with Self-Distilled Registers
by: Chen, Yinjie, et al.
Published: (2025)
by: Chen, Yinjie, et al.
Published: (2025)
3D Aware Region Prompted Vision Language Model
by: Cheng, An-Chieh, et al.
Published: (2025)
by: Cheng, An-Chieh, et al.
Published: (2025)
TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning
by: Xie, Jingjing, et al.
Published: (2024)
by: Xie, Jingjing, et al.
Published: (2024)
TRIO: Token Reduction via Inference-Objective Guidance for Efficient Vision-Language Models
by: Zhang, Haokui, et al.
Published: (2026)
by: Zhang, Haokui, et al.
Published: (2026)
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models
by: Liu, Xinyang, et al.
Published: (2023)
by: Liu, Xinyang, et al.
Published: (2023)
AdaptInfer: Adaptive Token Pruning for Vision-Language Model Inference with Dynamical Text Guidance
by: Zhang, Weichen, et al.
Published: (2025)
by: Zhang, Weichen, et al.
Published: (2025)
CoDefend: Cross-Modal Collaborative Defense via Diffusion Purification and Prompt Optimization
by: Zhu, Fengling, et al.
Published: (2025)
by: Zhu, Fengling, et al.
Published: (2025)
Autoregressive Image Generation with Vision Full-view Prompt
by: Cai, Miaomiao, et al.
Published: (2025)
by: Cai, Miaomiao, et al.
Published: (2025)
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
by: Chang, You-Ming, et al.
Published: (2023)
by: Chang, You-Ming, et al.
Published: (2023)
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
by: Wang, Meng, et al.
Published: (2025)
by: Wang, Meng, et al.
Published: (2025)
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
by: Zhang, Yi, et al.
Published: (2024)
by: Zhang, Yi, et al.
Published: (2024)
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
by: Chen, Zhekai, et al.
Published: (2026)
by: Chen, Zhekai, et al.
Published: (2026)
Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models
by: Li, Jiayu, et al.
Published: (2026)
by: Li, Jiayu, et al.
Published: (2026)
MAO: Efficient Model-Agnostic Optimization of Prompt Tuning for Vision-Language Models
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
Similar Items
-
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
by: Li, Xinyao, et al.
Published: (2024) -
Unified modality separation: A vision-language framework for unsupervised domain adaptation
by: Li, Xinyao, et al.
Published: (2025) -
Agile Multi-Source-Free Domain Adaptation
by: Li, Xinyao, et al.
Published: (2024) -
Generalizing vision-language models to novel domains: A comprehensive survey
by: Li, Xinyao, et al.
Published: (2025) -
ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models
by: Ye, Wencheng, et al.
Published: (2025)