Saved in:
| Main Authors: | Zhou, Shuchang, Wei, Jiwei, He, Shiyuan, Zhou, Yuyang, Zhang, Chaoning, Zou, Jie, Xie, Ning, Yang, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.19777 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MASRA: MLLM-Assisted Semantic-Relational Consistent Alignment for Video Temporal Grounding
by: Ran, Ran, et al.
Published: (2026)
by: Ran, Ran, et al.
Published: (2026)
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection
by: Zhou, Shuchang, et al.
Published: (2026)
by: Zhou, Shuchang, et al.
Published: (2026)
MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
by: Liang, Qian, et al.
Published: (2025)
by: Liang, Qian, et al.
Published: (2025)
Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework
by: Liu, Ke, et al.
Published: (2026)
by: Liu, Ke, et al.
Published: (2026)
Frequency-Aware Semantic Fusion with Gated Injection for AI-generated Image Detection
by: Zhou, Shuchang, et al.
Published: (2026)
by: Zhou, Shuchang, et al.
Published: (2026)
Vision-EKIPL: External Knowledge-Infused Policy Learning for Visual Reasoning
by: Wang, Chaoyang, et al.
Published: (2025)
by: Wang, Chaoyang, et al.
Published: (2025)
LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning
by: Shi, Yiming, et al.
Published: (2024)
by: Shi, Yiming, et al.
Published: (2024)
Multi-modal Attribute Prompting for Vision-Language Models
by: Liu, Xin, et al.
Published: (2024)
by: Liu, Xin, et al.
Published: (2024)
Language-Guided Token Compression with Reinforcement Learning in Large Vision-Language Models
by: Cao, Sihan, et al.
Published: (2026)
by: Cao, Sihan, et al.
Published: (2026)
Relaxing Anchor-Frame Dominance for Mitigating Hallucinations in Video Large Language Models
by: Liu, Zijian, et al.
Published: (2026)
by: Liu, Zijian, et al.
Published: (2026)
Active Prompt Learning with Vision-Language Model Priors
by: Kim, Hoyoung, et al.
Published: (2024)
by: Kim, Hoyoung, et al.
Published: (2024)
$Δ$VLA: Prior-Guided Vision-Language-Action Models via World Knowledge Variation
by: Zhu, Yijie, et al.
Published: (2026)
by: Zhu, Yijie, et al.
Published: (2026)
ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models
by: Tian, Xinyu, et al.
Published: (2023)
by: Tian, Xinyu, et al.
Published: (2023)
Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models
by: Liang, Qiao, et al.
Published: (2025)
by: Liang, Qiao, et al.
Published: (2025)
Hierarchical Cross-modal Prompt Learning for Vision-Language Models
by: Zheng, Hao, et al.
Published: (2025)
by: Zheng, Hao, et al.
Published: (2025)
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
by: Chang, You-Ming, et al.
Published: (2023)
by: Chang, You-Ming, et al.
Published: (2023)
RCP: Representation Consistency Pruner for Mitigating Distribution Shift in Large Vision-Language Models
by: Zhang, Jianwei, et al.
Published: (2026)
by: Zhang, Jianwei, et al.
Published: (2026)
Infusing fine-grained visual knowledge to Vision-Language Models
by: Ypsilantis, Nikolaos-Antonios, et al.
Published: (2025)
by: Ypsilantis, Nikolaos-Antonios, et al.
Published: (2025)
Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation
by: Yang, Hongji, et al.
Published: (2025)
by: Yang, Hongji, et al.
Published: (2025)
Joint-Optimized Unsupervised Adversarial Domain Adaptation in Remote Sensing Segmentation with Prompted Foundation Model
by: Lyu, Shuchang, et al.
Published: (2024)
by: Lyu, Shuchang, et al.
Published: (2024)
MINT: Memory-Infused Prompt Tuning at Test-time for CLIP
by: Yi, Jiaming, et al.
Published: (2025)
by: Yi, Jiaming, et al.
Published: (2025)
Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
by: Zhao, Minyi, et al.
Published: (2024)
by: Zhao, Minyi, et al.
Published: (2024)
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts
by: Fang, Shuangkang, et al.
Published: (2024)
by: Fang, Shuangkang, et al.
Published: (2024)
Unleashing Semantic and Geometric Priors for 3D Scene Completion
by: Chen, Shiyuan, et al.
Published: (2025)
by: Chen, Shiyuan, et al.
Published: (2025)
Rethinking Overlooked Aspects in Vision-Language Models
by: Liu, Yuan, et al.
Published: (2024)
by: Liu, Yuan, et al.
Published: (2024)
Libra-MIL: Multimodal Prototypes Stereoscopic Infused with Task-specific Language Priors for Few-shot Whole Slide Image Classification
by: Zhuang, Zhenfeng, et al.
Published: (2025)
by: Zhuang, Zhenfeng, et al.
Published: (2025)
RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
by: Yang, Yuncheng, et al.
Published: (2024)
by: Yang, Yuncheng, et al.
Published: (2024)
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
by: Li, Zheng, et al.
Published: (2024)
by: Li, Zheng, et al.
Published: (2024)
LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection
by: Xiao, Yutong, et al.
Published: (2026)
by: Xiao, Yutong, et al.
Published: (2026)
PKI: Prior Knowledge-Infused Neural Network for Few-Shot Class-Incremental Learning
by: Baoa, Kexin, et al.
Published: (2026)
by: Baoa, Kexin, et al.
Published: (2026)
Topology-Aware Layer Pruning for Large Vision-Language Models
by: Zheng, Pengcheng, et al.
Published: (2026)
by: Zheng, Pengcheng, et al.
Published: (2026)
Revisiting Prompt Pretraining of Vision-Language Models
by: Chen, Zhenyuan, et al.
Published: (2024)
by: Chen, Zhenyuan, et al.
Published: (2024)
Enhancing Medical Visual Grounding via Knowledge-guided Spatial Prompts
by: Gao, Yifan, et al.
Published: (2026)
by: Gao, Yifan, et al.
Published: (2026)
Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models
by: Sun, Haoyi, et al.
Published: (2026)
by: Sun, Haoyi, et al.
Published: (2026)
Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model
by: Ma, Shuailei, et al.
Published: (2023)
by: Ma, Shuailei, et al.
Published: (2023)
Seeing the Unseen: Towards Zero-Shot Inspection for Wind Turbine Blades using Knowledge-Augmented Vision Language Models
by: Zhang, Yang, et al.
Published: (2025)
by: Zhang, Yang, et al.
Published: (2025)
MoAPT: Mixture of Adversarial Prompt Tuning for Vision-Language Models
by: Zhao, Shiji, et al.
Published: (2025)
by: Zhao, Shiji, et al.
Published: (2025)
Diversity Covariance-Aware Prompt Learning for Vision-Language Models
by: Dong, Songlin, et al.
Published: (2025)
by: Dong, Songlin, et al.
Published: (2025)
GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model
by: Ai, Zixiang, et al.
Published: (2025)
by: Ai, Zixiang, et al.
Published: (2025)
Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt
by: Li, Jiangmeng, et al.
Published: (2022)
by: Li, Jiangmeng, et al.
Published: (2022)
Similar Items
-
MASRA: MLLM-Assisted Semantic-Relational Consistent Alignment for Video Temporal Grounding
by: Ran, Ran, et al.
Published: (2026) -
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection
by: Zhou, Shuchang, et al.
Published: (2026) -
MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
by: Liang, Qian, et al.
Published: (2025) -
Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework
by: Liu, Ke, et al.
Published: (2026) -
Frequency-Aware Semantic Fusion with Gated Injection for AI-generated Image Detection
by: Zhou, Shuchang, et al.
Published: (2026)