Saved in:
| Main Authors: | Hu, Chuanbo, Jia, Shan, Li, Xin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.14589 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models
by: Jamil, Sofia, et al.
Published: (2025)
by: Jamil, Sofia, et al.
Published: (2025)
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models
by: Xu, Jinglin, et al.
Published: (2024)
by: Xu, Jinglin, et al.
Published: (2024)
Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning
by: Li, Chunlei, et al.
Published: (2025)
by: Li, Chunlei, et al.
Published: (2025)
Simulating Post-Neoadjuvant Chemotherapy Breast Cancer MRI via Diffusion Model with Prompt Tuning
by: Kim, Jonghun, et al.
Published: (2025)
by: Kim, Jonghun, et al.
Published: (2025)
Modeling Subjective Urban Perception with Human Gaze
by: Che, Lin, et al.
Published: (2026)
by: Che, Lin, et al.
Published: (2026)
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
by: Wang, Yibo, et al.
Published: (2024)
by: Wang, Yibo, et al.
Published: (2024)
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
by: Liu, Yufan, et al.
Published: (2025)
by: Liu, Yufan, et al.
Published: (2025)
The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection
by: He, Qingdong, et al.
Published: (2025)
by: He, Qingdong, et al.
Published: (2025)
SEP: Self-Enhanced Prompt Tuning for Visual-Language Model
by: Yao, Hantao, et al.
Published: (2024)
by: Yao, Hantao, et al.
Published: (2024)
FLDM-VTON: Faithful Latent Diffusion Model for Virtual Try-on
by: Wang, Chenhui, et al.
Published: (2024)
by: Wang, Chenhui, et al.
Published: (2024)
AoP-SAM: Automation of Prompts for Efficient Segmentation
by: Chen, Yi, et al.
Published: (2025)
by: Chen, Yi, et al.
Published: (2025)
Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision
by: Yan, Weicai, et al.
Published: (2025)
by: Yan, Weicai, et al.
Published: (2025)
MuGa-VTON: Multi-Garment Virtual Try-On via Diffusion Transformers with Prompt Customization
by: Deria, Ankan, et al.
Published: (2025)
by: Deria, Ankan, et al.
Published: (2025)
Debiased Prompt Tuning in Vision-Language Model without Annotations
by: Jiang, Chaoquan, et al.
Published: (2025)
by: Jiang, Chaoquan, et al.
Published: (2025)
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
by: Lian, Long, et al.
Published: (2023)
by: Lian, Long, et al.
Published: (2023)
Tuning Vision-Language Models with Candidate Labels by Prompt Alignment
by: Zhang, Zhifang, et al.
Published: (2024)
by: Zhang, Zhifang, et al.
Published: (2024)
SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI
by: Cui, Zhuo-Xu, et al.
Published: (2023)
by: Cui, Zhuo-Xu, et al.
Published: (2023)
Raw Data Matters: Enhancing Prompt Tuning by Internal Augmentation on Vision-Language Models
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning
by: Li, Xinhao, et al.
Published: (2025)
by: Li, Xinhao, et al.
Published: (2025)
TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
by: Zhao, Yiming, et al.
Published: (2025)
by: Zhao, Yiming, et al.
Published: (2025)
PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting
by: Wang, Linqing, et al.
Published: (2025)
by: Wang, Linqing, et al.
Published: (2025)
RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
by: Huang, Jie, et al.
Published: (2024)
by: Huang, Jie, et al.
Published: (2024)
POET: Prompt Offset Tuning for Continual Human Action Adaptation
by: Garg, Prachi, et al.
Published: (2025)
by: Garg, Prachi, et al.
Published: (2025)
LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model
by: Sun, Haowen, et al.
Published: (2024)
by: Sun, Haowen, et al.
Published: (2024)
Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images
by: Zhang, Jiaxin, et al.
Published: (2024)
by: Zhang, Jiaxin, et al.
Published: (2024)
HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
by: Gong, Jue, et al.
Published: (2025)
by: Gong, Jue, et al.
Published: (2025)
Face-MLLM: A Large Face Perception Model
by: Sun, Haomiao, et al.
Published: (2024)
by: Sun, Haomiao, et al.
Published: (2024)
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
by: Qiu, Haonan, et al.
Published: (2023)
by: Qiu, Haonan, et al.
Published: (2023)
DynRsl-VLM: Enhancing Autonomous Driving Perception with Dynamic Resolution Vision-Language Models
by: Zhou, Xirui, et al.
Published: (2025)
by: Zhou, Xirui, et al.
Published: (2025)
Vision-Driven 2D Supervised Fine-Tuning Framework for Bird's Eye View Perception
by: He, Lei, et al.
Published: (2024)
by: He, Lei, et al.
Published: (2024)
Adversarial Prompt Tuning for Vision-Language Models
by: Zhang, Jiaming, et al.
Published: (2023)
by: Zhang, Jiaming, et al.
Published: (2023)
Anatomy-Grounded Weakly Supervised Prompt Tuning for Chest X-ray Latent Diffusion Models
by: Vilouras, Konstantinos, et al.
Published: (2025)
by: Vilouras, Konstantinos, et al.
Published: (2025)
Tokenize Anything via Prompting
by: Pan, Ting, et al.
Published: (2023)
by: Pan, Ting, et al.
Published: (2023)
Adaptive Physical-Facial Representation Fusion via Subject-Invariant Cross-Modal Prompt Tuning for Video-Based Emotion Recognition
by: Luo, Xiwen, et al.
Published: (2026)
by: Luo, Xiwen, et al.
Published: (2026)
Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
by: Yu, Yangche, et al.
Published: (2025)
by: Yu, Yangche, et al.
Published: (2025)
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model
by: Wu, Tao, et al.
Published: (2024)
by: Wu, Tao, et al.
Published: (2024)
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
by: Wang, Xin, et al.
Published: (2024)
by: Wang, Xin, et al.
Published: (2024)
Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model
by: Wang, Shuyun, et al.
Published: (2026)
by: Wang, Shuyun, et al.
Published: (2026)
Prompt-aligned Gradient for Prompt Tuning
by: Zhu, Beier, et al.
Published: (2022)
by: Zhu, Beier, et al.
Published: (2022)
Similar Items
-
Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models
by: Jamil, Sofia, et al.
Published: (2025) -
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models
by: Xu, Jinglin, et al.
Published: (2024) -
Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning
by: Li, Chunlei, et al.
Published: (2025) -
Simulating Post-Neoadjuvant Chemotherapy Breast Cancer MRI via Diffusion Model with Prompt Tuning
by: Kim, Jonghun, et al.
Published: (2025) -
Modeling Subjective Urban Perception with Human Gaze
by: Che, Lin, et al.
Published: (2026)