:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Wenxi, Liu, Hongbin, Li, Mingqian, Yuan, Junyan, Zhang, Junqi
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.14936
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging the Gap: Aligning Text-to-Image Diffusion Models with Specific Feedback
by: Niu, Xuexiang, et al.
Published: (2024)

A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
by: Yang, Shentao, et al.
Published: (2024)

TextAlign: Preference Alignment for Text Rendering with Hierarchical Rewards
by: Cui, Mingxuan, et al.
Published: (2026)

PC-Diffusion: Aligning Diffusion Models with Human Preferences via Preference Classifier
by: Wang, Shaomeng, et al.
Published: (2025)

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
by: Liu, Kendong, et al.
Published: (2024)

Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation
by: Hu, Zijing, et al.
Published: (2025)

Follow-Your-Preference: Towards Preference-Aligned Image Inpainting
by: Shen, Yutao, et al.
Published: (2025)

Calibrated Multi-Preference Optimization for Aligning Diffusion Models
by: Lee, Kyungmin, et al.
Published: (2025)

Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment
by: Liu, Henglin, et al.
Published: (2025)

Zero-Shot Chinese Character Recognition with Hierarchical Multi-Granularity Image-Text Aligning
by: Zhu, Yinglian, et al.
Published: (2025)

MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences
by: Wang, Weitao, et al.
Published: (2024)

Category-level Text-to-Image Retrieval Improved: Bridging the Domain Gap with Diffusion Models and Vision Encoders
by: Khan, Faizan Farooq, et al.
Published: (2025)

DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression
by: Shi, Junqi, et al.
Published: (2026)

MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning
by: Liu, Xiaoyang, et al.
Published: (2024)

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
by: Liang, Guotao, et al.
Published: (2025)

AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
by: An, Jingkun, et al.
Published: (2024)

Facial Expression Generation Aligned with Human Preference for Natural Dyadic Interaction
by: Chen, Xu, et al.
Published: (2026)

HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking
by: Deng, Yao, et al.
Published: (2025)

Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
by: Wang, Fu-Yun, et al.
Published: (2025)

Can Diffusion Models Bridge the Domain Gap in Cardiac MR Imaging?
by: Wong, Xin Ci, et al.
Published: (2025)

LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
by: Wang, Yibin, et al.
Published: (2024)

Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
by: Han, Xu, et al.
Published: (2024)

Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models
by: Johnson, Emily, et al.
Published: (2025)

IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning
by: Qiu, Tianheng, et al.
Published: (2025)

Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
by: Lu, Yunhong, et al.
Published: (2025)

SuperFace: Preference-Aligned Facial Expression Estimation Beyond Pseudo Supervision
by: Kang, Zejian, et al.
Published: (2026)

Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets
by: Kennerley, Mikhail, et al.
Published: (2025)

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
by: Chen, Yiyang, et al.
Published: (2022)

Unveiling and Bridging the Functional Perception Gap in MLLMs: Atomic Visual Alignment and Hierarchical Evaluation via PET-Bench
by: Ye, Zanting, et al.
Published: (2026)

Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images
by: Xu, Shicheng, et al.
Published: (2023)

AdaViP: Aligning Multi-modal LLMs via Adaptive Vision-enhanced Preference Optimization
by: Lu, Jinda, et al.
Published: (2025)

Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
by: Zhang, Chunyang, et al.
Published: (2025)

Instant Preference Alignment for Text-to-Image Diffusion Models
by: Li, Yang, et al.
Published: (2025)

When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO
by: Zhang, Lingfan, et al.
Published: (2025)

FreeInit: Bridging Initialization Gap in Video Diffusion Models
by: Wu, Tianxing, et al.
Published: (2023)

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
by: Lin, Hongbin, et al.
Published: (2025)

Bridging the Gap Between End-to-End and Two-Step Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)

ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
by: Shen, Wenhao, et al.
Published: (2025)

Revisiting Relevance Feedback for CLIP-based Interactive Image Retrieval
by: Nara, Ryoya, et al.
Published: (2024)

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
by: Jiang, Dongzhi, et al.
Published: (2024)