:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Liu, Gongye, Yang, Bo, Zhi, Yida, Zhong, Zhizhou, Ke, Lei, Deng, Didan, Gao, Han, Huang, Yongxiang, Zhang, Kaihao, Fu, Hongbo, Luo, Wenhan
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2602.11146
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
di: Xie, Weiyan, et al.
Pubblicazione: (2025)

The Dark Side of Rich Rewards: Understanding and Mitigating Noise in VLM Rewards
di: Huang, Sukai, et al.
Pubblicazione: (2024)

MB-TaylorFormer V2: Improved Multi-branch Linear Transformer Expanded by Taylor Formula for Image Restoration
di: Jin, Zhi, et al.
Pubblicazione: (2025)

From Demonstrations to Rewards: Test-Time Prompt Optimization for VLM Reward Models
di: Gumbsch, Christian, et al.
Pubblicazione: (2026)

Towards Real-World Blind Face Restoration with Generative Diffusion Prior
di: Chen, Xiaoxu, et al.
Pubblicazione: (2023)

Diffusion Reward: Learning Rewards via Conditional Video Diffusion
di: Huang, Tao, et al.
Pubblicazione: (2023)

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models
di: Li, Kaican, et al.
Pubblicazione: (2024)

Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
di: Hu, Zhiyuan, et al.
Pubblicazione: (2025)

Beyond Pairwise Preferences: Listwise Reward-Aware Alignment for Diffusion Models
di: Wang, Austin, et al.
Pubblicazione: (2026)

Reward Shaping to Mitigate Reward Hacking in RLHF
di: Fu, Jiayi, et al.
Pubblicazione: (2025)

FlowSteer: Guiding Few-Step Image Synthesis with Authentic Trajectories
di: Ke, Lei, et al.
Pubblicazione: (2025)

Prompt-Level Reward Specifications for Open-Ended Post-Training
di: Weng, Zijun, et al.
Pubblicazione: (2026)

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
di: He, Lehan, et al.
Pubblicazione: (2024)

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
di: Deng, Fei, et al.
Pubblicazione: (2024)

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
di: Jia, Zhiwei, et al.
Pubblicazione: (2024)

Self-Corrected Image Generation with Explainable Latent Rewards
di: Luo, Yinyi, et al.
Pubblicazione: (2026)

Prrr: Personal Random Rewards for Blockchain Reporting
di: Chen, Hongyin, et al.
Pubblicazione: (2025)

LatSearch: Latent Reward-Guided Search for Faster Inference-Time Scaling in Video Diffusion
di: Zhao, Zengqun, et al.
Pubblicazione: (2026)

Scouting By Reward: VLM-TO-IRL-Driven Player Selection For Esports
di: Yan, Qing, et al.
Pubblicazione: (2026)

Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM
di: Wu, Chengyue, et al.
Pubblicazione: (2026)

Video Generation Models Are Good Latent Reward Models
di: Mi, Xiaoyue, et al.
Pubblicazione: (2025)

Self-Rewarding Sequential Monte Carlo for Masked Diffusion Language Models
di: Luo, Ziwei, et al.
Pubblicazione: (2026)

Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
di: Xie, Tianbao, et al.
Pubblicazione: (2023)

Reward Guided Latent Consistency Distillation
di: Li, Jiachen, et al.
Pubblicazione: (2024)

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
di: Wang, Chaoqi, et al.
Pubblicazione: (2025)

Exploration by Random Reward Perturbation
di: Ma, Haozhe, et al.
Pubblicazione: (2025)

Reward Balancing Revisited: Enhancing Offline Reinforcement Learning for Recommender Systems
di: Shu, Wenzheng, et al.
Pubblicazione: (2025)

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
di: Zhang, Tao, et al.
Pubblicazione: (2025)

ProcVLM: Learning Procedure-Grounded Progress Rewards for Robotic Manipulation
di: Feng, Youhe, et al.
Pubblicazione: (2026)

VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
di: Cong, Xiaoyan, et al.
Pubblicazione: (2025)

RewardDance: Reward Scaling in Visual Generation
di: Wu, Jie, et al.
Pubblicazione: (2025)

MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding
di: Yang, Fan, et al.
Pubblicazione: (2025)

Driving Beyond Privilege: Distilling Dense-Reward Knowledge into Sparse-Reward Policies
di: Khanzada, Feeza Khan, et al.
Pubblicazione: (2025)

VLM-TDP: VLM-guided Trajectory-conditioned Diffusion Policy for Robust Long-Horizon Manipulation
di: Huang, Kefeng, et al.
Pubblicazione: (2025)

Reward Prediction with Factorized World States
di: Shen, Yijun, et al.
Pubblicazione: (2026)

CommentScope: A Comment-Embedded Assisted Reading System for a Long Text
di: Chen, Shuai, et al.
Pubblicazione: (2025)

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning
di: Sun, Xiaopeng, et al.
Pubblicazione: (2024)

Aligning Few-Step Diffusion Models with Dense Reward Difference Learning
di: Zhang, Ziyi, et al.
Pubblicazione: (2024)

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
di: Deng, Haikang, et al.
Pubblicazione: (2023)

BaseReward: A Strong Baseline for Multimodal Reward Model
di: Zhang, Yi-Fan, et al.
Pubblicazione: (2025)