Saved in:
| Main Authors: | Peysakhovich, Alexander, Berman, William |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.15577 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
by: Berman, William, et al.
Published: (2024)
by: Berman, William, et al.
Published: (2024)
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
by: Zhao, Xiaoming, et al.
Published: (2025)
by: Zhao, Xiaoming, et al.
Published: (2025)
Classifier-Free Guidance: From High-Dimensional Analysis to Generalized Guidance Forms
by: Pavasovic, Krunoslav Lehman, et al.
Published: (2025)
by: Pavasovic, Krunoslav Lehman, et al.
Published: (2025)
Discriminator Guidance for Autoregressive Diffusion Models
by: Kelvinius, Filip Ekström, et al.
Published: (2023)
by: Kelvinius, Filip Ekström, et al.
Published: (2023)
CFG-OEC: Classifier Free Guidance with Orthogonal Error Correction
by: Yang, Nakgyu, et al.
Published: (2025)
by: Yang, Nakgyu, et al.
Published: (2025)
WARP: On the Benefits of Weight Averaged Rewarded Policies
by: Ramé, Alexandre, et al.
Published: (2024)
by: Ramé, Alexandre, et al.
Published: (2024)
CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models
by: Chung, Hyungjin, et al.
Published: (2024)
by: Chung, Hyungjin, et al.
Published: (2024)
Classifier-Free Guidance is a Predictor-Corrector
by: Bradley, Arwen, et al.
Published: (2024)
by: Bradley, Arwen, et al.
Published: (2024)
EP-CFG: Energy-Preserving Classifier-Free Guidance
by: Zhang, Kai, et al.
Published: (2024)
by: Zhang, Kai, et al.
Published: (2024)
Value-Free Policy Optimization via Reward Partitioning
by: Faye, Bilal, et al.
Published: (2025)
by: Faye, Bilal, et al.
Published: (2025)
Improving Classifier-Free Guidance in Masked Diffusion: Low-Dim Theoretical Insights with High-Dim Impact
by: Rojas, Kevin, et al.
Published: (2025)
by: Rojas, Kevin, et al.
Published: (2025)
Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance
by: Li, Zhuo, et al.
Published: (2025)
by: Li, Zhuo, et al.
Published: (2025)
Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
by: Schnell, Jacob, et al.
Published: (2025)
by: Schnell, Jacob, et al.
Published: (2025)
Efficient Controllable Diffusion via Optimal Classifier Guidance
by: Oertell, Owen, et al.
Published: (2025)
by: Oertell, Owen, et al.
Published: (2025)
Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models
by: Kim, Yeongmin, et al.
Published: (2026)
by: Kim, Yeongmin, et al.
Published: (2026)
Classifier-Free Guidance inside the Attraction Basin May Cause Memorization
by: Jain, Anubhav, et al.
Published: (2024)
by: Jain, Anubhav, et al.
Published: (2024)
Margin-calibrated Classifier Guidance for Property-driven Synthesis Planning
by: Laabid, Najwa, et al.
Published: (2026)
by: Laabid, Najwa, et al.
Published: (2026)
Diffusion Models without Classifier-free Guidance
by: Tang, Zhicong, et al.
Published: (2025)
by: Tang, Zhicong, et al.
Published: (2025)
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
by: Lin, Baijiong, et al.
Published: (2025)
by: Lin, Baijiong, et al.
Published: (2025)
TFG: Unified Training-Free Guidance for Diffusion Models
by: Ye, Haotian, et al.
Published: (2024)
by: Ye, Haotian, et al.
Published: (2024)
Intrinsic Reward Policy Optimization for Sparse-Reward Environments
by: Cho, Minjae, et al.
Published: (2026)
by: Cho, Minjae, et al.
Published: (2026)
Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework
by: Deng, Yongxin, et al.
Published: (2024)
by: Deng, Yongxin, et al.
Published: (2024)
Entropy Aware Reward Guidance for Diffusion Language Model Alignment
by: Tejaswi, Atula, et al.
Published: (2026)
by: Tejaswi, Atula, et al.
Published: (2026)
Planning-Augmented Sampling with Early Guidance for High-Reward Discovery
by: Zhu, Rui, et al.
Published: (2025)
by: Zhu, Rui, et al.
Published: (2025)
Autoregressive Policy Optimization for Constrained Allocation Tasks
by: Winkel, David, et al.
Published: (2024)
by: Winkel, David, et al.
Published: (2024)
Mutual-Taught for Co-adapting Policy and Reward Models
by: Shi, Tianyuan, et al.
Published: (2025)
by: Shi, Tianyuan, et al.
Published: (2025)
Policy Filtration for RLHF to Mitigate Noise in Reward Models
by: Zhang, Chuheng, et al.
Published: (2024)
by: Zhang, Chuheng, et al.
Published: (2024)
WARM: On the Benefits of Weight Averaged Reward Models
by: Ramé, Alexandre, et al.
Published: (2024)
by: Ramé, Alexandre, et al.
Published: (2024)
MONET -- Virtual Cell Painting of Brightfield Images and Time Lapses Using Reference Consistent Diffusion
by: Peysakhovich, Alexander, et al.
Published: (2025)
by: Peysakhovich, Alexander, et al.
Published: (2025)
Deep SPI: Safe Policy Improvement via World Models
by: Delgrange, Florent, et al.
Published: (2025)
by: Delgrange, Florent, et al.
Published: (2025)
WAVE: Weighted Autoregressive Varying Gate for Time Series Forecasting
by: Lu, Jiecheng, et al.
Published: (2024)
by: Lu, Jiecheng, et al.
Published: (2024)
Boosting Reinforcement Learning with Verifiable Rewards via Randomly Selected Few-Shot Guidance
by: Yan, Kai, et al.
Published: (2026)
by: Yan, Kai, et al.
Published: (2026)
CROP: Conservative Reward for Model-based Offline Policy Optimization
by: Li, Hao, et al.
Published: (2023)
by: Li, Hao, et al.
Published: (2023)
Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)
by: Zhong, Victor, et al.
Published: (2024)
Auto-Rubric: Learning From Implicit Weights to Explicit Rubrics for Reward Modeling
by: Xie, Lipeng, et al.
Published: (2025)
by: Xie, Lipeng, et al.
Published: (2025)
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
by: Hussain, Shehzeen, et al.
Published: (2025)
by: Hussain, Shehzeen, et al.
Published: (2025)
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training
by: Bhaskara, Vin, et al.
Published: (2026)
by: Bhaskara, Vin, et al.
Published: (2026)
Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning
by: Nguyen, Viet Bac, et al.
Published: (2026)
by: Nguyen, Viet Bac, et al.
Published: (2026)
GRPO is Secretly a Process Reward Model
by: Sullivan, Michael, et al.
Published: (2025)
by: Sullivan, Michael, et al.
Published: (2025)
Adversarial Training for Process Reward Models
by: Juneja, Gurusha, et al.
Published: (2025)
by: Juneja, Gurusha, et al.
Published: (2025)
Similar Items
-
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
by: Berman, William, et al.
Published: (2024) -
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
by: Zhao, Xiaoming, et al.
Published: (2025) -
Classifier-Free Guidance: From High-Dimensional Analysis to Generalized Guidance Forms
by: Pavasovic, Krunoslav Lehman, et al.
Published: (2025) -
Discriminator Guidance for Autoregressive Diffusion Models
by: Kelvinius, Filip Ekström, et al.
Published: (2023) -
CFG-OEC: Classifier Free Guidance with Orthogonal Error Correction
by: Yang, Nakgyu, et al.
Published: (2025)