Saved in:
| Main Authors: | Jing, Xuepeng, Lu, Wenhuan, Meng, Hao, Yu, Zhizhi, Wei, Jianguo |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.24936 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
by: Deng, Haoyou, et al.
Published: (2026)
by: Deng, Haoyou, et al.
Published: (2026)
You Only Speak Once to See
by: Yang, Wenhao, et al.
Published: (2024)
by: Yang, Wenhao, et al.
Published: (2024)
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
by: Wang, Jing, et al.
Published: (2025)
by: Wang, Jing, et al.
Published: (2025)
OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models
by: Zhang, Liyu, et al.
Published: (2026)
by: Zhang, Liyu, et al.
Published: (2026)
Flow-GRPO: Training Flow Matching Models via Online RL
by: Liu, Jie, et al.
Published: (2025)
by: Liu, Jie, et al.
Published: (2025)
Smart-GRPO: Smartly Sampling Noise for Efficient RL of Flow-Matching Models
by: Yu, Benjamin, et al.
Published: (2025)
by: Yu, Benjamin, et al.
Published: (2025)
Stepwise Credit Assignment for GRPO on Flow-Matching Models
by: Savani, Yash, et al.
Published: (2026)
by: Savani, Yash, et al.
Published: (2026)
FlowCast: Trajectory Forecasting for Scalable Zero-Cost Speculative Flow Matching
by: Bajpai, Divya Jyoti, et al.
Published: (2026)
by: Bajpai, Divya Jyoti, et al.
Published: (2026)
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
by: Wang, Yibin, et al.
Published: (2025)
by: Wang, Yibin, et al.
Published: (2025)
Multi-GRPO: Multi-Group Advantage Estimation for Text-to-Image Generation with Tree-Based Trajectories and Multiple Rewards
by: Lyu, Qiang, et al.
Published: (2025)
by: Lyu, Qiang, et al.
Published: (2025)
DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
by: Liu, Henglin, et al.
Published: (2025)
by: Liu, Henglin, et al.
Published: (2025)
EMIT: Enhancing MLLMs for Industrial Anomaly Detection via Difficulty-Aware GRPO
by: Guan, Wei, et al.
Published: (2025)
by: Guan, Wei, et al.
Published: (2025)
TempFlow-GRPO: When Timing Matters for GRPO in Flow Models
by: He, Xiaoxuan, et al.
Published: (2025)
by: He, Xiaoxuan, et al.
Published: (2025)
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO
by: Tong, Yunze, et al.
Published: (2026)
by: Tong, Yunze, et al.
Published: (2026)
FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models
by: Sun, Yi, et al.
Published: (2026)
by: Sun, Yi, et al.
Published: (2026)
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
by: Rong, Xuankun, et al.
Published: (2025)
by: Rong, Xuankun, et al.
Published: (2025)
GuideFlow: Constraint-Guided Flow Matching for Planning in End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)
by: Liu, Lin, et al.
Published: (2025)
Accelerating Rectified Flow Models via Trajectory-Aware Caching
by: Liu, Xiao, et al.
Published: (2026)
by: Liu, Xiao, et al.
Published: (2026)
Beyond Imitation: Constraint-Aware Trajectory Generation with Flow Matching For End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)
by: Liu, Lin, et al.
Published: (2025)
DanceGRPO: Unleashing GRPO on Visual Generation
by: Xue, Zeyue, et al.
Published: (2025)
by: Xue, Zeyue, et al.
Published: (2025)
CurveFlow: Curvature-Guided Flow Matching for Image Generation
by: Luo, Yan, et al.
Published: (2025)
by: Luo, Yan, et al.
Published: (2025)
MedLoc-R1: Performance-Aware Curriculum Reward Scheduling for GRPO-Based Medical Visual Grounding
by: Yang, Guangjing, et al.
Published: (2026)
by: Yang, Guangjing, et al.
Published: (2026)
Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learning
by: Meng, Xiangyu, et al.
Published: (2025)
by: Meng, Xiangyu, et al.
Published: (2025)
Rethinking Reward Signals in Video GRPO: When Scores Become Targets
by: Li, Rui, et al.
Published: (2025)
by: Li, Rui, et al.
Published: (2025)
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
by: Chen, Chen, et al.
Published: (2025)
by: Chen, Chen, et al.
Published: (2025)
Robust Dataset Distillation by Matching Adversarial Trajectories
by: Lai, Wei, et al.
Published: (2025)
by: Lai, Wei, et al.
Published: (2025)
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
by: Bu, Jiazi, et al.
Published: (2026)
by: Bu, Jiazi, et al.
Published: (2026)
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
by: Li, Junzhe, et al.
Published: (2025)
by: Li, Junzhe, et al.
Published: (2025)
Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching
by: Zhang, Yasi, et al.
Published: (2024)
by: Zhang, Yasi, et al.
Published: (2024)
Reward-Aware Trajectory Shaping for Few-step Visual Generation
by: Li, Rui, et al.
Published: (2026)
by: Li, Rui, et al.
Published: (2026)
MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation
by: Ma, Xiaoxiao, et al.
Published: (2026)
by: Ma, Xiaoxiao, et al.
Published: (2026)
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
by: Lu, Yunhong, et al.
Published: (2025)
by: Lu, Yunhong, et al.
Published: (2025)
MotionGRPO: Overcoming Low Intra-Group Diversity in GRPO-Based Egocentric Motion Recovery
by: Yao, Nanjie, et al.
Published: (2026)
by: Yao, Nanjie, et al.
Published: (2026)
Image Aesthetic Reasoning via HCM-GRPO: Empowering Compact Model for Superior Performance
by: Hu, Zhiyuan, et al.
Published: (2025)
by: Hu, Zhiyuan, et al.
Published: (2025)
MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation
by: Fu, Yuxiang, et al.
Published: (2025)
by: Fu, Yuxiang, et al.
Published: (2025)
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving
by: Xing, Zebin, et al.
Published: (2025)
by: Xing, Zebin, et al.
Published: (2025)
Training-Free Reward-Guided Image Editing via Trajectory Optimal Control
by: Chang, Jinho, et al.
Published: (2025)
by: Chang, Jinho, et al.
Published: (2025)
Geometry-Aware Image Flow Matching
by: Lee, Junho, et al.
Published: (2026)
by: Lee, Junho, et al.
Published: (2026)
Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow Models
by: He, Dailan, et al.
Published: (2025)
by: He, Dailan, et al.
Published: (2025)
Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis
by: Lin, Tianwei, et al.
Published: (2026)
by: Lin, Tianwei, et al.
Published: (2026)
Similar Items
-
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
by: Deng, Haoyou, et al.
Published: (2026) -
You Only Speak Once to See
by: Yang, Wenhao, et al.
Published: (2024) -
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
by: Wang, Jing, et al.
Published: (2025) -
OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models
by: Zhang, Liyu, et al.
Published: (2026) -
Flow-GRPO: Training Flow Matching Models via Online RL
by: Liu, Jie, et al.
Published: (2025)