Saved in:
| Main Authors: | Guo, Haoxin, Pan, Jiawen, Zhai, Weixin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.17105 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reinforcement Unlearning via Group Relative Policy Optimization
by: Zaradoukas, Efstratios, et al.
Published: (2026)
by: Zaradoukas, Efstratios, et al.
Published: (2026)
Sequential Policy Gradient for Adaptive Hyperparameter Optimization
by: Li, Zheng, et al.
Published: (2025)
by: Li, Zheng, et al.
Published: (2025)
Amortized Molecular Optimization via Group Relative Policy Optimization
by: Javaid, Muhammad bin, et al.
Published: (2026)
by: Javaid, Muhammad bin, et al.
Published: (2026)
Constrained Group Relative Policy Optimization
by: Girgis, Roger, et al.
Published: (2026)
by: Girgis, Roger, et al.
Published: (2026)
Overtuning in Hyperparameter Optimization
by: Schneider, Lennart, et al.
Published: (2025)
by: Schneider, Lennart, et al.
Published: (2025)
Sharpness-Guided Group Relative Policy Optimization via Probability Shaping
by: Le, Tue, et al.
Published: (2025)
by: Le, Tue, et al.
Published: (2025)
Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing
by: Li, Gengsheng, et al.
Published: (2026)
by: Li, Gengsheng, et al.
Published: (2026)
Consensus Group Relative Policy Optimization for Text Generation
by: Ichihara, Yuki, et al.
Published: (2026)
by: Ichihara, Yuki, et al.
Published: (2026)
Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training
by: Mroueh, Youssef, et al.
Published: (2025)
by: Mroueh, Youssef, et al.
Published: (2025)
Dynamic Priors in Bayesian Optimization for Hyperparameter Optimization
by: Fehring, Lukas, et al.
Published: (2025)
by: Fehring, Lukas, et al.
Published: (2025)
Hyperparameter Optimization in Machine Learning
by: Franceschi, Luca, et al.
Published: (2024)
by: Franceschi, Luca, et al.
Published: (2024)
Hyperparameter Optimization via Interacting with Probabilistic Circuits
by: Seng, Jonas, et al.
Published: (2025)
by: Seng, Jonas, et al.
Published: (2025)
Advantage Collapse in Group Relative Policy Optimization: Diagnosis and Mitigation
by: He, Xixiang, et al.
Published: (2026)
by: He, Xixiang, et al.
Published: (2026)
NGRPO: Negative-enhanced Group Relative Policy Optimization
by: Nan, Gongrui, et al.
Published: (2025)
by: Nan, Gongrui, et al.
Published: (2025)
Hybrid Group Relative Policy Optimization: A Multi-Sample Approach to Enhancing Policy Optimization
by: Sane, Soham
Published: (2025)
by: Sane, Soham
Published: (2025)
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
by: Saito, Yuta, et al.
Published: (2024)
by: Saito, Yuta, et al.
Published: (2024)
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
by: Rakotoarison, Herilalaina, et al.
Published: (2024)
by: Rakotoarison, Herilalaina, et al.
Published: (2024)
Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic
by: Zhou, Hongyi, et al.
Published: (2026)
by: Zhou, Hongyi, et al.
Published: (2026)
Deriving Hyperparameter Scaling Laws via Modern Optimization Theory
by: Shulgin, Egor, et al.
Published: (2026)
by: Shulgin, Egor, et al.
Published: (2026)
Leveraging Group Relative Policy Optimization to Advance Large Language Models in Traditional Chinese Medicine
by: Xie, Jiacheng, et al.
Published: (2025)
by: Xie, Jiacheng, et al.
Published: (2025)
Latent-GRPO: Group Relative Policy Optimization for Latent Reasoning
by: Deng, Jingcheng, et al.
Published: (2026)
by: Deng, Jingcheng, et al.
Published: (2026)
GTPO: Stabilizing Group Relative Policy Optimization via Gradient and Entropy Control
by: Simoni, Marco, et al.
Published: (2025)
by: Simoni, Marco, et al.
Published: (2025)
Enhancing Performance and Calibration in Quantile Hyperparameter Optimization
by: Doyle, Riccardo
Published: (2025)
by: Doyle, Riccardo
Published: (2025)
Adaptive Hyperparameter Optimization for Continual Learning Scenarios
by: Semola, Rudy, et al.
Published: (2024)
by: Semola, Rudy, et al.
Published: (2024)
Hyperparameter Tuning Through Pessimistic Bilevel Optimization
by: Ustun, Meltem Apaydin, et al.
Published: (2024)
by: Ustun, Meltem Apaydin, et al.
Published: (2024)
On Optimizing Hyperparameters for Quantum Neural Networks
by: Herbst, Sabrina, et al.
Published: (2024)
by: Herbst, Sabrina, et al.
Published: (2024)
ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization
by: Schröder, Maresa, et al.
Published: (2026)
by: Schröder, Maresa, et al.
Published: (2026)
Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space
by: Zixian, Wang
Published: (2026)
by: Zixian, Wang
Published: (2026)
Personalized Group Relative Policy Optimization for Heterogenous Preference Alignment
by: Wang, Jialu, et al.
Published: (2026)
by: Wang, Jialu, et al.
Published: (2026)
MedGround-R1: Advancing Medical Image Grounding via Spatial-Semantic Rewarded Group Relative Policy Optimization
by: Xu, Huihui, et al.
Published: (2025)
by: Xu, Huihui, et al.
Published: (2025)
Enhancing LLM-based Search Agents via Contribution Weighted Group Relative Policy Optimization
by: Wang, Junzhe, et al.
Published: (2026)
by: Wang, Junzhe, et al.
Published: (2026)
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
by: Wang, Jiaqi, et al.
Published: (2026)
by: Wang, Jiaqi, et al.
Published: (2026)
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
by: Qi, Penghui, et al.
Published: (2025)
by: Qi, Penghui, et al.
Published: (2025)
TreeRPO: Tree Relative Policy Optimization
by: Yang, Zhicheng, et al.
Published: (2025)
by: Yang, Zhicheng, et al.
Published: (2025)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning
by: Kim, Minyoung, et al.
Published: (2024)
by: Kim, Minyoung, et al.
Published: (2024)
From Black-Box Tuning to Guided Optimization via Hyperparameters Interaction Analysis
by: Garouani, Moncef, et al.
Published: (2025)
by: Garouani, Moncef, et al.
Published: (2025)
WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning
by: Mundada, Gagan, et al.
Published: (2026)
by: Mundada, Gagan, et al.
Published: (2026)
F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking
by: Surana, Rohan, et al.
Published: (2026)
by: Surana, Rohan, et al.
Published: (2026)
Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview
by: Karl, Florian, et al.
Published: (2022)
by: Karl, Florian, et al.
Published: (2022)
Practitioner Motives to Use Different Hyperparameter Optimization Methods
by: Kannengießer, Niclas, et al.
Published: (2022)
by: Kannengießer, Niclas, et al.
Published: (2022)
Similar Items
-
Reinforcement Unlearning via Group Relative Policy Optimization
by: Zaradoukas, Efstratios, et al.
Published: (2026) -
Sequential Policy Gradient for Adaptive Hyperparameter Optimization
by: Li, Zheng, et al.
Published: (2025) -
Amortized Molecular Optimization via Group Relative Policy Optimization
by: Javaid, Muhammad bin, et al.
Published: (2026) -
Constrained Group Relative Policy Optimization
by: Girgis, Roger, et al.
Published: (2026) -
Overtuning in Hyperparameter Optimization
by: Schneider, Lennart, et al.
Published: (2025)