:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lara, Luis, Milios, Aristides, Luo, Zhi Hao, Sharma, Aditya, Luo, Ge Ya, Beckham, Christopher, Golemo, Florian, Pal, Christopher
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.14117
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design
by: Luo, Zhi Hao, et al.
Published: (2024)

Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes
by: Gosselin, Anthony, et al.
Published: (2025)

LLMs can learn self-restraint through iterative self-reflection
by: Piché, Alexandre, et al.
Published: (2024)

Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion
by: Luo, Ge Ya, et al.
Published: (2024)

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
by: Luo, Ge Ya, et al.
Published: (2024)

Exploring validation metrics for offline model-based optimisation with diffusion models
by: Beckham, Christopher, et al.
Published: (2022)

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning
by: Rowe, Luke, et al.
Published: (2024)

Robust Guided Diffusion for Offline Black-Box Optimization
by: Chen, Can Sam, et al.
Published: (2024)

Reducing Hallucinations in Language Model-based SPARQL Query Generation Using Post-Generation Memory Retrieval
by: Sharma, Aditya, et al.
Published: (2025)

Indignados en España e Indecisos en Polonia. La inspiración española en el contexto polaco y el fracaso de la protesta en el país de “Solidarnosć”
by: Karolina Golemo
Published: (2014)

ROSA: Random Subspace Adaptation for Efficient Fine-Tuning
by: Hameed, Marawan Gamal Abdel, et al.
Published: (2024)

Reinforced Embodied Planning with Verifiable Reward for Real-World Robotic Manipulation
by: Bo, Zitong, et al.
Published: (2025)

Meeting Legal Challenges. The School Leader's Library: Leading for Learning Series.
by: Beckham, Joseph
Published: (1996)

Incentivizing Parametric Knowledge via Reinforcement Learning with Verifiable Rewards for Cross-Cultural Entity Translation
by: Zhou, Jiang, et al.
Published: (2026)

Contextual Rollout Bandits for Reinforcement Learning with Verifiable Rewards
by: Lu, Xiaodong, et al.
Published: (2026)

Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards
by: Liu, Shuze Daniel, et al.
Published: (2026)

Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
by: Li, Hongyu, et al.
Published: (2025)

PRISM: A Unified Framework for Post-Training LLMs Without Verifiable Rewards
by: Ghimire, Mukesh, et al.
Published: (2026)

GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models
by: Sharma, Aditya, et al.
Published: (2024)

Minerva: Reinforcement Learning with Verifiable Rewards for Cyber Threat Intelligence LLMs
by: Alam, Md Tanvirul, et al.
Published: (2026)

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
by: Bensal, Shelly, et al.
Published: (2025)

Chart-RL: Generalized Chart Comprehension via Reinforcement Learning with Verifiable Rewards
by: Zhang, Xin, et al.
Published: (2026)

On the Generalization Gap in LLM Planning: Tests and Verifier-Reward RL
by: Belcamino, Valerio, et al.
Published: (2026)

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
by: Wang, Peisong, et al.
Published: (2025)

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
by: Wen, Xumeng, et al.
Published: (2025)

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation
by: Jiang, Yuxin, et al.
Published: (2026)

HypergraphFormer: Learning Hypergraphs from LLMs for Editable Floor Plan Generation
by: Klimenko, Nikita, et al.
Published: (2026)

Outcome Rewards Do Not Guarantee Verifiable or Causally Important Reasoning
by: Yu, Qinan, et al.
Published: (2026)

Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
by: Cai, Xin-Qiang, et al.
Published: (2025)

Verifier-Free RL for LLMs via Intrinsic Gradient-Norm Reward
by: Wen, Xuexiang, et al.
Published: (2026)

Text-to-Layout: A Generative Workflow for Drafting Architectural Floor Plans Using LLMs
by: Duggempudi, Jayakrishna, et al.
Published: (2025)

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
by: Liu, Shudong, et al.
Published: (2025)

Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
by: Liang, Hao, et al.
Published: (2022)

Breaking the Safety-Capability Tradeoff: Reinforcement Learning with Verifiable Rewards Maintains Safety Guardrails in LLMs
by: Cho, Dongkyu Derek, et al.
Published: (2025)

Conocimiento termodinámico sociales, y sus límites ecológicos
by: Camarena, Beckham, et al.
Published: (2025)

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
by: Liu, Xiaoyuan, et al.
Published: (2025)

Lessons from Training Grounded LLMs with Verifiable Rewards
by: Sim, Shang Hong, et al.
Published: (2025)

Auditing Data Membership in Reinforcement Learning With Verifiable Rewards
by: Liu, Yule, et al.
Published: (2025)

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards
by: Tang, Xinyu, et al.
Published: (2025)

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)