Saved in:
| Main Authors: | Jha, Manvi, Wan, Jiaxin, Chen, Deming |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.06239 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CoRe-Code: Collaborative Reinforcement Learning for Code Generation
by: Dou, Zhihao, et al.
Published: (2026)
by: Dou, Zhihao, et al.
Published: (2026)
CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning
by: Shi, Ji, et al.
Published: (2026)
by: Shi, Ji, et al.
Published: (2026)
SchemaCoder: Automatic Log Schema Extraction Coder with Residual Q-Tree Boosting
by: Wan, Lily Jiaxin, et al.
Published: (2025)
by: Wan, Lily Jiaxin, et al.
Published: (2025)
FVRuleLearner: Operator-Level Reasoning Tree (OP-Tree)-Based Rules Learning for Formal Verification
by: Wan, Lily Jiaxin, et al.
Published: (2026)
by: Wan, Lily Jiaxin, et al.
Published: (2026)
Agentic Reinforcement Learning for Real-World Code Repair
by: Zhu, Siyu, et al.
Published: (2025)
by: Zhu, Siyu, et al.
Published: (2025)
Beyond Binary: Turning Partial Success into Dense Verifiable Rewards for Reinforcement Learning in Code Generation
by: Wang, Longwen, et al.
Published: (2026)
by: Wang, Longwen, et al.
Published: (2026)
Structure-informed Positional Encoding for Music Generation
by: Agarwal, Manvi, et al.
Published: (2024)
by: Agarwal, Manvi, et al.
Published: (2024)
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
by: Agarwal, Manvi, et al.
Published: (2025)
by: Agarwal, Manvi, et al.
Published: (2025)
Scaling Generative Verifiers For Natural Language Mathematical Proof Verification And Selection
by: Mahdavi, Sadegh, et al.
Published: (2025)
by: Mahdavi, Sadegh, et al.
Published: (2025)
Generative Floor Plan Design with LLMs via Reinforcement Learning with Verifiable Rewards
by: Lara, Luis, et al.
Published: (2026)
by: Lara, Luis, et al.
Published: (2026)
Automated Proof Generation for Rust Code via Self-Evolution
by: Chen, Tianyu, et al.
Published: (2024)
by: Chen, Tianyu, et al.
Published: (2024)
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
by: Manvi, Rohin, et al.
Published: (2024)
by: Manvi, Rohin, et al.
Published: (2024)
LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward
by: Islam, Nafis Tanveer, et al.
Published: (2024)
by: Islam, Nafis Tanveer, et al.
Published: (2024)
AutoICE: Automatically Synthesizing Verifiable C Code via LLM-driven Evolution
by: Luo, Weilin, et al.
Published: (2025)
by: Luo, Weilin, et al.
Published: (2025)
F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation
by: Agarwal, Manvi, et al.
Published: (2025)
by: Agarwal, Manvi, et al.
Published: (2025)
Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development
by: Koch, Christopher
Published: (2026)
by: Koch, Christopher
Published: (2026)
Learning to Inject: Automated Prompt Injection via Reinforcement Learning
by: Chen, Xin, et al.
Published: (2026)
by: Chen, Xin, et al.
Published: (2026)
GeoContra: From Fluent GIS Code to Verifiable Spatial Analysis with Geography-Grounded Repair
by: Xiao, Yinhao, et al.
Published: (2026)
by: Xiao, Yinhao, et al.
Published: (2026)
Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning
by: Yin, Shouyu, et al.
Published: (2026)
by: Yin, Shouyu, et al.
Published: (2026)
WybeCoder: Verified Imperative Code Generation
by: Gloeckle, Fabian, et al.
Published: (2026)
by: Gloeckle, Fabian, et al.
Published: (2026)
Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
by: Cai, Xin-Qiang, et al.
Published: (2025)
by: Cai, Xin-Qiang, et al.
Published: (2025)
Leanabell-Prover-V2: Verifier-integrated Reasoning for Formal Theorem Proving via Reinforcement Learning
by: Ji, Xingguang, et al.
Published: (2025)
by: Ji, Xingguang, et al.
Published: (2025)
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
by: Farhat, Sean, et al.
Published: (2024)
by: Farhat, Sean, et al.
Published: (2024)
Multimodal Reinforcement Learning with Adaptive Verifier for AI Agents
by: Tan, Reuben, et al.
Published: (2025)
by: Tan, Reuben, et al.
Published: (2025)
Look as You Think: Unifying Reasoning and Visual Evidence Attribution for Verifiable Document RAG via Reinforcement Learning
by: Liu, Shuochen, et al.
Published: (2025)
by: Liu, Shuochen, et al.
Published: (2025)
Contextual Rollout Bandits for Reinforcement Learning with Verifiable Rewards
by: Lu, Xiaodong, et al.
Published: (2026)
by: Lu, Xiaodong, et al.
Published: (2026)
Zero-Knowledge Proof Based Verifiable Inference of Models
by: Wang, Yunxiao
Published: (2025)
by: Wang, Yunxiao
Published: (2025)
Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models
by: Islam, Nafis Tanveer, et al.
Published: (2024)
by: Islam, Nafis Tanveer, et al.
Published: (2024)
INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair
by: Wang, Hanbin, et al.
Published: (2023)
by: Wang, Hanbin, et al.
Published: (2023)
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
by: Liu, Xiaoyuan, et al.
Published: (2025)
by: Liu, Xiaoyuan, et al.
Published: (2025)
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
Optimistic Verifiable Training by Controlling Hardware Nondeterminism
by: Srivastava, Megha, et al.
Published: (2024)
by: Srivastava, Megha, et al.
Published: (2024)
A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
by: Peng, Zhizhi, et al.
Published: (2025)
by: Peng, Zhizhi, et al.
Published: (2025)
Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning
by: Xu, Siyuan, et al.
Published: (2026)
by: Xu, Siyuan, et al.
Published: (2026)
CPTuning: Contrastive Prompt Tuning for Generative Relation Extraction
by: Duan, Jiaxin, et al.
Published: (2025)
by: Duan, Jiaxin, et al.
Published: (2025)
Execution-Verified Reinforcement Learning for Optimization Modeling
by: Guan, Runda, et al.
Published: (2026)
by: Guan, Runda, et al.
Published: (2026)
Clover: Closed-Loop Verifiable Code Generation
by: Sun, Chuyue, et al.
Published: (2023)
by: Sun, Chuyue, et al.
Published: (2023)
HardSecBench: Benchmarking the Security Awareness of LLMs for Hardware Code Generation
by: Chen, Qirui, et al.
Published: (2026)
by: Chen, Qirui, et al.
Published: (2026)
Prompt2Fingerprint: Plug-and-Play LLM Fingerprinting via Text-to-Weight Generation
by: Chen, Sixu, et al.
Published: (2026)
by: Chen, Sixu, et al.
Published: (2026)
ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation
by: Zhao, Yicong, et al.
Published: (2025)
by: Zhao, Yicong, et al.
Published: (2025)
Similar Items
-
CoRe-Code: Collaborative Reinforcement Learning for Code Generation
by: Dou, Zhihao, et al.
Published: (2026) -
CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning
by: Shi, Ji, et al.
Published: (2026) -
SchemaCoder: Automatic Log Schema Extraction Coder with Residual Q-Tree Boosting
by: Wan, Lily Jiaxin, et al.
Published: (2025) -
FVRuleLearner: Operator-Level Reasoning Tree (OP-Tree)-Based Rules Learning for Formal Verification
by: Wan, Lily Jiaxin, et al.
Published: (2026) -
Agentic Reinforcement Learning for Real-World Code Repair
by: Zhu, Siyu, et al.
Published: (2025)