Saved in:
| Main Authors: | Xie, Tian, Gao, Zitian, Ren, Qingnan, Luo, Haoming, Hong, Yuqian, Dai, Bryan, Zhou, Joey, Qiu, Kai, Wu, Zhirong, Luo, Chong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.14768 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
One-shot Entropy Minimization
by: Gao, Zitian, et al.
Published: (2025)
by: Gao, Zitian, et al.
Published: (2025)
Universal Reasoning Model
by: Gao, Zitian, et al.
Published: (2025)
by: Gao, Zitian, et al.
Published: (2025)
What Makes Diffusion Language Models Super Data Learners?
by: Gao, Zitian, et al.
Published: (2025)
by: Gao, Zitian, et al.
Published: (2025)
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning
by: Xu, Ziqiang, et al.
Published: (2025)
by: Xu, Ziqiang, et al.
Published: (2025)
Controlled LLM Training on Spectral Sphere
by: Xie, Tian, et al.
Published: (2026)
by: Xie, Tian, et al.
Published: (2026)
Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling
by: Chen, Yilong, et al.
Published: (2026)
by: Chen, Yilong, et al.
Published: (2026)
Shorten After You're Right: Lazy Length Penalties for Reasoning RL
by: Yuan, Danlong, et al.
Published: (2025)
by: Yuan, Danlong, et al.
Published: (2025)
TemplateRL: Structured Template-Guided Reinforcement Learning for LLM Reasoning
by: Wu, Jinyang, et al.
Published: (2025)
by: Wu, Jinyang, et al.
Published: (2025)
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning
by: Luo, Linhao, et al.
Published: (2023)
by: Luo, Linhao, et al.
Published: (2023)
Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL
by: Liu, Che, et al.
Published: (2025)
by: Liu, Che, et al.
Published: (2025)
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL
by: Zhang, Yu, et al.
Published: (2025)
by: Zhang, Yu, et al.
Published: (2025)
Unleashing Perception-Time Scaling to Multimodal Reasoning Models
by: Li, Yifan, et al.
Published: (2025)
by: Li, Yifan, et al.
Published: (2025)
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
by: Chen, Zhangquan, et al.
Published: (2025)
by: Chen, Zhangquan, et al.
Published: (2025)
ADORA: Training Reasoning Models with Dynamic Advantage Estimation on Reinforcement Learning
by: Ren, Qingnan, et al.
Published: (2026)
by: Ren, Qingnan, et al.
Published: (2026)
DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs
by: Shu, Yubo, et al.
Published: (2025)
by: Shu, Yubo, et al.
Published: (2025)
Abductive Logical Rule Induction by Bridging Inductive Logic Programming and Multimodal Large Language Models
by: Peng, Yifei, et al.
Published: (2025)
by: Peng, Yifei, et al.
Published: (2025)
Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage
by: Lei, Bin, et al.
Published: (2024)
by: Lei, Bin, et al.
Published: (2024)
Robust Answers, Fragile Logic: Probing the Decoupling Hypothesis in LLM Reasoning
by: Jiang, Enyi, et al.
Published: (2025)
by: Jiang, Enyi, et al.
Published: (2025)
On the Steiner $k$-diameter and Steiner ($k,k^{\prime}$)-radius of trees
by: Zhang, Qingnan, et al.
Published: (2025)
by: Zhang, Qingnan, et al.
Published: (2025)
LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation
by: Shi, Hengyu, et al.
Published: (2025)
by: Shi, Hengyu, et al.
Published: (2025)
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
by: Zha, Kaiwen, et al.
Published: (2025)
by: Zha, Kaiwen, et al.
Published: (2025)
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
by: Li, Zejun, et al.
Published: (2024)
by: Li, Zejun, et al.
Published: (2024)
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
by: Zhou, Guanghao, et al.
Published: (2025)
by: Zhou, Guanghao, et al.
Published: (2025)
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
by: Hong, Joey, et al.
Published: (2025)
by: Hong, Joey, et al.
Published: (2025)
Comparison of two peripheral regional analgesic techniques for primary elective total hip arthroplasty
by: Longsheng Zhang, et al.
Published: (2025)
by: Longsheng Zhang, et al.
Published: (2025)
Power Margin Ratio -- A Large-Signal System Strength Metric for Inverter-Based Resources-Dominated Power Systems
by: Qiu, Zitian, et al.
Published: (2026)
by: Qiu, Zitian, et al.
Published: (2026)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
ProCeedRL: Process Critic with Exploratory Demonstration Reinforcement Learning for LLM Agentic Reasoning
by: Gao, Jingyue, et al.
Published: (2026)
by: Gao, Jingyue, et al.
Published: (2026)
LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning
by: Wong, Zhen Hao, et al.
Published: (2025)
by: Wong, Zhen Hao, et al.
Published: (2025)
Reasoning Meets Personalization: Unleashing the Potential of Large Reasoning Model for Personalized Generation
by: Luo, Sichun, et al.
Published: (2025)
by: Luo, Sichun, et al.
Published: (2025)
Scheduling Your LLM Reinforcement Learning with Reasoning Trees
by: Wang, Hong, et al.
Published: (2025)
by: Wang, Hong, et al.
Published: (2025)
CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning
by: Li, Kailing, et al.
Published: (2025)
by: Li, Kailing, et al.
Published: (2025)
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
by: Wang, Shenzhi, et al.
Published: (2025)
by: Wang, Shenzhi, et al.
Published: (2025)
Logics-STEM: Empowering LLM Reasoning via Failure-Driven Post-Training and Document Knowledge Enhancement
by: Xu, Mingyu, et al.
Published: (2026)
by: Xu, Mingyu, et al.
Published: (2026)
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
by: Fan, Kaixuan, et al.
Published: (2025)
by: Fan, Kaixuan, et al.
Published: (2025)
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning
by: Li, Yuan, et al.
Published: (2026)
by: Li, Yuan, et al.
Published: (2026)
LLM-Net: Democratizing LLMs-as-a-Service through Blockchain-based Expert Networks
by: Chong, Zan-Kai, et al.
Published: (2025)
by: Chong, Zan-Kai, et al.
Published: (2025)
The Silent Scholar Problem: A Probabilistic Framework for Breaking Epistemic Asymmetry in LLM Agents
by: Chong, Zan-Kai, et al.
Published: (2025)
by: Chong, Zan-Kai, et al.
Published: (2025)
RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning
by: Hao, Qianyue, et al.
Published: (2025)
by: Hao, Qianyue, et al.
Published: (2025)
SuperRL: Reinforcement Learning with Supervision to Boost Language Model Reasoning
by: Liu, Yihao, et al.
Published: (2025)
by: Liu, Yihao, et al.
Published: (2025)
Similar Items
-
One-shot Entropy Minimization
by: Gao, Zitian, et al.
Published: (2025) -
Universal Reasoning Model
by: Gao, Zitian, et al.
Published: (2025) -
What Makes Diffusion Language Models Super Data Learners?
by: Gao, Zitian, et al.
Published: (2025) -
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning
by: Xu, Ziqiang, et al.
Published: (2025) -
Controlled LLM Training on Spectral Sphere
by: Xie, Tian, et al.
Published: (2026)