:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ding, Bowen, Chen, Yuhan, Lyv, Jiayang, Yuan, Jiyao, Zhu, Qi, Tian, Shuangshuang, Zhu, Dantong, Wang, Futing, Deng, Heyuan, Mi, Fei, Shang, Lifeng, Lin, Tao
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2512.11470
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

KDRL: Post-Training Reasoning LLMs via Unified Knowledge Distillation and Reinforcement Learning
by: Xu, Hongling, et al.
Published: (2025)

Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model
by: Ding, Bowen, et al.
Published: (2025)

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
by: Yu, Erxin, et al.
Published: (2025)

ReliableMath: Benchmark of Reliable Mathematical Reasoning on Large Language Models
by: Xue, Boyang, et al.
Published: (2025)

Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning
by: Wang, Hanbin, et al.
Published: (2026)

Teaching Large Reasoning Models Effective Reflection
by: Wang, Hanbin, et al.
Published: (2026)

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
by: Xu, Minrui, et al.
Published: (2026)

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
by: Zuo, Yuxin, et al.
Published: (2025)

Entropy Centroids as Intrinsic Rewards for Test-Time Scaling
by: Zhao, Wenshuo, et al.
Published: (2026)

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
by: Chen, Jierun, et al.
Published: (2025)

Benchmarking and Rethinking Knowledge Editing for Large Language Models
by: He, Guoxiu, et al.
Published: (2025)

EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing
by: Gao, Fan, et al.
Published: (2025)

Stackelberg Meta-Learning for Strategic Guidance in Multi-Robot Trajectory Planning
by: Zhao, Yuhan, et al.
Published: (2022)

On Data Synthesis and Post-training for Visual Abstract Reasoning
by: Zhu, Ke, et al.
Published: (2025)

Stackelberg Game-Theoretic Trajectory Guidance for Multi-Robot Systems with Koopman Operator
by: Zhao, Yuhan, et al.
Published: (2023)

Beyond Rejection Sampling: Trajectory Fusion for Scaling Mathematical Reasoning
by: Deng, Jie, et al.
Published: (2026)

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study
by: Zhang, Zhexin, et al.
Published: (2025)

Data Management For Training Large Language Models: A Survey
by: Wang, Zige, et al.
Published: (2023)

ELICIT: LLM Augmentation via External In-Context Capability
by: Wang, Futing, et al.
Published: (2024)

Asymptotically Optimal Depth Fermionic Permutation on 2D Grid Quantum Architecture without Ancillas
by: Li, Dantong, et al.
Published: (2026)

From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks
by: Stephan, Andreas, et al.
Published: (2024)

Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning
by: Xia, Haizhou
Published: (2026)

Heatmap Guided Query Transformers for Robust Astrocyte Detection across Immunostains and Resolutions
by: Zhang, Xizhe, et al.
Published: (2025)

Rethinking Cross-Domain Evaluation for Face Forgery Detection with Semantic Fine-grained Alignment and Mixture-of-Experts
by: Luo, Yuhan, et al.
Published: (2026)

SchoenbAt: Rethinking Attention with Polynomial basis
by: Guo, Yuhan, et al.
Published: (2025)

When to Reason: Semantic Router for vLLM
by: Wang, Chen, et al.
Published: (2025)

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
by: Deng, Yihe, et al.
Published: (2025)

ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
by: Li, Xinhao, et al.
Published: (2023)

Rethinking Polarization in Wurtzite Semiconductors
by: Wang, Ding, et al.
Published: (2024)

Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
by: Tan, Zelin, et al.
Published: (2025)

Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving
by: Xu, Xin, et al.
Published: (2025)

ViTE: Virtual Graph Trajectory Expert Router for Pedestrian Trajectory Prediction
by: Li, Ruochen, et al.
Published: (2025)

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
by: Deng, Yihe, et al.
Published: (2024)

Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning
by: Wang, Yiming, et al.
Published: (2024)

Rethinking Wireless Communications through Formal Mathematical AI Reasoning
by: Zhao, Changyuan, et al.
Published: (2026)

TraXion: Rethinking Pre-training Frameworks for Mobility and Beyond
by: Hsu, Shang-Ling, et al.
Published: (2026)

Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning
by: Hu, Boren, et al.
Published: (2026)

Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
by: Liu, Chengwu, et al.
Published: (2025)

The Mechanism of the Return Decision‐Making of Rural Migrants in China From the Translocal Perspective: The Case of County Towns in Yangzhou
by: Jiachen Zhang, et al.
Published: (2025)

Same Verdict, Different Reasons: LLM-as-a-Judge and Clinician Disagreement on Medical Chatbot Completeness
by: DeLucia, Alexandra, et al.
Published: (2026)