Saved in:
| Main Authors: | Chen, Zhipeng, Min, Yingqian, Zhang, Beichen, Chen, Jie, Jiang, Jinhao, Cheng, Daixuan, Zhao, Wayne Xin, Liu, Zheng, Miao, Xu, Lu, Yang, Fang, Lei, Wang, Zhongyuan, Wen, Ji-Rong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.04548 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Effective Code-Integrated Reasoning
by: Bai, Fei, et al.
Published: (2025)
by: Bai, Fei, et al.
Published: (2025)
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
by: Song, Huatong, et al.
Published: (2025)
by: Song, Huatong, et al.
Published: (2025)
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
by: Min, Yingqian, et al.
Published: (2024)
by: Min, Yingqian, et al.
Published: (2024)
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
by: Sun, Haoxiang, et al.
Published: (2025)
by: Sun, Haoxiang, et al.
Published: (2025)
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
by: Deng, Jia, et al.
Published: (2025)
by: Deng, Jia, et al.
Published: (2025)
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
by: Song, Huatong, et al.
Published: (2025)
by: Song, Huatong, et al.
Published: (2025)
ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests
by: Xu, Shiyi, et al.
Published: (2025)
by: Xu, Shiyi, et al.
Published: (2025)
Enhancing LLM Reasoning with Reward-guided Tree Search
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
Sticker-TTS: Learn to Utilize Historical Experience with a Sticker-driven Test-Time Scaling Framework
by: Chen, Jie, et al.
Published: (2025)
by: Chen, Jie, et al.
Published: (2025)
Computer Environments Elicit General Agentic Intelligence in LLMs
by: Cheng, Daixuan, et al.
Published: (2026)
by: Cheng, Daixuan, et al.
Published: (2026)
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
by: Zhou, Kun, et al.
Published: (2024)
by: Zhou, Kun, et al.
Published: (2024)
ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph
by: Jiang, Jinhao, et al.
Published: (2023)
by: Jiang, Jinhao, et al.
Published: (2023)
Decomposing the Entropy-Performance Exchange: The Missing Keys to Unlocking Effective Reinforcement Learning
by: Deng, Jia, et al.
Published: (2025)
by: Deng, Jia, et al.
Published: (2025)
Adaptive Ability Decomposing for Unlocking Large Reasoning Model Effective Reinforcement Learning
by: Chen, Zhipeng, et al.
Published: (2026)
by: Chen, Zhipeng, et al.
Published: (2026)
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
by: Sun, Shuang, et al.
Published: (2025)
by: Sun, Shuang, et al.
Published: (2025)
Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration
by: Chen, Zhipeng, et al.
Published: (2026)
by: Chen, Zhipeng, et al.
Published: (2026)
KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
Improving Vision-language Models with Perception-centric Process Reward Models
by: Min, Yingqian, et al.
Published: (2026)
by: Min, Yingqian, et al.
Published: (2026)
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM
by: Du, Yifan, et al.
Published: (2025)
by: Du, Yifan, et al.
Published: (2025)
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models
by: Li, Junyi, et al.
Published: (2024)
by: Li, Junyi, et al.
Published: (2024)
Slow Thinking for Sequential Recommendation
by: Zhang, Junjie, et al.
Published: (2025)
by: Zhang, Junjie, et al.
Published: (2025)
Reasoning with Exploration: An Entropy Perspective
by: Cheng, Daixuan, et al.
Published: (2025)
by: Cheng, Daixuan, et al.
Published: (2025)
Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment
by: Chen, Zhipeng, et al.
Published: (2024)
by: Chen, Zhipeng, et al.
Published: (2024)
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
by: Song, Huatong, et al.
Published: (2026)
by: Song, Huatong, et al.
Published: (2026)
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint
by: Chen, Zhipeng, et al.
Published: (2024)
by: Chen, Zhipeng, et al.
Published: (2024)
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting
by: Cheng, Xiaoxue, et al.
Published: (2024)
by: Cheng, Xiaoxue, et al.
Published: (2024)
A Survey of Large Language Models
by: Zhao, Wayne Xin, et al.
Published: (2023)
by: Zhao, Wayne Xin, et al.
Published: (2023)
Towards Effective and Efficient Continual Pre-training of Large Language Models
by: Chen, Jie, et al.
Published: (2024)
by: Chen, Jie, et al.
Published: (2024)
LARES: Latent Reasoning for Sequential Recommendation
by: Liu, Enze, et al.
Published: (2025)
by: Liu, Enze, et al.
Published: (2025)
Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization
by: Du, Yifan, et al.
Published: (2025)
by: Du, Yifan, et al.
Published: (2025)
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
by: Chen, Zhipeng, et al.
Published: (2024)
by: Chen, Zhipeng, et al.
Published: (2024)
Analyzing and Mitigating Object Hallucination: A Training Bias Perspective
by: Li, Yifan, et al.
Published: (2025)
by: Li, Yifan, et al.
Published: (2025)
Tapping the Potential of Large Language Models as Recommender Systems: A Comprehensive Framework and Empirical Analysis
by: Xu, Lanling, et al.
Published: (2024)
by: Xu, Lanling, et al.
Published: (2024)
CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
by: Peng, Han, et al.
Published: (2025)
by: Peng, Han, et al.
Published: (2025)
Universal Item Tokenization for Transferable Generative Recommendation
by: Zheng, Bowen, et al.
Published: (2025)
by: Zheng, Bowen, et al.
Published: (2025)
Admissible Reconstruction of Reaction-Channel Levels on Fixed Subgroup Support for Cross-Section-Space Probability Table Constructions
by: Zheng, Beichen, et al.
Published: (2026)
by: Zheng, Beichen, et al.
Published: (2026)
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
by: Chen, Jie, et al.
Published: (2024)
by: Chen, Jie, et al.
Published: (2024)
MMATH: A Multilingual Benchmark for Mathematical Reasoning
by: Luo, Wenyang, et al.
Published: (2025)
by: Luo, Wenyang, et al.
Published: (2025)
Toward Autonomous Long-Horizon Engineering for ML Research
by: Chen, Guoxin, et al.
Published: (2026)
by: Chen, Guoxin, et al.
Published: (2026)
Similar Items
-
Towards Effective Code-Integrated Reasoning
by: Bai, Fei, et al.
Published: (2025) -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
by: Song, Huatong, et al.
Published: (2025) -
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
by: Min, Yingqian, et al.
Published: (2024) -
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
by: Sun, Haoxiang, et al.
Published: (2025) -
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
by: Deng, Jia, et al.
Published: (2025)