Saved in:
| Main Authors: | Jiang, Yuxuan, Ferraro, Francis |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.00305 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models
by: Jiang, Yuxuan, et al.
Published: (2025)
by: Jiang, Yuxuan, et al.
Published: (2025)
Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
by: Jiang, Yuxuan, et al.
Published: (2026)
by: Jiang, Yuxuan, et al.
Published: (2026)
Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs
by: Xiao, Yilin, et al.
Published: (2025)
by: Xiao, Yilin, et al.
Published: (2025)
Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
by: Jiang, Yuxuan, et al.
Published: (2024)
by: Jiang, Yuxuan, et al.
Published: (2024)
Experiments or Outcomes? Probing Scientific Feasibility in Large Language Models
by: Mohammadi, Seyedali, et al.
Published: (2026)
by: Mohammadi, Seyedali, et al.
Published: (2026)
Learning to Reason under Off-Policy Guidance
by: Yan, Jianhao, et al.
Published: (2025)
by: Yan, Jianhao, et al.
Published: (2025)
FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response
by: Shichman, Mollie, et al.
Published: (2025)
by: Shichman, Mollie, et al.
Published: (2025)
Synthesis by Design: Controlled Data Generation via Structural Guidance
by: Xu, Lei, et al.
Published: (2025)
by: Xu, Lei, et al.
Published: (2025)
Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation
by: Chen, Wei-Rui, et al.
Published: (2025)
by: Chen, Wei-Rui, et al.
Published: (2025)
AdaSwitch: Balancing Exploration and Guidance in Knowledge Distillation via Adaptive Switching
by: Peng, Jingyu, et al.
Published: (2025)
by: Peng, Jingyu, et al.
Published: (2025)
Surgical Post-Training: Proximal On-Policy Distillation for Reasoning with Knowledge Retention
by: Lin, Wenye, et al.
Published: (2026)
by: Lin, Wenye, et al.
Published: (2026)
STEP: Success-Rate-Aware Trajectory-Efficient Policy Optimization
by: Chen, Yuhan, et al.
Published: (2025)
by: Chen, Yuhan, et al.
Published: (2025)
DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models
by: Jiang, Yuxuan, et al.
Published: (2026)
by: Jiang, Yuxuan, et al.
Published: (2026)
Hybrid Policy Distillation for LLMs
by: Zhu, Wenhong, et al.
Published: (2026)
by: Zhu, Wenhong, et al.
Published: (2026)
TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
by: Wang, Haorui, et al.
Published: (2024)
by: Wang, Haorui, et al.
Published: (2024)
Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance
by: Ren, Yanwei, et al.
Published: (2026)
by: Ren, Yanwei, et al.
Published: (2026)
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
by: Li, Hengli, et al.
Published: (2025)
by: Li, Hengli, et al.
Published: (2025)
Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning
by: Wang, Olivia Peiyu, et al.
Published: (2026)
by: Wang, Olivia Peiyu, et al.
Published: (2026)
SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning
by: Ma, Yufei, et al.
Published: (2026)
by: Ma, Yufei, et al.
Published: (2026)
Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs
by: Huang, Yuxuan
Published: (2023)
by: Huang, Yuxuan
Published: (2023)
Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
by: Xu, Ningning, et al.
Published: (2025)
by: Xu, Ningning, et al.
Published: (2025)
WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions
by: Mohammadi, Seyedali, et al.
Published: (2024)
by: Mohammadi, Seyedali, et al.
Published: (2024)
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
by: Fu, Yuqian, et al.
Published: (2026)
by: Fu, Yuqian, et al.
Published: (2026)
Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance
by: Xiong, Kai, et al.
Published: (2024)
by: Xiong, Kai, et al.
Published: (2024)
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
by: Chari, Vivek, et al.
Published: (2025)
by: Chari, Vivek, et al.
Published: (2025)
VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning
by: Singh, Vikash, et al.
Published: (2026)
by: Singh, Vikash, et al.
Published: (2026)
Hierarchical Budget Policy Optimization for Adaptive Reasoning
by: Lyu, Shangke, et al.
Published: (2025)
by: Lyu, Shangke, et al.
Published: (2025)
Beyond Mimicry to Contextual Guidance: Knowledge Distillation for Interactive AI
by: Wang, Tong, et al.
Published: (2024)
by: Wang, Tong, et al.
Published: (2024)
Structural Rationale Distillation via Reasoning Space Compression
by: Yang, Jialin, et al.
Published: (2026)
by: Yang, Jialin, et al.
Published: (2026)
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)
by: Yue, Murong, et al.
Published: (2024)
Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
by: Zhu, Xunyu, et al.
Published: (2024)
by: Zhu, Xunyu, et al.
Published: (2024)
Detecting Distillation Data from Reasoning Models
by: Zhang, Hengxiang, et al.
Published: (2025)
by: Zhang, Hengxiang, et al.
Published: (2025)
MiniLLM: On-Policy Distillation of Large Language Models
by: Gu, Yuxian, et al.
Published: (2023)
by: Gu, Yuxian, et al.
Published: (2023)
Black-Box On-Policy Distillation of Large Language Models
by: Ye, Tianzhu, et al.
Published: (2025)
by: Ye, Tianzhu, et al.
Published: (2025)
Thinking with DistilQwen: A Tale of Four Distilled Reasoning and Reward Model Series
by: Cai, Wenrui, et al.
Published: (2025)
by: Cai, Wenrui, et al.
Published: (2025)
TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM
by: Zhou, Haoyang, et al.
Published: (2026)
by: Zhou, Haoyang, et al.
Published: (2026)
Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
by: Monsefi, Amin Karimi, et al.
Published: (2026)
by: Monsefi, Amin Karimi, et al.
Published: (2026)
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
by: Cao, Lang, et al.
Published: (2024)
by: Cao, Lang, et al.
Published: (2024)
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
by: Wu, Xingyu, et al.
Published: (2025)
by: Wu, Xingyu, et al.
Published: (2025)
Similar Items
-
DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models
by: Jiang, Yuxuan, et al.
Published: (2025) -
Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
by: Jiang, Yuxuan, et al.
Published: (2026) -
Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs
by: Xiao, Yilin, et al.
Published: (2025) -
Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
by: Jiang, Yuxuan, et al.
Published: (2024) -
Experiments or Outcomes? Probing Scientific Feasibility in Large Language Models
by: Mohammadi, Seyedali, et al.
Published: (2026)