:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Yuxuan, Ferraro, Francis
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2606.00305
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models
by: Jiang, Yuxuan, et al.
Published: (2025)

Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
by: Jiang, Yuxuan, et al.
Published: (2026)

Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs
by: Xiao, Yilin, et al.
Published: (2025)

Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
by: Jiang, Yuxuan, et al.
Published: (2024)

Experiments or Outcomes? Probing Scientific Feasibility in Large Language Models
by: Mohammadi, Seyedali, et al.
Published: (2026)

Learning to Reason under Off-Policy Guidance
by: Yan, Jianhao, et al.
Published: (2025)

FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response
by: Shichman, Mollie, et al.
Published: (2025)

Synthesis by Design: Controlled Data Generation via Structural Guidance
by: Xu, Lei, et al.
Published: (2025)

Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation
by: Chen, Wei-Rui, et al.
Published: (2025)

AdaSwitch: Balancing Exploration and Guidance in Knowledge Distillation via Adaptive Switching
by: Peng, Jingyu, et al.
Published: (2025)

Surgical Post-Training: Proximal On-Policy Distillation for Reasoning with Knowledge Retention
by: Lin, Wenye, et al.
Published: (2026)

STEP: Success-Rate-Aware Trajectory-Efficient Policy Optimization
by: Chen, Yuhan, et al.
Published: (2025)

DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification
by: Dipta, Shubhashis Roy, et al.
Published: (2026)

SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models
by: Jiang, Yuxuan, et al.
Published: (2026)

Hybrid Policy Distillation for LLMs
by: Zhu, Wenhong, et al.
Published: (2026)

TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
by: Wang, Haorui, et al.
Published: (2024)

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance
by: Ren, Yanwei, et al.
Published: (2026)

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
by: Li, Hengli, et al.
Published: (2025)

Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning
by: Wang, Olivia Peiyu, et al.
Published: (2026)

SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning
by: Ma, Yufei, et al.
Published: (2026)

Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs
by: Huang, Yuxuan
Published: (2023)

Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
by: Xu, Ningning, et al.
Published: (2025)

WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions
by: Mohammadi, Seyedali, et al.
Published: (2024)

Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
by: Fu, Yuqian, et al.
Published: (2026)

Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance
by: Xiong, Kai, et al.
Published: (2024)

KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
by: Chari, Vivek, et al.
Published: (2025)

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning
by: Singh, Vikash, et al.
Published: (2026)

Hierarchical Budget Policy Optimization for Adaptive Reasoning
by: Lyu, Shangke, et al.
Published: (2025)

Beyond Mimicry to Contextual Guidance: Knowledge Distillation for Interactive AI
by: Wang, Tong, et al.
Published: (2024)

Structural Rationale Distillation via Reasoning Space Compression
by: Yang, Jialin, et al.
Published: (2026)

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)

Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
by: Zhu, Xunyu, et al.
Published: (2024)

Detecting Distillation Data from Reasoning Models
by: Zhang, Hengxiang, et al.
Published: (2025)

MiniLLM: On-Policy Distillation of Large Language Models
by: Gu, Yuxian, et al.
Published: (2023)

Black-Box On-Policy Distillation of Large Language Models
by: Ye, Tianzhu, et al.
Published: (2025)

Thinking with DistilQwen: A Tale of Four Distilled Reasoning and Reward Model Series
by: Cai, Wenrui, et al.
Published: (2025)

TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM
by: Zhou, Haoyang, et al.
Published: (2026)

Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
by: Monsefi, Amin Karimi, et al.
Published: (2026)

Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
by: Cao, Lang, et al.
Published: (2024)

LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
by: Wu, Xingyu, et al.
Published: (2025)