Saved in:
| Main Authors: | Lazaridis, Aristotelis, Bates, Dylan, Sharma, Aman, King, Brian, Lu, Vincent, FitzGerald, Jack |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.23493 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Measuring and Eliminating Refusals in Military Large Language Models
by: FitzGerald, Jack, et al.
Published: (2026)
by: FitzGerald, Jack, et al.
Published: (2026)
EdgeRunner 20B: Military Task Parity with GPT-5 while Running on the Edge
by: FitzGerald, Jack, et al.
Published: (2025)
by: FitzGerald, Jack, et al.
Published: (2025)
PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint
by: Vasani, Bhoomit, et al.
Published: (2025)
by: Vasani, Bhoomit, et al.
Published: (2025)
The Twin Purposes of Guided Inquiry: Guiding Student Inquiry and Evidence-Based Practice
by: FitzGerald, Lee
Published: (2010)
by: FitzGerald, Lee
Published: (2010)
MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources
by: Lee, Dongkyu, et al.
Published: (2024)
by: Lee, Dongkyu, et al.
Published: (2024)
Impossible Things in the Age of Reformations
by: Brian FitzGerald
Published: (2025)
by: Brian FitzGerald
Published: (2025)
OPD+: Rethinking the Advantage Design for On-Policy Distillation
by: Zhao, Hanyang, et al.
Published: (2026)
by: Zhao, Hanyang, et al.
Published: (2026)
Flow-OPD: On-Policy Distillation for Flow Matching Models
by: Fang, Zhen, et al.
Published: (2026)
by: Fang, Zhen, et al.
Published: (2026)
DP-OPD: Differentially Private On-Policy Distillation for Language Models
by: Khadem, Fatemeh, et al.
Published: (2026)
by: Khadem, Fatemeh, et al.
Published: (2026)
Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning
by: Yang, Zhicheng, et al.
Published: (2026)
by: Yang, Zhicheng, et al.
Published: (2026)
HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation
by: Ding, Ken
Published: (2026)
by: Ding, Ken
Published: (2026)
$\boldsymbol{f}$-OPD: Stabilizing Long-Horizon On-Policy Distillation with Freshness-Aware Control
by: Chen, Xianwei, et al.
Published: (2026)
by: Chen, Xianwei, et al.
Published: (2026)
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation
by: Wu, Yecheng, et al.
Published: (2026)
by: Wu, Yecheng, et al.
Published: (2026)
MAD-OPD: Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate
by: Wang, Jianze, et al.
Published: (2026)
by: Wang, Jianze, et al.
Published: (2026)
X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs
by: Cao, Di, et al.
Published: (2026)
by: Cao, Di, et al.
Published: (2026)
Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation
by: Yuan, Qianhao, et al.
Published: (2026)
by: Yuan, Qianhao, et al.
Published: (2026)
Erika Pani, Para pertenecer a la gran familia mexicana: procesos de naturalización en el siglo xix, México, El Colegio de México, 2015, 204 pp. ISBN 978-607-462-713-8
by: David Scott FitzGerald
Published: (2016)
by: David Scott FitzGerald
Published: (2016)
Catherine Vézina, Diplomacia migratoria: una historia transnacional del Programa Bracero, 1947-1952, México, Centro de Investigación y Docencia Económicas, 2017, 404 pp. ISBN 978-607-446-102-2
by: David Scott FitzGerald
Published: (2020)
by: David Scott FitzGerald
Published: (2020)
QSTToolkit: A Python Library for Deep Learning Powered Quantum State Tomography
by: FitzGerald, George, et al.
Published: (2025)
by: FitzGerald, George, et al.
Published: (2025)
Privileged Information Distillation for Language Models
by: Penaloza, Emiliano, et al.
Published: (2026)
by: Penaloza, Emiliano, et al.
Published: (2026)
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
by: Zhang, Xingjian, et al.
Published: (2025)
by: Zhang, Xingjian, et al.
Published: (2025)
"Just Let Me Go at It": Exploring Students' Use and Perceptions of Guided Inquiry
by: Garrison, Kasey L., et al.
Published: (2018)
by: Garrison, Kasey L., et al.
Published: (2018)
Distilling Privileged Information for Dubins Traveling Salesman Problems with Neighborhoods
by: Shin, Min Kyu, et al.
Published: (2024)
by: Shin, Min Kyu, et al.
Published: (2024)
Random walks in Weyl chambers
by: Denisov, Denis, et al.
Published: (2025)
by: Denisov, Denis, et al.
Published: (2025)
Critique de la méthode de distinction entre poissons anadromes et dulcicoles de la même espèce par la teneur en strontium de leurs écailles
by: Castonguay, M., FitzGerald, G. J
Published: (1982)
by: Castonguay, M., FitzGerald, G. J
Published: (1982)
Ordered random walks and the Airy line ensemble
by: Denisov, Denis, et al.
Published: (2024)
by: Denisov, Denis, et al.
Published: (2024)
Library Signage: Applications for the Apple Macintosh and MacPaint.
by: Diskin, Jill A., et al.
Published: (1984)
by: Diskin, Jill A., et al.
Published: (1984)
Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models
by: Lin, Zizhuo, et al.
Published: (2026)
by: Lin, Zizhuo, et al.
Published: (2026)
Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation
by: Zhao, Anhao, et al.
Published: (2026)
by: Zhao, Anhao, et al.
Published: (2026)
OISD: On-Policy Internal Self-Distillation of Language Models
by: Liu, Xinyu, et al.
Published: (2026)
by: Liu, Xinyu, et al.
Published: (2026)
CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback
by: Chen, Bin, et al.
Published: (2026)
by: Chen, Bin, et al.
Published: (2026)
Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport
by: Aslam, Muhammad Haseeb, et al.
Published: (2024)
by: Aslam, Muhammad Haseeb, et al.
Published: (2024)
Depth-Guided Self-Supervised Human Keypoint Detection via Cross-Modal Distillation
by: Anand, Aman, et al.
Published: (2024)
by: Anand, Aman, et al.
Published: (2024)
Characterization and Evaluation of Interferential Current Stimulation for Functional Electrical Stimulation
by: Rodrigo Osorio, et al.
Published: (2025)
by: Rodrigo Osorio, et al.
Published: (2025)
Wanda++: Pruning Large Language Models via Regional Gradients
by: Yang, Yifan, et al.
Published: (2025)
by: Yang, Yifan, et al.
Published: (2025)
The effect of predation on the dynamics of Chronic Wasting Disease in deer
by: FitzGerald, Cody E., et al.
Published: (2025)
by: FitzGerald, Cody E., et al.
Published: (2025)
OPSDL: On-Policy Self-Distillation for Long-Context Language Models
by: Zhang, Xinsen, et al.
Published: (2026)
by: Zhang, Xinsen, et al.
Published: (2026)
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs
by: Singhal, Ronit, et al.
Published: (2024)
by: Singhal, Ronit, et al.
Published: (2024)
ProteinOPD: Towards Effective and Efficient Preference Alignment for Protein Design
by: Zhang, Yulin, et al.
Published: (2026)
by: Zhang, Yulin, et al.
Published: (2026)
POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration
by: Qu, Yuxiao, et al.
Published: (2026)
by: Qu, Yuxiao, et al.
Published: (2026)
Similar Items
-
Measuring and Eliminating Refusals in Military Large Language Models
by: FitzGerald, Jack, et al.
Published: (2026) -
EdgeRunner 20B: Military Task Parity with GPT-5 while Running on the Edge
by: FitzGerald, Jack, et al.
Published: (2025) -
PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint
by: Vasani, Bhoomit, et al.
Published: (2025) -
The Twin Purposes of Guided Inquiry: Guiding Student Inquiry and Evidence-Based Practice
by: FitzGerald, Lee
Published: (2010) -
MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources
by: Lee, Dongkyu, et al.
Published: (2024)