Saved in:
| Main Authors: | Sun, Jie, Zheng, Mao, Song, Mingyang, Zhong, Qiyong, Cheng, Yilin, Feng, Bichuan, Liu, Pengfei, Fang, Junfeng, Wang, Xiang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.07711 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SOD: Step-wise On-policy Distillation for Small Language Model Agents
by: Zhong, Qiyong, et al.
Published: (2026)
by: Zhong, Qiyong, et al.
Published: (2026)
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
by: Zhao, Fufangchen, et al.
Published: (2024)
by: Zhao, Fufangchen, et al.
Published: (2024)
A Survey of On-Policy Distillation for Large Language Models
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing
by: Li, Gengsheng, et al.
Published: (2026)
by: Li, Gengsheng, et al.
Published: (2026)
Rubric-based On-policy Distillation
by: Fang, Junfeng, et al.
Published: (2026)
by: Fang, Junfeng, et al.
Published: (2026)
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
by: Wang, Zirui, et al.
Published: (2023)
by: Wang, Zirui, et al.
Published: (2023)
Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
by: Song, Mingyang, et al.
Published: (2025)
by: Song, Mingyang, et al.
Published: (2025)
A Survey of Query Optimization in Large Language Models
by: Song, Mingyang, et al.
Published: (2024)
by: Song, Mingyang, et al.
Published: (2024)
Pctx: Tokenizing Personalized Context for Generative Recommendation
by: Zhong, Qiyong, et al.
Published: (2025)
by: Zhong, Qiyong, et al.
Published: (2025)
MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation
by: Zheng, Haoyu, et al.
Published: (2026)
by: Zheng, Haoyu, et al.
Published: (2026)
Cross-Source Supervision for Bone Infection Segmentation in Dual-Modality PET-CT
by: Yang, Zonglin, et al.
Published: (2026)
by: Yang, Zonglin, et al.
Published: (2026)
Recovering the Wedge Modes Lost to 21-cm Foregrounds
by: Gagnon-Hartman, Samuel, et al.
Published: (2021)
by: Gagnon-Hartman, Samuel, et al.
Published: (2021)
Lost in Projection? Gaussian Filtering Recovers Hidden Conformational States
by: Sartore, Sofia, et al.
Published: (2026)
by: Sartore, Sofia, et al.
Published: (2026)
PRISM: Probability Reallocation with In-Span Masking for Knowledge-Sensitive Alignment
by: Xu, Chenning, et al.
Published: (2026)
by: Xu, Chenning, et al.
Published: (2026)
Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language Models
by: Song, Mingyang, et al.
Published: (2024)
by: Song, Mingyang, et al.
Published: (2024)
GRP: Goal-Reversed Prompting for Zero-Shot Evaluation with LLMs
by: Song, Mingyang, et al.
Published: (2025)
by: Song, Mingyang, et al.
Published: (2025)
Beyond the Illusion of Consensus: From Surface Heuristics to Knowledge-Grounded Evaluation in LLM-as-a-Judge
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
by: Gu, Yuchao, et al.
Published: (2026)
by: Gu, Yuchao, et al.
Published: (2026)
SPIN: Self-Supervised Prompt INjection
by: Zhou, Leon, et al.
Published: (2024)
by: Zhou, Leon, et al.
Published: (2024)
TIP: Token Importance in On-Policy Distillation
by: Xu, Yuanda, et al.
Published: (2026)
by: Xu, Yuanda, et al.
Published: (2026)
Distilling Transitional Pattern to Large Language Models for Multimodal Session-based Recommendation
by: Su, Jiajie, et al.
Published: (2025)
by: Su, Jiajie, et al.
Published: (2025)
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping
by: Chen, Yijie, et al.
Published: (2025)
by: Chen, Yijie, et al.
Published: (2025)
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models
by: Wang, Mingyang, et al.
Published: (2025)
by: Wang, Mingyang, et al.
Published: (2025)
CrossKD: Cross-Head Knowledge Distillation for Object Detection
by: Wang, Jiabao, et al.
Published: (2023)
by: Wang, Jiabao, et al.
Published: (2023)
ForSim: Stepwise Forward Simulation for Traffic Policy Fine-Tuning
by: Chen, Keyu, et al.
Published: (2026)
by: Chen, Keyu, et al.
Published: (2026)
HardMTBench: Stress-Testing Chinese-English Translation on Knowledge-Intensive Domains
by: Li, Zheng, et al.
Published: (2026)
by: Li, Zheng, et al.
Published: (2026)
IFMTBench: A Comprehensive Benchmark for Multilingual Translation Instruction Following
by: Sun, Mingrui, et al.
Published: (2026)
by: Sun, Mingrui, et al.
Published: (2026)
MiMoTable: A Multi-scale Spreadsheet Benchmark with Meta Operations for Table Reasoning
by: Li, Zheng, et al.
Published: (2024)
by: Li, Zheng, et al.
Published: (2024)
TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment
by: Li, Zheng, et al.
Published: (2025)
by: Li, Zheng, et al.
Published: (2025)
PodBench: A Comprehensive Benchmark for Instruction-Aware Audio-Oriented Podcast Script Generation
by: Xu, Chenning, et al.
Published: (2026)
by: Xu, Chenning, et al.
Published: (2026)
CTPD: Cross Tokenizer Preference Distillation
by: Nguyen, Truong, et al.
Published: (2026)
by: Nguyen, Truong, et al.
Published: (2026)
SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction
by: Chen, Gaofeng, et al.
Published: (2024)
by: Chen, Gaofeng, et al.
Published: (2024)
CodeDelegator: Mitigating Context Pollution via Role Separation in Code-as-Action Agents
by: Fei, Tianxiang, et al.
Published: (2026)
by: Fei, Tianxiang, et al.
Published: (2026)
Hybrid Policy Distillation for LLMs
by: Zhu, Wenhong, et al.
Published: (2026)
by: Zhu, Wenhong, et al.
Published: (2026)
LinguDistill: Recovering Linguistic Ability in Vision-Language Models via Selective Cross-Modal Distillation
by: Irawan, Patrick Amadeus, et al.
Published: (2026)
by: Irawan, Patrick Amadeus, et al.
Published: (2026)
SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection
by: Xu, Ruoyu, et al.
Published: (2024)
by: Xu, Ruoyu, et al.
Published: (2024)
X-Token: Projection-Guided Cross-Tokenizer Knowledge Distillation
by: Sreenivas, Sharath Turuvekere, et al.
Published: (2026)
by: Sreenivas, Sharath Turuvekere, et al.
Published: (2026)
Lost in Tokenization: Fundamental Trade-offs in Graph Tokenization for Transformers
by: Bechler-Speicher, Maya, et al.
Published: (2026)
by: Bechler-Speicher, Maya, et al.
Published: (2026)
LoGoFair: Post-Processing for Local and Global Fairness in Federated Learning
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Similar Items
-
SOD: Step-wise On-policy Distillation for Small Language Model Agents
by: Zhong, Qiyong, et al.
Published: (2026) -
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
by: Zhao, Fufangchen, et al.
Published: (2024) -
A Survey of On-Policy Distillation for Large Language Models
by: Song, Mingyang, et al.
Published: (2026) -
Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing
by: Li, Gengsheng, et al.
Published: (2026) -
Rubric-based On-policy Distillation
by: Fang, Junfeng, et al.
Published: (2026)