Saved in:
| Main Authors: | Wang, Hao, Gu, Hao, Piao, Hongming, Gong, Kaixiong, Ye, Yuxiao, Yue, Xiangyu, Han, Sirui, Guo, Yike, Wu, Dapeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.02244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection
by: Piao, Shengmin, et al.
Published: (2024)
by: Piao, Shengmin, et al.
Published: (2024)
Basic Reading Distillation
by: Zhou, Zhi, et al.
Published: (2025)
by: Zhou, Zhi, et al.
Published: (2025)
Supervised Fine-Tuning as Inverse Reinforcement Learning
by: Sun, Hao
Published: (2024)
by: Sun, Hao
Published: (2024)
Rotation-Preserving Supervised Fine-Tuning
by: Jin, Hangzhan, et al.
Published: (2026)
by: Jin, Hangzhan, et al.
Published: (2026)
Staying Healthy While You Are Pregnant
Published: (2025)
Published: (2025)
On-Policy Supervised Fine-Tuning for Efficient Reasoning
by: Zhao, Anhao, et al.
Published: (2026)
by: Zhao, Anhao, et al.
Published: (2026)
Reassessing the Role of Supervised Fine-Tuning: An Empirical Study in VLM Reasoning
by: Yu, Yongcan, et al.
Published: (2025)
by: Yu, Yongcan, et al.
Published: (2025)
Adaptive Federated Fine-Tuning of Self-Supervised Speech Representations
by: Guo, Xin, et al.
Published: (2026)
by: Guo, Xin, et al.
Published: (2026)
Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
by: Lu, Yuxiao, et al.
Published: (2024)
by: Lu, Yuxiao, et al.
Published: (2024)
Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods
by: Hao, Yifan, et al.
Published: (2025)
by: Hao, Yifan, et al.
Published: (2025)
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
by: Zhang, Yiyuan, et al.
Published: (2024)
by: Zhang, Yiyuan, et al.
Published: (2024)
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
by: Diao, Muxi, et al.
Published: (2026)
by: Diao, Muxi, et al.
Published: (2026)
Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
by: Zhang, Zhejun, et al.
Published: (2024)
by: Zhang, Zhejun, et al.
Published: (2024)
Learning to Stay Safe: Adaptive Regularization Against Safety Degradation during Fine-Tuning
by: Goel, Jyotin, et al.
Published: (2026)
by: Goel, Jyotin, et al.
Published: (2026)
Asymmetric Advantage Modulation Calibrates Entropy Dynamics in RLVR
by: Gu, Hengrui, et al.
Published: (2026)
by: Gu, Hengrui, et al.
Published: (2026)
OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning
by: Yang, Yuxiao, et al.
Published: (2026)
by: Yang, Yuxiao, et al.
Published: (2026)
On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning
by: Li, Zhaoyi, et al.
Published: (2026)
by: Li, Zhaoyi, et al.
Published: (2026)
Video-R1: Reinforcing Video Reasoning in MLLMs
by: Feng, Kaituo, et al.
Published: (2025)
by: Feng, Kaituo, et al.
Published: (2025)
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
by: Li, Ziniu, et al.
Published: (2024)
by: Li, Ziniu, et al.
Published: (2024)
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
by: Li, Lujun, et al.
Published: (2025)
by: Li, Lujun, et al.
Published: (2025)
Supervised Fine-Tuning Needs to Unlock the Potential of Token Priority
by: Shen, Zhanming, et al.
Published: (2026)
by: Shen, Zhanming, et al.
Published: (2026)
BIFRÖST: 3D-Aware Image compositing with Language Instructions
by: Li, Lingxiao, et al.
Published: (2024)
by: Li, Lingxiao, et al.
Published: (2024)
How to Stay Curious while Avoiding Noisy TVs using Aleatoric Uncertainty Estimation
by: Mavor-Parker, Augustine N., et al.
Published: (2021)
by: Mavor-Parker, Augustine N., et al.
Published: (2021)
Fine-Tuning Robot Policies While Maintaining User Privacy
by: Christie, Benjamin A., et al.
Published: (2025)
by: Christie, Benjamin A., et al.
Published: (2025)
Preserving Multilingual Quality While Tuning Query Encoder on English Only
by: Vasilyev, Oleg, et al.
Published: (2024)
by: Vasilyev, Oleg, et al.
Published: (2024)
Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning
by: Pang, Jinlong, et al.
Published: (2025)
by: Pang, Jinlong, et al.
Published: (2025)
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
by: Jiang, Liming, et al.
Published: (2025)
by: Jiang, Liming, et al.
Published: (2025)
Policy Gradient with Adaptive Entropy Annealing for Continual Fine-Tuning
by: Zhang, Yaqian, et al.
Published: (2026)
by: Zhang, Yaqian, et al.
Published: (2026)
A Layer-wise Analysis of Supervised Fine-Tuning
by: Zhao, Qinghua, et al.
Published: (2026)
by: Zhao, Qinghua, et al.
Published: (2026)
Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis
by: Huang, Hong, et al.
Published: (2025)
by: Huang, Hong, et al.
Published: (2025)
EGAD: Entropy-Guided Adaptive Distillation for Token-Level Knowledge Transfer
by: Zhang, Hao, et al.
Published: (2026)
by: Zhang, Hao, et al.
Published: (2026)
Natural Language Fine-Tuning
by: Liu, Jia, et al.
Published: (2024)
by: Liu, Jia, et al.
Published: (2024)
Self-Supervised On-Policy Distillation for Reasoning Language Models
by: Tan, Zhiquan, et al.
Published: (2026)
by: Tan, Zhiquan, et al.
Published: (2026)
Pioneering Reliable Assessment in Text-to-Image Knowledge Editing: Leveraging a Fine-Grained Dataset and an Innovative Criterion
by: Gu, Hengrui, et al.
Published: (2024)
by: Gu, Hengrui, et al.
Published: (2024)
Reasoning While Recommending: Entropy-Guided Latent Reasoning in Generative Re-ranking Models
by: Zhang, Changshuo
Published: (2026)
by: Zhang, Changshuo
Published: (2026)
M-GRPO: Stabilizing Self-Supervised Reinforcement Learning for Large Language Models with Momentum-Anchored Policy Optimization
by: Bai, Bizhe, et al.
Published: (2025)
by: Bai, Bizhe, et al.
Published: (2025)
Remote Training in Task-Oriented Communication: Supervised or Self-Supervised with Fine-Tuning?
by: Li, Hongru, et al.
Published: (2025)
by: Li, Hongru, et al.
Published: (2025)
Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills
by: Wang, Changsheng, et al.
Published: (2025)
by: Wang, Changsheng, et al.
Published: (2025)
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
by: Liu, Wanlong, et al.
Published: (2025)
by: Liu, Wanlong, et al.
Published: (2025)
RoaD: Rollouts as Demonstrations for Closed-Loop Supervised Fine-Tuning of Autonomous Driving Policies
by: Garcia-Cobo, Guillermo, et al.
Published: (2025)
by: Garcia-Cobo, Guillermo, et al.
Published: (2025)
Similar Items
-
TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection
by: Piao, Shengmin, et al.
Published: (2024) -
Basic Reading Distillation
by: Zhou, Zhi, et al.
Published: (2025) -
Supervised Fine-Tuning as Inverse Reinforcement Learning
by: Sun, Hao
Published: (2024) -
Rotation-Preserving Supervised Fine-Tuning
by: Jin, Hangzhan, et al.
Published: (2026) -
Staying Healthy While You Are Pregnant
Published: (2025)