Saved in:
| Main Authors: | Zhao, Ziqi, Ma, Xinyu, Yang, Liu, Feng, Yujie, Shi, Daiting, He, Jingzhou, Xin, Xin, Ren, Zhaochun, Wu, Xiao-Ming |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.28014 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reinforced Efficient Reasoning via Semantically Diverse Exploration
by: Zhao, Ziqi, et al.
Published: (2026)
by: Zhao, Ziqi, et al.
Published: (2026)
Curriculum Approximate Unlearning for Session-based Recommendation
by: Yang, Liu, et al.
Published: (2025)
by: Yang, Liu, et al.
Published: (2025)
Cold-Starts in Generative Recommendation: A Reproducibility Study
by: Zhang, Zhen, et al.
Published: (2026)
by: Zhang, Zhen, et al.
Published: (2026)
Offline Trajectory Optimization for Offline Reinforcement Learning
by: Zhao, Ziqi, et al.
Published: (2024)
by: Zhao, Ziqi, et al.
Published: (2024)
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
by: Zhao, Siyan, et al.
Published: (2026)
by: Zhao, Siyan, et al.
Published: (2026)
OISD: On-Policy Internal Self-Distillation of Language Models
by: Liu, Xinyu, et al.
Published: (2026)
by: Liu, Xinyu, et al.
Published: (2026)
Training with Harnesses: On-Policy Harness Self-Distillation for Complex Reasoning
by: Zhao, Zhengyang, et al.
Published: (2026)
by: Zhao, Zhengyang, et al.
Published: (2026)
Continual Dialogue State Tracking via Reason-of-Select Distillation
by: Feng, Yujie, et al.
Published: (2024)
by: Feng, Yujie, et al.
Published: (2024)
Model Editing for New Document Integration in Generative Information Retrieval
by: Zhang, Zhen, et al.
Published: (2026)
by: Zhang, Zhen, et al.
Published: (2026)
Improving Sequential Recommenders through Counterfactual Augmentation of System Exposure
by: Zhao, Ziqi, et al.
Published: (2025)
by: Zhao, Ziqi, et al.
Published: (2025)
Crosslingual On-Policy Self-Distillation for Multilingual Reasoning
by: Liu, Yihong, et al.
Published: (2026)
by: Liu, Yihong, et al.
Published: (2026)
Self-Supervised On-Policy Distillation for Reasoning Language Models
by: Tan, Zhiquan, et al.
Published: (2026)
by: Tan, Zhiquan, et al.
Published: (2026)
When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
by: Liu, Xiaogeng, et al.
Published: (2026)
by: Liu, Xiaogeng, et al.
Published: (2026)
Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models
by: Shi, Zhengliang, et al.
Published: (2025)
by: Shi, Zhengliang, et al.
Published: (2025)
Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
by: Liu, Xinyu, et al.
Published: (2025)
by: Liu, Xinyu, et al.
Published: (2025)
Respecting Self-Uncertainty in On-Policy Self-Distillation for Efficient LLM Reasoning
by: Ke, Junlong, et al.
Published: (2026)
by: Ke, Junlong, et al.
Published: (2026)
Self-Adaptive Cognitive Debiasing for Large Language Models in Decision-Making
by: Lyu, Yougang, et al.
Published: (2025)
by: Lyu, Yougang, et al.
Published: (2025)
ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework
by: Qin, Kai, et al.
Published: (2026)
by: Qin, Kai, et al.
Published: (2026)
Self-Supervised Position Debiasing for Large Language Models
by: Liu, Zhongkun, et al.
Published: (2024)
by: Liu, Zhongkun, et al.
Published: (2024)
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
by: Shi, Zhengliang, et al.
Published: (2025)
by: Shi, Zhengliang, et al.
Published: (2025)
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
by: Sun, Weiwei, et al.
Published: (2023)
by: Sun, Weiwei, et al.
Published: (2023)
CRISP: Compressed Reasoning via Iterative Self-Policy Distillation
by: Sang, Hejian, et al.
Published: (2026)
by: Sang, Hejian, et al.
Published: (2026)
R^3AG: First Workshop on Refined and Reliable Retrieval Augmented Generation
by: Wang, Zihan, et al.
Published: (2024)
by: Wang, Zihan, et al.
Published: (2024)
DiffuGR: Generative Document Retrieval with Diffusion Language Models
by: Zhao, Xinpeng, et al.
Published: (2025)
by: Zhao, Xinpeng, et al.
Published: (2025)
CORD: Bridging the Audio-Text Reasoning Gap via Weighted On-policy Cross-modal Distillation
by: Hu, Jing, et al.
Published: (2026)
by: Hu, Jing, et al.
Published: (2026)
DreamPolish: Domain Score Distillation With Progressive Geometry Generation
by: Cheng, Yean, et al.
Published: (2024)
by: Cheng, Yean, et al.
Published: (2024)
Reasoning Compression with Mixed-Policy Distillation
by: Yang, Han, et al.
Published: (2026)
by: Yang, Han, et al.
Published: (2026)
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
by: Hao, Jitai, et al.
Published: (2025)
by: Hao, Jitai, et al.
Published: (2025)
Agentic-R: Learning to Retrieve for Agentic Search
by: Liu, Wenhan, et al.
Published: (2026)
by: Liu, Wenhan, et al.
Published: (2026)
Tailoring High‐Temperature Low‐Loss Properties in Polyimide Films for Energy Storage Through Multiamino Crosslinking
by: Di Wu, et al.
Published: (2026)
by: Di Wu, et al.
Published: (2026)
A Cooperative Multi-Agent Framework for Zero-Shot Named Entity Recognition
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
Constrained Auto-Regressive Decoding Constrains Generative Retrieval
by: Wu, Shiguang, et al.
Published: (2025)
by: Wu, Shiguang, et al.
Published: (2025)
Mirror: A Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning
by: Yan, Hanqi, et al.
Published: (2024)
by: Yan, Hanqi, et al.
Published: (2024)
Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation
by: Yang, Jiyuan, et al.
Published: (2024)
by: Yang, Jiyuan, et al.
Published: (2024)
ReleaseEval: A Benchmark for Evaluating Language Models in Automated Release Note Generation
by: Meng, Qianru, et al.
Published: (2025)
by: Meng, Qianru, et al.
Published: (2025)
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering
by: Tang, Xinyu, et al.
Published: (2025)
by: Tang, Xinyu, et al.
Published: (2025)
TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy
by: Chen, Yiqun, et al.
Published: (2024)
by: Chen, Yiqun, et al.
Published: (2024)
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter
by: Hao, Jitai, et al.
Published: (2024)
by: Hao, Jitai, et al.
Published: (2024)
What are the limits of cross-lingual dense passage retrieval for low-resource languages?
by: Wu, Jie, et al.
Published: (2024)
by: Wu, Jie, et al.
Published: (2024)
Skill-Conditioned Gated Self-Distillation for LLM Reasoning
by: Huang, Jiazhen, et al.
Published: (2026)
by: Huang, Jiazhen, et al.
Published: (2026)
Similar Items
-
Reinforced Efficient Reasoning via Semantically Diverse Exploration
by: Zhao, Ziqi, et al.
Published: (2026) -
Curriculum Approximate Unlearning for Session-based Recommendation
by: Yang, Liu, et al.
Published: (2025) -
Cold-Starts in Generative Recommendation: A Reproducibility Study
by: Zhang, Zhen, et al.
Published: (2026) -
Offline Trajectory Optimization for Offline Reinforcement Learning
by: Zhao, Ziqi, et al.
Published: (2024) -
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
by: Zhao, Siyan, et al.
Published: (2026)