Saved in:
Bibliographic Details
Main Authors: Guo, Wei, Lu, Siyuan, Ran, Xiangdong, Tong, Yiqi, Ban, Yikun, Xu, Zelong, Fan, Jing, Huang, Zixuan, Zhang, Xiao, Hu, Zhaojun, Zhuang, Fuzhen
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.18749
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914341983354880
author Guo, Wei
Lu, Siyuan
Ran, Xiangdong
Tong, Yiqi
Ban, Yikun
Xu, Zelong
Fan, Jing
Huang, Zixuan
Zhang, Xiao
Hu, Zhaojun
Zhuang, Fuzhen
author_facet Guo, Wei
Lu, Siyuan
Ran, Xiangdong
Tong, Yiqi
Ban, Yikun
Xu, Zelong
Fan, Jing
Huang, Zixuan
Zhang, Xiao
Hu, Zhaojun
Zhuang, Fuzhen
contents Data allocation plays a critical role in federated large language model (LLM) and small language models (SLMs) reasoning collaboration. Nevertheless, existing data allocation methods fail to address an under-explored challenge in collaboration: bidirectional model learnability gap, where client-side SLMs cannot identify high-reward samples matching their learnability constraints for effective knowledge transfer from LLMs, while LLMs struggle to select samples contributing novel knowledge beyond their existing data. Furthermore, these collaboration frameworks face another key challenge: domain-agnostic reasoning transfer, where existing reasoning transfer methods fail to flexibly adapt to the local domain data, preventing SLMs from effectively acquiring step-by-step reasoning abilities within from general LLM. To address these challenges, we propose LaDa, a federated reasoning distillation framework with model learnability-aware data allocation. It introduces a model learnability-aware data filter that adaptively allocates high-reward samples based on the learnability gap between each SLM and LLM pair, effectively facilitating bidirectional knowledge transfer. We further design a domain adaptive reasoning distillation method that aligns joint probabilities of reasoning paths on filtered high-reward samples through contrastive distillation learning between SLM and LLM, enabling SLM to capture underlying reasoning patterns under local data distribution. LaDa operates as a plug-in module for existing collaboration frameworks, adapting knowledge transfer based on model learnability gaps.
format Preprint
id arxiv_https___arxiv_org_abs_2602_18749
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Federated Reasoning Distillation Framework with Model Learnability-Aware Data Allocation
Guo, Wei
Lu, Siyuan
Ran, Xiangdong
Tong, Yiqi
Ban, Yikun
Xu, Zelong
Fan, Jing
Huang, Zixuan
Zhang, Xiao
Hu, Zhaojun
Zhuang, Fuzhen
Artificial Intelligence
Data allocation plays a critical role in federated large language model (LLM) and small language models (SLMs) reasoning collaboration. Nevertheless, existing data allocation methods fail to address an under-explored challenge in collaboration: bidirectional model learnability gap, where client-side SLMs cannot identify high-reward samples matching their learnability constraints for effective knowledge transfer from LLMs, while LLMs struggle to select samples contributing novel knowledge beyond their existing data. Furthermore, these collaboration frameworks face another key challenge: domain-agnostic reasoning transfer, where existing reasoning transfer methods fail to flexibly adapt to the local domain data, preventing SLMs from effectively acquiring step-by-step reasoning abilities within from general LLM. To address these challenges, we propose LaDa, a federated reasoning distillation framework with model learnability-aware data allocation. It introduces a model learnability-aware data filter that adaptively allocates high-reward samples based on the learnability gap between each SLM and LLM pair, effectively facilitating bidirectional knowledge transfer. We further design a domain adaptive reasoning distillation method that aligns joint probabilities of reasoning paths on filtered high-reward samples through contrastive distillation learning between SLM and LLM, enabling SLM to capture underlying reasoning patterns under local data distribution. LaDa operates as a plug-in module for existing collaboration frameworks, adapting knowledge transfer based on model learnability gaps.
title Federated Reasoning Distillation Framework with Model Learnability-Aware Data Allocation
topic Artificial Intelligence
url https://arxiv.org/abs/2602.18749