Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jin, Yuqi, Shuai, Zhenhao, Hu, Zihan, Zhang, Weiteng, Xie, Weihao, Shuai, Jianwei, Shen, Xian, Feng, Zhen
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2507.21179
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914053352325120
author	Jin, Yuqi Shuai, Zhenhao Hu, Zihan Zhang, Weiteng Xie, Weihao Shuai, Jianwei Shen, Xian Feng, Zhen
author_facet	Jin, Yuqi Shuai, Zhenhao Hu, Zihan Zhang, Weiteng Xie, Weihao Shuai, Jianwei Shen, Xian Feng, Zhen
contents	Background and Aims: Large language models (LLMs) have shown remarkable generalization and transfer capabilities by learning from vast corpora of text and web data. Their semantic representations allow cross-task knowledge transfer and reasoning, offering promising opportunities for data-scarce and heterogeneous domains such as clinical medicine. Yet, in diagnostic tasks like sarcopenia, major challenges remain: interpretability, transparency, and deployment efficiency. Traditional machine learning (TML) models provide stable performance and feature-level attribution, ensuring traceable and auditable decision logic, but lack semantic breadth. Conversely, LLMs enable flexible inference but often function as opaque predictors. Existing integration strategies remain shallow, rarely embedding the structured reasoning of TML into LLM inference. Methods: Using sarcopenia diagnosis as a case study, SHapley Additive exPlanations (SHAP) were extracted from a baseline XGBoost model and transformed into structured, LLM-compatible representations. An actor-critic reinforcement learning (RL) strategy guided the LLM to reason over these SHAP-based inputs, producing calibrated rationales and refined decision rules. The distilled reasoning was consolidated into a structured knowledge repository and deployed via retrieval-augmented generation (RAG) for case-based inference. Results: (Omitted here.) Conclusion: By coupling SHAP-derived statistical evidence with reinforcement-trained LLM reasoning, CANDLE mitigates the interpretability-performance trade-off, enhances predictive accuracy, and preserves high decision consistency. The framework offers a scalable approach to knowledge assetization of TML models, enabling interpretable, reproducible, and clinically aligned decision support in sarcopenia and potentially broader medical domains.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_21179
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	CANDLE: A Cross-Modal Agentic Knowledge Distillation Framework for Interpretable Sarcopenia Diagnosis Jin, Yuqi Shuai, Zhenhao Hu, Zihan Zhang, Weiteng Xie, Weihao Shuai, Jianwei Shen, Xian Feng, Zhen Machine Learning Artificial Intelligence Background and Aims: Large language models (LLMs) have shown remarkable generalization and transfer capabilities by learning from vast corpora of text and web data. Their semantic representations allow cross-task knowledge transfer and reasoning, offering promising opportunities for data-scarce and heterogeneous domains such as clinical medicine. Yet, in diagnostic tasks like sarcopenia, major challenges remain: interpretability, transparency, and deployment efficiency. Traditional machine learning (TML) models provide stable performance and feature-level attribution, ensuring traceable and auditable decision logic, but lack semantic breadth. Conversely, LLMs enable flexible inference but often function as opaque predictors. Existing integration strategies remain shallow, rarely embedding the structured reasoning of TML into LLM inference. Methods: Using sarcopenia diagnosis as a case study, SHapley Additive exPlanations (SHAP) were extracted from a baseline XGBoost model and transformed into structured, LLM-compatible representations. An actor-critic reinforcement learning (RL) strategy guided the LLM to reason over these SHAP-based inputs, producing calibrated rationales and refined decision rules. The distilled reasoning was consolidated into a structured knowledge repository and deployed via retrieval-augmented generation (RAG) for case-based inference. Results: (Omitted here.) Conclusion: By coupling SHAP-derived statistical evidence with reinforcement-trained LLM reasoning, CANDLE mitigates the interpretability-performance trade-off, enhances predictive accuracy, and preserves high decision consistency. The framework offers a scalable approach to knowledge assetization of TML models, enabling interpretable, reproducible, and clinically aligned decision support in sarcopenia and potentially broader medical domains.
title	CANDLE: A Cross-Modal Agentic Knowledge Distillation Framework for Interpretable Sarcopenia Diagnosis
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2507.21179

Similar Items