Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.01113 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917378036596736 |
|---|---|
| author | Liu, Haochen Li, Weien Song, Rui Li, Zeyu Xue, Chun Jason Liu, Xiao-Yang Nallaperuma, Sam Liu, Xue Yuan, Ye |
| author_facet | Liu, Haochen Li, Weien Song, Rui Li, Zeyu Xue, Chun Jason Liu, Xiao-Yang Nallaperuma, Sam Liu, Xue Yuan, Ye |
| contents | Large language model (LLM) systems are increasingly used to support high-stakes decision-making, but they typically perform worse when the available evidence is internally inconsistent. Such a scenario exists in real-world healthcare settings, with patient-reported symptoms contradicting medical signs. To study this problem, we introduce MIMIC-DOS, a dataset for short-horizon organ dysfunction worsening prediction in the intensive care unit (ICU) setting. We derive this dataset from the widely recognized MIMIC-IV, a publicly available electronic health record dataset, and construct it exclusively from cases in which discordance between signs and symptoms exists. This setting poses a substantial challenge for existing LLM-based approaches, with single-pass LLMs and agentic pipelines often struggling to reconcile such conflicting signals. To address this problem, we propose CARE: a multi-stage privacy-compliant agentic reasoning framework in which a remote LLM provides guidance by generating structured categories and transitions without accessing sensitive patient data, while a local LLM uses these categories and transitions to support evidence acquisition and final decision-making. Empirically, CARE achieves stronger performance across all key metrics compared to multiple baseline settings, showing that CARE can more robustly handle conflicting clinical evidence while preserving privacy. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_01113 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance Liu, Haochen Li, Weien Song, Rui Li, Zeyu Xue, Chun Jason Liu, Xiao-Yang Nallaperuma, Sam Liu, Xue Yuan, Ye Computation and Language Large language model (LLM) systems are increasingly used to support high-stakes decision-making, but they typically perform worse when the available evidence is internally inconsistent. Such a scenario exists in real-world healthcare settings, with patient-reported symptoms contradicting medical signs. To study this problem, we introduce MIMIC-DOS, a dataset for short-horizon organ dysfunction worsening prediction in the intensive care unit (ICU) setting. We derive this dataset from the widely recognized MIMIC-IV, a publicly available electronic health record dataset, and construct it exclusively from cases in which discordance between signs and symptoms exists. This setting poses a substantial challenge for existing LLM-based approaches, with single-pass LLMs and agentic pipelines often struggling to reconcile such conflicting signals. To address this problem, we propose CARE: a multi-stage privacy-compliant agentic reasoning framework in which a remote LLM provides guidance by generating structured categories and transitions without accessing sensitive patient data, while a local LLM uses these categories and transitions to support evidence acquisition and final decision-making. Empirically, CARE achieves stronger performance across all key metrics compared to multiple baseline settings, showing that CARE can more robustly handle conflicting clinical evidence while preserving privacy. |
| title | CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2604.01113 |