:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Yahan, Jie, Xinyi, Ruan, Wanjia, Zhang, Xubei, Zhu, Huaijie, Gao, Yicheng, Du, Chaohao, Liu, Ruishan
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.29373
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments
by: Gao, Yicheng, et al.
Published: (2026)

CounselReflect: A Toolkit for Auditing Mental-Health Dialogues
by: Li, Yahan, et al.
Published: (2026)

CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering
by: Li, Yahan, et al.
Published: (2025)

From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction
by: Khatib, Hassan S. Al, et al.
Published: (2025)

Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges
by: Shi, Xiaoming, et al.
Published: (2024)

Cancer-Myth: Evaluating Large Language Models on Patient Questions with False Presuppositions
by: Zhu, Wang Bill, et al.
Published: (2025)

Fairness or Fluency? An Investigation into Language Bias of Pairwise LLM-as-a-Judge
by: Zhou, Xiaolin, et al.
Published: (2026)

Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation
by: Ren, Zhiyao, et al.
Published: (2024)

Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning
by: Zhu, Jiayuan, et al.
Published: (2025)

LLM-based NLG Evaluation: Current Status and Challenges
by: Gao, Mingqi, et al.
Published: (2024)

Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models
by: Huang, Zhongzhen, et al.
Published: (2024)

Overview of the MEDIQA-OE 2025 Shared Task on Medical Order Extraction from Doctor-Patient Consultations
by: Corbeil, Jean-Philippe, et al.
Published: (2025)

Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation
by: Lim, Seungseop, et al.
Published: (2025)

Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation
by: Ren, Zhiyao, et al.
Published: (2026)

Beyond English and Evasion: A Human-Annotated Multi-Domain Benchmark for High-Stakes LLM Safety Evaluation in Chinese
by: Zaghouani, Wajdi, et al.
Published: (2026)

Evaluating the Pre-Consultation Ability of LLMs using Diagnostic Guidelines
by: Seo, Jean, et al.
Published: (2026)

On the Calibration of Multilingual Question Answering LLMs
by: Yang, Yahan, et al.
Published: (2023)

Learning Word Embedding with Better Distance Weighting and Window Size Scheduling
by: Yang, Chaohao, et al.
Published: (2024)

From Fuzzy Speech to Medical Insight: Benchmarking LLMs on Noisy Patient Narratives
by: Mama, Eden, et al.
Published: (2025)

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs
by: Kothari, Sara, et al.
Published: (2025)

ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
by: Tu, Yahan, et al.
Published: (2024)

Large Language Model Evaluation via Matrix Nuclear-Norm
by: Li, Yahan, et al.
Published: (2024)

Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization
by: Wang, Liang, et al.
Published: (2026)

Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling
by: Ruan, Jie, et al.
Published: (2024)

Evaluating ChatGPT on Medical Information Extraction Tasks: Performance, Explainability and Beyond
by: Li, Liz, et al.
Published: (2026)

MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
by: Fan, Yongqi, et al.
Published: (2025)

Beyond Paper-to-Paper: Structured Profiling and Rubric Scoring for Paper-Reviewer Matching
by: Pan, Yicheng, et al.
Published: (2026)

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
by: Xia, Congying, et al.
Published: (2024)

Solid Medication Intake in Hospitalised Patients With Dysphagia: A Challenge for Speech and Language Pathologists?
by: Michaela Trapl‐Grundschober, et al.
Published: (2025)

Beyond Survival: Evaluating LLMs in Social Deduction Games with Human-Aligned Strategies
by: Song, Zirui, et al.
Published: (2025)

MediEval: A Unified Medical Benchmark for Patient-Contextual and Knowledge-Grounded Reasoning in LLMs
by: Qu, Zhan, et al.
Published: (2025)

Evaluating Alignment of Behavioral Dispositions in LLMs
by: Taubenfeld, Amir, et al.
Published: (2026)

MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval
by: Chen, Shuheng, et al.
Published: (2026)

MM-LLMs: Recent Advances in MultiModal Large Language Models
by: Zhang, Duzhen, et al.
Published: (2024)

Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation
by: Qin, Lang, et al.
Published: (2024)

The role of infrastructure investment location in China's Western development / Xubei Luo
by: Luo, Xubei
Published: (2004)

Growth spillover effects and regional development patterns : the case of Chinese provinces / Xubei Luo
by: Luo, Xubei
Published: (2005)

Regional disparities in labor market performance in Croatia : the role of individual and regional structural characteristics / Xubei Luo
by: Luo, Xubei
Published: (2007)

EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records
by: Zhao, Shuguang, et al.
Published: (2025)

Time-Critical Multimodal Medical Transportation: Organs, Patients, and Medical Supplies
by: Varnousfaderani, Elaheh Sabziyan, et al.
Published: (2026)