Saved in:
| Main Authors: | Li, Yahan, Jie, Xinyi, Ruan, Wanjia, Zhang, Xubei, Zhu, Huaijie, Gao, Yicheng, Du, Chaohao, Liu, Ruishan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.29373 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments
by: Gao, Yicheng, et al.
Published: (2026)
by: Gao, Yicheng, et al.
Published: (2026)
CounselReflect: A Toolkit for Auditing Mental-Health Dialogues
by: Li, Yahan, et al.
Published: (2026)
by: Li, Yahan, et al.
Published: (2026)
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering
by: Li, Yahan, et al.
Published: (2025)
by: Li, Yahan, et al.
Published: (2025)
From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction
by: Khatib, Hassan S. Al, et al.
Published: (2025)
by: Khatib, Hassan S. Al, et al.
Published: (2025)
Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges
by: Shi, Xiaoming, et al.
Published: (2024)
by: Shi, Xiaoming, et al.
Published: (2024)
Cancer-Myth: Evaluating Large Language Models on Patient Questions with False Presuppositions
by: Zhu, Wang Bill, et al.
Published: (2025)
by: Zhu, Wang Bill, et al.
Published: (2025)
Fairness or Fluency? An Investigation into Language Bias of Pairwise LLM-as-a-Judge
by: Zhou, Xiaolin, et al.
Published: (2026)
by: Zhou, Xiaolin, et al.
Published: (2026)
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation
by: Ren, Zhiyao, et al.
Published: (2024)
by: Ren, Zhiyao, et al.
Published: (2024)
Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning
by: Zhu, Jiayuan, et al.
Published: (2025)
by: Zhu, Jiayuan, et al.
Published: (2025)
LLM-based NLG Evaluation: Current Status and Challenges
by: Gao, Mingqi, et al.
Published: (2024)
by: Gao, Mingqi, et al.
Published: (2024)
Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models
by: Huang, Zhongzhen, et al.
Published: (2024)
by: Huang, Zhongzhen, et al.
Published: (2024)
Overview of the MEDIQA-OE 2025 Shared Task on Medical Order Extraction from Doctor-Patient Consultations
by: Corbeil, Jean-Philippe, et al.
Published: (2025)
by: Corbeil, Jean-Philippe, et al.
Published: (2025)
Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation
by: Lim, Seungseop, et al.
Published: (2025)
by: Lim, Seungseop, et al.
Published: (2025)
Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation
by: Ren, Zhiyao, et al.
Published: (2026)
by: Ren, Zhiyao, et al.
Published: (2026)
Beyond English and Evasion: A Human-Annotated Multi-Domain Benchmark for High-Stakes LLM Safety Evaluation in Chinese
by: Zaghouani, Wajdi, et al.
Published: (2026)
by: Zaghouani, Wajdi, et al.
Published: (2026)
Evaluating the Pre-Consultation Ability of LLMs using Diagnostic Guidelines
by: Seo, Jean, et al.
Published: (2026)
by: Seo, Jean, et al.
Published: (2026)
On the Calibration of Multilingual Question Answering LLMs
by: Yang, Yahan, et al.
Published: (2023)
by: Yang, Yahan, et al.
Published: (2023)
Learning Word Embedding with Better Distance Weighting and Window Size Scheduling
by: Yang, Chaohao, et al.
Published: (2024)
by: Yang, Chaohao, et al.
Published: (2024)
From Fuzzy Speech to Medical Insight: Benchmarking LLMs on Noisy Patient Narratives
by: Mama, Eden, et al.
Published: (2025)
by: Mama, Eden, et al.
Published: (2025)
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs
by: Kothari, Sara, et al.
Published: (2025)
by: Kothari, Sara, et al.
Published: (2025)
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
by: Tu, Yahan, et al.
Published: (2024)
by: Tu, Yahan, et al.
Published: (2024)
Large Language Model Evaluation via Matrix Nuclear-Norm
by: Li, Yahan, et al.
Published: (2024)
by: Li, Yahan, et al.
Published: (2024)
Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization
by: Wang, Liang, et al.
Published: (2026)
by: Wang, Liang, et al.
Published: (2026)
Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling
by: Ruan, Jie, et al.
Published: (2024)
by: Ruan, Jie, et al.
Published: (2024)
Evaluating ChatGPT on Medical Information Extraction Tasks: Performance, Explainability and Beyond
by: Li, Liz, et al.
Published: (2026)
by: Li, Liz, et al.
Published: (2026)
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
by: Fan, Yongqi, et al.
Published: (2025)
by: Fan, Yongqi, et al.
Published: (2025)
Beyond Paper-to-Paper: Structured Profiling and Rubric Scoring for Paper-Reviewer Matching
by: Pan, Yicheng, et al.
Published: (2026)
by: Pan, Yicheng, et al.
Published: (2026)
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
by: Xia, Congying, et al.
Published: (2024)
by: Xia, Congying, et al.
Published: (2024)
Solid Medication Intake in Hospitalised Patients With Dysphagia: A Challenge for Speech and Language Pathologists?
by: Michaela Trapl‐Grundschober, et al.
Published: (2025)
by: Michaela Trapl‐Grundschober, et al.
Published: (2025)
Beyond Survival: Evaluating LLMs in Social Deduction Games with Human-Aligned Strategies
by: Song, Zirui, et al.
Published: (2025)
by: Song, Zirui, et al.
Published: (2025)
MediEval: A Unified Medical Benchmark for Patient-Contextual and Knowledge-Grounded Reasoning in LLMs
by: Qu, Zhan, et al.
Published: (2025)
by: Qu, Zhan, et al.
Published: (2025)
Evaluating Alignment of Behavioral Dispositions in LLMs
by: Taubenfeld, Amir, et al.
Published: (2026)
by: Taubenfeld, Amir, et al.
Published: (2026)
MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval
by: Chen, Shuheng, et al.
Published: (2026)
by: Chen, Shuheng, et al.
Published: (2026)
MM-LLMs: Recent Advances in MultiModal Large Language Models
by: Zhang, Duzhen, et al.
Published: (2024)
by: Zhang, Duzhen, et al.
Published: (2024)
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation
by: Qin, Lang, et al.
Published: (2024)
by: Qin, Lang, et al.
Published: (2024)
The role of infrastructure investment location in China's Western development / Xubei Luo
by: Luo, Xubei
Published: (2004)
by: Luo, Xubei
Published: (2004)
Growth spillover effects and regional development patterns : the case of Chinese provinces / Xubei Luo
by: Luo, Xubei
Published: (2005)
by: Luo, Xubei
Published: (2005)
Regional disparities in labor market performance in Croatia : the role of individual and regional structural characteristics / Xubei Luo
by: Luo, Xubei
Published: (2007)
by: Luo, Xubei
Published: (2007)
EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records
by: Zhao, Shuguang, et al.
Published: (2025)
by: Zhao, Shuguang, et al.
Published: (2025)
Time-Critical Multimodal Medical Transportation: Organs, Patients, and Medical Supplies
by: Varnousfaderani, Elaheh Sabziyan, et al.
Published: (2026)
by: Varnousfaderani, Elaheh Sabziyan, et al.
Published: (2026)
Similar Items
-
MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments
by: Gao, Yicheng, et al.
Published: (2026) -
CounselReflect: A Toolkit for Auditing Mental-Health Dialogues
by: Li, Yahan, et al.
Published: (2026) -
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering
by: Li, Yahan, et al.
Published: (2025) -
From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction
by: Khatib, Hassan S. Al, et al.
Published: (2025) -
Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges
by: Shi, Xiaoming, et al.
Published: (2024)