Saved in:
| Main Authors: | Romano, Antonio, Riccio, Giuseppe, Barone, Mariano, Postiglione, Marco, Moscato, Vincenzo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.18468 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP
by: Barone, Mariano, et al.
Published: (2025)
by: Barone, Mariano, et al.
Published: (2025)
Combating Biomedical Misinformation through Multi-modal Claim Detection and Evidence-based Verification
by: Barone, Mariano, et al.
Published: (2025)
by: Barone, Mariano, et al.
Published: (2025)
Combining Evidence and Reasoning for Biomedical Fact-Checking
by: Barone, Mariano, et al.
Published: (2025)
by: Barone, Mariano, et al.
Published: (2025)
SOLVE-Med: Specialized Orchestration for Leading Vertical Experts across Medical Specialties
by: Di Marino, Roberta, et al.
Published: (2025)
by: Di Marino, Roberta, et al.
Published: (2025)
Can "AI" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs
by: Barone, Mariano, et al.
Published: (2026)
by: Barone, Mariano, et al.
Published: (2026)
Brain3D: Brain Report Automation via Inflated Vision Transformers in 3D
by: Barone, Mariano, et al.
Published: (2026)
by: Barone, Mariano, et al.
Published: (2026)
Agent-Based Modelling Meets Generative AI in Social Network Simulations
by: Ferraro, Antonino, et al.
Published: (2024)
by: Ferraro, Antonino, et al.
Published: (2024)
Hallucination Benchmark in Medical Visual Question Answering
by: Wu, Jinge, et al.
Published: (2024)
by: Wu, Jinge, et al.
Published: (2024)
A Benchmark for Long-Form Medical Question Answering
by: Hosseini, Pedram, et al.
Published: (2024)
by: Hosseini, Pedram, et al.
Published: (2024)
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
by: Chen, Hanjie, et al.
Published: (2024)
by: Chen, Hanjie, et al.
Published: (2024)
Assessing the Potential of Generative Agents in Crowdsourced Fact-Checking
by: Costabile, Luigia, et al.
Published: (2025)
by: Costabile, Luigia, et al.
Published: (2025)
MedExQA: Medical Question Answering Benchmark with Multiple Explanations
by: Kim, Yunsoo, et al.
Published: (2024)
by: Kim, Yunsoo, et al.
Published: (2024)
How Good LLMs Are at Answering Bangla Medical Visual Questions? Dataset and Benchmarking
by: Ahmed, Rafid, et al.
Published: (2026)
by: Ahmed, Rafid, et al.
Published: (2026)
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering
by: Alonso, Iñigo, et al.
Published: (2024)
by: Alonso, Iñigo, et al.
Published: (2024)
MedAraBench: Large-Scale Arabic Medical Question Answering Dataset and Benchmark
by: Abu-Daoud, Mouath, et al.
Published: (2026)
by: Abu-Daoud, Mouath, et al.
Published: (2026)
Efficient Medical Question Answering with Knowledge-Augmented Question Generation
by: Khlaut, Julien, et al.
Published: (2024)
by: Khlaut, Julien, et al.
Published: (2024)
Trustworthy Medical Question Answering: An Evaluation-Centric Survey
by: Wang, Yinuo, et al.
Published: (2025)
by: Wang, Yinuo, et al.
Published: (2025)
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset
by: Olatunji, Tobi, et al.
Published: (2024)
by: Olatunji, Tobi, et al.
Published: (2024)
Social Bias in Popular Question-Answering Benchmarks
by: Kraft, Angelie, et al.
Published: (2025)
by: Kraft, Angelie, et al.
Published: (2025)
Argument-Based Comparative Question Answering Evaluation Benchmark
by: Nikishina, Irina, et al.
Published: (2025)
by: Nikishina, Irina, et al.
Published: (2025)
MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering
by: Zhang, Xuanliang, et al.
Published: (2025)
by: Zhang, Xuanliang, et al.
Published: (2025)
MedEthicsQA: A Comprehensive Question Answering Benchmark for Medical Ethics Evaluation of LLMs
by: Wei, Jianhui, et al.
Published: (2025)
by: Wei, Jianhui, et al.
Published: (2025)
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering
by: Muller, Sacha, et al.
Published: (2024)
by: Muller, Sacha, et al.
Published: (2024)
Rationale-Guided Retrieval Augmented Generation for Medical Question Answering
by: Sohn, Jiwoong, et al.
Published: (2024)
by: Sohn, Jiwoong, et al.
Published: (2024)
Effects of Cross-lingual Evidence in Multilingual Medical Question Answering
by: Yeginbergen, Anar, et al.
Published: (2026)
by: Yeginbergen, Anar, et al.
Published: (2026)
Coal Mining Question Answering with LLMs
by: Rivera, Antonio Carlos, et al.
Published: (2024)
by: Rivera, Antonio Carlos, et al.
Published: (2024)
MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical Question Answering
by: Shi, Yucheng, et al.
Published: (2023)
by: Shi, Yucheng, et al.
Published: (2023)
To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering
by: Zhan, Zaifu, et al.
Published: (2026)
by: Zhan, Zaifu, et al.
Published: (2026)
RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning
by: Jin, Congyun, et al.
Published: (2024)
by: Jin, Congyun, et al.
Published: (2024)
PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language
by: Jamali, Naghmeh, et al.
Published: (2025)
by: Jamali, Naghmeh, et al.
Published: (2025)
PAT-Questions: A Self-Updating Benchmark for Present-Anchored Temporal Question-Answering
by: Meem, Jannat Ara, et al.
Published: (2024)
by: Meem, Jannat Ara, et al.
Published: (2024)
A Large-Scale Benchmark for Evaluating Large Language Models on Medical Question Answering in Romanian
by: Rogoz, Ana-Cristina, et al.
Published: (2025)
by: Rogoz, Ana-Cristina, et al.
Published: (2025)
Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems
by: Ji, Yuelyu, et al.
Published: (2025)
by: Ji, Yuelyu, et al.
Published: (2025)
Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering
by: AlMannaa, Feras, et al.
Published: (2025)
by: AlMannaa, Feras, et al.
Published: (2025)
TOP-Training: Target-Oriented Pretraining for Medical Extractive Question Answering
by: Sengupta, Saptarshi, et al.
Published: (2023)
by: Sengupta, Saptarshi, et al.
Published: (2023)
OWLViz: An Open-World Benchmark for Visual Question Answering
by: Nguyen, Thuy, et al.
Published: (2025)
by: Nguyen, Thuy, et al.
Published: (2025)
KoBBQ: Korean Bias Benchmark for Question Answering
by: Jin, Jiho, et al.
Published: (2023)
by: Jin, Jiho, et al.
Published: (2023)
Multilingual Medical Reasoning for Question Answering with Large Language Models
by: Ferrazzi, Pietro, et al.
Published: (2025)
by: Ferrazzi, Pietro, et al.
Published: (2025)
Fine-Tuning LLMs for Reliable Medical Question-Answering Services
by: Anaissi, Ali, et al.
Published: (2024)
by: Anaissi, Ali, et al.
Published: (2024)
Uncertainty Estimation of Large Language Models in Medical Question Answering
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
Similar Items
-
DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP
by: Barone, Mariano, et al.
Published: (2025) -
Combating Biomedical Misinformation through Multi-modal Claim Detection and Evidence-based Verification
by: Barone, Mariano, et al.
Published: (2025) -
Combining Evidence and Reasoning for Biomedical Fact-Checking
by: Barone, Mariano, et al.
Published: (2025) -
SOLVE-Med: Specialized Orchestration for Leading Vertical Experts across Medical Specialties
by: Di Marino, Roberta, et al.
Published: (2025) -
Can "AI" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs
by: Barone, Mariano, et al.
Published: (2026)