Saved in:
| Main Authors: | Yang, Yifan, Liu, Xiaoyu, Jin, Qiao, Huang, Furong, Lu, Zhiyong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.13867 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information
by: Jin, Qiao, et al.
Published: (2023)
by: Jin, Qiao, et al.
Published: (2023)
Adversarial Attacks on Large Language Models in Medicine
by: Yang, Yifan, et al.
Published: (2024)
by: Yang, Yifan, et al.
Published: (2024)
Large Language Models Lack Temporal Awareness of Medical Knowledge
by: Guan, Zihan, et al.
Published: (2026)
by: Guan, Zihan, et al.
Published: (2026)
Ensuring Safety and Trust: Analyzing the Risks of Large Language Models in Medicine
by: Yang, Yifan, et al.
Published: (2024)
by: Yang, Yifan, et al.
Published: (2024)
Accelerating Clinical Evidence Synthesis with Large Language Models
by: Wang, Zifeng, et al.
Published: (2024)
by: Wang, Zifeng, et al.
Published: (2024)
Matching Patients to Clinical Trials with Large Language Models
by: Jin, Qiao, et al.
Published: (2023)
by: Jin, Qiao, et al.
Published: (2023)
Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play
by: Zeng, Yifan, et al.
Published: (2024)
by: Zeng, Yifan, et al.
Published: (2024)
Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models
by: Xiong, Guangzhi, et al.
Published: (2026)
by: Xiong, Guangzhi, et al.
Published: (2026)
Anchoring Bias in Large Language Models: An Experimental Study
by: Lou, Jiaxu, et al.
Published: (2024)
by: Lou, Jiaxu, et al.
Published: (2024)
Large Language Models and Causal Inference in Collaboration: A Survey
by: Liu, Xiaoyu, et al.
Published: (2024)
by: Liu, Xiaoyu, et al.
Published: (2024)
Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models
by: Manduru, Sumanth, et al.
Published: (2025)
by: Manduru, Sumanth, et al.
Published: (2025)
Quantifying Generalization Complexity for Large Language Models
by: Qi, Zhenting, et al.
Published: (2024)
by: Qi, Zhenting, et al.
Published: (2024)
Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
by: Zhou, Yuhang, et al.
Published: (2023)
by: Zhou, Yuhang, et al.
Published: (2023)
Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators
by: Wan, Nicholas, et al.
Published: (2024)
by: Wan, Nicholas, et al.
Published: (2024)
Evaluating the Impact of Lab Test Results on Large Language Models Generated Differential Diagnoses from Clinical Case Vignettes
by: Bhasuran, Balu, et al.
Published: (2024)
by: Bhasuran, Balu, et al.
Published: (2024)
PRISON: Unmasking the Criminal Potential of Large Language Models
by: Wu, Xinyi, et al.
Published: (2025)
by: Wu, Xinyi, et al.
Published: (2025)
Measuring Spiritual Values and Bias of Large Language Models
by: Liu, Songyuan, et al.
Published: (2024)
by: Liu, Songyuan, et al.
Published: (2024)
Benchmarking Retrieval-Augmented Generation for Medicine
by: Xiong, Guangzhi, et al.
Published: (2024)
by: Xiong, Guangzhi, et al.
Published: (2024)
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning
by: Jin, Qiao, et al.
Published: (2024)
by: Jin, Qiao, et al.
Published: (2024)
Problematic Tokens: Tokenizer Bias in Large Language Models
by: Yang, Jin, et al.
Published: (2024)
by: Yang, Jin, et al.
Published: (2024)
Bias in, Bias out: Annotation Bias in Multilingual Large Language Models
by: Cui, Xia, et al.
Published: (2025)
by: Cui, Xia, et al.
Published: (2025)
LADER: Log-Augmented DEnse Retrieval for Biomedical Literature Search
by: Jin, Qiao, et al.
Published: (2023)
by: Jin, Qiao, et al.
Published: (2023)
MedCite: Can Language Models Generate Verifiable Text for Medicine?
by: Wang, Xiao, et al.
Published: (2025)
by: Wang, Xiao, et al.
Published: (2025)
Quantifying Label-Induced Bias in Large Language Model Self- and Cross-Evaluations
by: Saraf, Muskan, et al.
Published: (2025)
by: Saraf, Muskan, et al.
Published: (2025)
MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
by: Khandekar, Nikhil, et al.
Published: (2024)
by: Khandekar, Nikhil, et al.
Published: (2024)
Mitigating the Bias of Large Language Model Evaluation
by: Zhou, Hongli, et al.
Published: (2024)
by: Zhou, Hongli, et al.
Published: (2024)
Examining Gender and Racial Bias in Large Vision-Language Models Using a Novel Dataset of Parallel Images
by: Fraser, Kathleen C., et al.
Published: (2024)
by: Fraser, Kathleen C., et al.
Published: (2024)
Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models
by: Agarwal, Aradhye, et al.
Published: (2024)
by: Agarwal, Aradhye, et al.
Published: (2024)
Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study
by: He, Zhe, et al.
Published: (2024)
by: He, Zhe, et al.
Published: (2024)
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
by: Wang, Xiyao, et al.
Published: (2024)
by: Wang, Xiyao, et al.
Published: (2024)
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
by: Guan, Tianrui, et al.
Published: (2023)
by: Guan, Tianrui, et al.
Published: (2023)
Where-to-Unmask: Ground-Truth-Guided Unmasking Order Learning for Masked Diffusion Language Models
by: Asano, Hikaru, et al.
Published: (2026)
by: Asano, Hikaru, et al.
Published: (2026)
Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health
by: Tian, Shubo, et al.
Published: (2023)
by: Tian, Shubo, et al.
Published: (2023)
Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models
by: Guo, Linge
Published: (2024)
by: Guo, Linge
Published: (2024)
Unmasking Conversational Bias in AI Multiagent Systems
by: Coppolillo, Erica, et al.
Published: (2025)
by: Coppolillo, Erica, et al.
Published: (2025)
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
by: An, Bang, et al.
Published: (2024)
by: An, Bang, et al.
Published: (2024)
Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency
by: Liu, Yiran, et al.
Published: (2024)
by: Liu, Yiran, et al.
Published: (2024)
Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models
by: Jin, Feihu, et al.
Published: (2024)
by: Jin, Feihu, et al.
Published: (2024)
Reward Models Can Improve Themselves: Reward-Guided Adversarial Failure Mode Discovery for Robust Reward Modeling
by: Pathmanathan, Pankayaraj, et al.
Published: (2025)
by: Pathmanathan, Pankayaraj, et al.
Published: (2025)
Quantifying Hallucinations in Language Language Models on Medical Textbooks
by: Colelough, Brandon C., et al.
Published: (2026)
by: Colelough, Brandon C., et al.
Published: (2026)
Similar Items
-
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information
by: Jin, Qiao, et al.
Published: (2023) -
Adversarial Attacks on Large Language Models in Medicine
by: Yang, Yifan, et al.
Published: (2024) -
Large Language Models Lack Temporal Awareness of Medical Knowledge
by: Guan, Zihan, et al.
Published: (2026) -
Ensuring Safety and Trust: Analyzing the Risks of Large Language Models in Medicine
by: Yang, Yifan, et al.
Published: (2024) -
Accelerating Clinical Evidence Synthesis with Large Language Models
by: Wang, Zifeng, et al.
Published: (2024)