Salvato in:
Dettagli Bibliografici
Autori principali: Chen, Chacha, Liu, Han, Yang, Jiamin, Mervak, Benjamin M., Kalaycioglu, Bora, Lee, Grace, Cakmakli, Emre, Bonatti, Matteo, Pudu, Sridhar, Kahraman, Osman, Pamuk, Gul Gizem, Oto, Aytekin, Chatterjee, Aritrick, Tan, Chenhao
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2502.03482
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866910816486293504
author Chen, Chacha
Liu, Han
Yang, Jiamin
Mervak, Benjamin M.
Kalaycioglu, Bora
Lee, Grace
Cakmakli, Emre
Bonatti, Matteo
Pudu, Sridhar
Kahraman, Osman
Pamuk, Gul Gizem
Oto, Aytekin
Chatterjee, Aritrick
Tan, Chenhao
author_facet Chen, Chacha
Liu, Han
Yang, Jiamin
Mervak, Benjamin M.
Kalaycioglu, Bora
Lee, Grace
Cakmakli, Emre
Bonatti, Matteo
Pudu, Sridhar
Kahraman, Osman
Pamuk, Gul Gizem
Oto, Aytekin
Chatterjee, Aritrick
Tan, Chenhao
contents Despite the growing interest in human-AI decision making, experimental studies with domain experts remain rare, largely due to the complexity of working with domain experts and the challenges in setting up realistic experiments. In this work, we conduct an in-depth collaboration with radiologists in prostate cancer diagnosis based on MRI images. Building on existing tools for teaching prostate cancer diagnosis, we develop an interface and conduct two experiments to study how AI assistance and performance feedback shape the decision making of domain experts. In Study 1, clinicians were asked to provide an initial diagnosis (human), then view the AI's prediction, and subsequently finalize their decision (human-AI team). In Study 2 (after a memory wash-out period), the same participants first received aggregated performance statistics from Study 1, specifically their own performance, the AI's performance, and their human-AI team performance, and then directly viewed the AI's prediction before making their diagnosis (i.e., no independent initial diagnosis). These two workflows represent realistic ways that clinical AI tools might be used in practice, where the second study simulates a scenario where doctors can adjust their reliance and trust on AI based on prior performance feedback. Our findings show that, while human-AI teams consistently outperform humans alone, they still underperform the AI due to under-reliance, similar to prior studies with crowdworkers. Providing clinicians with performance feedback did not significantly improve the performance of human-AI teams, although showing AI decisions in advance nudges people to follow AI more. Meanwhile, we observe that the ensemble of human-AI teams can outperform AI alone, suggesting promising directions for human-AI collaboration.
format Preprint
id arxiv_https___arxiv_org_abs_2502_03482
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Can Domain Experts Rely on AI Appropriately? A Case Study on AI-Assisted Prostate Cancer MRI Diagnosis
Chen, Chacha
Liu, Han
Yang, Jiamin
Mervak, Benjamin M.
Kalaycioglu, Bora
Lee, Grace
Cakmakli, Emre
Bonatti, Matteo
Pudu, Sridhar
Kahraman, Osman
Pamuk, Gul Gizem
Oto, Aytekin
Chatterjee, Aritrick
Tan, Chenhao
Image and Video Processing
Artificial Intelligence
Computer Vision and Pattern Recognition
Computers and Society
Human-Computer Interaction
Machine Learning
Despite the growing interest in human-AI decision making, experimental studies with domain experts remain rare, largely due to the complexity of working with domain experts and the challenges in setting up realistic experiments. In this work, we conduct an in-depth collaboration with radiologists in prostate cancer diagnosis based on MRI images. Building on existing tools for teaching prostate cancer diagnosis, we develop an interface and conduct two experiments to study how AI assistance and performance feedback shape the decision making of domain experts. In Study 1, clinicians were asked to provide an initial diagnosis (human), then view the AI's prediction, and subsequently finalize their decision (human-AI team). In Study 2 (after a memory wash-out period), the same participants first received aggregated performance statistics from Study 1, specifically their own performance, the AI's performance, and their human-AI team performance, and then directly viewed the AI's prediction before making their diagnosis (i.e., no independent initial diagnosis). These two workflows represent realistic ways that clinical AI tools might be used in practice, where the second study simulates a scenario where doctors can adjust their reliance and trust on AI based on prior performance feedback. Our findings show that, while human-AI teams consistently outperform humans alone, they still underperform the AI due to under-reliance, similar to prior studies with crowdworkers. Providing clinicians with performance feedback did not significantly improve the performance of human-AI teams, although showing AI decisions in advance nudges people to follow AI more. Meanwhile, we observe that the ensemble of human-AI teams can outperform AI alone, suggesting promising directions for human-AI collaboration.
title Can Domain Experts Rely on AI Appropriately? A Case Study on AI-Assisted Prostate Cancer MRI Diagnosis
topic Image and Video Processing
Artificial Intelligence
Computer Vision and Pattern Recognition
Computers and Society
Human-Computer Interaction
Machine Learning
url https://arxiv.org/abs/2502.03482