Saved in:
Bibliographic Details
Main Authors: Yang, Shu, Gamalo, Margaret, Fu, Haoda
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.19735
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914169974947840
author Yang, Shu
Gamalo, Margaret
Fu, Haoda
author_facet Yang, Shu
Gamalo, Margaret
Fu, Haoda
contents Randomized controlled trials (RCTs) have been the cornerstone of clinical evidence; however, their cost, duration, and restrictive eligibility criteria limit power and external validity. Studies using real-world data (RWD), historically considered less reliable for establishing causality, are now recognized to be important for generating real-world evidence (RWE). In parallel, artificial intelligence and machine learning (AI/ML) are being increasingly used throughout the drug development process, providing scalability and flexibility but also presenting challenges in interpretability and rigor that traditional statistics do not face. This Perspective argues that the future of evidence generation will not depend on RCTs versus RWD, or statistics versus AI/ML, but on their principled integration. To this end, a causal roadmap is needed to clarify inferential goals, make assumptions explicit, and ensure transparency about tradeoffs. We highlight key objectives of integrative evidence synthesis, including transporting RCT results to broader populations, embedding AI-assisted analyses within RCTs, designing hybrid controlled trials, and extending short-term RCTs with long-term RWD. We also outline future directions in privacy-preserving analytics, uncertainty quantification, and small-sample methods. By uniting statistical rigor with AI/ML innovation, integrative approaches can produce robust, transparent, and policy-relevant evidence, making them a key component of modern regulatory science.
format Preprint
id arxiv_https___arxiv_org_abs_2511_19735
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Integrating RCTs, RWD, AI/ML and Statistics: Next-Generation Evidence Synthesis
Yang, Shu
Gamalo, Margaret
Fu, Haoda
Methodology
Machine Learning
Randomized controlled trials (RCTs) have been the cornerstone of clinical evidence; however, their cost, duration, and restrictive eligibility criteria limit power and external validity. Studies using real-world data (RWD), historically considered less reliable for establishing causality, are now recognized to be important for generating real-world evidence (RWE). In parallel, artificial intelligence and machine learning (AI/ML) are being increasingly used throughout the drug development process, providing scalability and flexibility but also presenting challenges in interpretability and rigor that traditional statistics do not face. This Perspective argues that the future of evidence generation will not depend on RCTs versus RWD, or statistics versus AI/ML, but on their principled integration. To this end, a causal roadmap is needed to clarify inferential goals, make assumptions explicit, and ensure transparency about tradeoffs. We highlight key objectives of integrative evidence synthesis, including transporting RCT results to broader populations, embedding AI-assisted analyses within RCTs, designing hybrid controlled trials, and extending short-term RCTs with long-term RWD. We also outline future directions in privacy-preserving analytics, uncertainty quantification, and small-sample methods. By uniting statistical rigor with AI/ML innovation, integrative approaches can produce robust, transparent, and policy-relevant evidence, making them a key component of modern regulatory science.
title Integrating RCTs, RWD, AI/ML and Statistics: Next-Generation Evidence Synthesis
topic Methodology
Machine Learning
url https://arxiv.org/abs/2511.19735