Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Zifeng, Gao, Junyi, Danek, Benjamin, Theodorou, Brandon, Shaik, Ruba, Thati, Shivashankar, Won, Seunghyun, Sun, Jimeng
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2504.00934
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910900978450432
author	Wang, Zifeng Gao, Junyi Danek, Benjamin Theodorou, Brandon Shaik, Ruba Thati, Shivashankar Won, Seunghyun Sun, Jimeng
author_facet	Wang, Zifeng Gao, Junyi Danek, Benjamin Theodorou, Brandon Shaik, Ruba Thati, Shivashankar Won, Seunghyun Sun, Jimeng
contents	Leveraging large language models (LLMs) to generate high-stakes documents, such as informed consent forms (ICFs), remains a significant challenge due to the extreme need for regulatory compliance and factual accuracy. Here, we present InformGen, an LLM-driven copilot for accurate and compliant ICF drafting by optimized knowledge document parsing and content generation, with humans in the loop. We further construct a benchmark dataset comprising protocols and ICFs from 900 clinical trials. Experimental results demonstrate that InformGen achieves near 100% compliance with 18 core regulatory rules derived from FDA guidelines, outperforming a vanilla GPT-4o model by up to 30%. Additionally, a user study with five annotators shows that InformGen, when integrated with manual intervention, attains over 90% factual accuracy, significantly surpassing the vanilla GPT-4o model's 57%-82%. Crucially, InformGen ensures traceability by providing inline citations to source protocols, enabling easy verification and maintaining the highest standards of factual integrity.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_00934
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	InformGen: An AI Copilot for Accurate and Compliant Clinical Research Consent Document Generation Wang, Zifeng Gao, Junyi Danek, Benjamin Theodorou, Brandon Shaik, Ruba Thati, Shivashankar Won, Seunghyun Sun, Jimeng Computation and Language Leveraging large language models (LLMs) to generate high-stakes documents, such as informed consent forms (ICFs), remains a significant challenge due to the extreme need for regulatory compliance and factual accuracy. Here, we present InformGen, an LLM-driven copilot for accurate and compliant ICF drafting by optimized knowledge document parsing and content generation, with humans in the loop. We further construct a benchmark dataset comprising protocols and ICFs from 900 clinical trials. Experimental results demonstrate that InformGen achieves near 100% compliance with 18 core regulatory rules derived from FDA guidelines, outperforming a vanilla GPT-4o model by up to 30%. Additionally, a user study with five annotators shows that InformGen, when integrated with manual intervention, attains over 90% factual accuracy, significantly surpassing the vanilla GPT-4o model's 57%-82%. Crucially, InformGen ensures traceability by providing inline citations to source protocols, enabling easy verification and maintaining the highest standards of factual integrity.
title	InformGen: An AI Copilot for Accurate and Compliant Clinical Research Consent Document Generation
topic	Computation and Language
url	https://arxiv.org/abs/2504.00934

Similar Items