Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Salvi, Rohan Charudatt, Chawla, Chirag, Jain, Dhruv, Panigrahi, Swapnil, Akhtar, Md Shad, Yadav, Shweta
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2512.03340
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908690446024704
author	Salvi, Rohan Charudatt Chawla, Chirag Jain, Dhruv Panigrahi, Swapnil Akhtar, Md Shad Yadav, Shweta
author_facet	Salvi, Rohan Charudatt Chawla, Chirag Jain, Dhruv Panigrahi, Swapnil Akhtar, Md Shad Yadav, Shweta
contents	Automatic medical text simplification plays a key role in improving health literacy by making complex biomedical research accessible to diverse readers. However, most existing resources assume a single generic audience, overlooking the wide variation in medical literacy and information needs across user groups. To address this limitation, we introduce PERCS (Persona-guided Controllable Summarization), a dataset of biomedical abstracts paired with summaries tailored to four personas: Laypersons, Premedical Students, Non-medical Researchers, and Medical Experts. These personas represent different levels of medical literacy and information needs, emphasizing the need for targeted, audience-specific summarization. Each summary in PERCS was reviewed by physicians for factual accuracy and persona alignment using a detailed error taxonomy. Technical validation shows clear differences in readability, vocabulary, and content depth across personas. Along with describing the dataset, we benchmark four large language models on PERCS using automatic evaluation metrics that assess comprehensiveness, readability, and faithfulness, establishing baseline results for future research. The dataset, annotation guidelines, and evaluation materials are publicly available to support research on persona-specific communication and controllable biomedical summarization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_03340
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	PERCS: Persona-Guided Controllable Biomedical Summarization Dataset Salvi, Rohan Charudatt Chawla, Chirag Jain, Dhruv Panigrahi, Swapnil Akhtar, Md Shad Yadav, Shweta Computation and Language Automatic medical text simplification plays a key role in improving health literacy by making complex biomedical research accessible to diverse readers. However, most existing resources assume a single generic audience, overlooking the wide variation in medical literacy and information needs across user groups. To address this limitation, we introduce PERCS (Persona-guided Controllable Summarization), a dataset of biomedical abstracts paired with summaries tailored to four personas: Laypersons, Premedical Students, Non-medical Researchers, and Medical Experts. These personas represent different levels of medical literacy and information needs, emphasizing the need for targeted, audience-specific summarization. Each summary in PERCS was reviewed by physicians for factual accuracy and persona alignment using a detailed error taxonomy. Technical validation shows clear differences in readability, vocabulary, and content depth across personas. Along with describing the dataset, we benchmark four large language models on PERCS using automatic evaluation metrics that assess comprehensiveness, readability, and faithfulness, establishing baseline results for future research. The dataset, annotation guidelines, and evaluation materials are publicly available to support research on persona-specific communication and controllable biomedical summarization.
title	PERCS: Persona-Guided Controllable Biomedical Summarization Dataset
topic	Computation and Language
url	https://arxiv.org/abs/2512.03340

Similar Items