Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Sylvey, Menke, Joe, Ming, Shufan, Nam, Dongin, Smalheiser, Neil, Kilicoglu, Halil
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.20628
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913147347009536
author	Lin, Sylvey Menke, Joe Ming, Shufan Nam, Dongin Smalheiser, Neil Kilicoglu, Halil
author_facet	Lin, Sylvey Menke, Joe Ming, Shufan Nam, Dongin Smalheiser, Neil Kilicoglu, Halil
contents	Biomedical abstracts play a critical role in downstream NLP applications, such as information retrieval, biocuration, and biomedical knowledge discovery. However, a non-trivial number of biomedical articles do not have abstracts, diminishing the utility of these articles for downstream tasks. We propose DPR-BAG (Divide, Prompt, and Refine for Biomedical Abstract Generation), a training-free, zero-shot framework that generates coherent and factually grounded abstracts for biomedical articles with full text but no abstract. DPR-BAG decomposes full-text documents into structured rhetorical facets following the Background-Objective-Methods-Results-Conclusions (BOMRC) schema, performs parallel LLM-based summarization for each facet, and applies a final refinement stage to restore global discourse coherence. On PMC-MAD, a distribution-aligned dataset of 46,309 biomedical articles, DPR-BAG improves abstractive novelty over strong extractive and fine-tuned baselines, while maintaining factual consistency. Our ablation study reveals a counterintuitive finding: increasing prompt complexity or explicitly injecting entity-level guidance can degrade factual alignment, highlighting the importance of controlled prompting strategies. These findings underscore the potential of training-free, structure-aware frameworks for scalable biomedical abstract generation in low-resource settings. Our data and code are available at https://huggingface.co/datasets/pmc-mad/PMC-MAD and https://github.com/ScienceNLP-Lab/MultiTagger-v2/tree/main/DPR-BAG.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_20628
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Divide-Prompt-Refine: a Training-Free, Structure-Aware Framework for Biomedical Abstract Generation Lin, Sylvey Menke, Joe Ming, Shufan Nam, Dongin Smalheiser, Neil Kilicoglu, Halil Computation and Language Biomedical abstracts play a critical role in downstream NLP applications, such as information retrieval, biocuration, and biomedical knowledge discovery. However, a non-trivial number of biomedical articles do not have abstracts, diminishing the utility of these articles for downstream tasks. We propose DPR-BAG (Divide, Prompt, and Refine for Biomedical Abstract Generation), a training-free, zero-shot framework that generates coherent and factually grounded abstracts for biomedical articles with full text but no abstract. DPR-BAG decomposes full-text documents into structured rhetorical facets following the Background-Objective-Methods-Results-Conclusions (BOMRC) schema, performs parallel LLM-based summarization for each facet, and applies a final refinement stage to restore global discourse coherence. On PMC-MAD, a distribution-aligned dataset of 46,309 biomedical articles, DPR-BAG improves abstractive novelty over strong extractive and fine-tuned baselines, while maintaining factual consistency. Our ablation study reveals a counterintuitive finding: increasing prompt complexity or explicitly injecting entity-level guidance can degrade factual alignment, highlighting the importance of controlled prompting strategies. These findings underscore the potential of training-free, structure-aware frameworks for scalable biomedical abstract generation in low-resource settings. Our data and code are available at https://huggingface.co/datasets/pmc-mad/PMC-MAD and https://github.com/ScienceNLP-Lab/MultiTagger-v2/tree/main/DPR-BAG.
title	Divide-Prompt-Refine: a Training-Free, Structure-Aware Framework for Biomedical Abstract Generation
topic	Computation and Language
url	https://arxiv.org/abs/2605.20628

Similar Items