Saved in:
Bibliographic Details
Main Authors: Cook, John, Wyatt, Michael, Wei, Peng, Chin, Iris, Gupta, Santosh, Van Vuuren, Van Zyl, Siburian, Richie, Spicer, Amanda, Viviano, Kristen, Cami, Alda, Malhotra, Raunaq, Yao, Zhewei, Rasley, Jeff, Kaushik, Gaurav
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.23515
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle processes, freeing providers to focus more on patient care. However, automating the assignment of ICD-10-CM and CPT codes from clinical documentation remains a challenge due to heterogeneous records, nuanced coding guidelines, and long-tail distributions. Large language models have been proposed to help or automate specific medical coding tasks. However, foundation models are not explicitly trained for medical coding and zero-shot coding has yielded poor results. We investigate whether a modern open-weight foundation model can be adapted for an expert-level medical coding task using privacy-preserving synthetic training data derived from electronic health records. We fine-tune Llama 3-70B on pairs of clinical notes and gold codes generated from EHR-grounded templates and coding policies, then evaluate exact-code prediction for ICD-10-CM and CPT. A zero-shot baseline with the unadapted model achieved an F1 score of 0.18 for exact code match. After fine-tuning on the synthetic corpus, exact-match F1 exceeded 0.70, representing a large absolute gain across both code systems. Notably, performance remained high on complex categories that often require multi-step clinical reasoning and code composition, including Advanced Illness and Frailty classes, and the model retained its performance on medical comprehension tasks. These results indicate that synthetic, policy-aware data can efficiently teach a general-purpose large language model to support precise medical coding without exposing protected health information. The approach offers a practical path for training coding agents safely and iteratively on specific tasks that represent real-world populations.