Saved in:
Bibliographic Details
Main Authors: Basu, Sanjay, Patel, Sadiq Y., Sheth, Parth, Muralidharan, Bhairavi, Elamaran, Namrata, Kinra, Aakriti, Batniji, Rajaie
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.16291
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Care coordination and population health management programs serve large Medicaid and safety-net populations and must be auditable, efficient, and adaptable. While clinical risk for outreach modalities is typically low, time and opportunity costs differ substantially across text, phone, video, and in-person visits. We propose a lightweight offline reinforcement learning (RL) approach that augments trained policies with (i) test-time learning via local neighborhood calibration, and (ii) inference-time deliberation via a small Q-ensemble that incorporates predictive uncertainty and time/effort cost. The method exposes transparent dials for neighborhood size and uncertainty/cost penalties and preserves an auditable training pipeline. Evaluated on a de-identified operational dataset, TTL+ITD achieves stable value estimates with predictable efficiency trade-offs and subgroup auditing.