Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Fuentes-Vicente, Laura, Even, Mathieu, Dormion, Gaelle, Josse, Julie, Chambaz, Antoine
Format:	Preprint
Published:	2026
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2601.22717
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915763465486336
author	Fuentes-Vicente, Laura Even, Mathieu Dormion, Gaelle Josse, Julie Chambaz, Antoine
author_facet	Fuentes-Vicente, Laura Even, Mathieu Dormion, Gaelle Josse, Julie Chambaz, Antoine
contents	A medical policy aims to support decision-making by mapping patient characteristics to individualized treatment recommendations. Standard approaches typically optimize a single outcome criterion. For example, recommending treatment according to the sign of the Conditional Average Treatment Effect (CATE) maximizes the policy "value" by exploiting treatment effect heterogeneity. This point of view shifts policy learning towards the challenge of learning a reliable CATE estimator. However, in multi-outcome settings, such strategies ignore the risk of adverse events, despite their relevance. PLUC (Policy Learning Under Constraint) addresses this challenges by learning an estimator of the CATE that yields smoothed policies controlling the probability of an adverse event in observational settings. Inspired by insights from EP-learning, PLUC involves the optimization of strongly convex Lagrangian criteria over a convex hull of functions. Its alternating procedure iteratively applies the Frank-Wolfe algorithm to minimize the current criterion, then performs a targeting step that updates the criterion so that its evaluations at previously visited landmarks become targeted estimators of the corresponding theoretical quantities. An R package PLUC-R provides a practical implementation. We illustrate PLUC's performance through a series of numerical experiments.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_22717
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Policy learning under constraint: Maximizing a primary outcome while controlling an adverse event Fuentes-Vicente, Laura Even, Mathieu Dormion, Gaelle Josse, Julie Chambaz, Antoine Methodology A medical policy aims to support decision-making by mapping patient characteristics to individualized treatment recommendations. Standard approaches typically optimize a single outcome criterion. For example, recommending treatment according to the sign of the Conditional Average Treatment Effect (CATE) maximizes the policy "value" by exploiting treatment effect heterogeneity. This point of view shifts policy learning towards the challenge of learning a reliable CATE estimator. However, in multi-outcome settings, such strategies ignore the risk of adverse events, despite their relevance. PLUC (Policy Learning Under Constraint) addresses this challenges by learning an estimator of the CATE that yields smoothed policies controlling the probability of an adverse event in observational settings. Inspired by insights from EP-learning, PLUC involves the optimization of strongly convex Lagrangian criteria over a convex hull of functions. Its alternating procedure iteratively applies the Frank-Wolfe algorithm to minimize the current criterion, then performs a targeting step that updates the criterion so that its evaluations at previously visited landmarks become targeted estimators of the corresponding theoretical quantities. An R package PLUC-R provides a practical implementation. We illustrate PLUC's performance through a series of numerical experiments.
title	Policy learning under constraint: Maximizing a primary outcome while controlling an adverse event
topic	Methodology
url	https://arxiv.org/abs/2601.22717

Similar Items