Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhu, Qijie, Ye, Zeqi, Liu, Han, Wang, Zhaoran, Chen, Minshuo
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.16198
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914336393396224
author	Zhu, Qijie Ye, Zeqi Liu, Han Wang, Zhaoran Chen, Minshuo
author_facet	Zhu, Qijie Ye, Zeqi Liu, Han Wang, Zhaoran Chen, Minshuo
contents	Adaptation methods have been a workhorse for unlocking the transformative power of pre-trained diffusion models in diverse applications. Existing approaches often abstract adaptation objectives as a reward function and steer diffusion models to generate high-reward samples. However, these approaches can incur high computational overhead due to additional training, or rely on stringent assumptions on the reward such as differentiability. Moreover, despite their empirical success, theoretical justification and guarantees are seldom established. In this paper, we propose DOIT (Doob-Oriented Inference-time Transformation), a training-free and computationally efficient adaptation method that applies to generic, non-differentiable rewards. The key framework underlying our method is a measure transport formulation that seeks to transport the pre-trained generative distribution to a high-reward target distribution. We leverage Doob's $h$-transform to realize this transport, which induces a dynamic correction to the diffusion sampling process and enables efficient simulation-based computation without modifying the pre-trained model. Theoretically, we establish a high probability convergence guarantee to the target high-reward distribution via characterizing the approximation error in the dynamic Doob's correction. Empirically, on D4RL offline RL benchmarks, our method consistently outperforms state-of-the-art baselines while preserving sampling efficiency.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_16198
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Training-Free Adaptation of Diffusion Models via Doob's $h$-Transform Zhu, Qijie Ye, Zeqi Liu, Han Wang, Zhaoran Chen, Minshuo Machine Learning Adaptation methods have been a workhorse for unlocking the transformative power of pre-trained diffusion models in diverse applications. Existing approaches often abstract adaptation objectives as a reward function and steer diffusion models to generate high-reward samples. However, these approaches can incur high computational overhead due to additional training, or rely on stringent assumptions on the reward such as differentiability. Moreover, despite their empirical success, theoretical justification and guarantees are seldom established. In this paper, we propose DOIT (Doob-Oriented Inference-time Transformation), a training-free and computationally efficient adaptation method that applies to generic, non-differentiable rewards. The key framework underlying our method is a measure transport formulation that seeks to transport the pre-trained generative distribution to a high-reward target distribution. We leverage Doob's $h$-transform to realize this transport, which induces a dynamic correction to the diffusion sampling process and enables efficient simulation-based computation without modifying the pre-trained model. Theoretically, we establish a high probability convergence guarantee to the target high-reward distribution via characterizing the approximation error in the dynamic Doob's correction. Empirically, on D4RL offline RL benchmarks, our method consistently outperforms state-of-the-art baselines while preserving sampling efficiency.
title	Training-Free Adaptation of Diffusion Models via Doob's $h$-Transform
topic	Machine Learning
url	https://arxiv.org/abs/2602.16198

Similar Items