Saved in:
Bibliographic Details
Main Authors: Liu, Jiaming, Petersen, Felix, Gao, Yunhe, Zhang, Yabin, Kim, Hyojin, Chaudhari, Akshay S., Sun, Yu, Ermon, Stefano, Gatidis, Sergios
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.16664
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Adversarial diffusion and diffusion-inversion methods have advanced unpaired image-to-image translation, but each faces key limitations. Adversarial approaches require target-domain adversarial loss during training, which can limit generalization to unseen data, while diffusion-inversion methods often produce low-fidelity translations due to imperfect inversion into noise-latent representations. In this work, we propose the Self-Supervised Semantic Bridge (SSB), a versatile framework that integrates external semantic priors into diffusion bridge models to enable spatially faithful translation without cross-domain supervision. Our key idea is to leverage self-supervised visual encoders to learn representations that are invariant to appearance changes but capture geometric structure, forming a shared latent space that conditions the diffusion bridges. Extensive experiments show that SSB outperforms strong prior methods for challenging medical image synthesis in both in-domain and out-of-domain settings, and extends easily to high-quality text-guided editing.