Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Choi, Yongjin, Park, Chanhun, Baek, Seung Jun
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2503.17728
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917966436630528
author	Choi, Yongjin Park, Chanhun Baek, Seung Jun
author_facet	Choi, Yongjin Park, Chanhun Baek, Seung Jun
contents	Recent advances in text-to-image diffusion models spurred research on personalization, i.e., a customized image synthesis, of subjects within reference images. Although existing personalization methods are able to alter the subjects' positions or to personalize multiple subjects simultaneously, they often struggle to modify the behaviors of subjects or their dynamic interactions. The difficulty is attributable to overfitting to reference images, which worsens if only a single reference image is available. We propose DynASyn, an effective multi-subject personalization from a single reference image addressing these challenges. DynASyn preserves the subject identity in the personalization process by aligning concept-based priors with subject appearances and actions. This is achieved by regularizing the attention maps between the subject token and images through concept-based priors. In addition, we propose concept-based prompt-and-image augmentation for an enhanced trade-off between identity preservation and action diversity. We adopt an SDE-based editing guided by augmented prompts to generate diverse appearances and actions while maintaining identity consistency in the augmented images. Experiments show that DynASyn is capable of synthesizing highly realistic images of subjects with novel contexts and dynamic interactions with the surroundings, and outperforms baseline methods in both quantitative and qualitative aspects.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_17728
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis Choi, Yongjin Park, Chanhun Baek, Seung Jun Computer Vision and Pattern Recognition Artificial Intelligence Recent advances in text-to-image diffusion models spurred research on personalization, i.e., a customized image synthesis, of subjects within reference images. Although existing personalization methods are able to alter the subjects' positions or to personalize multiple subjects simultaneously, they often struggle to modify the behaviors of subjects or their dynamic interactions. The difficulty is attributable to overfitting to reference images, which worsens if only a single reference image is available. We propose DynASyn, an effective multi-subject personalization from a single reference image addressing these challenges. DynASyn preserves the subject identity in the personalization process by aligning concept-based priors with subject appearances and actions. This is achieved by regularizing the attention maps between the subject token and images through concept-based priors. In addition, we propose concept-based prompt-and-image augmentation for an enhanced trade-off between identity preservation and action diversity. We adopt an SDE-based editing guided by augmented prompts to generate diverse appearances and actions while maintaining identity consistency in the augmented images. Experiments show that DynASyn is capable of synthesizing highly realistic images of subjects with novel contexts and dynamic interactions with the surroundings, and outperforms baseline methods in both quantitative and qualitative aspects.
title	DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2503.17728

Similar Items