Saved in:
Bibliographic Details
Main Authors: Ling, Run, Wang, Wenji, Liu, Yuting, Guo, Guibing, Liu, Haowei, Lu, Jian, Zhang, Quanwei, Xu, Yexing, Lu, Shuo, Wang, Yun, Shao, Yihua, Zhang, Zhanjie, Ma, Ao, Jiang, Linying, Wang, Xingwei
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.01657
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911103209963520
author Ling, Run
Wang, Wenji
Liu, Yuting
Guo, Guibing
Liu, Haowei
Lu, Jian
Zhang, Quanwei
Xu, Yexing
Lu, Shuo
Wang, Yun
Shao, Yihua
Zhang, Zhanjie
Ma, Ao
Jiang, Linying
Wang, Xingwei
author_facet Ling, Run
Wang, Wenji
Liu, Yuting
Guo, Guibing
Liu, Haowei
Lu, Jian
Zhang, Quanwei
Xu, Yexing
Lu, Shuo
Wang, Yun
Shao, Yihua
Zhang, Zhanjie
Ma, Ao
Jiang, Linying
Wang, Xingwei
contents Personalized image generation is crucial for improving the user experience, as it renders reference images into preferred ones according to user visual preferences. Although effective, existing methods face two main issues. First, existing methods treat all items in the user historical sequence equally when extracting user preferences, overlooking the varying semantic similarities between historical items and the reference item. Disproportionately high weights for low-similarity items distort users' visual preferences for the reference item. Second, existing methods heavily rely on consistency between generated and reference images to optimize the generation, which leads to underfitting user preferences and hinders personalization. To address these issues, we propose Retrieval Augment Personalized Image GenerAtion guided by Recommendation (RAGAR). Our approach uses a retrieval mechanism to assign different weights to historical items according to their similarities to the reference item, thereby extracting more refined users' visual preferences for the reference item. Then we introduce a novel rank task based on the multi-modal ranking model to optimize the personalization of the generated images instead of forcing depend on consistency. Extensive experiments and human evaluations on three real-world datasets demonstrate that RAGAR achieves significant improvements in both personalization and semantic metrics compared to five baselines.
format Preprint
id arxiv_https___arxiv_org_abs_2505_01657
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation
Ling, Run
Wang, Wenji
Liu, Yuting
Guo, Guibing
Liu, Haowei
Lu, Jian
Zhang, Quanwei
Xu, Yexing
Lu, Shuo
Wang, Yun
Shao, Yihua
Zhang, Zhanjie
Ma, Ao
Jiang, Linying
Wang, Xingwei
Information Retrieval
Computer Vision and Pattern Recognition
Personalized image generation is crucial for improving the user experience, as it renders reference images into preferred ones according to user visual preferences. Although effective, existing methods face two main issues. First, existing methods treat all items in the user historical sequence equally when extracting user preferences, overlooking the varying semantic similarities between historical items and the reference item. Disproportionately high weights for low-similarity items distort users' visual preferences for the reference item. Second, existing methods heavily rely on consistency between generated and reference images to optimize the generation, which leads to underfitting user preferences and hinders personalization. To address these issues, we propose Retrieval Augment Personalized Image GenerAtion guided by Recommendation (RAGAR). Our approach uses a retrieval mechanism to assign different weights to historical items according to their similarities to the reference item, thereby extracting more refined users' visual preferences for the reference item. Then we introduce a novel rank task based on the multi-modal ranking model to optimize the personalization of the generated images instead of forcing depend on consistency. Extensive experiments and human evaluations on three real-world datasets demonstrate that RAGAR achieves significant improvements in both personalization and semantic metrics compared to five baselines.
title RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation
topic Information Retrieval
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2505.01657