Saved in:
Bibliographic Details
Main Authors: Lu, Shuo, Wang, Haohan, Feng, Wei, Wang, Weizhen, Zhang, Shen, Li, Yaoyu, Ma, Ao, Zhang, Zheng, Lv, Jingjing, Shen, Junjie, Law, Ching, Zhan, Bing, Xu, Yuan, Yao, Huizai, Yu, Yongcan, Si, Chenyang, Liang, Jian
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.02033
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908807543652352
author Lu, Shuo
Wang, Haohan
Feng, Wei
Wang, Weizhen
Zhang, Shen
Li, Yaoyu
Ma, Ao
Zhang, Zheng
Lv, Jingjing
Shen, Junjie
Law, Ching
Zhan, Bing
Xu, Yuan
Yao, Huizai
Yu, Yongcan
Si, Chenyang
Liang, Jian
author_facet Lu, Shuo
Wang, Haohan
Feng, Wei
Wang, Weizhen
Zhang, Shen
Li, Yaoyu
Ma, Ao
Zhang, Zheng
Lv, Jingjing
Shen, Junjie
Law, Ching
Zhan, Bing
Xu, Yuan
Yao, Huizai
Yu, Yongcan
Si, Chenyang
Liang, Jian
contents Advertising image generation has increasingly focused on online metrics like Click-Through Rate (CTR), yet existing approaches adopt a ``one-size-fits-all" strategy that optimizes for overall CTR while neglecting preference diversity among user groups. This leads to suboptimal performance for specific groups, limiting targeted marketing effectiveness. To bridge this gap, we present \textit{One Size, Many Fits} (OSMF), a unified framework that aligns diverse group-wise click preferences in large-scale advertising image generation. OSMF begins with product-aware adaptive grouping, which dynamically organizes users based on their attributes and product characteristics, representing each group with rich collective preference features. Building on these groups, preference-conditioned image generation employs a Group-aware Multimodal Large Language Model (G-MLLM) to generate tailored images for each group. The G-MLLM is pre-trained to simultaneously comprehend group features and generate advertising images. Subsequently, we fine-tune the G-MLLM using our proposed Group-DPO for group-wise preference alignment, which effectively enhances each group's CTR on the generated images. To further advance this field, we introduce the Grouped Advertising Image Preference Dataset (GAIP), the first large-scale public dataset of group-wise image preferences, including around 600K groups built from 40M users. Extensive experiments demonstrate that our framework achieves the state-of-the-art performance in both offline and online settings. Our code and datasets will be released at https://github.com/JD-GenX/OSMF.
format Preprint
id arxiv_https___arxiv_org_abs_2602_02033
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle One Size, Many Fits: Aligning Diverse Group-Wise Click Preferences in Large-Scale Advertising Image Generation
Lu, Shuo
Wang, Haohan
Feng, Wei
Wang, Weizhen
Zhang, Shen
Li, Yaoyu
Ma, Ao
Zhang, Zheng
Lv, Jingjing
Shen, Junjie
Law, Ching
Zhan, Bing
Xu, Yuan
Yao, Huizai
Yu, Yongcan
Si, Chenyang
Liang, Jian
Computer Vision and Pattern Recognition
Artificial Intelligence
Multimedia
Advertising image generation has increasingly focused on online metrics like Click-Through Rate (CTR), yet existing approaches adopt a ``one-size-fits-all" strategy that optimizes for overall CTR while neglecting preference diversity among user groups. This leads to suboptimal performance for specific groups, limiting targeted marketing effectiveness. To bridge this gap, we present \textit{One Size, Many Fits} (OSMF), a unified framework that aligns diverse group-wise click preferences in large-scale advertising image generation. OSMF begins with product-aware adaptive grouping, which dynamically organizes users based on their attributes and product characteristics, representing each group with rich collective preference features. Building on these groups, preference-conditioned image generation employs a Group-aware Multimodal Large Language Model (G-MLLM) to generate tailored images for each group. The G-MLLM is pre-trained to simultaneously comprehend group features and generate advertising images. Subsequently, we fine-tune the G-MLLM using our proposed Group-DPO for group-wise preference alignment, which effectively enhances each group's CTR on the generated images. To further advance this field, we introduce the Grouped Advertising Image Preference Dataset (GAIP), the first large-scale public dataset of group-wise image preferences, including around 600K groups built from 40M users. Extensive experiments demonstrate that our framework achieves the state-of-the-art performance in both offline and online settings. Our code and datasets will be released at https://github.com/JD-GenX/OSMF.
title One Size, Many Fits: Aligning Diverse Group-Wise Click Preferences in Large-Scale Advertising Image Generation
topic Computer Vision and Pattern Recognition
Artificial Intelligence
Multimedia
url https://arxiv.org/abs/2602.02033