Saved in:
Bibliographic Details
Main Authors: Wang, Zhongsheng, Lin, Ming, Lin, Zhedong, Shakib, Yaser, Liu, Qian, Liu, Jiamou
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.10135
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914165972533248
author Wang, Zhongsheng
Lin, Ming
Lin, Zhedong
Shakib, Yaser
Liu, Qian
Liu, Jiamou
author_facet Wang, Zhongsheng
Lin, Ming
Lin, Zhedong
Shakib, Yaser
Liu, Qian
Liu, Jiamou
contents Ensuring character identity consistency across varying prompts remains a fundamental limitation in diffusion-based text-to-image generation. We propose CharCom, a modular and parameter-efficient framework that achieves character-consistent story illustration through composable LoRA adapters, enabling efficient per-character customization without retraining the base model. Built on a frozen diffusion backbone, CharCom dynamically composes adapters at inference using prompt-aware control. Experiments on multi-scene narratives demonstrate that CharCom significantly enhances character fidelity, semantic alignment, and temporal coherence. It remains robust in crowded scenes and enables scalable multi-character generation with minimal overhead, making it well-suited for real-world applications such as story illustration and animation.
format Preprint
id arxiv_https___arxiv_org_abs_2510_10135
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle CharCom: Composable Identity Control for Multi-Character Story Illustration
Wang, Zhongsheng
Lin, Ming
Lin, Zhedong
Shakib, Yaser
Liu, Qian
Liu, Jiamou
Artificial Intelligence
Ensuring character identity consistency across varying prompts remains a fundamental limitation in diffusion-based text-to-image generation. We propose CharCom, a modular and parameter-efficient framework that achieves character-consistent story illustration through composable LoRA adapters, enabling efficient per-character customization without retraining the base model. Built on a frozen diffusion backbone, CharCom dynamically composes adapters at inference using prompt-aware control. Experiments on multi-scene narratives demonstrate that CharCom significantly enhances character fidelity, semantic alignment, and temporal coherence. It remains robust in crowded scenes and enables scalable multi-character generation with minimal overhead, making it well-suited for real-world applications such as story illustration and animation.
title CharCom: Composable Identity Control for Multi-Character Story Illustration
topic Artificial Intelligence
url https://arxiv.org/abs/2510.10135