Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.10135 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914165972533248 |
|---|---|
| author | Wang, Zhongsheng Lin, Ming Lin, Zhedong Shakib, Yaser Liu, Qian Liu, Jiamou |
| author_facet | Wang, Zhongsheng Lin, Ming Lin, Zhedong Shakib, Yaser Liu, Qian Liu, Jiamou |
| contents | Ensuring character identity consistency across varying prompts remains a fundamental limitation in diffusion-based text-to-image generation. We propose CharCom, a modular and parameter-efficient framework that achieves character-consistent story illustration through composable LoRA adapters, enabling efficient per-character customization without retraining the base model. Built on a frozen diffusion backbone, CharCom dynamically composes adapters at inference using prompt-aware control. Experiments on multi-scene narratives demonstrate that CharCom significantly enhances character fidelity, semantic alignment, and temporal coherence. It remains robust in crowded scenes and enables scalable multi-character generation with minimal overhead, making it well-suited for real-world applications such as story illustration and animation. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2510_10135 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | CharCom: Composable Identity Control for Multi-Character Story Illustration Wang, Zhongsheng Lin, Ming Lin, Zhedong Shakib, Yaser Liu, Qian Liu, Jiamou Artificial Intelligence Ensuring character identity consistency across varying prompts remains a fundamental limitation in diffusion-based text-to-image generation. We propose CharCom, a modular and parameter-efficient framework that achieves character-consistent story illustration through composable LoRA adapters, enabling efficient per-character customization without retraining the base model. Built on a frozen diffusion backbone, CharCom dynamically composes adapters at inference using prompt-aware control. Experiments on multi-scene narratives demonstrate that CharCom significantly enhances character fidelity, semantic alignment, and temporal coherence. It remains robust in crowded scenes and enables scalable multi-character generation with minimal overhead, making it well-suited for real-world applications such as story illustration and animation. |
| title | CharCom: Composable Identity Control for Multi-Character Story Illustration |
| topic | Artificial Intelligence |
| url | https://arxiv.org/abs/2510.10135 |