Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yin, Sijing, Liu, Jiamou, Tang, Xiao, Shakib, Yaser, Liu, Qian
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.22448
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916036405624832
author	Yin, Sijing Liu, Jiamou Tang, Xiao Shakib, Yaser Liu, Qian
author_facet	Yin, Sijing Liu, Jiamou Tang, Xiao Shakib, Yaser Liu, Qian
contents	Multi-frame story illustration requires long-horizon coherence beyond single-image text-to-image generation, including narrative decomposition and persistent character identity, layout, and affect across frames. We propose Story-to-Executable Descriptions (S2ED), a training-free, model-agnostic, prompt-layer framework that converts a full story into a sequence of explicit, editable executable descriptions for more consistent rendering. S2ED coordinates three agents to segment the narrative, ground canonical character attributes, and enrich spatial and affective cues, enabling interpretable prompt-carried state propagation and local edits to repair drift without retraining the generator. Experiments on Flintstones and Shakoo Maku show that S2ED improves sequence-level consistency and character fidelity over strong prompting, large-model planning, and a reference training-based method, under both automatic metrics and human judgments. We also deploy S2ED in an end-to-end story-to-storybook system for children's illustrated stories, with a supplementary video.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_22448
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	S2ED: From Story to Executable Descriptions for Consistency-Aware Story Illustration Yin, Sijing Liu, Jiamou Tang, Xiao Shakib, Yaser Liu, Qian Artificial Intelligence Multi-frame story illustration requires long-horizon coherence beyond single-image text-to-image generation, including narrative decomposition and persistent character identity, layout, and affect across frames. We propose Story-to-Executable Descriptions (S2ED), a training-free, model-agnostic, prompt-layer framework that converts a full story into a sequence of explicit, editable executable descriptions for more consistent rendering. S2ED coordinates three agents to segment the narrative, ground canonical character attributes, and enrich spatial and affective cues, enabling interpretable prompt-carried state propagation and local edits to repair drift without retraining the generator. Experiments on Flintstones and Shakoo Maku show that S2ED improves sequence-level consistency and character fidelity over strong prompting, large-model planning, and a reference training-based method, under both automatic metrics and human judgments. We also deploy S2ED in an end-to-end story-to-storybook system for children's illustrated stories, with a supplementary video.
title	S2ED: From Story to Executable Descriptions for Consistency-Aware Story Illustration
topic	Artificial Intelligence
url	https://arxiv.org/abs/2605.22448

Similar Items