Saved in:
Bibliographic Details
Main Authors: Shen, Hanwen, Ying, Ting
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.12572
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • A two stage novel generation framework (outline -> section outline -> manuscript) is widely used in long novel generation,(e.g., \textsc{DOME}, \textsc{Plan\&Write}, \textsc{Long Writer}), but study of such framework in ultra long novel(>1M words) reconstruction is little. Building on recent text compression methods (\textsc{LLMZip}, \textsc{LLM2Vec}), we conduct an information-theoretic analysis to quantify semantic distortion under different compression-expansion ratios. We examine how outline length affects information preservation. Experiments on ultra-long novels show that the optimal compression-expansion ratio significantly reduces semantic distortion compared to other non-optimal compression-expansion ratio.