Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.17229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914969858080768 |
|---|---|
| author | Yang, Wanggong Zhao, Yifei |
| author_facet | Yang, Wanggong Zhao, Yifei |
| contents | Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2407_17229 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis Yang, Wanggong Zhao, Yifei Computer Vision and Pattern Recognition Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available. |
| title | Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2407.17229 |