Saved in:
Bibliographic Details
Main Authors: Yang, Wanggong, Zhao, Yifei
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.17229
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914969858080768
author Yang, Wanggong
Zhao, Yifei
author_facet Yang, Wanggong
Zhao, Yifei
contents Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available.
format Preprint
id arxiv_https___arxiv_org_abs_2407_17229
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
Yang, Wanggong
Zhao, Yifei
Computer Vision and Pattern Recognition
Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available.
title Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2407.17229