Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Magrill, Jamie, Gornstein, Leah, Seekins, Sandra, Magrill, Barry
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Computers and Society
Online Access:	https://arxiv.org/abs/2601.09169
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912823067541504
author	Magrill, Jamie Gornstein, Leah Seekins, Sandra Magrill, Barry
author_facet	Magrill, Jamie Gornstein, Leah Seekins, Sandra Magrill, Barry
contents	Generative artificial intelligence (GenAI) text-to-image systems are increasingly used to generate architectural imagery, yet their capacity to reproduce accurate images in a historically rule-bound field remains poorly characterized. We evaluated five widely used GenAI image platforms (Adobe Firefly, DALL-E 3, Google Imagen 3, Microsoft Image Generator, and Midjourney) using 30 architectural prompts spanning styles, typologies, and codified elements. Each prompt-generator pair produced four images (n = 600 images total). Two architectural historians independently scored each image for accuracy against predefined criteria, resolving disagreements by consensus. Set-level performance was summarized as zero to four accurate images per four-image set. Image output from Common prompts was 2.7-fold more accurate than from Rare prompts (p < 0.05). Across platforms, overall accuracy was limited (highest accuracy score 52 percent; lowest 32 percent; mean 42 percent). All-correct (4 out of 4) outcomes were similar across platforms. By contrast, all-incorrect (0 out of 4) outcomes varied substantially, with Imagen 3 exhibiting the fewest failures and Microsoft Image Generator exhibiting the highest number of failures. Qualitative review of the image dataset identified recurring patterns including over-embellishment, confusion between medieval styles and their later revivals, and misrepresentation of descriptive prompts (for example, egg-and-dart, banded column, pendentive). These findings support the need for visible labeling of GenAI synthetic content, provenance standards for future training datasets, and cautious educational use of GenAI architectural imagery.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_09169
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Architecture inside the mirage: evaluating generative image models on architectural style, elements, and typologies Magrill, Jamie Gornstein, Leah Seekins, Sandra Magrill, Barry Computer Vision and Pattern Recognition Computers and Society Generative artificial intelligence (GenAI) text-to-image systems are increasingly used to generate architectural imagery, yet their capacity to reproduce accurate images in a historically rule-bound field remains poorly characterized. We evaluated five widely used GenAI image platforms (Adobe Firefly, DALL-E 3, Google Imagen 3, Microsoft Image Generator, and Midjourney) using 30 architectural prompts spanning styles, typologies, and codified elements. Each prompt-generator pair produced four images (n = 600 images total). Two architectural historians independently scored each image for accuracy against predefined criteria, resolving disagreements by consensus. Set-level performance was summarized as zero to four accurate images per four-image set. Image output from Common prompts was 2.7-fold more accurate than from Rare prompts (p < 0.05). Across platforms, overall accuracy was limited (highest accuracy score 52 percent; lowest 32 percent; mean 42 percent). All-correct (4 out of 4) outcomes were similar across platforms. By contrast, all-incorrect (0 out of 4) outcomes varied substantially, with Imagen 3 exhibiting the fewest failures and Microsoft Image Generator exhibiting the highest number of failures. Qualitative review of the image dataset identified recurring patterns including over-embellishment, confusion between medieval styles and their later revivals, and misrepresentation of descriptive prompts (for example, egg-and-dart, banded column, pendentive). These findings support the need for visible labeling of GenAI synthetic content, provenance standards for future training datasets, and cautious educational use of GenAI architectural imagery.
title	Architecture inside the mirage: evaluating generative image models on architectural style, elements, and typologies
topic	Computer Vision and Pattern Recognition Computers and Society
url	https://arxiv.org/abs/2601.09169

Similar Items