Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Saxon, Michael, Luo, Yiran, Levy, Sharon, Baral, Chitta, Yang, Yezhou, Wang, William Yang
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence Computer Vision and Pattern Recognition Computers and Society Image and Video Processing
Online Access:	https://arxiv.org/abs/2403.11092
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917616474390528
author	Saxon, Michael Luo, Yiran Levy, Sharon Baral, Chitta Yang, Yezhou Wang, William Yang
author_facet	Saxon, Michael Luo, Yiran Levy, Sharon Baral, Chitta Yang, Yezhou Wang, William Yang
contents	Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set. One such benchmark, "Conceptual Coverage Across Languages" (CoCo-CroLa), assesses the tangible noun inventory of T2I models by prompting them to generate pictures from a concept list translated to seven languages and comparing the output image populations. Unfortunately, we find that this benchmark contains translation errors of varying severity in Spanish, Japanese, and Chinese. We provide corrections for these errors and analyze how impactful they are on the utility and validity of CoCo-CroLa as a benchmark. We reassess multiple baseline T2I models with the revisions, compare the outputs elicited under the new translations to those conditioned on the old, and show that a correction's impactfulness on the image-domain benchmark results can be predicted in the text domain with similarity scores. Our findings will guide the future development of T2I multilinguality metrics by providing analytical tools for practical translation decisions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_11092
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts Saxon, Michael Luo, Yiran Levy, Sharon Baral, Chitta Yang, Yezhou Wang, William Yang Computation and Language Artificial Intelligence Computer Vision and Pattern Recognition Computers and Society Image and Video Processing Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set. One such benchmark, "Conceptual Coverage Across Languages" (CoCo-CroLa), assesses the tangible noun inventory of T2I models by prompting them to generate pictures from a concept list translated to seven languages and comparing the output image populations. Unfortunately, we find that this benchmark contains translation errors of varying severity in Spanish, Japanese, and Chinese. We provide corrections for these errors and analyze how impactful they are on the utility and validity of CoCo-CroLa as a benchmark. We reassess multiple baseline T2I models with the revisions, compare the outputs elicited under the new translations to those conditioned on the old, and show that a correction's impactfulness on the image-domain benchmark results can be predicted in the text domain with similarity scores. Our findings will guide the future development of T2I multilinguality metrics by providing analytical tools for practical translation decisions.
title	Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts
topic	Computation and Language Artificial Intelligence Computer Vision and Pattern Recognition Computers and Society Image and Video Processing
url	https://arxiv.org/abs/2403.11092

Similar Items