Salvato in:
Dettagli Bibliografici
Autori principali: Haque, Sabrina, Csallner, Christoph
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2409.18060
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866916425792225280
author Haque, Sabrina
Csallner, Christoph
author_facet Haque, Sabrina
Csallner, Christoph
contents Ensuring accessibility in mobile applications remains a significant challenge, particularly for visually impaired users who rely on screen readers. User interface icons are essential for navigation and interaction and often lack meaningful alt-text, creating barriers to effective use. Traditional deep learning approaches for generating alt-text require extensive datasets and struggle with the diversity and imbalance of icon types. More recent Vision Language Models (VLMs) require complete UI screens, which can be impractical during the iterative phases of app development. To address these issues, we introduce a novel method using Large Language Models (LLMs) to autonomously generate informative alt-text for mobile UI icons with partial UI data. By incorporating icon context, that include class, resource ID, bounds, OCR-detected text, and contextual information from parent and sibling nodes, we fine-tune an off-the-shelf LLM on a small dataset of approximately 1.4k icons, yielding IconDesc. In an empirical evaluation and a user study IconDesc demonstrates significant improvements in generating relevant alt-text. This ability makes IconDesc an invaluable tool for developers, aiding in the rapid iteration and enhancement of UI accessibility.
format Preprint
id arxiv_https___arxiv_org_abs_2409_18060
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Inferring Alt-text For UI Icons With Large Language Models During App Development
Haque, Sabrina
Csallner, Christoph
Human-Computer Interaction
Software Engineering
Ensuring accessibility in mobile applications remains a significant challenge, particularly for visually impaired users who rely on screen readers. User interface icons are essential for navigation and interaction and often lack meaningful alt-text, creating barriers to effective use. Traditional deep learning approaches for generating alt-text require extensive datasets and struggle with the diversity and imbalance of icon types. More recent Vision Language Models (VLMs) require complete UI screens, which can be impractical during the iterative phases of app development. To address these issues, we introduce a novel method using Large Language Models (LLMs) to autonomously generate informative alt-text for mobile UI icons with partial UI data. By incorporating icon context, that include class, resource ID, bounds, OCR-detected text, and contextual information from parent and sibling nodes, we fine-tune an off-the-shelf LLM on a small dataset of approximately 1.4k icons, yielding IconDesc. In an empirical evaluation and a user study IconDesc demonstrates significant improvements in generating relevant alt-text. This ability makes IconDesc an invaluable tool for developers, aiding in the rapid iteration and enhancement of UI accessibility.
title Inferring Alt-text For UI Icons With Large Language Models During App Development
topic Human-Computer Interaction
Software Engineering
url https://arxiv.org/abs/2409.18060