Saved in:
Bibliographic Details
Main Authors: Mohanbabu, Ananya Gubbi, Natalie, Rosiana, Kim, Brandon, Guo, Anhong, Pavel, Amy
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.09310
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918330478100480
author Mohanbabu, Ananya Gubbi
Natalie, Rosiana
Kim, Brandon
Guo, Anhong
Pavel, Amy
author_facet Mohanbabu, Ananya Gubbi
Natalie, Rosiana
Kim, Brandon
Guo, Anhong
Pavel, Amy
contents Computer Use Agents (CUAs) operate interfaces by pointing, clicking, and typing -- mirroring interactions of sighted users (SUs) who can thus monitor CUAs and share control. CUAs do not reflect interactions by blind and low-vision users (BLVUs) who use assistive technology (AT). BLVUs thus cannot easily collaborate with CUAs. To characterize the accessibility gap of CUAs, we present A11y-CUA, a dataset of BLVUs and SUs performing 60 everyday tasks with 40.4 hours and 158,325 events. Our dataset analysis reveals that our collected interaction traces quantitatively confirm distinct interaction styles between SU and BLVU groups (mouse- vs. keyboard-dominant) and demonstrate interaction diversity within each group (sequential vs. shortcut navigation for BLVUs). We then compare collected traces to state-of-the-art CUAs under default and AT conditions (keyboard-only, magnifier). The default CUA executed 78.3% of tasks successfully. But with the AT conditions, CUA's performance dropped to 41.67% and 28.3% with keyboard-only and magnifier conditions respectively, and did not reflect nuances of real AT use. With our open A11y-CUA dataset, we aim to promote collaborative and accessible CUAs for everyone.
format Preprint
id arxiv_https___arxiv_org_abs_2602_09310
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A11y-CUA Dataset: Characterizing the Accessibility Gap in Computer Use Agents
Mohanbabu, Ananya Gubbi
Natalie, Rosiana
Kim, Brandon
Guo, Anhong
Pavel, Amy
Human-Computer Interaction
Computer Use Agents (CUAs) operate interfaces by pointing, clicking, and typing -- mirroring interactions of sighted users (SUs) who can thus monitor CUAs and share control. CUAs do not reflect interactions by blind and low-vision users (BLVUs) who use assistive technology (AT). BLVUs thus cannot easily collaborate with CUAs. To characterize the accessibility gap of CUAs, we present A11y-CUA, a dataset of BLVUs and SUs performing 60 everyday tasks with 40.4 hours and 158,325 events. Our dataset analysis reveals that our collected interaction traces quantitatively confirm distinct interaction styles between SU and BLVU groups (mouse- vs. keyboard-dominant) and demonstrate interaction diversity within each group (sequential vs. shortcut navigation for BLVUs). We then compare collected traces to state-of-the-art CUAs under default and AT conditions (keyboard-only, magnifier). The default CUA executed 78.3% of tasks successfully. But with the AT conditions, CUA's performance dropped to 41.67% and 28.3% with keyboard-only and magnifier conditions respectively, and did not reflect nuances of real AT use. With our open A11y-CUA dataset, we aim to promote collaborative and accessible CUAs for everyone.
title A11y-CUA Dataset: Characterizing the Accessibility Gap in Computer Use Agents
topic Human-Computer Interaction
url https://arxiv.org/abs/2602.09310