Saved in:
| Main Authors: | Lesner, Jasmine, Beyeler, Michael |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.17326 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating Deep Human-in-the-Loop Optimization for Retinal Implants Using Sighted Participants
by: Schoinas, Eirini, et al.
Published: (2025)
by: Schoinas, Eirini, et al.
Published: (2025)
A Powered Prosthetic Hand with Vision System for Enhancing the Anthropopathic Grasp
by: Xu, Yansong, et al.
Published: (2024)
by: Xu, Yansong, et al.
Published: (2024)
The Relative Importance of Depth Cues and Semantic Edges for Indoor Mobility Using Simulated Prosthetic Vision in Immersive Virtual Reality
by: Rasla, Alex, et al.
Published: (2022)
by: Rasla, Alex, et al.
Published: (2022)
SpiritSight Agent: Advanced GUI Agent with One Look
by: Huang, Zhiyuan, et al.
Published: (2025)
by: Huang, Zhiyuan, et al.
Published: (2025)
A Convolution-Based Gait Asymmetry Metric for Inter-Limb Synergistic Coordination
by: Fukino, Go, et al.
Published: (2025)
by: Fukino, Go, et al.
Published: (2025)
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
by: Lin, Yukang, et al.
Published: (2025)
by: Lin, Yukang, et al.
Published: (2025)
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
by: Laurençon, Hugo, et al.
Published: (2024)
by: Laurençon, Hugo, et al.
Published: (2024)
Weak-Annotation of HAR Datasets using Vision Foundation Models
by: Bock, Marius, et al.
Published: (2024)
by: Bock, Marius, et al.
Published: (2024)
Steering Generative Models for Accessibility: EasyRead Image Generation
by: Dickenmann, Nicolas, et al.
Published: (2026)
by: Dickenmann, Nicolas, et al.
Published: (2026)
InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates
by: Huang, Jinbin, et al.
Published: (2023)
by: Huang, Jinbin, et al.
Published: (2023)
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
by: Kang, Wan Ju, et al.
Published: (2025)
by: Kang, Wan Ju, et al.
Published: (2025)
Real-Time Cellist Postural Evaluation With On-Device Computer Vision
by: Wang, Paolo, et al.
Published: (2026)
by: Wang, Paolo, et al.
Published: (2026)
Vision Language Models as Values Detectors
by: Abbo, Giulio Antonio, et al.
Published: (2025)
by: Abbo, Giulio Antonio, et al.
Published: (2025)
ViT-Explainer: An Interactive Walkthrough of the Vision Transformer Pipeline
by: Hernandez, Juan Manuel, et al.
Published: (2026)
by: Hernandez, Juan Manuel, et al.
Published: (2026)
VisionCAD: An Integration-Free Radiology Copilot Framework
by: Li, Jiaming, et al.
Published: (2025)
by: Li, Jiaming, et al.
Published: (2025)
VFA: Vision Frequency Analysis of Foundation Models and Human
by: Darvishi-Bayazi, Mohammad-Javad, et al.
Published: (2024)
by: Darvishi-Bayazi, Mohammad-Javad, et al.
Published: (2024)
Computer Vision for Objects used in Group Work: Challenges and Opportunities
by: Jung, Changsoo, et al.
Published: (2025)
by: Jung, Changsoo, et al.
Published: (2025)
Machine Vision-Based Surgical Lighting System:Design and Implementation
by: Gharghabi, Amir, et al.
Published: (2025)
by: Gharghabi, Amir, et al.
Published: (2025)
iTrace: Click-Based Gaze Visualization on the Apple Vision Pro
by: Mehmedova, Esra, et al.
Published: (2025)
by: Mehmedova, Esra, et al.
Published: (2025)
Visual Affect Analysis: Predicting Emotions of Image Viewers with Vision-Language Models
by: Nowicki, Filip, et al.
Published: (2026)
by: Nowicki, Filip, et al.
Published: (2026)
Do MLLMs Understand Pointing? Benchmarking and Enhancing Referential Reasoning in Egocentric Vision
by: Li, Chentao, et al.
Published: (2026)
by: Li, Chentao, et al.
Published: (2026)
A Dataset for Crucial Object Recognition in Blind and Low-Vision Individuals' Navigation
by: Islam, Md Touhidul, et al.
Published: (2024)
by: Islam, Md Touhidul, et al.
Published: (2024)
An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
by: Huang, Yifei, et al.
Published: (2025)
by: Huang, Yifei, et al.
Published: (2025)
Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision
by: Long-fei, Chen, et al.
Published: (2019)
by: Long-fei, Chen, et al.
Published: (2019)
EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision
by: Zhao, Yiming, et al.
Published: (2024)
by: Zhao, Yiming, et al.
Published: (2024)
HarassGuard: Detecting Harassment Behaviors in Social Virtual Reality with Vision-Language Models
by: Lee, Junhee, et al.
Published: (2026)
by: Lee, Junhee, et al.
Published: (2026)
Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics
by: Li, Xinyu, et al.
Published: (2026)
by: Li, Xinyu, et al.
Published: (2026)
Towards Context-aware Support for Color Vision Deficiency: An Approach Integrating LLM and AR
by: Morita, Shogo, et al.
Published: (2024)
by: Morita, Shogo, et al.
Published: (2024)
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
by: Fei, Hao, et al.
Published: (2024)
by: Fei, Hao, et al.
Published: (2024)
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-World Tasks
by: Jammot, Matthias, et al.
Published: (2025)
by: Jammot, Matthias, et al.
Published: (2025)
Computational Trichromacy Reconstruction: Empowering the Color-Vision Deficient to Recognize Colors Using Augmented Reality
by: Zhu, Yuhao, et al.
Published: (2024)
by: Zhu, Yuhao, et al.
Published: (2024)
Interactivity x Explainability: Toward Understanding How Interactivity Can Improve Computer Vision Explanations
by: Panigrahi, Indu, et al.
Published: (2025)
by: Panigrahi, Indu, et al.
Published: (2025)
EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition
by: Foteinopoulou, Niki Maria, et al.
Published: (2023)
by: Foteinopoulou, Niki Maria, et al.
Published: (2023)
MedFoundationHub: A Lightweight and Secure Toolkit for Deploying Medical Vision Language Foundation Models
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
Towards Consumer-Grade Cybersickness Prediction: Multi-Model Alignment for Real-Time Vision-Only Inference
by: Zhu, Yitong, et al.
Published: (2025)
by: Zhu, Yitong, et al.
Published: (2025)
"It's trained by non-disabled people": Evaluating How Image Quality Affects Product Captioning with Vision-Language Models
by: Garg, Kapil, et al.
Published: (2025)
by: Garg, Kapil, et al.
Published: (2025)
GPT-5 Model Corrected GPT-4V's Chart Reading Errors, Not Prompting
by: Yang, Kaichun, et al.
Published: (2025)
by: Yang, Kaichun, et al.
Published: (2025)
3DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications
by: Corral-Soto, Eduardo R., et al.
Published: (2024)
by: Corral-Soto, Eduardo R., et al.
Published: (2024)
Scene-Aware Urban Design: A Human-AI Recommendation Framework Using Co-Occurrence Embeddings and Vision-Language Models
by: Gallardo, Rodrigo, et al.
Published: (2025)
by: Gallardo, Rodrigo, et al.
Published: (2025)
Similar Items
-
Evaluating Deep Human-in-the-Loop Optimization for Retinal Implants Using Sighted Participants
by: Schoinas, Eirini, et al.
Published: (2025) -
A Powered Prosthetic Hand with Vision System for Enhancing the Anthropopathic Grasp
by: Xu, Yansong, et al.
Published: (2024) -
The Relative Importance of Depth Cues and Semantic Edges for Indoor Mobility Using Simulated Prosthetic Vision in Immersive Virtual Reality
by: Rasla, Alex, et al.
Published: (2022) -
SpiritSight Agent: Advanced GUI Agent with One Look
by: Huang, Zhiyuan, et al.
Published: (2025) -
A Convolution-Based Gait Asymmetry Metric for Inter-Limb Synergistic Coordination
by: Fukino, Go, et al.
Published: (2025)