:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Driessen, Meike, Khan, Selina, Marcelino, Gonçalo
Format:	Preprint
Published:	2025
Subjects:	Human-Computer Interaction Artificial Intelligence Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.11532
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mapping the Mind of an Instruction-based Image Editing using SMILE
by: Dehghani, Zeinab, et al.
Published: (2024)

Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning
by: Lu, Feiyu, et al.
Published: (2025)

ViEEG: Hierarchical Visual Neural Representation for EEG Brain Decoding
by: Liu, Minxu, et al.
Published: (2025)

Modelling the Interplay of Eye-Tracking Temporal Dynamics and Personality for Emotion Detection in Face-to-Face Settings
by: Seikavandi, Meisam J., et al.
Published: (2025)

Few-Shot VLM-Based G-Code and HMI Verification in CNC Machining
by: Pour, Yasaman Hashem, et al.
Published: (2025)

Real-Time Feedback and Benchmark Dataset for Isometric Pose Evaluation
by: Jaiswal, Abhishek, et al.
Published: (2025)

Regressor-Guided Generative Image Editing Balances User Emotions to Reduce Time Spent Online
by: Gebhardt, Christoph, et al.
Published: (2025)

Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models
by: Zhang, Xijie, et al.
Published: (2025)

SasMamba: A Lightweight Structure-Aware Stride State Space Model for 3D Human Pose Estimation
by: Cui, Hu, et al.
Published: (2025)

Pencils to Pixels: A Systematic Study of Creative Drawings across Children, Adults and AI
by: Nath, Surabhi S, et al.
Published: (2025)

Augmenting Image Annotation: A Human-LMM Collaborative Framework for Efficient Object Selection and Label Generation
by: Zhang, He, et al.
Published: (2025)

Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition
by: Wang, Zaitian, et al.
Published: (2025)

Real-Time Intuitive AI Drawing System for Collaboration: Enhancing Human Creativity through Formal and Contextual Intent Integration
by: Song, Jookyung, et al.
Published: (2025)

Towards a Multimodal Document-grounded Conversational AI System for Education
by: Taneja, Karan, et al.
Published: (2025)

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition
by: Yu, Yiheng, et al.
Published: (2025)

OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions
by: Luo, Cheng, et al.
Published: (2025)

Towards Safer and Understandable Driver Intention Prediction
by: Karuppasamy, Mukilan, et al.
Published: (2025)

Reading Smiles: Proxy Bias in Foundation Models for Facial Emotion Recognition
by: Tsangko, Iosif, et al.
Published: (2025)

GazeLLM: Multimodal LLMs incorporating Human Visual Attention
by: Rekimoto, Jun
Published: (2025)

Automated Visual Attention Detection using Mobile Eye Tracking in Behavioral Classroom Studies
by: Bozkir, Efe, et al.
Published: (2025)

UI-UG: A Unified MLLM for UI Understanding and Generation
by: Yang, Hao, et al.
Published: (2025)

Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones
by: Zhao, Yujie, et al.
Published: (2025)

Advancing the Understanding and Evaluation of AR-Generated Scenes: When Vision-Language Models Shine and Stumble
by: Duan, Lin, et al.
Published: (2025)

Words into World: A Task-Adaptive Agent for Language-Guided Spatial Retrieval in AR
by: Guo, Lixing, et al.
Published: (2025)

Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation
by: Kim, Namhee, et al.
Published: (2025)

Not There Yet: Evaluating Vision Language Models in Simulating the Visual Perception of People with Low Vision
by: Natalie, Rosiana, et al.
Published: (2025)

Learning To Defer To A Population With Limited Demonstrations
by: Ramgolam, Nilesh, et al.
Published: (2025)

SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration
by: Bohus, Dan, et al.
Published: (2025)

GAITEX: Human motion dataset of impaired gait and rehabilitation exercises using inertial and optical sensors
by: Spilz, Andreas, et al.
Published: (2025)

LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces
by: Mushkani, Rashid, et al.
Published: (2025)

SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition
by: Liu, Chen, et al.
Published: (2025)

Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
by: Kang, Wan Ju, et al.
Published: (2025)

PixelWeb: The First Web GUI Dataset with Pixel-Wise Labels
by: Yang, Qi, et al.
Published: (2025)

Yume: An Interactive World Generation Model
by: Mao, Xiaofeng, et al.
Published: (2025)

ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation
by: Kondic, Jovana, et al.
Published: (2025)

SSSUMO: Real-Time Semi-Supervised Submovement Decomposition
by: Rudakov, Evgenii, et al.
Published: (2025)

Predicting 3D Motion from 2D Video for Behavior-Based VR Biometrics
by: Li, Mingjun, et al.
Published: (2025)

CG-MER: A Card Game-based Multimodal dataset for Emotion Recognition
by: Farhat, Nessrine, et al.
Published: (2025)

Towards user-centered interactive medical image segmentation in VR with an assistive AI agent
by: Spiegler, Pascal, et al.
Published: (2025)

From Coarse to Nuanced: Cross-Modal Alignment of Fine-Grained Linguistic Cues and Visual Salient Regions for Dynamic Emotion Recognition
by: Liu, Yu, et al.
Published: (2025)