:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yu, Zhao, Xinyi, Bi, Chongke, Chen, Siming
Format:	Preprint
Published:	2026
Subjects:	Human-Computer Interaction Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.10871
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

UST-Hand: An Uncertainty-aware Spatiotemporal Point Cloud Interaction Network for 3D Self-supervised Hand Pose Estimation
by: Han, Tianhao, et al.
Published: (2026)

Precise Workcell Sketching from Point Clouds Using an AR Toolbox
by: Zieliński, Krzysztof, et al.
Published: (2024)

Extracting Human Attention through Crowdsourced Patch Labeling
by: Chang, Minsuk, et al.
Published: (2024)

Augmenting Image Annotation: A Human-LMM Collaborative Framework for Efficient Object Selection and Label Generation
by: Zhang, He, et al.
Published: (2025)

PixelWeb: The First Web GUI Dataset with Pixel-Wise Labels
by: Yang, Qi, et al.
Published: (2025)

ColorGPT: Leveraging Large Language Models for Multimodal Color Recommendation
by: Xia, Ding, et al.
Published: (2025)

UIPro: Unleashing Superior Interaction Capability For GUI Agents
by: Li, Hongxin, et al.
Published: (2025)

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization
by: You, Yuzhe, et al.
Published: (2023)

InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
by: Lin, Yukang, et al.
Published: (2025)

Exploring the "Great Unseen" in Medieval Manuscripts: Instance-Level Labeling of Legacy Image Collections with Zero-Shot Models
by: Meinecke, Christofer, et al.
Published: (2025)

SpriteHand: Real-Time Versatile Hand-Object Interaction with Autoregressive Video Generation
by: Li, Zisu, et al.
Published: (2025)

UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis
by: Liu, Xinyi, et al.
Published: (2025)

Self-Calibrating BCIs: Ranking and Recovery of Mental Targets Without Labels
by: Grizou, Jonathan, et al.
Published: (2025)

VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
by: Song, Sicheng, et al.
Published: (2025)

VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games
by: Zhang, He, et al.
Published: (2024)

Scene-Aware Urban Design: A Human-AI Recommendation Framework Using Co-Occurrence Embeddings and Vision-Language Models
by: Gallardo, Rodrigo, et al.
Published: (2025)

DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems
by: Zhang, Tong, et al.
Published: (2025)

Yume: An Interactive World Generation Model
by: Mao, Xiaofeng, et al.
Published: (2025)

QueryCraft: Transformer-Guided Query Initialization for Enhanced Human-Object Interaction Detection
by: Wang, Yuxiao, et al.
Published: (2025)

Interact with me: Joint Egocentric Forecasting of Intent to Interact, Attitude and Social Actions
by: Bian, Tongfei, et al.
Published: (2024)

RISEE: A Highly Interactive Naturalistic Driving Trajectories Dataset with Human Subjective Risk Perception and Eye-tracking Information
by: Wu, Xinzheng, et al.
Published: (2025)

ICo3D: An Interactive Conversational 3D Virtual Human
by: Shaw, Richard, et al.
Published: (2026)

Interactivity x Explainability: Toward Understanding How Interactivity Can Improve Computer Vision Explanations
by: Panigrahi, Indu, et al.
Published: (2025)

Enhanced Automated Quality Assessment Network for Interactive Building Segmentation in High-Resolution Remote Sensing Imagery
by: Zhang, Zhili, et al.
Published: (2024)

NARVis: Neural Accelerated Rendering for Real-Time Scientific Point Cloud Visualization
by: Hegde, Srinidhi, et al.
Published: (2024)

VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality
by: Jiang, Ying, et al.
Published: (2024)

3DPFIX: Improving Remote Novices' 3D Printing Troubleshooting through Human-AI Collaboration
by: Kwon, Nahyun, et al.
Published: (2024)

ObjectFinder: An Open-Vocabulary Assistive System for Interactive Object Search by Blind People
by: Liu, Ruiping, et al.
Published: (2024)

Do MLLMs Understand Pointing? Benchmarking and Enhancing Referential Reasoning in Egocentric Vision
by: Li, Chentao, et al.
Published: (2026)

Automated Label Placement on Maps via Large Language Models
by: Shomer, Harry, et al.
Published: (2025)

ViT-Explainer: An Interactive Walkthrough of the Vision Transformer Pipeline
by: Hernandez, Juan Manuel, et al.
Published: (2026)

Personalized Interpretability -- Interactive Alignment of Prototypical Parts Networks
by: Michalski, Tomasz, et al.
Published: (2025)

FreeDrag: Feature Dragging for Reliable Point-based Image Editing
by: Ling, Pengyang, et al.
Published: (2023)

Zero-shot Emotion Annotation in Facial Images Using Large Multimodal Models: Benchmarking and Prospects for Multi-Class, Multi-Frame Approaches
by: Zhang, He, et al.
Published: (2025)

MedFoundationHub: A Lightweight and Secure Toolkit for Deploying Medical Vision Language Foundation Models
by: Li, Xiao, et al.
Published: (2025)

StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars
by: Sun, Zhiyao, et al.
Published: (2025)

Collection Space Navigator: An Interactive Visualization Interface for Multidimensional Datasets
by: Ohm, Tillmann, et al.
Published: (2023)

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
by: Huang, Yifei, et al.
Published: (2025)

Visually Grounded Narratives: Reducing Cognitive Burden in Researcher-Participant Interaction
by: Wu, Runtong, et al.
Published: (2025)

HpEIS: Learning Hand Pose Embeddings for Multimedia Interactive Systems
by: Xu, Songpei, et al.
Published: (2024)