:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Geng-Xin, Zuo, Xiang, Li, Ye
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Human-Computer Interaction
Online Access:	https://arxiv.org/abs/2507.20737
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
by: Xie, Haoyu, et al.
Published: (2025)

From Coarse to Nuanced: Cross-Modal Alignment of Fine-Grained Linguistic Cues and Visual Salient Regions for Dynamic Emotion Recognition
by: Liu, Yu, et al.
Published: (2025)

iLearnRobot: An Interactive Learning-Based Multi-Modal Robot with Continuous Improvement
by: Wang, Kohou, et al.
Published: (2025)

VerSe: Integrating Multiple Queries as Prompts for Versatile Cardiac MRI Segmentation
by: Guo, Bangwei, et al.
Published: (2024)

EEG-based Multimodal Representation Learning for Emotion Recognition
by: Yin, Kang, et al.
Published: (2024)

ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map
by: Ye, Yilin, et al.
Published: (2024)

Triple Spectral Fusion for Sensor-based Human Activity Recognition
by: Zhang, Ye, et al.
Published: (2026)

Reading Smiles: Proxy Bias in Foundation Models for Facial Emotion Recognition
by: Tsangko, Iosif, et al.
Published: (2025)

AuraMask: An Extensible Pipeline for Developing Aesthetic Anti-Facial Recognition Image Filters
by: Lagogiannis, Jacob, et al.
Published: (2026)

Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition
by: Wang, Zaitian, et al.
Published: (2025)

CG-MER: A Card Game-based Multimodal dataset for Emotion Recognition
by: Farhat, Nessrine, et al.
Published: (2025)

egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-World Tasks
by: Jammot, Matthias, et al.
Published: (2025)

In-Depth Analysis of Emotion Recognition through Knowledge-Based Large Language Models
by: Han, Bin, et al.
Published: (2024)

From Image Generation to Infrastructure Design: a Multi-agent Pipeline for Street Design Generation
by: Wang, Chenguang, et al.
Published: (2025)

DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
by: Wu, Hang, et al.
Published: (2025)

Zero-shot Emotion Annotation in Facial Images Using Large Multimodal Models: Benchmarking and Prospects for Multi-Class, Multi-Frame Approaches
by: Zhang, He, et al.
Published: (2025)

AI-Based Facial Emotion Recognition Solutions for Education: A Study of Teacher-User and Other Categories
by: Ravenor, R. Yamamoto
Published: (2023)

SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition
by: Liu, Chen, et al.
Published: (2025)

Mask-up: Investigating Biases in Face Re-identification for Masked Faces
by: Jaiswal, Siddharth D, et al.
Published: (2024)

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition
by: Yu, Yiheng, et al.
Published: (2025)

Deep Learning Based Approach to Enhanced Recognition of Emotions and Behavioral Patterns of Autistic Children
by: R, Nelaka K. A., et al.
Published: (2025)

MP-GUI: Modality Perception with MLLMs for GUI Understanding
by: Wang, Ziwei, et al.
Published: (2025)

AV-EmoDialog: Chat with Audio-Visual Users Leveraging Emotional Cues
by: Park, Se Jin, et al.
Published: (2024)

Modelling the Interplay of Eye-Tracking Temporal Dynamics and Personality for Emotion Detection in Face-to-Face Settings
by: Seikavandi, Meisam J., et al.
Published: (2025)

Regressor-Guided Generative Image Editing Balances User Emotions to Reduce Time Spent Online
by: Gebhardt, Christoph, et al.
Published: (2025)

Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions
by: Rakesh, Vineet Kumar, et al.
Published: (2025)

Summary of the Unusual Activity Recognition Challenge for Developmental Disability Support
by: Garcia, Christina, et al.
Published: (2026)

MOTION: ML-Assisted On-Device Low-Latency Motion Recognition
by: Pugazhenthi, Veeramani, et al.
Published: (2025)

AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
by: Ye, Yilin, et al.
Published: (2025)

Deep Generative Domain Adaptation with Temporal Attention for Cross-User Activity Recognition
by: Ye, Xiaozhou, et al.
Published: (2024)

Exploring Object Status Recognition for Recipe Progress Tracking in Non-Visual Cooking
by: Li, Franklin Mingzhe, et al.
Published: (2025)

Deep Generative Domain Adaptation with Temporal Relation Knowledge for Cross-User Activity Recognition
by: Ye, Xiaozhou, et al.
Published: (2024)

Evaluating Visual Prompts with Eye-Tracking Data for MLLM-Based Human Activity Recognition
by: Choi, Jae Young, et al.
Published: (2026)

Explorer: Robust Collection of Interactable GUI Elements
by: Chaimalas, Iason, et al.
Published: (2025)

T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation
by: Chen, Chieh-Yun, et al.
Published: (2025)

Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models
by: Zhang, Xijie, et al.
Published: (2025)

A Cloud-Based Cross-Modal Transformer for Emotion Recognition and Adaptive Human-Computer Interaction
by: Zhong, Ziwen, et al.
Published: (2025)

ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation
by: Yang, Boyin, et al.
Published: (2025)

Multi-face emotion detection for effective Human-Robot Interaction
by: Yahyaoui, Mohamed Ala, et al.
Published: (2025)

Posture-Informed Muscular Force Learning for Robust Hand Pressure Estimation
by: Seo, Kyungjin, et al.
Published: (2024)