:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Emily, Ma, Marcus, Avramidis, Kleanthis, Toth, Gabor Mihaly, Narayanan, Shrikanth
Format:	Preprint
Published:	2026
Subjects:	Multimedia
Online Access:	https://arxiv.org/abs/2604.22016
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Encoding Emotion Through Self-Supervised Eye Movement Reconstruction
by: Ma, Marcus, et al.
Published: (2026)

Smiling Regulates Emotion During Traumatic Recollection
by: Ma, Marcus, et al.
Published: (2026)

Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023)

Early Detection of Coffee Leaf Rust Through Convolutional Neural Networks Trained on Low-Resolution Images
by: Cabrera, Angelly, et al.
Published: (2024)

Knowledge-guided EEG Representation Learning
by: Kommineni, Aditya, et al.
Published: (2024)

An Emotion Recognition Framework via Cross-modal Alignment of EEG and Eye Movement Data
by: Wang, Jianlu, et al.
Published: (2025)

VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs
by: Zhang, Hezhao, et al.
Published: (2026)

Look, Listen and Segment: Towards Weakly Supervised Audio-visual Semantic Segmentation
by: Li, Chengzhi, et al.
Published: (2026)

Evaluating Atypical Gaze Patterns through Vision Models: The Case of Cortical Visual Impairment
by: Avramidis, Kleanthis, et al.
Published: (2024)

Deep Learning Characterizes Depression and Suicidal Ideation from Eye Movements
by: Avramidis, Kleanthis, et al.
Published: (2025)

Informed Bootstrap Augmentation Improves EEG Decoding
by: Jeong, Woojae, et al.
Published: (2025)

Movement- and Traffic-based User Identification in Commercial Virtual Reality Applications: Threats and Opportunities
by: Baldoni, Sara, et al.
Published: (2025)

Listening to the Unspoken: Exploring "365" Aspects of Multimodal Interview Performance Assessment
by: Li, Jia, et al.
Published: (2025)

Remember Past, Anticipate Future: Learning Continual Multimodal Misinformation Detectors
by: Wang, Bing, et al.
Published: (2025)

Archiving Body Movements: Collective Generation of Chinese Calligraphy
by: Zhou, Aven Le, et al.
Published: (2023)

VoxCare: Studying Natural Communication Behaviors of Hospital Caregivers through Wearable Sensing of Egocentric Audio
by: Feng, Tiantian, et al.
Published: (2026)

Editing on the Generative Manifold: A Theoretical and Empirical Study of General Diffusion-Based Image Editing Trade-offs
by: Hu, Yi, et al.
Published: (2026)

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
by: Wang, Zixuan, et al.
Published: (2024)

Conformer-based Ultrasound-to-Speech Conversion
by: Ibrahimov, Ibrahim, et al.
Published: (2025)

Listen, Look, Drive: Coupling Audio Instructions for User-aware VLA-based Autonomous Driving
by: Guo, Ziang, et al.
Published: (2026)

Characterizing Multimedia Information Environment through Multi-modal Clustering of YouTube Videos
by: Yousefi, Niloofar, et al.
Published: (2024)

Modular Conversational Agents for Surveys and Interviews
by: Yu, Jiangbo, et al.
Published: (2024)

Through Their Eyes: Fixation-aligned Tuning for Personalized User Emulation
by: Huang, Lingfeng, et al.
Published: (2026)

Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
by: Lee, Jihwan, et al.
Published: (2024)

Looking Backward: Streaming Video-to-Video Translation with Feature Banks
by: Liang, Feng, et al.
Published: (2024)

Unravelling the Power of Single-Pass Look-Ahead in Modern Codecs for Optimized Transcoding Deployment
by: Vibhoothi, Vibhoothi, et al.
Published: (2024)

SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System
by: Nguyen, Truong Thanh Hung, et al.
Published: (2025)

Applying LLM-Powered Virtual Humans to Child Interviews in Child-Centered Design
by: Li, Linshi, et al.
Published: (2025)

Look, Compare and Draw: Differential Query Transformer for Automatic Oil Painting
by: Liu, Lingyu, et al.
Published: (2026)

Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation
by: Chen, Yuanhong, et al.
Published: (2023)

State-Anchored Complete-View Distillation for Robust Conversational Multimodal Emotion Recognition
by: Pan, Zhaoyan, et al.
Published: (2026)

EyeNexus: Adaptive Gaze-Driven Quality and Bitrate Streaming for Seamless VR Cloud Gaming Experiences
by: Wu, Ze, et al.
Published: (2025)

Can Video Diffusion Models Predict Past Frames? Bidirectional Cycle Consistency for Reversible Interpolation
by: Liu, Lingyu, et al.
Published: (2026)

Panonut360: A Head and Eye Tracking Dataset for Panoramic Video
by: Xu, Yutong, et al.
Published: (2024)

FineBadminton: A Multi-Level Dataset for Fine-Grained Badminton Video Understanding
by: He, Xusheng, et al.
Published: (2025)

Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes
by: Hsu, Chih-Chung, et al.
Published: (2024)

Real-time 3D Light-field Viewing with Eye-tracking on Conventional Displays
by: Pham, Trung Hieu, et al.
Published: (2025)

Subjective Evaluation of Frame Rate in Bitrate-Constrained Live Streaming
by: He, Jiaqi, et al.
Published: (2026)

Relationship Analysis of Image-Text Pair in SNS Posts
by: Nabeoka, Takuto, et al.
Published: (2025)

Promisedland: An XR Narrative Attraction Integrating Diorama-to-Virtual Workflow and Elemental Storytelling
by: Wang, Xianghan, et al.
Published: (2025)