:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Marcus, Prescott, Jordan, Zhou, Emily, Feng, Tiantian, Avramidis, Kleanthis, Toth, Gabor Mihaly, Narayanan, Shrikanth
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.12534
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Looking Into the Past: Eye Movements Characterize Elements of Autobiographical Recall in Interviews with Holocaust Survivors
by: Zhou, Emily, et al.
Published: (2026)

Early Detection of Coffee Leaf Rust Through Convolutional Neural Networks Trained on Low-Resolution Images
by: Cabrera, Angelly, et al.
Published: (2024)

Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023)

Smiling Regulates Emotion During Traumatic Recollection
by: Ma, Marcus, et al.
Published: (2026)

Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
by: Feng, Tiantian, et al.
Published: (2024)

Knowledge-guided EEG Representation Learning
by: Kommineni, Aditya, et al.
Published: (2024)

VoxCare: Studying Natural Communication Behaviors of Hospital Caregivers through Wearable Sensing of Egocentric Audio
by: Feng, Tiantian, et al.
Published: (2026)

Assessing Visual Privacy Risks in Multimodal AI: A Novel Taxonomy-Grounded Evaluation of Vision-Language Models
by: Tsaprazlis, Efthymios, et al.
Published: (2025)

Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
by: Lee, Jihwan, et al.
Published: (2024)

Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs
by: Tsaprazlis, Efthymios, et al.
Published: (2026)

Neural Codecs as Biosignal Tokenizers
by: Avramidis, Kleanthis, et al.
Published: (2025)

PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
by: Feng, Tiantian, et al.
Published: (2023)

Trade-offs Between Capacity and Robustness in Neural Audio Codecs for Adversarially Robust Speech Recognition
by: Prescott, Jordan, et al.
Published: (2026)

Speech2rtMRI: Speech-Guided Diffusion Model for Real-time MRI Video of the Vocal Tract during Speech
by: Nguyen, Hong, et al.
Published: (2024)

Masked Image Modeling as a Framework for Self-Supervised Learning across Eye Movements
by: Weiler, Robin, et al.
Published: (2024)

Evaluating Atypical Gaze Patterns through Vision Models: The Case of Cortical Visual Impairment
by: Avramidis, Kleanthis, et al.
Published: (2024)

Aperiodic and Low-Frequency Spectral Bias in Reconstruction based EEG Foundation Models
by: Kommineni, Aditya, et al.
Published: (2026)

Fake & Square: Training Self-Supervised Vision Transformers with Synthetic Data and Synthetic Hard Negatives
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)

Whitening Consistently Improves Self-Supervised Learning
by: Kalapos, András, et al.
Published: (2024)

Informed Bootstrap Augmentation Improves EEG Decoding
by: Jeong, Woojae, et al.
Published: (2025)

The Importance of Facial Features in Vision-based Sign Language Recognition: Eyes, Mouth or Full Face?
by: Pham, Dinh Nam, et al.
Published: (2025)

How to Retrieve Examples in In-context Learning to Improve Conversational Emotion Recognition using Large Language Models?
by: Wang, Mengqi, et al.
Published: (2025)

Understanding Stress, Burnout, and Behavioral Patterns in Medical Residents Using Large-scale Longitudinal Wearable Recordings
by: Feng, Tiantian, et al.
Published: (2024)

Developing a High-performance Framework for Speech Emotion Recognition in Naturalistic Conditions Challenge for Emotional Attribute Prediction
by: Lertpetchpun, Thanathai, et al.
Published: (2025)

Estimating Markers of Driving Stress through Multimodal Physiological Monitoring
by: Avramidis, Kleanthis, et al.
Published: (2025)

Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
by: Zhang, Tuo, et al.
Published: (2024)

Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum Disorder
by: Kommineni, Aditya, et al.
Published: (2024)

MOOSE: Pay Attention to Temporal Dynamics for Video Understanding via Optical Flows
by: Nguyen, Hong, et al.
Published: (2025)

CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture
by: Kalapos, András, et al.
Published: (2024)

Deep Learning Characterizes Depression and Suicidal Ideation from Eye Movements
by: Avramidis, Kleanthis, et al.
Published: (2025)

SIDME: Self-supervised Image Demoiréing via Masked Encoder-Decoder Reconstruction
by: Wang, Xia, et al.
Published: (2025)

Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements
by: Kinoshita, Genki, et al.
Published: (2026)

Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
by: Sun, Jingtao, et al.
Published: (2024)

Self-Supervised Sparse Sensor Fusion for Long Range Perception
by: Palladin, Edoardo, et al.
Published: (2025)

Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language
by: Pham, Dinh Nam, et al.
Published: (2025)

Self-Supervised Bird's Eye View Motion Prediction with Cross-Modality Signals
by: Fang, Shaoheng, et al.
Published: (2024)

ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization
by: Nguyen, Hong, et al.
Published: (2024)

Developing a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering Choices
by: Feng, Tiantian, et al.
Published: (2025)

AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process
by: Hu, Junjie, et al.
Published: (2025)

Intelligence Requires Grounding But Not Embodiment
by: Ma, Marcus, et al.
Published: (2026)