Saved in:
| Main Authors: | Ma, Marcus, Prescott, Jordan, Zhou, Emily, Feng, Tiantian, Avramidis, Kleanthis, Toth, Gabor Mihaly, Narayanan, Shrikanth |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.12534 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Looking Into the Past: Eye Movements Characterize Elements of Autobiographical Recall in Interviews with Holocaust Survivors
by: Zhou, Emily, et al.
Published: (2026)
by: Zhou, Emily, et al.
Published: (2026)
Early Detection of Coffee Leaf Rust Through Convolutional Neural Networks Trained on Low-Resolution Images
by: Cabrera, Angelly, et al.
Published: (2024)
by: Cabrera, Angelly, et al.
Published: (2024)
Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023)
by: Stewart, Shanti, et al.
Published: (2023)
Smiling Regulates Emotion During Traumatic Recollection
by: Ma, Marcus, et al.
Published: (2026)
by: Ma, Marcus, et al.
Published: (2026)
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
by: Feng, Tiantian, et al.
Published: (2024)
by: Feng, Tiantian, et al.
Published: (2024)
Knowledge-guided EEG Representation Learning
by: Kommineni, Aditya, et al.
Published: (2024)
by: Kommineni, Aditya, et al.
Published: (2024)
VoxCare: Studying Natural Communication Behaviors of Hospital Caregivers through Wearable Sensing of Egocentric Audio
by: Feng, Tiantian, et al.
Published: (2026)
by: Feng, Tiantian, et al.
Published: (2026)
Assessing Visual Privacy Risks in Multimodal AI: A Novel Taxonomy-Grounded Evaluation of Vision-Language Models
by: Tsaprazlis, Efthymios, et al.
Published: (2025)
by: Tsaprazlis, Efthymios, et al.
Published: (2025)
Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
by: Lee, Jihwan, et al.
Published: (2024)
by: Lee, Jihwan, et al.
Published: (2024)
Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs
by: Tsaprazlis, Efthymios, et al.
Published: (2026)
by: Tsaprazlis, Efthymios, et al.
Published: (2026)
Neural Codecs as Biosignal Tokenizers
by: Avramidis, Kleanthis, et al.
Published: (2025)
by: Avramidis, Kleanthis, et al.
Published: (2025)
PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
by: Feng, Tiantian, et al.
Published: (2023)
by: Feng, Tiantian, et al.
Published: (2023)
Trade-offs Between Capacity and Robustness in Neural Audio Codecs for Adversarially Robust Speech Recognition
by: Prescott, Jordan, et al.
Published: (2026)
by: Prescott, Jordan, et al.
Published: (2026)
Speech2rtMRI: Speech-Guided Diffusion Model for Real-time MRI Video of the Vocal Tract during Speech
by: Nguyen, Hong, et al.
Published: (2024)
by: Nguyen, Hong, et al.
Published: (2024)
Masked Image Modeling as a Framework for Self-Supervised Learning across Eye Movements
by: Weiler, Robin, et al.
Published: (2024)
by: Weiler, Robin, et al.
Published: (2024)
Evaluating Atypical Gaze Patterns through Vision Models: The Case of Cortical Visual Impairment
by: Avramidis, Kleanthis, et al.
Published: (2024)
by: Avramidis, Kleanthis, et al.
Published: (2024)
Aperiodic and Low-Frequency Spectral Bias in Reconstruction based EEG Foundation Models
by: Kommineni, Aditya, et al.
Published: (2026)
by: Kommineni, Aditya, et al.
Published: (2026)
Fake & Square: Training Self-Supervised Vision Transformers with Synthetic Data and Synthetic Hard Negatives
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)
Whitening Consistently Improves Self-Supervised Learning
by: Kalapos, András, et al.
Published: (2024)
by: Kalapos, András, et al.
Published: (2024)
Informed Bootstrap Augmentation Improves EEG Decoding
by: Jeong, Woojae, et al.
Published: (2025)
by: Jeong, Woojae, et al.
Published: (2025)
The Importance of Facial Features in Vision-based Sign Language Recognition: Eyes, Mouth or Full Face?
by: Pham, Dinh Nam, et al.
Published: (2025)
by: Pham, Dinh Nam, et al.
Published: (2025)
How to Retrieve Examples in In-context Learning to Improve Conversational Emotion Recognition using Large Language Models?
by: Wang, Mengqi, et al.
Published: (2025)
by: Wang, Mengqi, et al.
Published: (2025)
Understanding Stress, Burnout, and Behavioral Patterns in Medical Residents Using Large-scale Longitudinal Wearable Recordings
by: Feng, Tiantian, et al.
Published: (2024)
by: Feng, Tiantian, et al.
Published: (2024)
Developing a High-performance Framework for Speech Emotion Recognition in Naturalistic Conditions Challenge for Emotional Attribute Prediction
by: Lertpetchpun, Thanathai, et al.
Published: (2025)
by: Lertpetchpun, Thanathai, et al.
Published: (2025)
Estimating Markers of Driving Stress through Multimodal Physiological Monitoring
by: Avramidis, Kleanthis, et al.
Published: (2025)
by: Avramidis, Kleanthis, et al.
Published: (2025)
Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
by: Zhang, Tuo, et al.
Published: (2024)
by: Zhang, Tuo, et al.
Published: (2024)
Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum Disorder
by: Kommineni, Aditya, et al.
Published: (2024)
by: Kommineni, Aditya, et al.
Published: (2024)
MOOSE: Pay Attention to Temporal Dynamics for Video Understanding via Optical Flows
by: Nguyen, Hong, et al.
Published: (2025)
by: Nguyen, Hong, et al.
Published: (2025)
CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture
by: Kalapos, András, et al.
Published: (2024)
by: Kalapos, András, et al.
Published: (2024)
Deep Learning Characterizes Depression and Suicidal Ideation from Eye Movements
by: Avramidis, Kleanthis, et al.
Published: (2025)
by: Avramidis, Kleanthis, et al.
Published: (2025)
SIDME: Self-supervised Image Demoiréing via Masked Encoder-Decoder Reconstruction
by: Wang, Xia, et al.
Published: (2025)
by: Wang, Xia, et al.
Published: (2025)
Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements
by: Kinoshita, Genki, et al.
Published: (2026)
by: Kinoshita, Genki, et al.
Published: (2026)
Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
by: Sun, Jingtao, et al.
Published: (2024)
by: Sun, Jingtao, et al.
Published: (2024)
Self-Supervised Sparse Sensor Fusion for Long Range Perception
by: Palladin, Edoardo, et al.
Published: (2025)
by: Palladin, Edoardo, et al.
Published: (2025)
Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language
by: Pham, Dinh Nam, et al.
Published: (2025)
by: Pham, Dinh Nam, et al.
Published: (2025)
Self-Supervised Bird's Eye View Motion Prediction with Cross-Modality Signals
by: Fang, Shaoheng, et al.
Published: (2024)
by: Fang, Shaoheng, et al.
Published: (2024)
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization
by: Nguyen, Hong, et al.
Published: (2024)
by: Nguyen, Hong, et al.
Published: (2024)
Developing a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering Choices
by: Feng, Tiantian, et al.
Published: (2025)
by: Feng, Tiantian, et al.
Published: (2025)
AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process
by: Hu, Junjie, et al.
Published: (2025)
by: Hu, Junjie, et al.
Published: (2025)
Intelligence Requires Grounding But Not Embodiment
by: Ma, Marcus, et al.
Published: (2026)
by: Ma, Marcus, et al.
Published: (2026)
Similar Items
-
Looking Into the Past: Eye Movements Characterize Elements of Autobiographical Recall in Interviews with Holocaust Survivors
by: Zhou, Emily, et al.
Published: (2026) -
Early Detection of Coffee Leaf Rust Through Convolutional Neural Networks Trained on Low-Resolution Images
by: Cabrera, Angelly, et al.
Published: (2024) -
Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023) -
Smiling Regulates Emotion During Traumatic Recollection
by: Ma, Marcus, et al.
Published: (2026) -
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
by: Feng, Tiantian, et al.
Published: (2024)