:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Vedernikov, Alexander, Kumar, Puneet, Chen, Haoyu, Seppänen, Tapio, Li, Xiaobai
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.14749
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals
by: Vedernikov, Alexander, et al.
Published: (2024)

Analyzing Participants' Engagement during Online Meetings Using Unsupervised Remote Photoplethysmography with Behavioral Features
by: Vedernikov, Alexander, et al.
Published: (2024)

Not Every Subject Should Stay: Machine Unlearning for Noisy Engagement Recognition
by: Vedernikov, Alexander
Published: (2026)

PriorNet: Prior-Guided Engagement Estimation from Face Video
by: Vedernikov, Alexander
Published: (2026)

VisioPhysioENet: Visual Physiological Engagement Detection Network
by: Singh, Alakhsimar, et al.
Published: (2024)

GAViD: A Large-Scale Multimodal Dataset for Context-Aware Group Affect Recognition from Videos
by: Kumar, Deepak, et al.
Published: (2026)

VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
by: Kumar, Puneet, et al.
Published: (2022)

Computational Analysis of Stress, Depression and Engagement in Mental Health: A Survey
by: Kumar, Puneet, et al.
Published: (2024)

Lightweight Regression Model with Prediction Interval Estimation for Computer Vision-based Winter Road Surface Condition Monitoring
by: Ojala, Risto, et al.
Published: (2023)

A Dual-Domain Convolutional Network for Hyperspectral Single-Image Super-Resolution
by: Karayaka, Murat, et al.
Published: (2025)

OmniFD: A Unified Model for Versatile Face Forgery Detection
by: Liu, Haotian, et al.
Published: (2025)

CodePhys: Robust Video-based Remote Physiological Measurement through Latent Codebook Querying
by: Chu, Shuyang, et al.
Published: (2025)

Making Large Vision Language Models to be Good Few-shot Learners
by: Liu, Fan, et al.
Published: (2024)

Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models
by: Zhai, Haotian, et al.
Published: (2025)

Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast
by: Sun, Zhaodong, et al.
Published: (2022)

Contrast-Phys+: Unsupervised and Weakly-supervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast
by: Sun, Zhaodong, et al.
Published: (2023)

Rethinking Noise-Robust Training for Frozen Vision Foundation Models: A Cross-Dataset Benchmark with a Case Study of Small-Loss Failure
by: Li, Zitong, et al.
Published: (2026)

Are Large Vision Language Models Good Game Players?
by: Wang, Xinyu, et al.
Published: (2025)

MVRD-Bench: Multi-View Learning and Benchmarking for Dynamic Remote Photoplethysmography under Occlusion
by: He, Zuxian, et al.
Published: (2026)

HalluCXR: Benchmarking and Mitigating Hallucinations in Medical Vision-Language Models for Chest Radiograph Interpretation
by: Wang, Haoyu, et al.
Published: (2026)

Are Multimodal Large Language Models Good Annotators for Image Tagging?
by: Xie, Ming-Kun, et al.
Published: (2026)

Structural Graph Probing of Vision-Language Models
by: He, Haoyu, et al.
Published: (2026)

Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
by: Zhao, Zongchuang, et al.
Published: (2025)

What Makes Good Few-shot Examples for Vision-Language Models?
by: Guo, Zhaojun, et al.
Published: (2024)

What Makes "Good" Distractors for Object Hallucination Evaluation in Large Vision-Language Models?
by: Xie, Ming-Kun, et al.
Published: (2025)

Multi-Echo Denoising in Adverse Weather
by: Seppänen, Alvari, et al.
Published: (2023)

Learner Attentiveness and Engagement Analysis in Online Education Using Computer Vision
by: Gogawale, Sharva, et al.
Published: (2024)

Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models
by: Li, Jiayu, et al.
Published: (2026)

Noise is an Efficient Learner for Zero-Shot Vision-Language Models
by: Imam, Raza, et al.
Published: (2025)

Unboxing Engagement in YouTube Influencer Videos: An Attention-Based Approach
by: Rajaram, Prashant, et al.
Published: (2020)

Joint Vision-Language Social Bias Removal for CLIP
by: Zhang, Haoyu, et al.
Published: (2024)

Enhance Vision-Language Alignment with Noise
by: Huang, Sida, et al.
Published: (2024)

MTT-Bench: Predicting Social Dominance in Mice via Multimodal Large Language Models
by: Chen, Yunquan, et al.
Published: (2026)

Interpretable Image Emotion Recognition: A Domain Adaptation Approach Using Facial Expressions
by: Kumar, Puneet, et al.
Published: (2020)

Robust Prompt Tuning for Vision-Language Models with Mild Semantic Noise
by: Gao, Yansheng, et al.
Published: (2025)

Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
by: Chen, Junjie, et al.
Published: (2025)

Beyond Human Performance: A Vision-Language Multi-Agent Approach for Quality Control in Pharmaceutical Manufacturing
by: Mandal, Subhra Jyoti, et al.
Published: (2026)

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models
by: Chen, Jiawei, et al.
Published: (2024)

A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models
by: Jing, Liqiang, et al.
Published: (2025)

Are We on the Right Way for Evaluating Large Vision-Language Models?
by: Chen, Lin, et al.
Published: (2024)