:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Baranouskaya, Darya, Cavallaro, Andrea
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.09449
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The impact of abstract and object tags on image privacy classification
by: Baranouskaya, Darya, et al.
Published: (2025)

Which private attributes do VLMs agree on and predict well?
by: Hrynenko, Olena, et al.
Published: (2026)

Zero-shot image privacy classification with Vision-Language Models
by: Baia, Alina Elena, et al.
Published: (2025)

MultiPriv: Benchmarking Individual-Level Privacy Reasoning in Vision-Language Models
by: Sun, Xiongtao, et al.
Published: (2025)

FlowOVD: Learning Generative Latent Flows for Zero-shot Open-vocabulary Detection
by: Wei, Yao, et al.
Published: (2026)

Black-box Attacks on Image Activity Prediction and its Natural Language Explanations
by: Baia, Alina Elena, et al.
Published: (2023)

WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
by: Tavella, Federico, et al.
Published: (2022)

Image-guided topic modeling for interpretable privacy classification
by: Baia, Alina Elena, et al.
Published: (2024)

Improving Generalization of Language-Conditioned Robot Manipulation
by: Cui, Chenglin, et al.
Published: (2025)

Learning human-to-robot handovers through 3D scene reconstruction
by: Wu, Yuekun, et al.
Published: (2025)

Learning Privacy from Visual Entities
by: Xompero, Alessio, et al.
Published: (2025)

Affordance segmentation of hand-occluded containers from exocentric images
by: Apicella, Tommaso, et al.
Published: (2023)

Cross-modal Counterfactual Explanations: Uncovering Decision Factors and Dataset Biases in Subjective Classification
by: Baia, Alina Elena, et al.
Published: (2025)

CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
by: Appiani, Andrea, et al.
Published: (2024)

Sparse multi-view hand-object reconstruction for unseen environments
by: Pang, Yik Lung, et al.
Published: (2024)

Visual Affordance Prediction: Survey and Reproducibility
by: Apicella, Tommaso, et al.
Published: (2025)

3D-LEX v1.0: 3D Lexicons for American Sign Language and Sign Language of the Netherlands
by: Ranum, Oline, et al.
Published: (2024)

Segmenting Object Affordances: Reproducibility and Sensitivity to Scale
by: Apicella, Tommaso, et al.
Published: (2024)

ComPrivDet: Efficient Privacy Object Detection in Compressed Domains Through Inference Reuse
by: Yao, Yunhao, et al.
Published: (2026)

Multi-concept Model Immunization through Differentiable Model Merging
by: Zheng, Amber Yijia, et al.
Published: (2024)

BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
by: Tseng, Yu-Yun, et al.
Published: (2024)

Open-vocabulary object 6D pose estimation
by: Corsetti, Jaime, et al.
Published: (2023)

Vision-Language Model for Accurate Crater Detection
by: Bauer, Patrick, et al.
Published: (2026)

HyperPriv-EPN: Hypergraph Learning with Privileged Knowledge for Ependymoma Prognosis
by: Yu, Shuren Gabriel, et al.
Published: (2026)

Stereo Hand-Object Reconstruction for Human-to-Robot Handover
by: Pang, Yik Lung, et al.
Published: (2024)

CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?
by: Taratynova, Darya, et al.
Published: (2025)

SPDiffusion: Semantic Protection Diffusion Models for Multi-concept Text-to-image Generation
by: Zhang, Yang, et al.
Published: (2024)

Explaining models relating objects and privacy
by: Xompero, Alessio, et al.
Published: (2024)

Examining Vision Language Models through Multi-dimensional Experiments with Vision and Text Features
by: Sengupta, Saurav, et al.
Published: (2025)

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
by: Tian, Xinyu, et al.
Published: (2025)

3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation
by: Sariyanidi, Evangelos, et al.
Published: (2025)

High-resolution open-vocabulary object 6D pose estimation
by: Corsetti, Jaime, et al.
Published: (2024)

Vision-Language Models Assisted Unsupervised Video Anomaly Detection
by: Jiang, Yalong, et al.
Published: (2024)

Revisiting Few-Shot Object Detection with Vision-Language Models
by: Madan, Anish, et al.
Published: (2023)

Detecting Text Manipulation in Images using Vision Language Models
by: Vidit, Vidit, et al.
Published: (2025)

Delving into Out-of-Distribution Detection with Medical Vision-Language Models
by: Ju, Lie, et al.
Published: (2025)

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models
by: Chen, Jiawei, et al.
Published: (2024)

Deepfakes: we need to re-think the concept of "real" images
by: Keuper, Janis, et al.
Published: (2025)

Using Vision Language Models to Detect Students' Academic Emotion through Facial Expressions
by: Wang, Deliang, et al.
Published: (2025)

Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection
by: Lee, Pei-Kang, et al.
Published: (2025)