Saved in:
| Main Authors: | Laiti, Francesco, Talon, Davide, Staiano, Jacopo, Ricci, Elisa |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.21877 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Conditioned Prompt-Optimization for Continual Deepfake Detection
by: Laiti, Francesco, et al.
Published: (2024)
by: Laiti, Francesco, et al.
Published: (2024)
One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering
by: Das, Deepayan, et al.
Published: (2024)
by: Das, Deepayan, et al.
Published: (2024)
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
by: Das, Deepayan, et al.
Published: (2025)
by: Das, Deepayan, et al.
Published: (2025)
ProfVLM: A lightweight video-language model for multi-view proficiency estimation
by: Bianchi, Edoardo, et al.
Published: (2025)
by: Bianchi, Edoardo, et al.
Published: (2025)
Incremental Object-Based Novelty Detection with Feedback Loop
by: Caldarella, Simone, et al.
Published: (2023)
by: Caldarella, Simone, et al.
Published: (2023)
Evaluating Attribute Confusion in Fashion Text-to-Image Generation
by: Liu, Ziyue, et al.
Published: (2025)
by: Liu, Ziyue, et al.
Published: (2025)
ExpertAF: Expert Actionable Feedback from Video
by: Ashutosh, Kumar, et al.
Published: (2024)
by: Ashutosh, Kumar, et al.
Published: (2024)
Key Design Choices in Source-Free Unsupervised Domain Adaptation: An In-depth Empirical Analysis
by: Maracani, Andrea, et al.
Published: (2024)
by: Maracani, Andrea, et al.
Published: (2024)
Empowering Small VLMs to Think with Dynamic Memorization and Exploration
by: Liu, Jiazhen, et al.
Published: (2025)
by: Liu, Jiazhen, et al.
Published: (2025)
How Diffusion Models Memorize
by: Kim, Juyeop, et al.
Published: (2025)
by: Kim, Juyeop, et al.
Published: (2025)
Training-Free Semantic Multi-Object Tracking with Vision-Language Models
by: Bonat, Laurence, et al.
Published: (2026)
by: Bonat, Laurence, et al.
Published: (2026)
Specificity-aware reinforcement learning for fine-grained open-world classification
by: Angheben, Samuele, et al.
Published: (2026)
by: Angheben, Samuele, et al.
Published: (2026)
Seeing the Abstract: Translating the Abstract Language for Vision Language Models
by: Talon, Davide, et al.
Published: (2025)
by: Talon, Davide, et al.
Published: (2025)
Towards Unconstrained Human-Object Interaction
by: Tonini, Francesco, et al.
Published: (2026)
by: Tonini, Francesco, et al.
Published: (2026)
Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection
by: Tonini, Francesco, et al.
Published: (2025)
by: Tonini, Francesco, et al.
Published: (2025)
From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition
by: Gentile, Francesco, et al.
Published: (2026)
by: Gentile, Francesco, et al.
Published: (2026)
How (Mis)calibrated is Your Federated CLIP and What To Do About It?
by: Singha, Mainak, et al.
Published: (2025)
by: Singha, Mainak, et al.
Published: (2025)
AL-GTD: Deep Active Learning for Gaze Target Detection
by: Tonini, Francesco, et al.
Published: (2024)
by: Tonini, Francesco, et al.
Published: (2024)
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
by: Berasi, Davide, et al.
Published: (2025)
by: Berasi, Davide, et al.
Published: (2025)
LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing
by: Girella, Federico, et al.
Published: (2025)
by: Girella, Federico, et al.
Published: (2025)
Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation
by: Liu, Ziyue, et al.
Published: (2026)
by: Liu, Ziyue, et al.
Published: (2026)
Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized
by: Jin, Er, et al.
Published: (2025)
by: Jin, Er, et al.
Published: (2025)
Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images
by: Bagheri, Elham, et al.
Published: (2024)
by: Bagheri, Elham, et al.
Published: (2024)
Interactive Episodic Memory with User Feedback
by: Subedi, Nikesh, et al.
Published: (2026)
by: Subedi, Nikesh, et al.
Published: (2026)
3D Object Detection from Images for Autonomous Driving: A Survey
by: Ma, Xinzhu, et al.
Published: (2022)
by: Ma, Xinzhu, et al.
Published: (2022)
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
by: Caldarella, Simone, et al.
Published: (2024)
by: Caldarella, Simone, et al.
Published: (2024)
Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer
by: Shao, Xinyuan, et al.
Published: (2024)
by: Shao, Xinyuan, et al.
Published: (2024)
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
by: Tur, Anil Osman, et al.
Published: (2024)
by: Tur, Anil Osman, et al.
Published: (2024)
Retrieval-enriched zero-shot image classification in low-resource domains
by: Dall'Asen, Nicola, et al.
Published: (2024)
by: Dall'Asen, Nicola, et al.
Published: (2024)
Unlearning Personal Data from a Single Image
by: De Min, Thomas, et al.
Published: (2024)
by: De Min, Thomas, et al.
Published: (2024)
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
by: Ban, Yuanhao, et al.
Published: (2024)
by: Ban, Yuanhao, et al.
Published: (2024)
Large Multimodal Models as General In-Context Classifiers
by: Garosi, Marco, et al.
Published: (2026)
by: Garosi, Marco, et al.
Published: (2026)
Compositional Caching for Training-free Open-vocabulary Attribute Detection
by: Garosi, Marco, et al.
Published: (2025)
by: Garosi, Marco, et al.
Published: (2025)
Harnessing Large Language Models for Training-free Video Anomaly Detection
by: Zanella, Luca, et al.
Published: (2024)
by: Zanella, Luca, et al.
Published: (2024)
Test-Time Zero-Shot Temporal Action Localization
by: Liberatori, Benedetta, et al.
Published: (2024)
by: Liberatori, Benedetta, et al.
Published: (2024)
Novel class discovery meets foundation models for 3D semantic segmentation
by: Riz, Luigi, et al.
Published: (2023)
by: Riz, Luigi, et al.
Published: (2023)
Training-free Online Video Step Grounding
by: Zanella, Luca, et al.
Published: (2025)
by: Zanella, Luca, et al.
Published: (2025)
Task-Focused Memorization for Multimodal Agents
by: Zou, Tao, et al.
Published: (2026)
by: Zou, Tao, et al.
Published: (2026)
Investigating Memorization in Video Diffusion Models
by: Chen, Chen, et al.
Published: (2024)
by: Chen, Chen, et al.
Published: (2024)
Towards Memorization-Free Diffusion Models
by: Chen, Chen, et al.
Published: (2024)
by: Chen, Chen, et al.
Published: (2024)
Similar Items
-
Conditioned Prompt-Optimization for Continual Deepfake Detection
by: Laiti, Francesco, et al.
Published: (2024) -
One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering
by: Das, Deepayan, et al.
Published: (2024) -
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
by: Das, Deepayan, et al.
Published: (2025) -
ProfVLM: A lightweight video-language model for multi-view proficiency estimation
by: Bianchi, Edoardo, et al.
Published: (2025) -
Incremental Object-Based Novelty Detection with Feedback Loop
by: Caldarella, Simone, et al.
Published: (2023)