Saved in:
| Main Authors: | Lombardo, Gabriele, Maiorana, Luigi, Presti, Liliana Lo, La Cascia, Marco |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.09090 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sphere-Depth: A Benchmark for Depth Estimation Methods with Varying Spherical Camera Orientations
by: Gazzeh, Soulayma, et al.
Published: (2026)
by: Gazzeh, Soulayma, et al.
Published: (2026)
Learn&Drop: Fast Learning of CNNs based on Layer Dropping
by: Cruciata, Giorgio, et al.
Published: (2026)
by: Cruciata, Giorgio, et al.
Published: (2026)
Rethinking Visual Counterfactual Explanations Through Region Constraint
by: Sobieski, Bartlomiej, et al.
Published: (2024)
by: Sobieski, Bartlomiej, et al.
Published: (2024)
V-CECE: Visual Counterfactual Explanations via Conceptual Edits
by: Spanos, Nikolaos, et al.
Published: (2025)
by: Spanos, Nikolaos, et al.
Published: (2025)
Retrieving Counterfactuals Improves Visual In-Context Learning
by: Xiong, Guangzhi, et al.
Published: (2026)
by: Xiong, Guangzhi, et al.
Published: (2026)
LogicGaze: Benchmarking Causal Consistency in Visual Narratives via Counterfactual Verification
by: Driscoll, Rory, et al.
Published: (2026)
by: Driscoll, Rory, et al.
Published: (2026)
Visual Position Prompt for MLLM based Visual Grounding
by: Tang, Wei, et al.
Published: (2025)
by: Tang, Wei, et al.
Published: (2025)
Towards Understanding Visual Grounding in Visual Language Models
by: Pantazopoulos, Georgios, et al.
Published: (2025)
by: Pantazopoulos, Georgios, et al.
Published: (2025)
DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations
by: Augustin, Maximilian, et al.
Published: (2023)
by: Augustin, Maximilian, et al.
Published: (2023)
CrimEdit: Controllable Editing for Counterfactual Object Removal, Insertion, and Movement
by: Jeon, Boseong, et al.
Published: (2025)
by: Jeon, Boseong, et al.
Published: (2025)
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
by: Wang, Jingchao, et al.
Published: (2025)
by: Wang, Jingchao, et al.
Published: (2025)
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
by: Chen, Cong, et al.
Published: (2025)
by: Chen, Cong, et al.
Published: (2025)
Reasoning Matters for 3D Visual Grounding
by: Huang, Hsiang-Wei, et al.
Published: (2026)
by: Huang, Hsiang-Wei, et al.
Published: (2026)
Counterfactual Segmentation Reasoning: Diagnosing and Mitigating Pixel-Grounding Hallucination
by: Li, Xinzhuo, et al.
Published: (2025)
by: Li, Xinzhuo, et al.
Published: (2025)
Why Are You Wrong? Counterfactual Explanations for Language Grounding with 3D Objects
by: Preintner, Tobias, et al.
Published: (2025)
by: Preintner, Tobias, et al.
Published: (2025)
VGR: Visual Grounded Reasoning
by: Wang, Jiacong, et al.
Published: (2025)
by: Wang, Jiacong, et al.
Published: (2025)
Turin3D: Evaluating Adaptation Strategies under Label Scarcity in Urban LiDAR Segmentation with Semi-Supervised Techniques
by: Barco, Luca, et al.
Published: (2025)
by: Barco, Luca, et al.
Published: (2025)
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering
by: Shen, Zhixuan, et al.
Published: (2024)
by: Shen, Zhixuan, et al.
Published: (2024)
Revisiting Visual Understanding in Multimodal Reasoning through a Lens of Image Perturbation
by: Li, Yuting, et al.
Published: (2025)
by: Li, Yuting, et al.
Published: (2025)
OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding
by: Huang, Wenyuan, et al.
Published: (2025)
by: Huang, Wenyuan, et al.
Published: (2025)
Visual Grounding for Object-Level Generalization in Reinforcement Learning
by: Jiang, Haobin, et al.
Published: (2024)
by: Jiang, Haobin, et al.
Published: (2024)
DOGR: Towards Versatile Visual Document Grounding and Referring
by: Zhou, Yinan, et al.
Published: (2024)
by: Zhou, Yinan, et al.
Published: (2024)
Counterfactual Edits for Generative Evaluation
by: Lymperaiou, Maria, et al.
Published: (2023)
by: Lymperaiou, Maria, et al.
Published: (2023)
Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
by: Felizzi, Federico, et al.
Published: (2025)
by: Felizzi, Federico, et al.
Published: (2025)
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples
by: Howard, Phillip, et al.
Published: (2023)
by: Howard, Phillip, et al.
Published: (2023)
Grounding-Aware Token Pruning: Recovering from Drastic Performance Drops in Visual Grounding Caused by Pruning
by: Chien, Tzu-Chun, et al.
Published: (2025)
by: Chien, Tzu-Chun, et al.
Published: (2025)
TruthLens: Visual Grounding for Universal DeepFake Reasoning
by: Kundu, Rohit, et al.
Published: (2025)
by: Kundu, Rohit, et al.
Published: (2025)
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
by: Dai, Ming, et al.
Published: (2025)
by: Dai, Ming, et al.
Published: (2025)
GenIR: Generative Visual Feedback for Mental Image Retrieval
by: Yang, Diji, et al.
Published: (2025)
by: Yang, Diji, et al.
Published: (2025)
ExpVG: Investigating the Design Space of Visual Grounding in Multimodal Large Language Model
by: Kang, Weitai, et al.
Published: (2025)
by: Kang, Weitai, et al.
Published: (2025)
Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation
by: Jiang, Yubo, et al.
Published: (2026)
by: Jiang, Yubo, et al.
Published: (2026)
IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation
by: Huang, Jiacui, et al.
Published: (2024)
by: Huang, Jiacui, et al.
Published: (2024)
EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models
by: Villa, Andrés, et al.
Published: (2025)
by: Villa, Andrés, et al.
Published: (2025)
Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
by: Chang, Zhiyuan, et al.
Published: (2024)
by: Chang, Zhiyuan, et al.
Published: (2024)
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
by: Ernhofer, Benjamin Raphael, et al.
Published: (2025)
by: Ernhofer, Benjamin Raphael, et al.
Published: (2025)
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
by: Huy, Ta Duc, et al.
Published: (2025)
by: Huy, Ta Duc, et al.
Published: (2025)
Enhancing Radiology Report Generation and Visual Grounding using Reinforcement Learning
by: Gundersen, Benjamin, et al.
Published: (2025)
by: Gundersen, Benjamin, et al.
Published: (2025)
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
by: Wang, Jiaxi, et al.
Published: (2023)
by: Wang, Jiaxi, et al.
Published: (2023)
Efficient Adaptation For Remote Sensing Visual Grounding
by: Moughnieh, Hasan, et al.
Published: (2025)
by: Moughnieh, Hasan, et al.
Published: (2025)
The Role of Entropy in Visual Grounding: Analysis and Optimization
by: Li, Shuo, et al.
Published: (2025)
by: Li, Shuo, et al.
Published: (2025)
Similar Items
-
Sphere-Depth: A Benchmark for Depth Estimation Methods with Varying Spherical Camera Orientations
by: Gazzeh, Soulayma, et al.
Published: (2026) -
Learn&Drop: Fast Learning of CNNs based on Layer Dropping
by: Cruciata, Giorgio, et al.
Published: (2026) -
Rethinking Visual Counterfactual Explanations Through Region Constraint
by: Sobieski, Bartlomiej, et al.
Published: (2024) -
V-CECE: Visual Counterfactual Explanations via Conceptual Edits
by: Spanos, Nikolaos, et al.
Published: (2025) -
Retrieving Counterfactuals Improves Visual In-Context Learning
by: Xiong, Guangzhi, et al.
Published: (2026)