Saved in:
| Main Authors: | Montes, Tony, Lozano, Fernando |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.15928 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)
by: Wang, Yiming, et al.
Published: (2026)
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
by: Zhang, Chuhan, et al.
Published: (2025)
by: Zhang, Chuhan, et al.
Published: (2025)
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models
by: Hacheme, Gilles Quentin, et al.
Published: (2025)
by: Hacheme, Gilles Quentin, et al.
Published: (2025)
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
by: Von Gimborn, Bernd, et al.
Published: (2024)
by: Von Gimborn, Bernd, et al.
Published: (2024)
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)
by: Raoufi, Behnam, et al.
Published: (2025)
Sign Language Recognition and Translation for Low-Resource Languages: Challenges and Pathways Forward
by: Alishzade, Nigar, et al.
Published: (2026)
by: Alishzade, Nigar, et al.
Published: (2026)
Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
by: Chang, Ligang, et al.
Published: (2025)
by: Chang, Ligang, et al.
Published: (2025)
PhysVideoGenerator: Towards Physically Aware Video Generation via Latent Physics Guidance
by: Satish, Siddarth Nilol Kundur, et al.
Published: (2026)
by: Satish, Siddarth Nilol Kundur, et al.
Published: (2026)
Decoupled Sensitivity-Consistency Learning for Weakly Supervised Video Anomaly Detection
by: Zheng, Hantao, et al.
Published: (2026)
by: Zheng, Hantao, et al.
Published: (2026)
Transfer-learning for video classification: Video Swin Transformer on multiple domains
by: Oliveira, Daniel A. P., et al.
Published: (2022)
by: Oliveira, Daniel A. P., et al.
Published: (2022)
CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video Models
by: Foss, Aaron, et al.
Published: (2025)
by: Foss, Aaron, et al.
Published: (2025)
Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction
by: Li, Yumin, et al.
Published: (2025)
by: Li, Yumin, et al.
Published: (2025)
Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)
by: Deka, Dawar Jyoti, et al.
Published: (2026)
SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology
by: Wu, Dongli, et al.
Published: (2025)
by: Wu, Dongli, et al.
Published: (2025)
A Simple Baseline for Streaming Video Understanding
by: Shen, Yujiao, et al.
Published: (2026)
by: Shen, Yujiao, et al.
Published: (2026)
Action Anticipation from SoccerNet Football Video Broadcasts
by: Dalal, Mohamad, et al.
Published: (2025)
by: Dalal, Mohamad, et al.
Published: (2025)
Model-based Metric 3D Shape and Motion Reconstruction of Wild Bottlenose Dolphins in Drone-Shot Videos
by: Baieri, Daniele, et al.
Published: (2025)
by: Baieri, Daniele, et al.
Published: (2025)
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
by: Li, Yayuan, et al.
Published: (2025)
by: Li, Yayuan, et al.
Published: (2025)
NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
by: Lee, Kyuho, et al.
Published: (2025)
by: Lee, Kyuho, et al.
Published: (2025)
A Computer Vision Pipeline for Iterative Bullet Hole Tracking in Rifle Zeroing
by: Belcher, Robert M., et al.
Published: (2026)
by: Belcher, Robert M., et al.
Published: (2026)
Automated Plant Disease and Pest Detection System Using Hybrid Lightweight CNN-MobileViT Models for Diagnosis of Indigenous Crops
by: Gebremedhin, Tekleab G., et al.
Published: (2025)
by: Gebremedhin, Tekleab G., et al.
Published: (2025)
Single-Shot Metric Depth from Focused Plenoptic Cameras
by: Lasheras-Hernandez, Blanca, et al.
Published: (2024)
by: Lasheras-Hernandez, Blanca, et al.
Published: (2024)
Physical Knot Classification Beyond Accuracy: A Benchmark and Diagnostic Study
by: Nie, Shiheng, et al.
Published: (2026)
by: Nie, Shiheng, et al.
Published: (2026)
Human-Centric Perception for Child Sexual Abuse Imagery
by: Laranjeira, Camila, et al.
Published: (2026)
by: Laranjeira, Camila, et al.
Published: (2026)
A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments
by: Høgstedt, Espen Uri, et al.
Published: (2025)
by: Høgstedt, Espen Uri, et al.
Published: (2025)
Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
by: Ke, Yueying
Published: (2025)
by: Ke, Yueying
Published: (2025)
POC-SLT: Partial Object Completion with SDF Latent Transformers
by: Zakeri, Faezeh, et al.
Published: (2024)
by: Zakeri, Faezeh, et al.
Published: (2024)
Reducing Object Hallucination in LVLMs via Emphasizing Image-negative Tokens
by: Shen, Meng, et al.
Published: (2026)
by: Shen, Meng, et al.
Published: (2026)
Perception-to-Pursuit: Track-Centric Temporal Reasoning for Open-World Drone Detection and Autonomous Chasing
by: Oruganti, Venkatakrishna Reddy
Published: (2026)
by: Oruganti, Venkatakrishna Reddy
Published: (2026)
Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework
by: Jung, Seoik, et al.
Published: (2025)
by: Jung, Seoik, et al.
Published: (2025)
Privacy-Preserving Structureless Visual Localization via Image Obfuscation
by: Panek, Vojtech, et al.
Published: (2026)
by: Panek, Vojtech, et al.
Published: (2026)
When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't
by: Nemitz, Jonathan, et al.
Published: (2026)
by: Nemitz, Jonathan, et al.
Published: (2026)
Removing Cost Volumes from Optical Flow Estimators
by: Kiefhaber, Simon, et al.
Published: (2025)
by: Kiefhaber, Simon, et al.
Published: (2025)
TF-Lane: Traffic Flow Module for Robust Lane Perception
by: Xie, Yihan, et al.
Published: (2026)
by: Xie, Yihan, et al.
Published: (2026)
Generalized Closed-form Formulae for Feature-based Subpixel Alignment in Patch-based Matching
by: Jospin, Laurent Valentin, et al.
Published: (2021)
by: Jospin, Laurent Valentin, et al.
Published: (2021)
Tracking Moose using Aerial Object Detection
by: Indris, Christopher, et al.
Published: (2025)
by: Indris, Christopher, et al.
Published: (2025)
EE3P: Event-based Estimation of Periodic Phenomena Properties
by: Kolář, Jakub, et al.
Published: (2024)
by: Kolář, Jakub, et al.
Published: (2024)
EEPPR: Event-based Estimation of Periodic Phenomena Rate using Correlation in 3D
by: Kolář, Jakub, et al.
Published: (2024)
by: Kolář, Jakub, et al.
Published: (2024)
CARDIE: clustering algorithm on relevant descriptors for image enhancement
by: Bonino, Giulia, et al.
Published: (2025)
by: Bonino, Giulia, et al.
Published: (2025)
Human Modelling and Pose Estimation Overview
by: Knap, Pawel
Published: (2024)
by: Knap, Pawel
Published: (2024)
Similar Items
-
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026) -
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
by: Zhang, Chuhan, et al.
Published: (2025) -
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models
by: Hacheme, Gilles Quentin, et al.
Published: (2025) -
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
by: Von Gimborn, Bernd, et al.
Published: (2024) -
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)