:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Montes, Tony, Lozano, Fernando
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Computation and Language I.4.8
Online Access:	https://arxiv.org/abs/2505.15928
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)

Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
by: Zhang, Chuhan, et al.
Published: (2025)

GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models
by: Hacheme, Gilles Quentin, et al.
Published: (2025)

Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
by: Von Gimborn, Bernd, et al.
Published: (2024)

CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)

Sign Language Recognition and Translation for Low-Resource Languages: Challenges and Pathways Forward
by: Alishzade, Nigar, et al.
Published: (2026)

Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
by: Chang, Ligang, et al.
Published: (2025)

PhysVideoGenerator: Towards Physically Aware Video Generation via Latent Physics Guidance
by: Satish, Siddarth Nilol Kundur, et al.
Published: (2026)

Decoupled Sensitivity-Consistency Learning for Weakly Supervised Video Anomaly Detection
by: Zheng, Hantao, et al.
Published: (2026)

Transfer-learning for video classification: Video Swin Transformer on multiple domains
by: Oliveira, Daniel A. P., et al.
Published: (2022)

CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video Models
by: Foss, Aaron, et al.
Published: (2025)

Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction
by: Li, Yumin, et al.
Published: (2025)

Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)

SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology
by: Wu, Dongli, et al.
Published: (2025)

A Simple Baseline for Streaming Video Understanding
by: Shen, Yujiao, et al.
Published: (2026)

Action Anticipation from SoccerNet Football Video Broadcasts
by: Dalal, Mohamad, et al.
Published: (2025)

Model-based Metric 3D Shape and Motion Reconstruction of Wild Bottlenose Dolphins in Drone-Shot Videos
by: Baieri, Daniele, et al.
Published: (2025)

Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
by: Li, Yayuan, et al.
Published: (2025)

NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
by: Lee, Kyuho, et al.
Published: (2025)

A Computer Vision Pipeline for Iterative Bullet Hole Tracking in Rifle Zeroing
by: Belcher, Robert M., et al.
Published: (2026)

Automated Plant Disease and Pest Detection System Using Hybrid Lightweight CNN-MobileViT Models for Diagnosis of Indigenous Crops
by: Gebremedhin, Tekleab G., et al.
Published: (2025)

Single-Shot Metric Depth from Focused Plenoptic Cameras
by: Lasheras-Hernandez, Blanca, et al.
Published: (2024)

Physical Knot Classification Beyond Accuracy: A Benchmark and Diagnostic Study
by: Nie, Shiheng, et al.
Published: (2026)

Human-Centric Perception for Child Sexual Abuse Imagery
by: Laranjeira, Camila, et al.
Published: (2026)

A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments
by: Høgstedt, Espen Uri, et al.
Published: (2025)

Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
by: Ke, Yueying
Published: (2025)

POC-SLT: Partial Object Completion with SDF Latent Transformers
by: Zakeri, Faezeh, et al.
Published: (2024)

Reducing Object Hallucination in LVLMs via Emphasizing Image-negative Tokens
by: Shen, Meng, et al.
Published: (2026)

Perception-to-Pursuit: Track-Centric Temporal Reasoning for Open-World Drone Detection and Autonomous Chasing
by: Oruganti, Venkatakrishna Reddy
Published: (2026)

Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework
by: Jung, Seoik, et al.
Published: (2025)

Privacy-Preserving Structureless Visual Localization via Image Obfuscation
by: Panek, Vojtech, et al.
Published: (2026)

When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't
by: Nemitz, Jonathan, et al.
Published: (2026)

Removing Cost Volumes from Optical Flow Estimators
by: Kiefhaber, Simon, et al.
Published: (2025)

TF-Lane: Traffic Flow Module for Robust Lane Perception
by: Xie, Yihan, et al.
Published: (2026)

Generalized Closed-form Formulae for Feature-based Subpixel Alignment in Patch-based Matching
by: Jospin, Laurent Valentin, et al.
Published: (2021)

Tracking Moose using Aerial Object Detection
by: Indris, Christopher, et al.
Published: (2025)

EE3P: Event-based Estimation of Periodic Phenomena Properties
by: Kolář, Jakub, et al.
Published: (2024)

EEPPR: Event-based Estimation of Periodic Phenomena Rate using Correlation in 3D
by: Kolář, Jakub, et al.
Published: (2024)

CARDIE: clustering algorithm on relevant descriptors for image enhancement
by: Bonino, Giulia, et al.
Published: (2025)

Human Modelling and Pose Estimation Overview
by: Knap, Pawel
Published: (2024)