:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dordevic, Danilo, Kumar, Suryansh
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Information Retrieval Machine Learning
Online Access:	https://arxiv.org/abs/2409.01082
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Revisiting Uncertainty: On Evidential Learning for Partially Relevant Video Retrieval
by: Li, Jun, et al.
Published: (2026)

FOR: Finetuning for Object Level Open Vocabulary Image Retrieval
by: Levi, Hila, et al.
Published: (2024)

Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
by: Sogi, Naoya, et al.
Published: (2024)

Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval
by: Lourenço, Vítor N., et al.
Published: (2019)

Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder
by: Liu, Zheyuan, et al.
Published: (2023)

Multi-event Video-Text Retrieval
by: Zhang, Gengyuan, et al.
Published: (2023)

Embedding-based Retrieval in Multimodal Content Moderation
by: Liang, Hanzhong, et al.
Published: (2025)

Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking
by: Zhu, Tianyu, et al.
Published: (2024)

NanoVDR: Distilling a 2B Vision-Language Retriever into a 70M Text-Only Encoder for Visual Document Retrieval
by: Liu, Zhuchenyang, et al.
Published: (2026)

DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models
by: Wang, Yimu, et al.
Published: (2024)

Online Learning via Memory: Retrieval-Augmented Detector Adaptation
by: Jian, Yanan, et al.
Published: (2024)

Metric Compatible Training for Online Backfilling in Large-Scale Retrieval
by: Seo, Seonguk, et al.
Published: (2023)

Open Multimodal Retrieval-Augmented Factual Image Generation
by: Tian, Yang, et al.
Published: (2025)

CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval
by: Xu, Yifan, et al.
Published: (2024)

Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets
by: Dave, Ishan Rajendrakumar, et al.
Published: (2024)

Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval
by: Alomari, Hani, et al.
Published: (2025)

LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
by: Zhao, Pengcheng, et al.
Published: (2025)

Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval
by: Sarkar, Rohan, et al.
Published: (2024)

Re-ranking the Context for Multimodal Retrieval Augmented Generation
by: Mortaheb, Matin, et al.
Published: (2025)

RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance
by: Mortaheb, Matin, et al.
Published: (2025)

Transfer Learning with Self-Supervised Vision Transformers for Snake Identification
by: Miyaguchi, Anthony, et al.
Published: (2024)

Multi-Label Plant Species Classification with Self-Supervised Vision Transformers
by: Gustineli, Murilo, et al.
Published: (2024)

Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models
by: Yada, Yuki, et al.
Published: (2025)

PICS: Pipeline for Image Captioning and Search
by: Rosario, Grant, et al.
Published: (2024)

Multi-Label Plant Species Prediction with Metadata-Enhanced Multi-Head Vision Transformers
by: Herasimchyk, Hanna, et al.
Published: (2025)

Image Fusion for Cross-Domain Sequential Recommendation
by: Wu, Wangyu, et al.
Published: (2024)

Image Outlier Detection Without Training using RANSAC
by: Tsai, Chen-Han, et al.
Published: (2023)

Sustainable transparency in Recommender Systems: Bayesian Ranking of Images for Explainability
by: Paz-Ruza, Jorge, et al.
Published: (2023)

Image Hashing via Cross-View Code Alignment in the Age of Foundation Models
by: Moummad, Ilyass, et al.
Published: (2025)

Towards Resource-Efficient Streaming of Large-Scale Medical Image Datasets for Deep Learning
by: Kulkarni, Pranav, et al.
Published: (2023)

Understanding the Performance Plateau in Text-to-Video Retrieval: A Comprehensive Empirical and Linguistic Analysis
by: Pegia, Maria-Eirini, et al.
Published: (2026)

Revisit Anything: Visual Place Recognition via Image Segment Retrieval
by: Garg, Kartik, et al.
Published: (2024)

Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models
by: Williams-Lekuona, Mikel, et al.
Published: (2025)

Siamese Content-based Search Engine for a More Transparent Skin and Breast Cancer Diagnosis through Histological Imaging
by: Tabatabaei, Zahra, et al.
Published: (2024)

CoopHash: Cooperative Learning of Multipurpose Descriptor and Contrastive Pair Generator via Variational MCMC Teaching for Supervised Image Hashing
by: Doan, Khoa D., et al.
Published: (2022)

Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval
by: Deanda, Demetrio, et al.
Published: (2025)

Reasoning-Augmented Representations for Multimodal Retrieval
by: Zhang, Jianrui, et al.
Published: (2026)

Multimedia-Aware Question Answering: A Review of Retrieval and Cross-Modal Reasoning Architectures
by: Raja, Rahul, et al.
Published: (2025)

FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
by: Sarwar, Nobin
Published: (2025)

TabRAG: Improving Tabular Document Question Answering for Retrieval Augmented Generation via Structured Representations
by: Si, Jacob, et al.
Published: (2025)