:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Jian, Yanan, Yu, Fuxun, Zhang, Qi, Levine, William, Dubbs, Brandon, Karianakis, Nikolaos
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computer Vision and Pattern Recognition Information Retrieval Machine Learning
Online-Zugang:	https://arxiv.org/abs/2409.10716
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Reasoning-Augmented Representations for Multimodal Retrieval
von: Zhang, Jianrui, et al.
Veröffentlicht: (2026)

Re-ranking the Context for Multimodal Retrieval Augmented Generation
von: Mortaheb, Matin, et al.
Veröffentlicht: (2025)

Metric Compatible Training for Online Backfilling in Large-Scale Retrieval
von: Seo, Seonguk, et al.
Veröffentlicht: (2023)

RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance
von: Mortaheb, Matin, et al.
Veröffentlicht: (2025)

DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models
von: Wang, Yimu, et al.
Veröffentlicht: (2024)

RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation
von: Ling, Run, et al.
Veröffentlicht: (2025)

Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking
von: Zhu, Tianyu, et al.
Veröffentlicht: (2024)

Open Multimodal Retrieval-Augmented Factual Image Generation
von: Tian, Yang, et al.
Veröffentlicht: (2025)

NanoVDR: Distilling a 2B Vision-Language Retriever into a 70M Text-Only Encoder for Visual Document Retrieval
von: Liu, Zhuchenyang, et al.
Veröffentlicht: (2026)

Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines
von: Long, Xinwei, et al.
Veröffentlicht: (2025)

Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval
von: Lourenço, Vítor N., et al.
Veröffentlicht: (2019)

RANa: Retrieval-Augmented Navigation
von: Monaci, Gianluca, et al.
Veröffentlicht: (2025)

Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval
von: Sarkar, Rohan, et al.
Veröffentlicht: (2024)

UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
von: Jiang, Haoyu, et al.
Veröffentlicht: (2024)

Modality and Task Adaptation for Enhanced Zero-shot Composed Image Retrieval
von: Li, Haiwen, et al.
Veröffentlicht: (2024)

FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
von: Sarwar, Nobin
Veröffentlicht: (2025)

Evidential Transformers for Improved Image Retrieval
von: Dordevic, Danilo, et al.
Veröffentlicht: (2024)

Multi-event Video-Text Retrieval
von: Zhang, Gengyuan, et al.
Veröffentlicht: (2023)

Revisiting Uncertainty: On Evidential Learning for Partially Relevant Video Retrieval
von: Li, Jun, et al.
Veröffentlicht: (2026)

RAVEN: Multitask Retrieval Augmented Vision-Language Learning
von: Rao, Varun Nagaraj, et al.
Veröffentlicht: (2024)

Embedding-based Retrieval in Multimodal Content Moderation
von: Liang, Hanzhong, et al.
Veröffentlicht: (2025)

EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp Diagnosis
von: Yang, Ruijie, et al.
Veröffentlicht: (2024)

MARQUIS: A Three-Stage Pipeline for Video Retrieval-Augmented Generation
von: Chakraborty, Debashish, et al.
Veröffentlicht: (2026)

Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation
von: Luo, Weiqing, et al.
Veröffentlicht: (2026)

FOR: Finetuning for Object Level Open Vocabulary Image Retrieval
von: Levi, Hila, et al.
Veröffentlicht: (2024)

Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation
von: Cheng, Zehua, et al.
Veröffentlicht: (2026)

BRIDGE: Multimodal-to-Text Retrieval via Reinforcement-Learned Query Alignment
von: Mounis, Mohamed Darwish, et al.
Veröffentlicht: (2026)

Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
von: Sogi, Naoya, et al.
Veröffentlicht: (2024)

AutothinkRAG: Complexity-Aware Control of Retrieval-Augmented Reasoning for Image-Text Interaction
von: Yang, Jiashu, et al.
Veröffentlicht: (2026)

Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval
von: Bar, Leah, et al.
Veröffentlicht: (2024)

CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval
von: Xu, Yifan, et al.
Veröffentlicht: (2024)

Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval
von: Alomari, Hani, et al.
Veröffentlicht: (2025)

Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets
von: Dave, Ishan Rajendrakumar, et al.
Veröffentlicht: (2024)

CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval
von: Jiang, Xintong, et al.
Veröffentlicht: (2024)

Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
von: Martin, Alexander, et al.
Veröffentlicht: (2025)

LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
von: Zhao, Pengcheng, et al.
Veröffentlicht: (2025)

Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder
von: Liu, Zheyuan, et al.
Veröffentlicht: (2023)

TabRAG: Improving Tabular Document Question Answering for Retrieval Augmented Generation via Structured Representations
von: Si, Jacob, et al.
Veröffentlicht: (2025)

PC$^2$: Pseudo-Classification Based Pseudo-Captioning for Noisy Correspondence Learning in Cross-Modal Retrieval
von: Duan, Yue, et al.
Veröffentlicht: (2024)

VideoRAG: Retrieval-Augmented Generation over Video Corpus
von: Jeong, Soyeong, et al.
Veröffentlicht: (2025)