:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rizk, Basem, Walsh, Joel, Core, Mark, Nye, Benjamin
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Computation and Language Information Retrieval
Online Access:	https://arxiv.org/abs/2510.01513
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Indexing Multimodal Language Models for Large-scale Image Retrieval
by: Tharwat, Bahey, et al.
Published: (2026)

VLM-KG: Multimodal Radiology Knowledge Graph Generation
by: Abdullah, Abdullah, et al.
Published: (2025)

CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval
by: Wan, David, et al.
Published: (2025)

DSRAG: A Domain-Specific Retrieval Framework Based on Document-derived Multimodal Knowledge Graph
by: Yang, Mengzheng, et al.
Published: (2025)

MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation
by: Hsiao, Chi-Hsiang, et al.
Published: (2025)

Multi-Vector Index Compression in Any Modality
by: Qin, Hanxiang, et al.
Published: (2026)

SemRAG: Semantic Knowledge-Augmented RAG for Improved Question-Answering
by: Zhong, Kezhen, et al.
Published: (2025)

GeoOutageKG: A Multimodal Geospatiotemporal Knowledge Graph for Multiresolution Power Outage Analysis
by: Frakes, Ethan, et al.
Published: (2025)

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory
by: Guo, Minghao, et al.
Published: (2026)

Few-Shot Prompting for Extractive Quranic QA with Instruction-Tuned LLMs
by: Basem, Mohamed, et al.
Published: (2025)

Cross-Language Approach for Quranic QA
by: Oshallah, Islam, et al.
Published: (2025)

Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
by: Martin, Alexander, et al.
Published: (2025)

VideoAgent: Long-form Video Understanding with Large Language Model as Agent
by: Wang, Xiaohan, et al.
Published: (2024)

An Index-based Approach for Efficient and Effective Web Content Extraction
by: Chen, Yihan, et al.
Published: (2025)

The Structure-Content Trade-off in Knowledge Graph Retrieval
by: Six, Valentin, et al.
Published: (2025)

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users
by: Kim, Yejin, et al.
Published: (2024)

Optimized Quran Passage Retrieval Using an Expanded QA Dataset and Fine-Tuned Language Models
by: Basem, Mohamed, et al.
Published: (2024)

Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction
by: Basem, Mohamed, et al.
Published: (2025)

Self Knowledge Re-expression: A Fully Local Method for Adapting LLMs to Tasks Using Intrinsic Knowledge
by: Wang, Mengyu, et al.
Published: (2026)

Index Light, Reason Deep: Deferred Visual Ingestion for Visual-Dense Document Question Answering
by: Xu, Tao
Published: (2026)

Using Knowledge Graphs to harvest datasets for efficient CLIP model training
by: Ging, Simon, et al.
Published: (2025)

From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents
by: Lian, Niu, et al.
Published: (2026)

KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking
by: Kim, Juyeon, et al.
Published: (2025)

Windsock is Dancing: Adaptive Multimodal Retrieval-Augmented Generation
by: Zhao, Shu, et al.
Published: (2025)

X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation
by: Lyu, Hanjia, et al.
Published: (2024)

A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multimodal
by: Liang, Ke, et al.
Published: (2022)

Ontology-Based Knowledge Graph Framework for Industrial Standard Documents via Hierarchical and Propositional Structuring
by: Park, Jiin, et al.
Published: (2025)

InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search
by: Hou, Bohan, et al.
Published: (2026)

Benchmarking Retrieval-Augmented Multimodal Generation for Document Question Answering
by: Dong, Kuicai, et al.
Published: (2025)

E5-V: Universal Embeddings with Multimodal Large Language Models
by: Jiang, Ting, et al.
Published: (2024)

NativE: Multi-modal Knowledge Graph Completion in the Wild
by: Zhang, Yichi, et al.
Published: (2024)

ConceptFormer: Towards Efficient Use of Knowledge-Graph Embeddings in Large Language Models
by: Barmettler, Joel, et al.
Published: (2025)

Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking
by: Dai, Ziqi, et al.
Published: (2025)

MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
by: Xiao, Zilin, et al.
Published: (2025)

Understanding Parametric Knowledge Injection in Retrieval-Augmented Generation
by: Tang, Minghao, et al.
Published: (2025)

Do Recommender Systems Really Leverage Multimodal Content? A Comprehensive Analysis on Multimodal Representations for Recommendation
by: Pomo, Claudio, et al.
Published: (2025)

CollEX -- A Multimodal Agentic RAG System Enabling Interactive Exploration of Scientific Collections
by: Schneider, Florian, et al.
Published: (2025)

WikiSeeker: Rethinking the Role of Vision-Language Models in Knowledge-Based Visual Question Answering
by: Zhu, Yingjian, et al.
Published: (2026)

Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture
by: Zhang, Longxiang, et al.
Published: (2026)

Structurally Refined Graph Transformer for Multimodal Recommendation
by: Shi, Ke, et al.
Published: (2025)