:: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Moran, Sean
Format:	Preprint
Published:	2025
Subjects:	Information Retrieval Artificial Intelligence Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2510.04127
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MOON Embedding: Multimodal Representation Learning for E-commerce Search Advertising
by: Fu, Chenghan, et al.
Published: (2025)

Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
by: Zhang, Zhixin, et al.
Published: (2024)

GENIUS: A Generative Framework for Universal Multimodal Search
by: Kim, Sungyeon, et al.
Published: (2025)

Retrieval-augmented Prompt Learning for Pre-trained Foundation Models
by: Chen, Xiang, et al.
Published: (2025)

LookSync: Large-Scale Visual Product Search System for AI-Generated Fashion Looks
by: M, Pradeep, et al.
Published: (2025)

Image Hashing via Cross-View Code Alignment in the Age of Foundation Models
by: Moummad, Ilyass, et al.
Published: (2025)

PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval
by: Zou, Qiang, et al.
Published: (2025)

TelcoAI: Advancing 3GPP Technical Specification Search through Agentic Multi-Modal Retrieval-Augmented Generation
by: Ghosh, Rahul, et al.
Published: (2025)

Closing the Modality Gap for Mixed Modality Search
by: Li, Binxu, et al.
Published: (2025)

CoopHash: Cooperative Learning of Multipurpose Descriptor and Contrastive Pair Generator via Variational MCMC Teaching for Supervised Image Hashing
by: Doan, Khoa D., et al.
Published: (2022)

Distribution-Consistency-Guided Multi-modal Hashing
by: Liu, Jin-Yu, et al.
Published: (2024)

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
by: Chen, Zhuo, et al.
Published: (2024)

MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding
by: Zhang, Daoze, et al.
Published: (2025)

MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding
by: Wu, Junxian, et al.
Published: (2026)

Data-Centric Approach to Constrained Machine Learning: A Case Study on Conway's Game of Life
by: Bibin, Anton, et al.
Published: (2024)

MOON2.0: Dynamic Modality-balanced Multimodal Representation Learning for E-commerce Product Understanding
by: Nie, Zhanheng, et al.
Published: (2025)

PC$^2$: Pseudo-Classification Based Pseudo-Captioning for Noisy Correspondence Learning in Cross-Modal Retrieval
by: Duan, Yue, et al.
Published: (2024)

Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method based on Fast Fourier Convolution and ConvNeXt
by: Zhou, Han, et al.
Published: (2023)

Attributes Grouping and Mining Hashing for Fine-Grained Image Retrieval
by: Lu, Xin, et al.
Published: (2023)

ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition
by: Akram, Muhammad Waseem, et al.
Published: (2025)

A Signer-Invariant Conformer and Multi-Scale Fusion Transformer for Continuous Sign Language Recognition
by: Haque, Md Rezwanul, et al.
Published: (2025)

Multimodal RAG Enhanced Visual Description
by: Jaiswal, Amit Kumar, et al.
Published: (2025)

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
by: Wen, Tiansheng, et al.
Published: (2025)

Open Multimodal Retrieval-Augmented Factual Image Generation
by: Tian, Yang, et al.
Published: (2025)

Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting
by: Maurya, Amritansh, et al.
Published: (2026)

Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
by: Xiao, Ling, et al.
Published: (2022)

FeatUp: A Model-Agnostic Framework for Features at Any Resolution
by: Fu, Stephanie, et al.
Published: (2024)

Reasoning-Augmented Representations for Multimodal Retrieval
by: Zhang, Jianrui, et al.
Published: (2026)

Sustainable techniques to improve Data Quality for training image-based explanatory models for Recommender Systems
by: Paz-Ruza, Jorge, et al.
Published: (2024)

Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs
by: Forouzandehmehr, Najmeh, et al.
Published: (2024)

CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
by: Li, Po-han, et al.
Published: (2024)

Incremental Concept Formation over Visual Images Without Catastrophic Forgetting
by: Barari, Nicki, et al.
Published: (2024)

Semantic-Cohesive Knowledge Distillation for Deep Cross-modal Hashing
by: Sun, Changchang, et al.
Published: (2025)

Foundation Models and Information Retrieval in Digital Pathology
by: Tizhoosh, H. R.
Published: (2024)

Evaluation of Embedding-Based and Generative Methods for LLM-Driven Document Classification: Opportunities and Challenges
by: Lu, Rong, et al.
Published: (2026)

Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval
by: Deanda, Demetrio, et al.
Published: (2025)

Potential Field Based Deep Metric Learning
by: Bhatnagar, Shubhang, et al.
Published: (2024)

Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models
by: Williams-Lekuona, Mikel, et al.
Published: (2025)

Infinite Video Understanding
by: Zhang, Dell, et al.
Published: (2025)

Counteracting temporal attacks in Video Copy Detection
by: Fojcik, Katarzyna, et al.
Published: (2025)