Saved in:
| Main Authors: | Ma, Zi-Ao, Lan, Tian, Tu, Rong-Cheng, Hu, Yong, Zhu, Yu-Shi, Zhang, Tong, Huang, Heyan, Wu, Zhijing, Mao, Xian-Ling |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.16365 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark
by: Tu, Rong-Cheng, et al.
Published: (2024)
by: Tu, Rong-Cheng, et al.
Published: (2024)
T2I-Eval-R1: Reinforcement Learning-Driven Reasoning for Interpretable Text-to-Image Evaluation
by: Ma, Zi-Ao, et al.
Published: (2025)
by: Ma, Zi-Ao, et al.
Published: (2025)
Distribution-Consistency-Guided Multi-modal Hashing
by: Liu, Jin-Yu, et al.
Published: (2024)
by: Liu, Jin-Yu, et al.
Published: (2024)
A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations
by: Lan, Tian, et al.
Published: (2025)
by: Lan, Tian, et al.
Published: (2025)
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection
by: Lu, Yi-Fan, et al.
Published: (2025)
by: Lu, Yi-Fan, et al.
Published: (2025)
DeepSurvey-Bench: Evaluating Academic Value of Automatically Generated Scientific Survey
by: Zhang, Guo-Biao, et al.
Published: (2026)
by: Zhang, Guo-Biao, et al.
Published: (2026)
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation
by: Hu, Chan-Wei, et al.
Published: (2025)
by: Hu, Chan-Wei, et al.
Published: (2025)
Multi-modal Data Spectrum: Multi-modal Datasets are Multi-dimensional
by: Madaan, Divyam, et al.
Published: (2025)
by: Madaan, Divyam, et al.
Published: (2025)
FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference
by: Liu, Runheng, et al.
Published: (2024)
by: Liu, Runheng, et al.
Published: (2024)
Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation
by: Cheng, Zehua, et al.
Published: (2026)
by: Cheng, Zehua, et al.
Published: (2026)
CLIP Multi-modal Hashing for Multimedia Retrieval
by: Zhu, Jian, et al.
Published: (2024)
by: Zhu, Jian, et al.
Published: (2024)
Training Language Models to Critique With Multi-agent Feedback
by: Lan, Tian, et al.
Published: (2024)
by: Lan, Tian, et al.
Published: (2024)
Fine-grained Action Analysis: A Multi-modality and Multi-task Dataset of Figure Skating
by: Liu, Sheng-Lan, et al.
Published: (2023)
by: Liu, Sheng-Lan, et al.
Published: (2023)
MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation
by: Xiao, Xingchen, et al.
Published: (2026)
by: Xiao, Xingchen, et al.
Published: (2026)
RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis
by: Li, Haolin, et al.
Published: (2025)
by: Li, Haolin, et al.
Published: (2025)
Natural Language-Assisted Multi-modal Medication Recommendation
by: Tan, Jie, et al.
Published: (2025)
by: Tan, Jie, et al.
Published: (2025)
MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection
by: Lu, Weihai, et al.
Published: (2026)
by: Lu, Weihai, et al.
Published: (2026)
A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation
by: Che, Tian-Yi, et al.
Published: (2024)
by: Che, Tian-Yi, et al.
Published: (2024)
Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation
by: Zhang, Jiankun, et al.
Published: (2025)
by: Zhang, Jiankun, et al.
Published: (2025)
AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For Asthma Patient Support
by: Bahaj, Adil, et al.
Published: (2024)
by: Bahaj, Adil, et al.
Published: (2024)
Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models
by: Lu, Yi-Fan, et al.
Published: (2024)
by: Lu, Yi-Fan, et al.
Published: (2024)
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
by: Zhang, Kun, et al.
Published: (2025)
by: Zhang, Kun, et al.
Published: (2025)
Exploring the Potential of Multi-modal Sensing Framework for Forest Ecology
by: Romanello, Luca, et al.
Published: (2024)
by: Romanello, Luca, et al.
Published: (2024)
Oracle Bone Inscriptions Multi-modal Dataset
by: Li, Bang, et al.
Published: (2024)
by: Li, Bang, et al.
Published: (2024)
Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection
by: Wang, Kunpeng, et al.
Published: (2024)
by: Wang, Kunpeng, et al.
Published: (2024)
CriticEval: Evaluating Large Language Model as Critic
by: Lan, Tian, et al.
Published: (2024)
by: Lan, Tian, et al.
Published: (2024)
An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
by: Wang, Mengzhao, et al.
Published: (2024)
by: Wang, Mengzhao, et al.
Published: (2024)
Social Debiasing for Fair Multi-modal LLMs
by: Cheng, Harry, et al.
Published: (2024)
by: Cheng, Harry, et al.
Published: (2024)
Personalizing Causal Audio-Driven Facial Motion via Dynamic Multi-modal Retrieval
by: Chu, Xuangeng, et al.
Published: (2026)
by: Chu, Xuangeng, et al.
Published: (2026)
Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval
by: Ma, Zehong, et al.
Published: (2025)
by: Ma, Zehong, et al.
Published: (2025)
Visual Grounding with Multi-modal Conditional Adaptation
by: Yao, Ruilin, et al.
Published: (2024)
by: Yao, Ruilin, et al.
Published: (2024)
Yambda-5B -- A Large-Scale Multi-modal Dataset for Ranking And Retrieval
by: Ploshkin, A., et al.
Published: (2025)
by: Ploshkin, A., et al.
Published: (2025)
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval
by: Wu, Siwei, et al.
Published: (2024)
by: Wu, Siwei, et al.
Published: (2024)
UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark
by: Zhou, Zhaokun, et al.
Published: (2024)
by: Zhou, Zhaokun, et al.
Published: (2024)
Mix-Initiative Response Generation with Dynamic Prefix Tuning
by: Nie, Yuxiang, et al.
Published: (2024)
by: Nie, Yuxiang, et al.
Published: (2024)
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation
by: Zhou, Zhongliang, et al.
Published: (2024)
by: Zhou, Zhongliang, et al.
Published: (2024)
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing
by: Liu, Shengchao, et al.
Published: (2022)
by: Liu, Shengchao, et al.
Published: (2022)
Enhancing Incomplete Multi-modal Brain Tumor Segmentation with Intra-modal Asymmetry and Inter-modal Dependency
by: Liu, Weide, et al.
Published: (2024)
by: Liu, Weide, et al.
Published: (2024)
Exploration of Augmentation Strategies in Multi-modal Retrieval-Augmented Generation for the Biomedical Domain: A Case Study Evaluating Question Answering in Glycobiology
by: Kocbek, Primož, et al.
Published: (2025)
by: Kocbek, Primož, et al.
Published: (2025)
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
by: Shi, Haojun, et al.
Published: (2024)
by: Shi, Haojun, et al.
Published: (2024)
Similar Items
-
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark
by: Tu, Rong-Cheng, et al.
Published: (2024) -
T2I-Eval-R1: Reinforcement Learning-Driven Reasoning for Interpretable Text-to-Image Evaluation
by: Ma, Zi-Ao, et al.
Published: (2025) -
Distribution-Consistency-Guided Multi-modal Hashing
by: Liu, Jin-Yu, et al.
Published: (2024) -
A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations
by: Lan, Tian, et al.
Published: (2025) -
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection
by: Lu, Yi-Fan, et al.
Published: (2025)