Saved in:
| Main Authors: | Sankaradas, Murugan, Rajendran, Ravi K., Chakradhar, Srimat T. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.14101 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery
by: Arefeen, Md Adnan, et al.
Published: (2026)
by: Arefeen, Md Adnan, et al.
Published: (2026)
Visual Alignment of Medical Vision-Language Models for Grounded Radiology Report Generation
by: Bose, Sarosij, et al.
Published: (2025)
by: Bose, Sarosij, et al.
Published: (2025)
RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance
by: Mortaheb, Matin, et al.
Published: (2025)
by: Mortaheb, Matin, et al.
Published: (2025)
iRAG: Advancing RAG for Videos with an Incremental Approach
by: Arefeen, Md Adnan, et al.
Published: (2024)
by: Arefeen, Md Adnan, et al.
Published: (2024)
TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
by: Arefeen, Md Adnan, et al.
Published: (2025)
by: Arefeen, Md Adnan, et al.
Published: (2025)
Re-ranking the Context for Multimodal Retrieval Augmented Generation
by: Mortaheb, Matin, et al.
Published: (2025)
by: Mortaheb, Matin, et al.
Published: (2025)
Differentiable JPEG: The Devil is in the Details
by: Reich, Christoph, et al.
Published: (2023)
by: Reich, Christoph, et al.
Published: (2023)
Deep Video Codec Control for Vision Models
by: Reich, Christoph, et al.
Published: (2023)
by: Reich, Christoph, et al.
Published: (2023)
MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion
by: Kalakonda, Sai Shashank, et al.
Published: (2024)
by: Kalakonda, Sai Shashank, et al.
Published: (2024)
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding
by: Xue, Zhucun, et al.
Published: (2025)
by: Xue, Zhucun, et al.
Published: (2025)
RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning
by: Lyu, Yuanhuiyi, et al.
Published: (2025)
by: Lyu, Yuanhuiyi, et al.
Published: (2025)
PROPEX-RAG: Enhanced GraphRAG using Prompt-Driven Prompt Execution
by: Sarnaik, Tejas, et al.
Published: (2025)
by: Sarnaik, Tejas, et al.
Published: (2025)
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
by: Qi, Jingyuan, et al.
Published: (2025)
by: Qi, Jingyuan, et al.
Published: (2025)
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
by: Zhu, Chenhui, et al.
Published: (2025)
by: Zhu, Chenhui, et al.
Published: (2025)
QualiRAG: Retrieval-Augmented Generation for Visual Quality Understanding
by: Cao, Linhan, et al.
Published: (2026)
by: Cao, Linhan, et al.
Published: (2026)
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
by: Huang, Yubo, et al.
Published: (2025)
by: Huang, Yubo, et al.
Published: (2025)
RAIN: Real-time Animation of Infinite Video Stream
by: Shu, Zhilei, et al.
Published: (2024)
by: Shu, Zhilei, et al.
Published: (2024)
MobileRAG: Enhancing Mobile Agent with Retrieval-Augmented Generation
by: Loo, Gowen, et al.
Published: (2025)
by: Loo, Gowen, et al.
Published: (2025)
RegionRAG: Region-level Retrieval-Augmented Generation for Visual Document Understanding
by: Li, Yinglu, et al.
Published: (2025)
by: Li, Yinglu, et al.
Published: (2025)
LayoutRAG: Retrieval-Augmented Model for Content-agnostic Conditional Layout Generation
by: Wu, Yuxuan, et al.
Published: (2025)
by: Wu, Yuxuan, et al.
Published: (2025)
PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing
by: Li, Peize, et al.
Published: (2026)
by: Li, Peize, et al.
Published: (2026)
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
by: Zhang, Junyuan, et al.
Published: (2024)
by: Zhang, Junyuan, et al.
Published: (2024)
Mesh RAG: Retrieval Augmentation for Autoregressive Mesh Generation
by: Sun, Xiatao, et al.
Published: (2025)
by: Sun, Xiatao, et al.
Published: (2025)
REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
by: Wasserman, Navve, et al.
Published: (2025)
by: Wasserman, Navve, et al.
Published: (2025)
SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads
by: Yu, Tan, et al.
Published: (2026)
by: Yu, Tan, et al.
Published: (2026)
SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding
by: Chen, Jian, et al.
Published: (2024)
by: Chen, Jian, et al.
Published: (2024)
Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
by: Zhang, Xiang, et al.
Published: (2024)
by: Zhang, Xiang, et al.
Published: (2024)
LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation
by: Song, Steven, et al.
Published: (2024)
by: Song, Steven, et al.
Published: (2024)
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation
by: Shalev-Arkushin, Rotem, et al.
Published: (2025)
by: Shalev-Arkushin, Rotem, et al.
Published: (2025)
LVLM-Aware Multimodal Retrieval for RAG-Based Medical Diagnosis with General-Purpose Models
by: Mazor, Nir, et al.
Published: (2025)
by: Mazor, Nir, et al.
Published: (2025)
Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios
by: Yan, Peizheng, et al.
Published: (2026)
by: Yan, Peizheng, et al.
Published: (2026)
SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer
by: Zhao, Yuyang, et al.
Published: (2026)
by: Zhao, Yuyang, et al.
Published: (2026)
TV-RAG: A Temporal-aware and Semantic Entropy-Weighted Framework for Long Video Retrieval and Understanding
by: Cao, Zongsheng, et al.
Published: (2025)
by: Cao, Zongsheng, et al.
Published: (2025)
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
by: Chatterjee, Dibyadip, et al.
Published: (2025)
by: Chatterjee, Dibyadip, et al.
Published: (2025)
Real-time Stereo-based 3D Object Detection for Streaming Perception
by: Li, Changcai, et al.
Published: (2024)
by: Li, Changcai, et al.
Published: (2024)
StreamingEffect: Real-Time Human-Centric Video Effect Generation
by: Song, Yiren, et al.
Published: (2026)
by: Song, Yiren, et al.
Published: (2026)
REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation
by: Wang, Haotian, et al.
Published: (2025)
by: Wang, Haotian, et al.
Published: (2025)
NICO-RAG: Multimodal Hypergraph Retrieval-Augmented Generation for Understanding the Nicotine Public Health Crisis
by: Serna-Aguilera, Manuel, et al.
Published: (2026)
by: Serna-Aguilera, Manuel, et al.
Published: (2026)
AstroRAG -- A Pagerank-Based Retrieval-Augmented Generation Pipeline for Question Answering in Astronomy
by: Wang, Zhifeng, et al.
Published: (2026)
by: Wang, Zhifeng, et al.
Published: (2026)
ROMA: Real-time Omni-Multimodal Assistant with Interactive Streaming Understanding
by: Tian, Xueyun, et al.
Published: (2026)
by: Tian, Xueyun, et al.
Published: (2026)
Similar Items
-
Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery
by: Arefeen, Md Adnan, et al.
Published: (2026) -
Visual Alignment of Medical Vision-Language Models for Grounded Radiology Report Generation
by: Bose, Sarosij, et al.
Published: (2025) -
RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance
by: Mortaheb, Matin, et al.
Published: (2025) -
iRAG: Advancing RAG for Videos with an Incremental Approach
by: Arefeen, Md Adnan, et al.
Published: (2024) -
TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
by: Arefeen, Md Adnan, et al.
Published: (2025)