:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sankaradas, Murugan, Rajendran, Ravi K., Chakradhar, Srimat T.
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2501.14101
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery
by: Arefeen, Md Adnan, et al.
Published: (2026)

Visual Alignment of Medical Vision-Language Models for Grounded Radiology Report Generation
by: Bose, Sarosij, et al.
Published: (2025)

RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance
by: Mortaheb, Matin, et al.
Published: (2025)

iRAG: Advancing RAG for Videos with an Incremental Approach
by: Arefeen, Md Adnan, et al.
Published: (2024)

TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
by: Arefeen, Md Adnan, et al.
Published: (2025)

Re-ranking the Context for Multimodal Retrieval Augmented Generation
by: Mortaheb, Matin, et al.
Published: (2025)

Differentiable JPEG: The Devil is in the Details
by: Reich, Christoph, et al.
Published: (2023)

Deep Video Codec Control for Vision Models
by: Reich, Christoph, et al.
Published: (2023)

MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion
by: Kalakonda, Sai Shashank, et al.
Published: (2024)

AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding
by: Xue, Zhucun, et al.
Published: (2025)

RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning
by: Lyu, Yuanhuiyi, et al.
Published: (2025)

PROPEX-RAG: Enhanced GraphRAG using Prompt-Driven Prompt Execution
by: Sarnaik, Tejas, et al.
Published: (2025)

AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
by: Qi, Jingyuan, et al.
Published: (2025)

MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
by: Zhu, Chenhui, et al.
Published: (2025)

QualiRAG: Retrieval-Augmented Generation for Visual Quality Understanding
by: Cao, Linhan, et al.
Published: (2026)

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
by: Huang, Yubo, et al.
Published: (2025)

RAIN: Real-time Animation of Infinite Video Stream
by: Shu, Zhilei, et al.
Published: (2024)

MobileRAG: Enhancing Mobile Agent with Retrieval-Augmented Generation
by: Loo, Gowen, et al.
Published: (2025)

RegionRAG: Region-level Retrieval-Augmented Generation for Visual Document Understanding
by: Li, Yinglu, et al.
Published: (2025)

LayoutRAG: Retrieval-Augmented Model for Content-agnostic Conditional Layout Generation
by: Wu, Yuxuan, et al.
Published: (2025)

PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing
by: Li, Peize, et al.
Published: (2026)

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
by: Zhang, Junyuan, et al.
Published: (2024)

Mesh RAG: Retrieval Augmentation for Autoregressive Mesh Generation
by: Sun, Xiatao, et al.
Published: (2025)

REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
by: Wasserman, Navve, et al.
Published: (2025)

SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads
by: Yu, Tan, et al.
Published: (2026)

SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding
by: Chen, Jian, et al.
Published: (2024)

Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
by: Zhang, Xiang, et al.
Published: (2024)

LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation
by: Song, Steven, et al.
Published: (2024)

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation
by: Shalev-Arkushin, Rotem, et al.
Published: (2025)

LVLM-Aware Multimodal Retrieval for RAG-Based Medical Diagnosis with General-Purpose Models
by: Mazor, Nir, et al.
Published: (2025)

Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios
by: Yan, Peizheng, et al.
Published: (2026)

SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer
by: Zhao, Yuyang, et al.
Published: (2026)

TV-RAG: A Temporal-aware and Semantic Entropy-Weighted Framework for Long Video Retrieval and Understanding
by: Cao, Zongsheng, et al.
Published: (2025)

Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
by: Chatterjee, Dibyadip, et al.
Published: (2025)

Real-time Stereo-based 3D Object Detection for Streaming Perception
by: Li, Changcai, et al.
Published: (2024)

StreamingEffect: Real-Time Human-Centric Video Effect Generation
by: Song, Yiren, et al.
Published: (2026)

REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation
by: Wang, Haotian, et al.
Published: (2025)

NICO-RAG: Multimodal Hypergraph Retrieval-Augmented Generation for Understanding the Nicotine Public Health Crisis
by: Serna-Aguilera, Manuel, et al.
Published: (2026)

AstroRAG -- A Pagerank-Based Retrieval-Augmented Generation Pipeline for Question Answering in Astronomy
by: Wang, Zhifeng, et al.
Published: (2026)

ROMA: Real-time Omni-Multimodal Assistant with Interactive Streaming Understanding
by: Tian, Xueyun, et al.
Published: (2026)