:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bhattacharyya, Aniket, Tripathi, Anurag, Das, Ujjal, Karmakar, Archan, Pathak, Amit, Gupta, Maneesh
Format:	Preprint
Published:	2025
Subjects:	Information Retrieval Artificial Intelligence
Online Access:	https://arxiv.org/abs/2505.13535
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation
by: Bhattacharyya, Aniket, et al.
Published: (2024)

Evidence Units: Ontology-Grounded Document Organization for Parser-Independent Retrieval
by: Han, Yeonjee
Published: (2026)

Digitization of Document and Information Extraction using OCR
by: Sinha, Rasha, et al.
Published: (2025)

Cross-Modal Entity Matching for Visually Rich Documents
by: Sarkhel, Ritesh, et al.
Published: (2023)

HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents
by: Tong, Anyang, et al.
Published: (2025)

Roles of MLLMs in Visually Rich Document Retrieval for RAG: A Survey
by: Zhang, Xiantao
Published: (2025)

A Multi-Granularity Retrieval Framework for Visually-Rich Documents
by: Xu, Mingjun, et al.
Published: (2025)

Information Extraction From Fiscal Documents Using LLMs
by: Aggarwal, Vikram, et al.
Published: (2025)

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
by: Tripathi, Vishesh, et al.
Published: (2025)

MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents
by: Donabauer, Gregor, et al.
Published: (2025)

Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images
by: Xiao, Bin, et al.
Published: (2023)

C2T-ID: Converting Semantic Codebooks to Textual Document Identifiers for Generative Search
by: Zhang, Yingchen, et al.
Published: (2025)

DocGraphLM: Documental Graph Language Model for Information Extraction
by: Wang, Dongsheng, et al.
Published: (2024)

Multilingual Information Retrieval with a Monolingual Knowledge Base
by: Zhuang, Yingying, et al.
Published: (2025)

FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
by: Agarwal, Amit, et al.
Published: (2025)

Robustness of Structured Data Extraction from In-plane Rotated Documents using Multi-Modal Large Language Models (LLM)
by: Biswas, Anjanava, et al.
Published: (2024)

Indian Libraries: Documentation and Automation in Library Services.
by: Das Gupta, Krishna
Published: (1979)

MHier-RAG: Multi-Modal RAG for Visual-Rich Document Question-Answering via Hierarchical and Multi-Granularity Reasoning
by: Gong, Ziyu, et al.
Published: (2025)

Knowledge-Driven Cross-Document Relation Extraction
by: Jain, Monika, et al.
Published: (2024)

VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
by: Tanaka, Ryota, et al.
Published: (2025)

REALM: Recursive Relevance Modeling for LLM-based Document Re-Ranking
by: Wang, Pinhuan, et al.
Published: (2025)

Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset
by: Rasool, Zafaryab, et al.
Published: (2023)

Document Organization Using Kohonen's Algorithm.
by: Guerrero Bote, Vicente P., et al.
Published: (2002)

Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction
by: Jain, Monika, et al.
Published: (2024)

Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy
by: Kim, Juyeon, et al.
Published: (2025)

Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents
by: Wang, Hao, et al.
Published: (2024)

Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
by: Yoon, Yejun, et al.
Published: (2025)

ModernVBERT: Towards Smaller Visual Document Retrievers
by: Teiletche, Paul, et al.
Published: (2025)

Application Of Large Language Models For The Extraction Of Information From Particle Accelerator Technical Documentation
by: Dai, Qing, et al.
Published: (2025)

Passage Segmentation of Documents for Extractive Question Answering
by: Liu, Zuhong, et al.
Published: (2025)

Automatic Extraction of Document Keyphrases for Use in Digital Libraries: Evaluation and Applications.
by: Jones, Steve, et al.
Published: (2002)

Multi-Relation Extraction in Entity Pairs using Global Context
by: Nilesh, et al.
Published: (2025)

Multi-Stage Field Extraction of Financial Documents with OCR and Compact Vision-Language Models
by: Jin, Yichao, et al.
Published: (2025)

Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration
by: Dai, Sunhao, et al.
Published: (2024)

MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation
by: Malviya, Yash, et al.
Published: (2024)

Clinical Document Metadata Extraction: A Scoping Review
by: Miller, Kurt, et al.
Published: (2025)

Chemical Reaction Extraction from Long Patent Documents
by: Jadhav, Aishwarya, et al.
Published: (2024)

Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use
by: Cesista, Franz Louis, et al.
Published: (2024)

Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction
by: Qiao, Jingfen, et al.
Published: (2025)

Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking
by: Huang, Jerry, et al.
Published: (2025)