Saved in:
| Main Authors: | Bhattacharyya, Aniket, Tripathi, Anurag, Das, Ujjal, Karmakar, Archan, Pathak, Amit, Gupta, Maneesh |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.13535 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation
by: Bhattacharyya, Aniket, et al.
Published: (2024)
by: Bhattacharyya, Aniket, et al.
Published: (2024)
Evidence Units: Ontology-Grounded Document Organization for Parser-Independent Retrieval
by: Han, Yeonjee
Published: (2026)
by: Han, Yeonjee
Published: (2026)
Digitization of Document and Information Extraction using OCR
by: Sinha, Rasha, et al.
Published: (2025)
by: Sinha, Rasha, et al.
Published: (2025)
Cross-Modal Entity Matching for Visually Rich Documents
by: Sarkhel, Ritesh, et al.
Published: (2023)
by: Sarkhel, Ritesh, et al.
Published: (2023)
HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents
by: Tong, Anyang, et al.
Published: (2025)
by: Tong, Anyang, et al.
Published: (2025)
Roles of MLLMs in Visually Rich Document Retrieval for RAG: A Survey
by: Zhang, Xiantao
Published: (2025)
by: Zhang, Xiantao
Published: (2025)
A Multi-Granularity Retrieval Framework for Visually-Rich Documents
by: Xu, Mingjun, et al.
Published: (2025)
by: Xu, Mingjun, et al.
Published: (2025)
Information Extraction From Fiscal Documents Using LLMs
by: Aggarwal, Vikram, et al.
Published: (2025)
by: Aggarwal, Vikram, et al.
Published: (2025)
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
by: Tripathi, Vishesh, et al.
Published: (2025)
by: Tripathi, Vishesh, et al.
Published: (2025)
MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents
by: Donabauer, Gregor, et al.
Published: (2025)
by: Donabauer, Gregor, et al.
Published: (2025)
Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images
by: Xiao, Bin, et al.
Published: (2023)
by: Xiao, Bin, et al.
Published: (2023)
C2T-ID: Converting Semantic Codebooks to Textual Document Identifiers for Generative Search
by: Zhang, Yingchen, et al.
Published: (2025)
by: Zhang, Yingchen, et al.
Published: (2025)
DocGraphLM: Documental Graph Language Model for Information Extraction
by: Wang, Dongsheng, et al.
Published: (2024)
by: Wang, Dongsheng, et al.
Published: (2024)
Multilingual Information Retrieval with a Monolingual Knowledge Base
by: Zhuang, Yingying, et al.
Published: (2025)
by: Zhuang, Yingying, et al.
Published: (2025)
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
by: Agarwal, Amit, et al.
Published: (2025)
by: Agarwal, Amit, et al.
Published: (2025)
Robustness of Structured Data Extraction from In-plane Rotated Documents using Multi-Modal Large Language Models (LLM)
by: Biswas, Anjanava, et al.
Published: (2024)
by: Biswas, Anjanava, et al.
Published: (2024)
Indian Libraries: Documentation and Automation in Library Services.
by: Das Gupta, Krishna
Published: (1979)
by: Das Gupta, Krishna
Published: (1979)
MHier-RAG: Multi-Modal RAG for Visual-Rich Document Question-Answering via Hierarchical and Multi-Granularity Reasoning
by: Gong, Ziyu, et al.
Published: (2025)
by: Gong, Ziyu, et al.
Published: (2025)
Knowledge-Driven Cross-Document Relation Extraction
by: Jain, Monika, et al.
Published: (2024)
by: Jain, Monika, et al.
Published: (2024)
VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
by: Tanaka, Ryota, et al.
Published: (2025)
by: Tanaka, Ryota, et al.
Published: (2025)
REALM: Recursive Relevance Modeling for LLM-based Document Re-Ranking
by: Wang, Pinhuan, et al.
Published: (2025)
by: Wang, Pinhuan, et al.
Published: (2025)
Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset
by: Rasool, Zafaryab, et al.
Published: (2023)
by: Rasool, Zafaryab, et al.
Published: (2023)
Document Organization Using Kohonen's Algorithm.
by: Guerrero Bote, Vicente P., et al.
Published: (2002)
by: Guerrero Bote, Vicente P., et al.
Published: (2002)
Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction
by: Jain, Monika, et al.
Published: (2024)
by: Jain, Monika, et al.
Published: (2024)
Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy
by: Kim, Juyeon, et al.
Published: (2025)
by: Kim, Juyeon, et al.
Published: (2025)
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
by: Yoon, Yejun, et al.
Published: (2025)
by: Yoon, Yejun, et al.
Published: (2025)
ModernVBERT: Towards Smaller Visual Document Retrievers
by: Teiletche, Paul, et al.
Published: (2025)
by: Teiletche, Paul, et al.
Published: (2025)
Application Of Large Language Models For The Extraction Of Information From Particle Accelerator Technical Documentation
by: Dai, Qing, et al.
Published: (2025)
by: Dai, Qing, et al.
Published: (2025)
Passage Segmentation of Documents for Extractive Question Answering
by: Liu, Zuhong, et al.
Published: (2025)
by: Liu, Zuhong, et al.
Published: (2025)
Automatic Extraction of Document Keyphrases for Use in Digital Libraries: Evaluation and Applications.
by: Jones, Steve, et al.
Published: (2002)
by: Jones, Steve, et al.
Published: (2002)
Multi-Relation Extraction in Entity Pairs using Global Context
by: Nilesh, et al.
Published: (2025)
by: Nilesh, et al.
Published: (2025)
Multi-Stage Field Extraction of Financial Documents with OCR and Compact Vision-Language Models
by: Jin, Yichao, et al.
Published: (2025)
by: Jin, Yichao, et al.
Published: (2025)
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration
by: Dai, Sunhao, et al.
Published: (2024)
by: Dai, Sunhao, et al.
Published: (2024)
MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation
by: Malviya, Yash, et al.
Published: (2024)
by: Malviya, Yash, et al.
Published: (2024)
Clinical Document Metadata Extraction: A Scoping Review
by: Miller, Kurt, et al.
Published: (2025)
by: Miller, Kurt, et al.
Published: (2025)
Chemical Reaction Extraction from Long Patent Documents
by: Jadhav, Aishwarya, et al.
Published: (2024)
by: Jadhav, Aishwarya, et al.
Published: (2024)
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use
by: Cesista, Franz Louis, et al.
Published: (2024)
by: Cesista, Franz Louis, et al.
Published: (2024)
Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction
by: Qiao, Jingfen, et al.
Published: (2025)
by: Qiao, Jingfen, et al.
Published: (2025)
Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking
by: Huang, Jerry, et al.
Published: (2025)
by: Huang, Jerry, et al.
Published: (2025)
Similar Items
-
Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation
by: Bhattacharyya, Aniket, et al.
Published: (2024) -
Evidence Units: Ontology-Grounded Document Organization for Parser-Independent Retrieval
by: Han, Yeonjee
Published: (2026) -
Digitization of Document and Information Extraction using OCR
by: Sinha, Rasha, et al.
Published: (2025) -
Cross-Modal Entity Matching for Visually Rich Documents
by: Sarkhel, Ritesh, et al.
Published: (2023) -
HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents
by: Tong, Anyang, et al.
Published: (2025)