Saved in:
| Main Author: | Aguilar, Sergio Torres |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.20326 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis
by: Heo, Inbum, et al.
Published: (2026)
by: Heo, Inbum, et al.
Published: (2026)
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs
by: Yang, Zhantao, et al.
Published: (2024)
by: Yang, Zhantao, et al.
Published: (2024)
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
by: Luo, Chuwei, et al.
Published: (2024)
by: Luo, Chuwei, et al.
Published: (2024)
TRIDIS: A Comprehensive Medieval and Early Modern Corpus for HTR and NER
by: Aguilar, Sergio Torres
Published: (2025)
by: Aguilar, Sergio Torres
Published: (2025)
YOLO Object Detectors for Robotics -- a Comparative Study
by: Niżeniec, Patryk, et al.
Published: (2026)
by: Niżeniec, Patryk, et al.
Published: (2026)
SODIUM: From Open Web Data to Queryable Databases
by: Hu, Chuxuan, et al.
Published: (2026)
by: Hu, Chuxuan, et al.
Published: (2026)
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
by: Zhao, Zhiyuan, et al.
Published: (2024)
by: Zhao, Zhiyuan, et al.
Published: (2024)
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs
by: Lopez-Duran, Miguel, et al.
Published: (2025)
by: Lopez-Duran, Miguel, et al.
Published: (2025)
LAPDoc: Layout-Aware Prompting for Documents
by: Lamott, Marcel, et al.
Published: (2024)
by: Lamott, Marcel, et al.
Published: (2024)
Visually Guided Generative Text-Layout Pre-training for Document Intelligence
by: Mao, Zhiming, et al.
Published: (2024)
by: Mao, Zhiming, et al.
Published: (2024)
Improving Image Captioning Descriptiveness by Ranking and LLM-based Fusion
by: Celona, Luigi, et al.
Published: (2023)
by: Celona, Luigi, et al.
Published: (2023)
SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection
by: Hu, Xingjian, et al.
Published: (2024)
by: Hu, Xingjian, et al.
Published: (2024)
ROAP: A Reading-Order and Attention-Prior Pipeline for Optimizing Layout Transformers in Key Information Extraction
by: Xie, Tingwei, et al.
Published: (2026)
by: Xie, Tingwei, et al.
Published: (2026)
Multimodal Neural Databases
by: Trappolini, Giovanni, et al.
Published: (2023)
by: Trappolini, Giovanni, et al.
Published: (2023)
Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection
by: Allmendinger, Alicia, et al.
Published: (2025)
by: Allmendinger, Alicia, et al.
Published: (2025)
KH-FUNSD: A Hierarchical and Fine-Grained Layout Analysis Dataset for Low-Resource Khmer Business Document
by: Thuon, Nimol, et al.
Published: (2025)
by: Thuon, Nimol, et al.
Published: (2025)
Co-Layout: LLM-driven Co-optimization for Interior Layout
by: Xiang, Chucheng, et al.
Published: (2025)
by: Xiang, Chucheng, et al.
Published: (2025)
TWIX: Automatically Reconstructing Structured Data from Templatized Documents
by: Lin, Yiming, et al.
Published: (2025)
by: Lin, Yiming, et al.
Published: (2025)
LAND: A Longitudinal Analysis of Neuromorphic Datasets
by: Cohen, Gregory, et al.
Published: (2026)
by: Cohen, Gregory, et al.
Published: (2026)
DLAFormer: An End-to-End Transformer For Document Layout Analysis
by: Wang, Jiawei, et al.
Published: (2024)
by: Wang, Jiawei, et al.
Published: (2024)
Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models
by: Zhu, Wanrong, et al.
Published: (2024)
by: Zhu, Wanrong, et al.
Published: (2024)
Accurate Fine-grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation
by: Zhao, Penghai, et al.
Published: (2021)
by: Zhao, Penghai, et al.
Published: (2021)
VideoScoop: A Non-Traditional Domain-Independent Framework For Video Analysis
by: Billah, Hafsa
Published: (2025)
by: Billah, Hafsa
Published: (2025)
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
by: Abdallah, Abdelrahman, et al.
Published: (2024)
by: Abdallah, Abdelrahman, et al.
Published: (2024)
Reducing Hallucination in Vision-Language Models via Stage-wise Preference Optimization under Distribution Shift
by: Xu, Qinwu
Published: (2026)
by: Xu, Qinwu
Published: (2026)
Unveiling the Pitfalls of Knowledge Editing for Large Language Models
by: Li, Zhoubo, et al.
Published: (2023)
by: Li, Zhoubo, et al.
Published: (2023)
Reference-Based Post-OCR Processing with LLM for Precise Diacritic Text in Historical Document Recognition
by: Do, Thao, et al.
Published: (2024)
by: Do, Thao, et al.
Published: (2024)
CLDA-YOLO: Visual Contrastive Learning Based Domain Adaptive YOLO Detector
by: Qiu, Tianheng, et al.
Published: (2024)
by: Qiu, Tianheng, et al.
Published: (2024)
Layout-Aware Text Editing for Efficient Transformation of Academic PDFs to Markdown
by: Duan, Changxu
Published: (2025)
by: Duan, Changxu
Published: (2025)
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
by: Chan, Adrian, et al.
Published: (2024)
by: Chan, Adrian, et al.
Published: (2024)
R4-CGQA: Retrieval-based Vision Language Models for Computer Graphics Image Quality Assessment
by: Li, Zhuangzi, et al.
Published: (2026)
by: Li, Zhuangzi, et al.
Published: (2026)
Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration
by: Zhang, Yuyi, et al.
Published: (2025)
by: Zhang, Yuyi, et al.
Published: (2025)
Extract-Transform-Load for Video Streams
by: Kossmann, Ferdinand, et al.
Published: (2023)
by: Kossmann, Ferdinand, et al.
Published: (2023)
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing
by: Wang, Baode, et al.
Published: (2025)
by: Wang, Baode, et al.
Published: (2025)
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
by: Fan, Yue, et al.
Published: (2024)
by: Fan, Yue, et al.
Published: (2024)
LED Benchmark: Diagnosing Structural Layout Errors for Document Layout Analysis
by: Heo, Inbum, et al.
Published: (2025)
by: Heo, Inbum, et al.
Published: (2025)
A Comparative Study of Continuous Sign Language Recognition Techniques
by: Alyami, Sarah, et al.
Published: (2024)
by: Alyami, Sarah, et al.
Published: (2024)
A Hybrid Approach for Document Layout Analysis in Document images
by: Shehzadi, Tahira, et al.
Published: (2024)
by: Shehzadi, Tahira, et al.
Published: (2024)
Improving OCR for Historical Texts of Multiple Languages
by: Westerdijk, Hylke, et al.
Published: (2025)
by: Westerdijk, Hylke, et al.
Published: (2025)
DPCD: A Quality Assessment Database for Dynamic Point Clouds
by: Liu, Yating, et al.
Published: (2025)
by: Liu, Yating, et al.
Published: (2025)
Similar Items
-
LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis
by: Heo, Inbum, et al.
Published: (2026) -
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs
by: Yang, Zhantao, et al.
Published: (2024) -
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
by: Luo, Chuwei, et al.
Published: (2024) -
TRIDIS: A Comprehensive Medieval and Early Modern Corpus for HTR and NER
by: Aguilar, Sergio Torres
Published: (2025) -
YOLO Object Detectors for Robotics -- a Comparative Study
by: Niżeniec, Patryk, et al.
Published: (2026)