:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Bourne, Jonathan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2409.19735
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
by: Bourne, Jonathan
Published: (2024)

Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
by: Bourne, Jonathan
Published: (2025)

CECOR: Correction-oriented synthetic data construction for factual error correction
by: Zhu, Lei, et al.
Published: (2026)

The Character Error Vector: Decomposable errors for page-level OCR evaluation
by: Bourne, Jonathan, et al.
Published: (2026)

Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
by: Pieler, Michael, et al.
Published: (2024)

Prompting open-source and commercial language models for grammatical error correction of English learner text
by: Davis, Christopher, et al.
Published: (2024)

Where Vision Becomes Text: Locating the OCR Routing Bottleneck in Vision-Language Models
by: Steinberg, Jonathan, et al.
Published: (2026)

Typoglycemia under the Hood: Investigating Language Models' Understanding of Scrambled Words
by: Sperduti, Gianluca, et al.
Published: (2025)

Private prediction for large-scale synthetic text generation
by: Amin, Kareem, et al.
Published: (2024)

Labeling Free-text Data using Language Model Ensembles
by: Qiu, Jiaxing, et al.
Published: (2025)

DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines
by: Cardoso, Gabriel Pimenta de Freitas, et al.
Published: (2026)

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
by: Poznanski, Jake, et al.
Published: (2025)

Typhoon OCR: Open Vision-Language Model For Thai Document Extraction
by: Nonesung, Surapon, et al.
Published: (2026)

Spiking the training data to correct for test set contamination
by: Wei, Johnny Tian-Zheng, et al.
Published: (2026)

Natural language guidance of high-fidelity text-to-speech with synthetic annotations
by: Lyth, Dan, et al.
Published: (2024)

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
by: Shi, Yuling, et al.
Published: (2026)

Self-training from Self-memory in Data-to-text Generation
by: Ta, Hoang-Thang
Published: (2024)

TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction
by: Wang, Chengye, et al.
Published: (2026)

Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR
by: Hennara, Khalil, et al.
Published: (2025)

Improving OCR for Historical Texts of Multiple Languages
by: Westerdijk, Hylke, et al.
Published: (2025)

High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR
by: Banerjee, Sourav, et al.
Published: (2024)

Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning
by: Yu, Haiyang, et al.
Published: (2025)

Tag and correct: high precision post-editing approach to correction of speech recognition errors
by: Ziętkiewicz, Tomasz
Published: (2024)

Enhancing Vision-Language Model Pre-training with Image-text Pair Pruning Based on Word Frequency
by: Liang, Mingliang, et al.
Published: (2024)

GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts
by: Kargaran, Amir Hossein, et al.
Published: (2026)

Chain-of-Though (CoT) prompting strategies for medical error detection and correction
by: Wu, Zhaolong, et al.
Published: (2024)

olmOCR 2: Unit Test Rewards for Document OCR
by: Poznanski, Jake, et al.
Published: (2025)

Audio-visual training for improved grounding in video-text LLMs
by: Sagare, Shivprasad, et al.
Published: (2024)

KazakhOCR: A Synthetic Benchmark for Evaluating Multimodal Models in Low-Resource Kazakh Script OCR
by: Gagnier, Henry, et al.
Published: (2026)

Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models
by: He, Jie, et al.
Published: (2025)

Improving the quality of Persian clinical text with a novel spelling correction system
by: Dashti, Seyed Mohammad Sadegh, et al.
Published: (2024)

RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages
by: Kashid, Harshvivek, et al.
Published: (2024)

From scratch to silver: Creating trustworthy training data for patent-SDG classification using Large Language Models
by: Ascione, Grazia Sveva, et al.
Published: (2025)

GLM-OCR Technical Report
by: Duan, Shuaiqi, et al.
Published: (2026)

LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)

Causality extraction from medical text using Large Language Models (LLMs)
by: Gopalakrishnan, Seethalakshmi, et al.
Published: (2024)

Self-correction is Not An Innate Capability in Language Models
by: Liu, Guangliang, et al.
Published: (2024)

Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
by: Lauar, Filipe, et al.
Published: (2024)

OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models
by: Liu, Yuliang, et al.
Published: (2023)

Investigating the translation capabilities of Large Language Models trained on parallel data only
by: Gilabert, Javier García, et al.
Published: (2024)