Saved in:
| Main Author: | Bourne, Jonathan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.19735 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
by: Bourne, Jonathan
Published: (2024)
by: Bourne, Jonathan
Published: (2024)
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
by: Bourne, Jonathan
Published: (2025)
by: Bourne, Jonathan
Published: (2025)
CECOR: Correction-oriented synthetic data construction for factual error correction
by: Zhu, Lei, et al.
Published: (2026)
by: Zhu, Lei, et al.
Published: (2026)
The Character Error Vector: Decomposable errors for page-level OCR evaluation
by: Bourne, Jonathan, et al.
Published: (2026)
by: Bourne, Jonathan, et al.
Published: (2026)
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
by: Pieler, Michael, et al.
Published: (2024)
by: Pieler, Michael, et al.
Published: (2024)
Prompting open-source and commercial language models for grammatical error correction of English learner text
by: Davis, Christopher, et al.
Published: (2024)
by: Davis, Christopher, et al.
Published: (2024)
Where Vision Becomes Text: Locating the OCR Routing Bottleneck in Vision-Language Models
by: Steinberg, Jonathan, et al.
Published: (2026)
by: Steinberg, Jonathan, et al.
Published: (2026)
Typoglycemia under the Hood: Investigating Language Models' Understanding of Scrambled Words
by: Sperduti, Gianluca, et al.
Published: (2025)
by: Sperduti, Gianluca, et al.
Published: (2025)
Private prediction for large-scale synthetic text generation
by: Amin, Kareem, et al.
Published: (2024)
by: Amin, Kareem, et al.
Published: (2024)
Labeling Free-text Data using Language Model Ensembles
by: Qiu, Jiaxing, et al.
Published: (2025)
by: Qiu, Jiaxing, et al.
Published: (2025)
DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines
by: Cardoso, Gabriel Pimenta de Freitas, et al.
Published: (2026)
by: Cardoso, Gabriel Pimenta de Freitas, et al.
Published: (2026)
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
by: Poznanski, Jake, et al.
Published: (2025)
by: Poznanski, Jake, et al.
Published: (2025)
Typhoon OCR: Open Vision-Language Model For Thai Document Extraction
by: Nonesung, Surapon, et al.
Published: (2026)
by: Nonesung, Surapon, et al.
Published: (2026)
Spiking the training data to correct for test set contamination
by: Wei, Johnny Tian-Zheng, et al.
Published: (2026)
by: Wei, Johnny Tian-Zheng, et al.
Published: (2026)
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
by: Lyth, Dan, et al.
Published: (2024)
by: Lyth, Dan, et al.
Published: (2024)
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
by: Shi, Yuling, et al.
Published: (2026)
by: Shi, Yuling, et al.
Published: (2026)
Self-training from Self-memory in Data-to-text Generation
by: Ta, Hoang-Thang
Published: (2024)
by: Ta, Hoang-Thang
Published: (2024)
TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction
by: Wang, Chengye, et al.
Published: (2026)
by: Wang, Chengye, et al.
Published: (2026)
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR
by: Hennara, Khalil, et al.
Published: (2025)
by: Hennara, Khalil, et al.
Published: (2025)
Improving OCR for Historical Texts of Multiple Languages
by: Westerdijk, Hylke, et al.
Published: (2025)
by: Westerdijk, Hylke, et al.
Published: (2025)
High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR
by: Banerjee, Sourav, et al.
Published: (2024)
by: Banerjee, Sourav, et al.
Published: (2024)
Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning
by: Yu, Haiyang, et al.
Published: (2025)
by: Yu, Haiyang, et al.
Published: (2025)
Tag and correct: high precision post-editing approach to correction of speech recognition errors
by: Ziętkiewicz, Tomasz
Published: (2024)
by: Ziętkiewicz, Tomasz
Published: (2024)
Enhancing Vision-Language Model Pre-training with Image-text Pair Pruning Based on Word Frequency
by: Liang, Mingliang, et al.
Published: (2024)
by: Liang, Mingliang, et al.
Published: (2024)
GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts
by: Kargaran, Amir Hossein, et al.
Published: (2026)
by: Kargaran, Amir Hossein, et al.
Published: (2026)
Chain-of-Though (CoT) prompting strategies for medical error detection and correction
by: Wu, Zhaolong, et al.
Published: (2024)
by: Wu, Zhaolong, et al.
Published: (2024)
olmOCR 2: Unit Test Rewards for Document OCR
by: Poznanski, Jake, et al.
Published: (2025)
by: Poznanski, Jake, et al.
Published: (2025)
Audio-visual training for improved grounding in video-text LLMs
by: Sagare, Shivprasad, et al.
Published: (2024)
by: Sagare, Shivprasad, et al.
Published: (2024)
KazakhOCR: A Synthetic Benchmark for Evaluating Multimodal Models in Low-Resource Kazakh Script OCR
by: Gagnier, Henry, et al.
Published: (2026)
by: Gagnier, Henry, et al.
Published: (2026)
Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models
by: He, Jie, et al.
Published: (2025)
by: He, Jie, et al.
Published: (2025)
Improving the quality of Persian clinical text with a novel spelling correction system
by: Dashti, Seyed Mohammad Sadegh, et al.
Published: (2024)
by: Dashti, Seyed Mohammad Sadegh, et al.
Published: (2024)
RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages
by: Kashid, Harshvivek, et al.
Published: (2024)
by: Kashid, Harshvivek, et al.
Published: (2024)
From scratch to silver: Creating trustworthy training data for patent-SDG classification using Large Language Models
by: Ascione, Grazia Sveva, et al.
Published: (2025)
by: Ascione, Grazia Sveva, et al.
Published: (2025)
GLM-OCR Technical Report
by: Duan, Shuaiqi, et al.
Published: (2026)
by: Duan, Shuaiqi, et al.
Published: (2026)
LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)
by: Tyen, Gladys, et al.
Published: (2023)
Causality extraction from medical text using Large Language Models (LLMs)
by: Gopalakrishnan, Seethalakshmi, et al.
Published: (2024)
by: Gopalakrishnan, Seethalakshmi, et al.
Published: (2024)
Self-correction is Not An Innate Capability in Language Models
by: Liu, Guangliang, et al.
Published: (2024)
by: Liu, Guangliang, et al.
Published: (2024)
Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
by: Lauar, Filipe, et al.
Published: (2024)
by: Lauar, Filipe, et al.
Published: (2024)
OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models
by: Liu, Yuliang, et al.
Published: (2023)
by: Liu, Yuliang, et al.
Published: (2023)
Investigating the translation capabilities of Large Language Models trained on parallel data only
by: Gilabert, Javier García, et al.
Published: (2024)
by: Gilabert, Javier García, et al.
Published: (2024)
Similar Items
-
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
by: Bourne, Jonathan
Published: (2024) -
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
by: Bourne, Jonathan
Published: (2025) -
CECOR: Correction-oriented synthetic data construction for factual error correction
by: Zhu, Lei, et al.
Published: (2026) -
The Character Error Vector: Decomposable errors for page-level OCR evaluation
by: Bourne, Jonathan, et al.
Published: (2026) -
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
by: Pieler, Michael, et al.
Published: (2024)