Saved in:
| Main Authors: | Lauar, Filipe, Laurent, Valentin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.06950 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision-Language Model
by: Vesalainen, Ari, et al.
Published: (2026)
by: Vesalainen, Ari, et al.
Published: (2026)
TrInk: Ink Generation with Transformer Network
by: Jin, Zezhong, et al.
Published: (2025)
by: Jin, Zezhong, et al.
Published: (2025)
Multilingual Transfer and Domain Adaptation for Low-Resource Languages of Spain
by: Luo, Yuanchang, et al.
Published: (2024)
by: Luo, Yuanchang, et al.
Published: (2024)
Adapting TrOCR for Printed Tigrinya Text Recognition: Word-Aware Loss Weighting for Cross-Script Transfer Learning
by: Medhanie, Yonatan Haile, et al.
Published: (2026)
by: Medhanie, Yonatan Haile, et al.
Published: (2026)
Intermediate-Task Transfer Learning: Leveraging Sarcasm Detection for Stance Detection
by: Nkhata, Gibson, et al.
Published: (2025)
by: Nkhata, Gibson, et al.
Published: (2025)
Introducing TrGLUE and SentiTurca: A Comprehensive Benchmark for Turkish General Language Understanding and Sentiment Analysis
by: Altinok, Duygu
Published: (2025)
by: Altinok, Duygu
Published: (2025)
Deep Natural Language Feature Learning for Interpretable Prediction
by: Urrutia, Felipe, et al.
Published: (2023)
by: Urrutia, Felipe, et al.
Published: (2023)
DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines
by: Cardoso, Gabriel Pimenta de Freitas, et al.
Published: (2026)
by: Cardoso, Gabriel Pimenta de Freitas, et al.
Published: (2026)
Leveraging In-Context Learning for Language Model Agents
by: Gupta, Shivanshu, et al.
Published: (2025)
by: Gupta, Shivanshu, et al.
Published: (2025)
Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models
by: Sarker, Shraboni, et al.
Published: (2024)
by: Sarker, Shraboni, et al.
Published: (2024)
OCRTurk: A Comprehensive OCR Benchmark for Turkish
by: Yılmaz, Deniz, et al.
Published: (2026)
by: Yılmaz, Deniz, et al.
Published: (2026)
Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents
by: Greif, Gavin, et al.
Published: (2025)
by: Greif, Gavin, et al.
Published: (2025)
Cross-Lingual Transfer and Parameter-Efficient Adaptation in the Turkic Language Family: A Theoretical Framework for Low-Resource Language Models
by: Ibrahimzade, O., et al.
Published: (2026)
by: Ibrahimzade, O., et al.
Published: (2026)
Leveraging Weakly Annotated Data for Hate Speech Detection in Code-Mixed Hinglish: A Feasibility-Driven Transfer Learning Approach with Large Language Models
by: Yadav, Sargam, et al.
Published: (2024)
by: Yadav, Sargam, et al.
Published: (2024)
DIETA: A Decoder-only transformer-based model for Italian-English machine TrAnslation
by: Kasela, Pranav, et al.
Published: (2026)
by: Kasela, Pranav, et al.
Published: (2026)
Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR
by: Zhong, Yufeng, et al.
Published: (2025)
by: Zhong, Yufeng, et al.
Published: (2025)
Multi-BERT: Leveraging Adapters and Prompt Tuning for Low-Resource Multi-Domain Adaptation
by: Azad, Parham Abed, et al.
Published: (2024)
by: Azad, Parham Abed, et al.
Published: (2024)
DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models
by: Zeng, Yuanhao, et al.
Published: (2024)
by: Zeng, Yuanhao, et al.
Published: (2024)
DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts
by: Voznyuk, Anastasia, et al.
Published: (2024)
by: Voznyuk, Anastasia, et al.
Published: (2024)
TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation
by: Bergmanis, Toms, et al.
Published: (2026)
by: Bergmanis, Toms, et al.
Published: (2026)
Investigating OCR-Sensitive Neurons to Improve Entity Recognition in Historical Documents
by: Boros, Emanuela, et al.
Published: (2024)
by: Boros, Emanuela, et al.
Published: (2024)
Enriching Historical Records: An OCR and AI-Driven Approach for Database Integration
by: Abedi, Zahra, et al.
Published: (2025)
by: Abedi, Zahra, et al.
Published: (2025)
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
by: Sharthak, Shaurya, et al.
Published: (2025)
by: Sharthak, Shaurya, et al.
Published: (2025)
Token-Efficient Leverage Learning in Large Language Models
by: Zeng, Yuanhao, et al.
Published: (2024)
by: Zeng, Yuanhao, et al.
Published: (2024)
Leveraging Large Language Models for Entity Matching
by: Huang, Qianyu, et al.
Published: (2024)
by: Huang, Qianyu, et al.
Published: (2024)
Leveraging Grammar Induction for Language Understanding and Generation
by: Kai, Jushi, et al.
Published: (2024)
by: Kai, Jushi, et al.
Published: (2024)
Machine Translation Advancements of Low-Resource Indian Languages by Transfer Learning
by: Wei, Bin, et al.
Published: (2024)
by: Wei, Bin, et al.
Published: (2024)
CamemBERT-bio: Leveraging Continual Pre-training for Cost-Effective Models on French Biomedical Data
by: Touchent, Rian, et al.
Published: (2023)
by: Touchent, Rian, et al.
Published: (2023)
LengClaro2023: A Dataset of Administrative Texts in Spanish with Plain Language adaptations
by: Agüera-Marco, Belén, et al.
Published: (2025)
by: Agüera-Marco, Belén, et al.
Published: (2025)
TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy
by: Agarwal, Vibhav, et al.
Published: (2024)
by: Agarwal, Vibhav, et al.
Published: (2024)
Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning
by: Lacombe, Valentin, et al.
Published: (2025)
by: Lacombe, Valentin, et al.
Published: (2025)
Bridging Language Gaps: Enhancing Few-Shot Language Adaptation
by: Borchert, Philipp, et al.
Published: (2025)
by: Borchert, Philipp, et al.
Published: (2025)
Sequence-to-Sequence Spanish Pre-trained Language Models
by: Araujo, Vladimir, et al.
Published: (2023)
by: Araujo, Vladimir, et al.
Published: (2023)
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey
by: Pei, Qizhi, et al.
Published: (2024)
by: Pei, Qizhi, et al.
Published: (2024)
Confidence-Aware Document OCR Error Detection
by: Hemmer, Arthur, et al.
Published: (2024)
by: Hemmer, Arthur, et al.
Published: (2024)
Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models
by: Yoo, Haneul, et al.
Published: (2025)
by: Yoo, Haneul, et al.
Published: (2025)
Leveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMs
by: Horoi, Stefan, et al.
Published: (2025)
by: Horoi, Stefan, et al.
Published: (2025)
GutenOCR: A Grounded Vision-Language Front-End for Documents
by: Heidenreich, Hunter, et al.
Published: (2026)
by: Heidenreich, Hunter, et al.
Published: (2026)
What Layers When: Learning to Skip Compute in LLMs with Residual Gates
by: Laitenberger, Filipe, et al.
Published: (2025)
by: Laitenberger, Filipe, et al.
Published: (2025)
OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets
by: Shen, Jiyuan, et al.
Published: (2026)
by: Shen, Jiyuan, et al.
Published: (2026)
Similar Items
-
Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision-Language Model
by: Vesalainen, Ari, et al.
Published: (2026) -
TrInk: Ink Generation with Transformer Network
by: Jin, Zezhong, et al.
Published: (2025) -
Multilingual Transfer and Domain Adaptation for Low-Resource Languages of Spain
by: Luo, Yuanchang, et al.
Published: (2024) -
Adapting TrOCR for Printed Tigrinya Text Recognition: Word-Aware Loss Weighting for Cross-Script Transfer Learning
by: Medhanie, Yonatan Haile, et al.
Published: (2026) -
Intermediate-Task Transfer Learning: Leveraging Sarcasm Detection for Stance Detection
by: Nkhata, Gibson, et al.
Published: (2025)