Saved in:
| Main Authors: | Boillet, Mélodie, Tarride, Solène, Kermorvant, Christopher |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.26712 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library
by: Tarride, Solène, et al.
Published: (2024)
by: Tarride, Solène, et al.
Published: (2024)
Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition
by: Tarride, Solène, et al.
Published: (2024)
by: Tarride, Solène, et al.
Published: (2024)
The Socface Project: Large-Scale Collection, Processing, and Analysis of a Century of French Censuses
by: Boillet, Mélodie, et al.
Published: (2024)
by: Boillet, Mélodie, et al.
Published: (2024)
Reading Order Independent Metrics for Information Extraction in Handwritten Documents
by: Villanova-Aparisi, David, et al.
Published: (2024)
by: Villanova-Aparisi, David, et al.
Published: (2024)
Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates
by: Bottaioli, Natalia, et al.
Published: (2025)
by: Bottaioli, Natalia, et al.
Published: (2025)
Callico: a Versatile Open-Source Document Image Annotation Platform
by: Kermorvant, Christopher, et al.
Published: (2024)
by: Kermorvant, Christopher, et al.
Published: (2024)
From Press to Pixels: Evolving Urdu Text Recognition
by: Arif, Samee, et al.
Published: (2025)
by: Arif, Samee, et al.
Published: (2025)
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
by: Tang, Jingqun, et al.
Published: (2024)
by: Tang, Jingqun, et al.
Published: (2024)
Benchmarking Large Language Models for Handwritten Text Recognition
by: Crosilla, Giorgia, et al.
Published: (2025)
by: Crosilla, Giorgia, et al.
Published: (2025)
AnyText: Multilingual Visual Text Generation And Editing
by: Tuo, Yuxiang, et al.
Published: (2023)
by: Tuo, Yuxiang, et al.
Published: (2023)
KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark
by: Nom, Vannkinh, et al.
Published: (2024)
by: Nom, Vannkinh, et al.
Published: (2024)
MMTIT-Bench: A Multilingual and Multi-Scenario Benchmark with Cognition-Perception-Reasoning Guided Text-Image Machine Translation
by: Li, Gengluo, et al.
Published: (2026)
by: Li, Gengluo, et al.
Published: (2026)
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
by: Awal, Rabiul, et al.
Published: (2025)
by: Awal, Rabiul, et al.
Published: (2025)
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
by: Lu, Runnan, et al.
Published: (2025)
by: Lu, Runnan, et al.
Published: (2025)
GaitAdapt: Continual Learning for Evolving Gait Recognition
by: Wang, Jingjie, et al.
Published: (2025)
by: Wang, Jingjie, et al.
Published: (2025)
StyleTextGen: Style-Conditioned Multilingual Scene Text Generation
by: Chen, Zeyu, et al.
Published: (2026)
by: Chen, Zeyu, et al.
Published: (2026)
FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting
by: Das, Alloy, et al.
Published: (2024)
by: Das, Alloy, et al.
Published: (2024)
Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering
by: Maryam, Hiba, et al.
Published: (2024)
by: Maryam, Hiba, et al.
Published: (2024)
Leveraging Automatic Personalised Nutrition: Food Image Recognition Benchmark and Dataset based on Nutrition Taxonomy
by: Romero-Tapiador, Sergio, et al.
Published: (2022)
by: Romero-Tapiador, Sergio, et al.
Published: (2022)
Read or Ignore? A Unified Benchmark for Typographic-Attack Robustness and Text Recognition in Vision-Language Models
by: Waseda, Futa, et al.
Published: (2025)
by: Waseda, Futa, et al.
Published: (2025)
Saliency-Aware Automatic Buddhas Statue Recognition
by: Qi, Yong, et al.
Published: (2024)
by: Qi, Yong, et al.
Published: (2024)
DanceText: A Training-Free Layered Framework for Controllable Multilingual Text Transformation in Images
by: Yu, Zhenyu, et al.
Published: (2025)
by: Yu, Zhenyu, et al.
Published: (2025)
Research on Multilingual Natural Scene Text Detection Algorithm
by: Wang, Tao
Published: (2023)
by: Wang, Tao
Published: (2023)
JoyType: A Robust Design for Multilingual Visual Text Creation
by: Li, Chao, et al.
Published: (2024)
by: Li, Chao, et al.
Published: (2024)
Benchmarking and Evolving Reason-Reflect-Rectify for Reflective Visual Generation
by: Wang, Junjie, et al.
Published: (2026)
by: Wang, Junjie, et al.
Published: (2026)
MINERVA-Cultural: A Benchmark for Cultural and Multilingual Long Video Reasoning
by: Singh, Darshan, et al.
Published: (2026)
by: Singh, Darshan, et al.
Published: (2026)
Boosting Gesture Recognition with an Automatic Gesture Annotation Framework
by: Shen, Junxiao, et al.
Published: (2024)
by: Shen, Junxiao, et al.
Published: (2024)
MUNIChus: Multilingual News Image Captioning Benchmark
by: Chen, Yuji, et al.
Published: (2026)
by: Chen, Yuji, et al.
Published: (2026)
A Culturally-diverse Multilingual Multimodal Video Benchmark & Model
by: Shafique, Bhuiyan Sanjid, et al.
Published: (2025)
by: Shafique, Bhuiyan Sanjid, et al.
Published: (2025)
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
by: Liu, Jiawei, et al.
Published: (2025)
by: Liu, Jiawei, et al.
Published: (2025)
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
by: Ye, Keren, et al.
Published: (2025)
by: Ye, Keren, et al.
Published: (2025)
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models
by: Atuhurra, Jesse, et al.
Published: (2024)
by: Atuhurra, Jesse, et al.
Published: (2024)
TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
by: Xie, Yu, et al.
Published: (2025)
by: Xie, Yu, et al.
Published: (2025)
Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition
by: Fein-Ashley, Jacob, et al.
Published: (2023)
by: Fein-Ashley, Jacob, et al.
Published: (2023)
Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models
by: Wang, Xiyu, et al.
Published: (2024)
by: Wang, Xiyu, et al.
Published: (2024)
Pre-training for Action Recognition with Automatically Generated Fractal Datasets
by: Svyezhentsev, Davyd, et al.
Published: (2024)
by: Svyezhentsev, Davyd, et al.
Published: (2024)
Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
by: Bahram, Yara, et al.
Published: (2025)
by: Bahram, Yara, et al.
Published: (2025)
Automatic Text Box Placement for Supporting Typographic Design
by: Muraoka, Jun, et al.
Published: (2025)
by: Muraoka, Jun, et al.
Published: (2025)
Decoder Pre-Training with only Text for Scene Text Recognition
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition
by: Yang, Xiahan, et al.
Published: (2025)
by: Yang, Xiahan, et al.
Published: (2025)
Similar Items
-
Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library
by: Tarride, Solène, et al.
Published: (2024) -
Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition
by: Tarride, Solène, et al.
Published: (2024) -
The Socface Project: Large-Scale Collection, Processing, and Analysis of a Century of French Censuses
by: Boillet, Mélodie, et al.
Published: (2024) -
Reading Order Independent Metrics for Information Extraction in Handwritten Documents
by: Villanova-Aparisi, David, et al.
Published: (2024) -
Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates
by: Bottaioli, Natalia, et al.
Published: (2025)