Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Torterolo-Orta, Yanco Amor, Macicior-Mitxelena, Jaione, Miguez-Lamanuzzi, Marina, García-Serrano, Ana
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Computation and Language
Online Access:	https://arxiv.org/abs/2507.04878
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909677403504640
author	Torterolo-Orta, Yanco Amor Macicior-Mitxelena, Jaione Miguez-Lamanuzzi, Marina García-Serrano, Ana
author_facet	Torterolo-Orta, Yanco Amor Macicior-Mitxelena, Jaione Miguez-Lamanuzzi, Marina García-Serrano, Ana
contents	This article presents the experiments and results obtained by the GRESEL team in the IberLEF 2025 shared task PastReader: Transcribing Texts from the Past. Three types of experiments were conducted with the dual aim of participating in the task and enabling comparisons across different approaches. These included the use of a web-based OCR service, a traditional OCR engine, and a compact multimodal model. All experiments were run on consumer-grade hardware, which, despite lacking high-performance computing capacity, provided sufficient storage and stability. The results, while satisfactory, leave room for further improvement. Future work will focus on exploring new techniques and ideas using the Spanish-language dataset provided by the shared task, in collaboration with Biblioteca Nacional de España (BNE).
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_04878
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Transcribing Spanish Texts from the Past: Experiments with Transkribus, Tesseract and Granite Torterolo-Orta, Yanco Amor Macicior-Mitxelena, Jaione Miguez-Lamanuzzi, Marina García-Serrano, Ana Computer Vision and Pattern Recognition Computation and Language This article presents the experiments and results obtained by the GRESEL team in the IberLEF 2025 shared task PastReader: Transcribing Texts from the Past. Three types of experiments were conducted with the dual aim of participating in the task and enabling comparisons across different approaches. These included the use of a web-based OCR service, a traditional OCR engine, and a compact multimodal model. All experiments were run on consumer-grade hardware, which, despite lacking high-performance computing capacity, provided sufficient storage and stability. The results, while satisfactory, leave room for further improvement. Future work will focus on exploring new techniques and ideas using the Spanish-language dataset provided by the shared task, in collaboration with Biblioteca Nacional de España (BNE).
title	Transcribing Spanish Texts from the Past: Experiments with Transkribus, Tesseract and Granite
topic	Computer Vision and Pattern Recognition Computation and Language
url	https://arxiv.org/abs/2507.04878

Similar Items