Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Couture, Beatrice, Verret, Farah, Gohier, Maxime, Deslandres, Dominique
Format:	Preprint
Published:	2022
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2212.11146
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914755750395904
author	Couture, Beatrice Verret, Farah Gohier, Maxime Deslandres, Dominique
author_facet	Couture, Beatrice Verret, Farah Gohier, Maxime Deslandres, Dominique
contents	The arrival of handwriting recognition technologies offers new possibilities for research in heritage studies. However, it is now necessary to reflect on the experiences and the practices developed by research teams. Our use of the Transkribus platform since 2018 has led us to search for the most significant ways to improve the performance of our handwritten text recognition (HTR) models which are made to transcribe French handwriting dating from the 17th century. This article therefore reports on the impacts of creating transcribing protocols, using the language model at full scale and determining the best way to use base models in order to help increase the performance of HTR models. Combining all of these elements can indeed increase the performance of a single model by more than 20% (reaching a Character Error Rate below 5%). This article also discusses some challenges regarding the collaborative nature of HTR platforms such as Transkribus and the way researchers can share their data generated in the process of creating or training handwritten text recognition models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2212_11146
institution	arXiv
publishDate	2022
record_format	arxiv
spellingShingle	The Challenges of HTR Model Training: Feedback from the Project Donner le gout de l'archive a l'ere numerique Couture, Beatrice Verret, Farah Gohier, Maxime Deslandres, Dominique Computer Vision and Pattern Recognition Machine Learning The arrival of handwriting recognition technologies offers new possibilities for research in heritage studies. However, it is now necessary to reflect on the experiences and the practices developed by research teams. Our use of the Transkribus platform since 2018 has led us to search for the most significant ways to improve the performance of our handwritten text recognition (HTR) models which are made to transcribe French handwriting dating from the 17th century. This article therefore reports on the impacts of creating transcribing protocols, using the language model at full scale and determining the best way to use base models in order to help increase the performance of HTR models. Combining all of these elements can indeed increase the performance of a single model by more than 20% (reaching a Character Error Rate below 5%). This article also discusses some challenges regarding the collaborative nature of HTR platforms such as Transkribus and the way researchers can share their data generated in the process of creating or training handwritten text recognition models.
title	The Challenges of HTR Model Training: Feedback from the Project Donner le gout de l'archive a l'ere numerique
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2212.11146

Similar Items