Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Abdellaif, Osama Hosam, Nader, Abdelrahman, Hamdi, Ali
Format:	Preprint
Published:	2024
Subjects:	Robotics Digital Libraries Human-Computer Interaction Software Engineering
Online Access:	https://arxiv.org/abs/2412.18063
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916786314674176
author	Abdellaif, Osama Hosam Nader, Abdelrahman Hamdi, Ali
author_facet	Abdellaif, Osama Hosam Nader, Abdelrahman Hamdi, Ali
contents	This paper introduces LMRPA, a novel Large Model-Driven Robotic Process Automation (RPA) model designed to greatly improve the efficiency and speed of Optical Character Recognition (OCR) tasks. Traditional RPA platforms often suffer from performance bottlenecks when handling high-volume repetitive processes like OCR, leading to a less efficient and more time-consuming process. LMRPA allows the integration of Large Language Models (LLMs) to improve the accuracy and readability of extracted text, overcoming the challenges posed by ambiguous characters and complex text structures.Extensive benchmarks were conducted comparing LMRPA to leading RPA platforms, including UiPath and Automation Anywhere, using OCR engines like Tesseract and DocTR. The results are that LMRPA achieves superior performance, cutting the processing times by up to 52\%. For instance, in Batch 2 of the Tesseract OCR task, LMRPA completed the process in 9.8 seconds, where UiPath finished in 18.1 seconds and Automation Anywhere finished in 18.7 seconds. Similar improvements were observed with DocTR, where LMRPA outperformed other automation tools conducting the same process by completing tasks in 12.7 seconds, while competitors took over 20 seconds to do the same. These findings highlight the potential of LMRPA to revolutionize OCR-driven automation processes, offering a more efficient and effective alternative solution to the existing state-of-the-art RPA models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2412_18063
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	LMRPA: Large Language Model-Driven Efficient Robotic Process Automation for OCR Abdellaif, Osama Hosam Nader, Abdelrahman Hamdi, Ali Robotics Digital Libraries Human-Computer Interaction Software Engineering This paper introduces LMRPA, a novel Large Model-Driven Robotic Process Automation (RPA) model designed to greatly improve the efficiency and speed of Optical Character Recognition (OCR) tasks. Traditional RPA platforms often suffer from performance bottlenecks when handling high-volume repetitive processes like OCR, leading to a less efficient and more time-consuming process. LMRPA allows the integration of Large Language Models (LLMs) to improve the accuracy and readability of extracted text, overcoming the challenges posed by ambiguous characters and complex text structures.Extensive benchmarks were conducted comparing LMRPA to leading RPA platforms, including UiPath and Automation Anywhere, using OCR engines like Tesseract and DocTR. The results are that LMRPA achieves superior performance, cutting the processing times by up to 52\%. For instance, in Batch 2 of the Tesseract OCR task, LMRPA completed the process in 9.8 seconds, where UiPath finished in 18.1 seconds and Automation Anywhere finished in 18.7 seconds. Similar improvements were observed with DocTR, where LMRPA outperformed other automation tools conducting the same process by completing tasks in 12.7 seconds, while competitors took over 20 seconds to do the same. These findings highlight the potential of LMRPA to revolutionize OCR-driven automation processes, offering a more efficient and effective alternative solution to the existing state-of-the-art RPA models.
title	LMRPA: Large Language Model-Driven Efficient Robotic Process Automation for OCR
topic	Robotics Digital Libraries Human-Computer Interaction Software Engineering
url	https://arxiv.org/abs/2412.18063

Similar Items