Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhu, Boyuan, Liu, Fagui, Chen, Xi, Tang, Quan
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2401.11704
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911762231590912
author	Zhu, Boyuan Liu, Fagui Chen, Xi Tang, Quan
author_facet	Zhu, Boyuan Liu, Fagui Chen, Xi Tang, Quan
contents	Recently, scene text detection has received significant attention due to its wide application. However, accurate detection in complex scenes of multiple scales, orientations, and curvature remains a challenge. Numerous detection methods adopt the Vatti clipping (VC) algorithm for multiple-instance training to address the issue of arbitrary-shaped text. Yet we identify several bias results from these approaches called the "shrinked kernel". Specifically, it refers to a decrease in accuracy resulting from an output that overly favors the text kernel. In this paper, we propose a new approach named Expand Kernel Network (EK-Net) with expand kernel distance to compensate for the previous deficiency, which includes three-stages regression to complete instance detection. Moreover, EK-Net not only realize the precise positioning of arbitrary-shaped text, but also achieve a trade-off between performance and speed. Evaluation results demonstrate that EK-Net achieves state-of-the-art or competitive performance compared to other advanced methods, e.g., F-measure of 85.72% at 35.42 FPS on ICDAR 2015, F-measure of 85.75% at 40.13 FPS on CTW1500.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_11704
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	EK-Net:Real-time Scene Text Detection with Expand Kernel Distance Zhu, Boyuan Liu, Fagui Chen, Xi Tang, Quan Computer Vision and Pattern Recognition Recently, scene text detection has received significant attention due to its wide application. However, accurate detection in complex scenes of multiple scales, orientations, and curvature remains a challenge. Numerous detection methods adopt the Vatti clipping (VC) algorithm for multiple-instance training to address the issue of arbitrary-shaped text. Yet we identify several bias results from these approaches called the "shrinked kernel". Specifically, it refers to a decrease in accuracy resulting from an output that overly favors the text kernel. In this paper, we propose a new approach named Expand Kernel Network (EK-Net) with expand kernel distance to compensate for the previous deficiency, which includes three-stages regression to complete instance detection. Moreover, EK-Net not only realize the precise positioning of arbitrary-shaped text, but also achieve a trade-off between performance and speed. Evaluation results demonstrate that EK-Net achieves state-of-the-art or competitive performance compared to other advanced methods, e.g., F-measure of 85.72% at 35.42 FPS on ICDAR 2015, F-measure of 85.75% at 40.13 FPS on CTW1500.
title	EK-Net:Real-time Scene Text Detection with Expand Kernel Distance
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2401.11704

Similar Items