Enregistré dans:
Détails bibliographiques
Auteurs principaux: Wang, Jiale, Yu, Junhui, Liu, Huanyong, Kong, Chenanran
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2409.11677
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866909451994267648
author Wang, Jiale
Yu, Junhui
Liu, Huanyong
Kong, Chenanran
author_facet Wang, Jiale
Yu, Junhui
Liu, Huanyong
Kong, Chenanran
contents Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2409_11677
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network
Wang, Jiale
Yu, Junhui
Liu, Huanyong
Kong, Chenanran
Computation and Language
Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.
title Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network
topic Computation and Language
url https://arxiv.org/abs/2409.11677