Enregistré dans:
Détails bibliographiques
Auteurs principaux: Yuan, Hongyi, Yu, Sheng
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2409.19796
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866909329323458560
author Yuan, Hongyi
Yu, Sheng
author_facet Yuan, Hongyi
Yu, Sheng
contents Electronic medical records (EMRs) contain the majority of patients' healthcare details. It is an abundant resource for developing an automatic healthcare system. Most of the natural language processing (NLP) studies on EMR processing, such as concept extraction, are adversely affected by the inaccurate segmentation of EMR sections. At the same time, not enough attention has been given to the accurate sectioning of EMRs. The information that may occur in section structures is unvalued. This work focuses on the segmentation of EMRs and proposes a black-box segmentation method using a simple sentence embedding model and neural network, along with a proper training method. To achieve universal adaptivity, we train our model on the dataset with different section headings formats. We compare several advanced deep learning-based NLP methods, and our method achieves the best segmentation accuracies (above 98%) on various test data with a proper training corpus.
format Preprint
id arxiv_https___arxiv_org_abs_2409_19796
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Black-Box Segmentation of Electronic Medical Records
Yuan, Hongyi
Yu, Sheng
Computation and Language
Electronic medical records (EMRs) contain the majority of patients' healthcare details. It is an abundant resource for developing an automatic healthcare system. Most of the natural language processing (NLP) studies on EMR processing, such as concept extraction, are adversely affected by the inaccurate segmentation of EMR sections. At the same time, not enough attention has been given to the accurate sectioning of EMRs. The information that may occur in section structures is unvalued. This work focuses on the segmentation of EMRs and proposes a black-box segmentation method using a simple sentence embedding model and neural network, along with a proper training method. To achieve universal adaptivity, we train our model on the dataset with different section headings formats. We compare several advanced deep learning-based NLP methods, and our method achieves the best segmentation accuracies (above 98%) on various test data with a proper training corpus.
title Black-Box Segmentation of Electronic Medical Records
topic Computation and Language
url https://arxiv.org/abs/2409.19796