Saved in:
Bibliographic Details
Main Authors: Ollinger, Sandrine, Maurel, Denis
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.19808
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916338745737216
author Ollinger, Sandrine
Maurel, Denis
author_facet Ollinger, Sandrine
Maurel, Denis
contents This paper presents a graph cascade for sentence segmentation of XML documents. Our proposal offers sentences inside sentences for cases introduced by quotation marks and hyphens, and also pays particular attention to situations involving incises introduced by parentheses and lists introduced by colons. We present how the tool works and compare the results obtained with those available in 2019 on the same dataset, together with an evaluation of the system's performance on a test corpus
format Preprint
id arxiv_https___arxiv_org_abs_2407_19808
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Segmentation en phrases : ouvrez les guillemets sans perdre le fil
Ollinger, Sandrine
Maurel, Denis
Computation and Language
This paper presents a graph cascade for sentence segmentation of XML documents. Our proposal offers sentences inside sentences for cases introduced by quotation marks and hyphens, and also pays particular attention to situations involving incises introduced by parentheses and lists introduced by colons. We present how the tool works and compare the results obtained with those available in 2019 on the same dataset, together with an evaluation of the system's performance on a test corpus
title Segmentation en phrases : ouvrez les guillemets sans perdre le fil
topic Computation and Language
url https://arxiv.org/abs/2407.19808