Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Barayan, Abdullah, Camacho-Collados, Jose, Alva-Manchego, Fernando
Formato:	Preprint
Publicado:	2024
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2409.20246
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866912157779623936
author	Barayan, Abdullah Camacho-Collados, Jose Alva-Manchego, Fernando
author_facet	Barayan, Abdullah Camacho-Collados, Jose Alva-Manchego, Fernando
contents	Readability-controlled text simplification (RCTS) rewrites texts to lower readability levels while preserving their meaning. RCTS models often depend on parallel corpora with readability annotations on both source and target sides. Such datasets are scarce and difficult to curate, especially at the sentence level. To reduce reliance on parallel data, we explore using instruction-tuned large language models for zero-shot RCTS. Through automatic and manual evaluations, we examine: (1) how different types of contextual information affect a model's ability to generate sentences with the desired readability, and (2) the trade-off between achieving target readability and preserving meaning. Results show that all tested models struggle to simplify sentences (especially to the lowest levels) due to models' limitations and characteristics of the source sentences that impede adequate rewriting. Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS, as standard ones often misinterpret common simplification operations, and inaccurately assess readability and meaning preservation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_20246
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Analysing Zero-Shot Readability-Controlled Sentence Simplification Barayan, Abdullah Camacho-Collados, Jose Alva-Manchego, Fernando Computation and Language Readability-controlled text simplification (RCTS) rewrites texts to lower readability levels while preserving their meaning. RCTS models often depend on parallel corpora with readability annotations on both source and target sides. Such datasets are scarce and difficult to curate, especially at the sentence level. To reduce reliance on parallel data, we explore using instruction-tuned large language models for zero-shot RCTS. Through automatic and manual evaluations, we examine: (1) how different types of contextual information affect a model's ability to generate sentences with the desired readability, and (2) the trade-off between achieving target readability and preserving meaning. Results show that all tested models struggle to simplify sentences (especially to the lowest levels) due to models' limitations and characteristics of the source sentences that impede adequate rewriting. Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS, as standard ones often misinterpret common simplification operations, and inaccurately assess readability and meaning preservation.
title	Analysing Zero-Shot Readability-Controlled Sentence Simplification
topic	Computation and Language
url	https://arxiv.org/abs/2409.20246

Ejemplares similares