Salvato in:
| Autore principale: | |
|---|---|
| Natura: | Artículo científico |
| Lingua: | en |
| Pubblicazione: |
Sociedad Española para el Procesamiento del Lenguaje Natural
2006
|
| Soggetti: | |
| Accesso online: | https://www.redalyc.org/articulo.oa?id=515751737006 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
Sommario:
- A Comparative Study of Clustering Algorithms on Narrow-Domain Abstracts David Pinto Paolo Rosso Alfons Juan Héctor Jiménez-Salazar Computación Narrow domain Clustering of abstracts Transition Point technique Clustering abstracts of scientific texts of very narrow domain implies a big challenge. The first problem to attend is the high overlapping among the document’s vocabularies, besides the low frequency of these terms. The transition point technique has been successfully used in this area of Natural Language Processing (NLP). Its best properties rely on the extraction of the mid-frequency terms. Although the importance of these terms on NLP has been known from time ago, the exact extraction of these terms is unknown. In this paper we present an application of this technique as a feature selection technique in two corpora of very narrow domain. The experimental results show that the transition point technique obtains the best results of F-measure with five different clustering methods. 2006 artículo científico 1135-5948 https://www.redalyc.org/articulo.oa?id=515751737006 en http://www.redalyc.org/revista.oa?id=5157 Procesamiento del Lenguaje Natural application/pdf Sociedad Española para el Procesamiento del Lenguaje Natural Procesamiento del Lenguaje Natural (España) Num.37