Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rawat, Shreyash, Vijayarajan, V., Prasath, V. B. Surya
Format:	Preprint
Published:	2024
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2402.03380
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929234661867520
author	Rawat, Shreyash Vijayarajan, V. Prasath, V. B. Surya
author_facet	Rawat, Shreyash Vijayarajan, V. Prasath, V. B. Surya
contents	Text extraction is a highly subjective problem which depends on the dataset that one is working on and the kind of summarization details that needs to be extracted out. All the steps ranging from preprocessing of the data, to the choice of an optimal model for predictions, depends on the problem and the corpus at hand. In this paper, we describe a text extraction model where the aim is to extract word specified information relating to the semantics such that we can get all related and meaningful information about that word in a succinct format. This model can obtain meaningful results and can augment ubiquitous search model or a normal clustering or topic modelling algorithms. By utilizing new technique called two cluster assignment technique with K-means model, we improved the ontology of the retrieved text. We further apply the vector average damping technique for flexible movement of clusters. Our experimental results on a recent corpus of Covid-19 shows that we obtain good results based on main keywords.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_03380
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Modified K-means with Cluster Assignment -- Application to COVID-19 Data Rawat, Shreyash Vijayarajan, V. Prasath, V. B. Surya Information Retrieval Text extraction is a highly subjective problem which depends on the dataset that one is working on and the kind of summarization details that needs to be extracted out. All the steps ranging from preprocessing of the data, to the choice of an optimal model for predictions, depends on the problem and the corpus at hand. In this paper, we describe a text extraction model where the aim is to extract word specified information relating to the semantics such that we can get all related and meaningful information about that word in a succinct format. This model can obtain meaningful results and can augment ubiquitous search model or a normal clustering or topic modelling algorithms. By utilizing new technique called two cluster assignment technique with K-means model, we improved the ontology of the retrieved text. We further apply the vector average damping technique for flexible movement of clusters. Our experimental results on a recent corpus of Covid-19 shows that we obtain good results based on main keywords.
title	Modified K-means with Cluster Assignment -- Application to COVID-19 Data
topic	Information Retrieval
url	https://arxiv.org/abs/2402.03380

Similar Items