Saved in:
Bibliographic Details
Main Authors: Ishlach, Koren, Ben-David, Itzhak, Fire, Michael, Rokach, Lior
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.13071
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908980942471168
author Ishlach, Koren
Ben-David, Itzhak
Fire, Michael
Rokach, Lior
author_facet Ishlach, Koren
Ben-David, Itzhak
Fire, Michael
Rokach, Lior
contents Embedding news articles is a crucial tool for multiple fields, such as media bias detection, identifying fake news, and making news recommendations. However, existing news embedding methods are not optimized to capture the latent context of news events. Most embedding methods rely on full-text information and neglect time-relevant embedding generation. In this paper, we propose a novel lightweight method that optimizes news embedding generation by focusing on entities and themes mentioned in articles and their historical connections to specific events. We suggest a method composed of three stages. First, we process and extract events, entities, and themes from the given news articles. Second, we generate periodic time embeddings for themes and entities by training time-separated GloVe models on current and historical data. Lastly, we concatenate the news embeddings generated by two distinct approaches: Smooth Inverse Frequency (SIF) for article-level vectors and Siamese Neural Networks for embeddings with nuanced event-related information. We leveraged over 850,000 news articles and 1,000,000 events from the GDELT project to test and evaluate our method. We conducted a comparative analysis of different news embedding generation methods for validation. Our experiments demonstrate that our approach can both improve and outperform state-of-the-art methods on shared event detection tasks.
format Preprint
id arxiv_https___arxiv_org_abs_2405_13071
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle A Novel Method for News Article Event-Based Embedding
Ishlach, Koren
Ben-David, Itzhak
Fire, Michael
Rokach, Lior
Computation and Language
Artificial Intelligence
Social and Information Networks
Embedding news articles is a crucial tool for multiple fields, such as media bias detection, identifying fake news, and making news recommendations. However, existing news embedding methods are not optimized to capture the latent context of news events. Most embedding methods rely on full-text information and neglect time-relevant embedding generation. In this paper, we propose a novel lightweight method that optimizes news embedding generation by focusing on entities and themes mentioned in articles and their historical connections to specific events. We suggest a method composed of three stages. First, we process and extract events, entities, and themes from the given news articles. Second, we generate periodic time embeddings for themes and entities by training time-separated GloVe models on current and historical data. Lastly, we concatenate the news embeddings generated by two distinct approaches: Smooth Inverse Frequency (SIF) for article-level vectors and Siamese Neural Networks for embeddings with nuanced event-related information. We leveraged over 850,000 news articles and 1,000,000 events from the GDELT project to test and evaluate our method. We conducted a comparative analysis of different news embedding generation methods for validation. Our experiments demonstrate that our approach can both improve and outperform state-of-the-art methods on shared event detection tasks.
title A Novel Method for News Article Event-Based Embedding
topic Computation and Language
Artificial Intelligence
Social and Information Networks
url https://arxiv.org/abs/2405.13071