Saved in:
Bibliographic Details
Main Authors: Lin, Jonathan, Joshi, Aditya, Paik, Hye-young, Doung, Tri Dung, Gurdasani, Deepti
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.11440
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909462623682560
author Lin, Jonathan
Joshi, Aditya
Paik, Hye-young
Doung, Tri Dung
Gurdasani, Deepti
author_facet Lin, Jonathan
Joshi, Aditya
Paik, Hye-young
Doung, Tri Dung
Gurdasani, Deepti
contents Geocoding involves automatic extraction of location coordinates of incidents reported in news articles, and can be used for epidemic intelligence or disaster management. This paper introduces Retrieval-Augmented Coordinate Capture Of Online News articles (RACCOON), an open-source geocoding approach that extracts geolocations from news articles. RACCOON uses a retrieval-augmented generation (RAG) approach where candidate locations and associated information are retrieved in the form of context from a location database, and a prompt containing the retrieved context, location mentions and news articles is fed to an LLM to generate the location coordinates. Our evaluation on three datasets, two underlying LLMs, three baselines and several ablation tests based on the components of RACCOON demonstrate the utility of RACCOON. To the best of our knowledge, RACCOON is the first RAG-based approach for geocoding using pre-trained LLMs.
format Preprint
id arxiv_https___arxiv_org_abs_2501_11440
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle RACCOON: A Retrieval-Augmented Generation Approach for Location Coordinate Capture from News Articles
Lin, Jonathan
Joshi, Aditya
Paik, Hye-young
Doung, Tri Dung
Gurdasani, Deepti
Computation and Language
Geocoding involves automatic extraction of location coordinates of incidents reported in news articles, and can be used for epidemic intelligence or disaster management. This paper introduces Retrieval-Augmented Coordinate Capture Of Online News articles (RACCOON), an open-source geocoding approach that extracts geolocations from news articles. RACCOON uses a retrieval-augmented generation (RAG) approach where candidate locations and associated information are retrieved in the form of context from a location database, and a prompt containing the retrieved context, location mentions and news articles is fed to an LLM to generate the location coordinates. Our evaluation on three datasets, two underlying LLMs, three baselines and several ablation tests based on the components of RACCOON demonstrate the utility of RACCOON. To the best of our knowledge, RACCOON is the first RAG-based approach for geocoding using pre-trained LLMs.
title RACCOON: A Retrieval-Augmented Generation Approach for Location Coordinate Capture from News Articles
topic Computation and Language
url https://arxiv.org/abs/2501.11440