Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.20793 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866910239839748096 |
|---|---|
| author | de Brito, Mariana Madruga Madureira, Brielen Carvalho, Taís Maria Nunes Delforge, Damien Jézéquel, Aglaé Kurfalı, Murathan Li, Ni Messori, Gabriele Nivre, Joakim Pernici, Barbara Speybroeck, Niko Terzi, Stefano Thiery, Wim Valkenborg, Bram Wang, Jingxian Zahra, Shorouq Zscheischler, Jakob Sodoge, Jan |
| author_facet | de Brito, Mariana Madruga Madureira, Brielen Carvalho, Taís Maria Nunes Delforge, Damien Jézéquel, Aglaé Kurfalı, Murathan Li, Ni Messori, Gabriele Nivre, Joakim Pernici, Barbara Speybroeck, Niko Terzi, Stefano Thiery, Wim Valkenborg, Bram Wang, Jingxian Zahra, Shorouq Zscheischler, Jakob Sodoge, Jan |
| contents | Recent advances in natural language processing (NLP) and large language models (LLMs) have enabled the systematic use of large-scale textual data from news, social media, and reports to create datasets with socio-economic impacts of climate hazards such as floods, droughts, storms, and multi-hazard events. As the field of text-as-data for impact assessment expands, so does its methodological complexity. Yet research remains fragmented, with no clear guidelines for defining what constitutes an impact, handling temporal and spatial biases, and selecting appropriate modeling and post-processing strategies. This lack of coherence limits transparency and comparability across studies. Here, we address this gap by synthesising common practices, describing key challenges specific to the use of text-as-data methods for analyzing socio-economic impact data, and proposing recommendations to address them. By providing guidance on best practices, we aim to support the construction of robust text-derived socio-economic impact datasets that can more accurately inform disaster risk management and attribution studies. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2605_20793 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Assessing socio-economic climate impacts from text data de Brito, Mariana Madruga Madureira, Brielen Carvalho, Taís Maria Nunes Delforge, Damien Jézéquel, Aglaé Kurfalı, Murathan Li, Ni Messori, Gabriele Nivre, Joakim Pernici, Barbara Speybroeck, Niko Terzi, Stefano Thiery, Wim Valkenborg, Bram Wang, Jingxian Zahra, Shorouq Zscheischler, Jakob Sodoge, Jan Computation and Language Recent advances in natural language processing (NLP) and large language models (LLMs) have enabled the systematic use of large-scale textual data from news, social media, and reports to create datasets with socio-economic impacts of climate hazards such as floods, droughts, storms, and multi-hazard events. As the field of text-as-data for impact assessment expands, so does its methodological complexity. Yet research remains fragmented, with no clear guidelines for defining what constitutes an impact, handling temporal and spatial biases, and selecting appropriate modeling and post-processing strategies. This lack of coherence limits transparency and comparability across studies. Here, we address this gap by synthesising common practices, describing key challenges specific to the use of text-as-data methods for analyzing socio-economic impact data, and proposing recommendations to address them. By providing guidance on best practices, we aim to support the construction of robust text-derived socio-economic impact datasets that can more accurately inform disaster risk management and attribution studies. |
| title | Assessing socio-economic climate impacts from text data |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2605.20793 |