Salvato in:
Dettagli Bibliografici
Autori principali: Barbier, Victor, Jeangirard, Eric
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2501.10035
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866915107307520000
author Barbier, Victor
Jeangirard, Eric
author_facet Barbier, Victor
Jeangirard, Eric
contents This study introduces a novel methodology for mapping scientific communities at scale, addressing challenges associated with network analysis in large bibliometric datasets. By leveraging enriched publication metadata from the French research portal scanR and applying advanced filtering techniques to prioritize the strongest interactions between entities, we construct detailed, scalable network maps. These maps are enhanced through systematic disambiguation of authors, affiliations, and topics using persistent identifiers and specialized algorithms. The proposed framework integrates Elasticsearch for efficient data aggregation, Graphology for network spatialization (Force Atltas2) and community detection (Louvain algorithm) and VOSviewer for network vizualization. A Large Language Model (Mistral Nemo) is used to label the communities detected and OpenAlex data helps to enrich the results with citation counts estimation to detect hot topics. This scalable approach enables insightful exploration of research collaborations and thematic structures, with potential applications for strategic decision-making in science policy and funding. These web tools are effective at the global (national) scale but are also available (and can be integrated via iframes) on the perimeter of any French research institution (from large research organisms to any laboratory). The scanR community analysis tool is available online [https://scanr.enseignementsup-recherche.gouv.fr/networks/get-started](https://scanr.enseignementsup-recherche.gouv.fr/networks/get-started). All tools and methodologies are open-source on the repo [https://github.com/dataesr/scanr-ui](https://github.com/dataesr/scanr-ui)
format Preprint
id arxiv_https___arxiv_org_abs_2501_10035
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Mapping scientific communities at scale
Barbier, Victor
Jeangirard, Eric
Digital Libraries
This study introduces a novel methodology for mapping scientific communities at scale, addressing challenges associated with network analysis in large bibliometric datasets. By leveraging enriched publication metadata from the French research portal scanR and applying advanced filtering techniques to prioritize the strongest interactions between entities, we construct detailed, scalable network maps. These maps are enhanced through systematic disambiguation of authors, affiliations, and topics using persistent identifiers and specialized algorithms. The proposed framework integrates Elasticsearch for efficient data aggregation, Graphology for network spatialization (Force Atltas2) and community detection (Louvain algorithm) and VOSviewer for network vizualization. A Large Language Model (Mistral Nemo) is used to label the communities detected and OpenAlex data helps to enrich the results with citation counts estimation to detect hot topics. This scalable approach enables insightful exploration of research collaborations and thematic structures, with potential applications for strategic decision-making in science policy and funding. These web tools are effective at the global (national) scale but are also available (and can be integrated via iframes) on the perimeter of any French research institution (from large research organisms to any laboratory). The scanR community analysis tool is available online [https://scanr.enseignementsup-recherche.gouv.fr/networks/get-started](https://scanr.enseignementsup-recherche.gouv.fr/networks/get-started). All tools and methodologies are open-source on the repo [https://github.com/dataesr/scanr-ui](https://github.com/dataesr/scanr-ui)
title Mapping scientific communities at scale
topic Digital Libraries
url https://arxiv.org/abs/2501.10035