Saved in:
| Main Authors: | Zve, Evangelia, Icard, Benjamin, Breton, Alice, Sainero, Lila, Bourgne, Gauvain, Ganascia, Jean-Gabriel |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.22030 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From Noise to Signal: When Outliers Seed New Topics
by: Zve, Evangelia, et al.
Published: (2026)
by: Zve, Evangelia, et al.
Published: (2026)
Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models
by: Icard, Benjamin, et al.
Published: (2025)
by: Icard, Benjamin, et al.
Published: (2025)
Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings
by: Icard, Benjamin, et al.
Published: (2026)
by: Icard, Benjamin, et al.
Published: (2026)
Reliable News or Propagandist News? A Neurosymbolic Model Using Genre, Topic, and Persuasion Techniques to Improve Robustness in Classification
by: Faye, Géraud, et al.
Published: (2026)
by: Faye, Géraud, et al.
Published: (2026)
Semiparametric Latent Topic Modeling on Consumer-Generated Corpora
by: Dayta, Dominic B., et al.
Published: (2021)
by: Dayta, Dominic B., et al.
Published: (2021)
New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024)
by: Škorić, Mihailo, et al.
Published: (2024)
An Argumentative Explanation Framework for Generalized Reason Model with Inconsistent Precedents
by: Fungwacharakorn, Wachara, et al.
Published: (2025)
by: Fungwacharakorn, Wachara, et al.
Published: (2025)
Bidirectional Topic Matching: Quantifying Thematic Overlap Between Corpora Through Topic Modelling
by: Adam, Raven, et al.
Published: (2024)
by: Adam, Raven, et al.
Published: (2024)
Distributed Asymmetric Allocation: A Topic Model for Large Imbalanced Corpora in Social Sciences
by: Watanabe, Kohei
Published: (2025)
by: Watanabe, Kohei
Published: (2025)
Identifying Narrative Patterns and Outliers in Holocaust Testimonies Using Topic Modeling
by: Ifergan, Maxim, et al.
Published: (2024)
by: Ifergan, Maxim, et al.
Published: (2024)
Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers
by: Movva, Rajiv, et al.
Published: (2023)
by: Movva, Rajiv, et al.
Published: (2023)
HYBRINFOX at CheckThat! 2024 -- Task 1: Enhancing Language Models with Structured Information for Check-Worthiness Estimation
by: Faye, Géraud, et al.
Published: (2024)
by: Faye, Géraud, et al.
Published: (2024)
Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025)
by: Church, Kenneth
Published: (2025)
Bias in News Summarization: Measures, Pitfalls and Corpora
by: Steen, Julius, et al.
Published: (2023)
by: Steen, Julius, et al.
Published: (2023)
Anticipating Innovation Using Large Language Models
by: Fenoaltea, Enrico Maria, et al.
Published: (2026)
by: Fenoaltea, Enrico Maria, et al.
Published: (2026)
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors
by: Huang, Jing, et al.
Published: (2025)
by: Huang, Jing, et al.
Published: (2025)
Anticipating Future with Large Language Model for Simultaneous Machine Translation
by: Ouyang, Siqi, et al.
Published: (2024)
by: Ouyang, Siqi, et al.
Published: (2024)
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
by: Lin, Peiqin, et al.
Published: (2024)
by: Lin, Peiqin, et al.
Published: (2024)
TopicProphet: Prophesies on Temporal Topic Trends and Stocks
by: Kim, Olivia
Published: (2025)
by: Kim, Olivia
Published: (2025)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)
by: Derner, Erik, et al.
Published: (2024)
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling
by: Mu, Yida, et al.
Published: (2024)
by: Mu, Yida, et al.
Published: (2024)
EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora
by: Qarah, Faisal
Published: (2024)
by: Qarah, Faisal
Published: (2024)
Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
by: Suzgun, Mirac, et al.
Published: (2024)
by: Suzgun, Mirac, et al.
Published: (2024)
BERTrend: Neural Topic Modeling for Emerging Trends Detection
by: Boutaleb, Allaa, et al.
Published: (2024)
by: Boutaleb, Allaa, et al.
Published: (2024)
Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?
by: Nedumpozhimana, Vasudevan, et al.
Published: (2024)
by: Nedumpozhimana, Vasudevan, et al.
Published: (2024)
An action language-based formalisation of an abstract argumentation framework
by: Munro, Yann, et al.
Published: (2024)
by: Munro, Yann, et al.
Published: (2024)
How Causal Abstraction Underpins Computational Explanation
by: Geiger, Atticus, et al.
Published: (2025)
by: Geiger, Atticus, et al.
Published: (2025)
Systematic Outliers in Large Language Models
by: An, Yongqi, et al.
Published: (2025)
by: An, Yongqi, et al.
Published: (2025)
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document Corpora
by: Majurski, Michael, et al.
Published: (2025)
by: Majurski, Michael, et al.
Published: (2025)
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)
by: Xu, Yuemei, et al.
Published: (2024)
HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection
by: Casanova, Morgane, et al.
Published: (2024)
by: Casanova, Morgane, et al.
Published: (2024)
A Dynamic Logic for Information Evaluation in Intelligence
by: Icard, Benjamin
Published: (2024)
by: Icard, Benjamin
Published: (2024)
Data Caricatures: On the Representation of African American Language in Pretraining Corpora
by: Deas, Nicholas, et al.
Published: (2025)
by: Deas, Nicholas, et al.
Published: (2025)
A Multi-Label Dataset of French Fake News: Human and Machine Insights
by: Icard, Benjamin, et al.
Published: (2024)
by: Icard, Benjamin, et al.
Published: (2024)
A Socratic RAG Approach to Connect Natural Language Queries on Research Topics with Knowledge Organization Systems
by: Lefton, Lew, et al.
Published: (2025)
by: Lefton, Lew, et al.
Published: (2025)
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering
by: Weir, Nathaniel, et al.
Published: (2024)
by: Weir, Nathaniel, et al.
Published: (2024)
Evaluating Commercial AI Chatbots as News Intermediaries
by: Suzgun, Mirac, et al.
Published: (2026)
by: Suzgun, Mirac, et al.
Published: (2026)
Interleaving Logic and Counting
by: van Benthem, Johan, et al.
Published: (2025)
by: van Benthem, Johan, et al.
Published: (2025)
SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora
by: Qarah, Faisal
Published: (2024)
by: Qarah, Faisal
Published: (2024)
Reap the Wild Wind: Detecting Media Storms in Large-Scale News Corpora
by: Markus, Dror K., et al.
Published: (2024)
by: Markus, Dror K., et al.
Published: (2024)
Similar Items
-
From Noise to Signal: When Outliers Seed New Topics
by: Zve, Evangelia, et al.
Published: (2026) -
Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models
by: Icard, Benjamin, et al.
Published: (2025) -
Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings
by: Icard, Benjamin, et al.
Published: (2026) -
Reliable News or Propagandist News? A Neurosymbolic Model Using Genre, Topic, and Persuasion Techniques to Improve Robustness in Classification
by: Faye, Géraud, et al.
Published: (2026) -
Semiparametric Latent Topic Modeling on Consumer-Generated Corpora
by: Dayta, Dominic B., et al.
Published: (2021)