:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zve, Evangelia, Icard, Benjamin, Breton, Alice, Sainero, Lila, Bourgne, Gauvain, Ganascia, Jean-Gabriel
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2509.22030
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

From Noise to Signal: When Outliers Seed New Topics
by: Zve, Evangelia, et al.
Published: (2026)

Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models
by: Icard, Benjamin, et al.
Published: (2025)

Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings
by: Icard, Benjamin, et al.
Published: (2026)

Reliable News or Propagandist News? A Neurosymbolic Model Using Genre, Topic, and Persuasion Techniques to Improve Robustness in Classification
by: Faye, Géraud, et al.
Published: (2026)

Semiparametric Latent Topic Modeling on Consumer-Generated Corpora
by: Dayta, Dominic B., et al.
Published: (2021)

New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024)

An Argumentative Explanation Framework for Generalized Reason Model with Inconsistent Precedents
by: Fungwacharakorn, Wachara, et al.
Published: (2025)

Bidirectional Topic Matching: Quantifying Thematic Overlap Between Corpora Through Topic Modelling
by: Adam, Raven, et al.
Published: (2024)

Distributed Asymmetric Allocation: A Topic Model for Large Imbalanced Corpora in Social Sciences
by: Watanabe, Kohei
Published: (2025)

Identifying Narrative Patterns and Outliers in Holocaust Testimonies Using Topic Modeling
by: Ifergan, Maxim, et al.
Published: (2024)

Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers
by: Movva, Rajiv, et al.
Published: (2023)

HYBRINFOX at CheckThat! 2024 -- Task 1: Enhancing Language Models with Structured Information for Check-Worthiness Estimation
by: Faye, Géraud, et al.
Published: (2024)

Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025)

Bias in News Summarization: Measures, Pitfalls and Corpora
by: Steen, Julius, et al.
Published: (2023)

Anticipating Innovation Using Large Language Models
by: Fenoaltea, Enrico Maria, et al.
Published: (2026)

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors
by: Huang, Jing, et al.
Published: (2025)

Anticipating Future with Large Language Model for Simultaneous Machine Translation
by: Ouyang, Siqi, et al.
Published: (2024)

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
by: Lin, Peiqin, et al.
Published: (2024)

TopicProphet: Prophesies on Temporal Topic Trends and Stocks
by: Kim, Olivia
Published: (2025)

Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)

Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling
by: Mu, Yida, et al.
Published: (2024)

EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora
by: Qarah, Faisal
Published: (2024)

Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
by: Suzgun, Mirac, et al.
Published: (2024)

BERTrend: Neural Topic Modeling for Emerging Trends Detection
by: Boutaleb, Allaa, et al.
Published: (2024)

Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?
by: Nedumpozhimana, Vasudevan, et al.
Published: (2024)

An action language-based formalisation of an abstract argumentation framework
by: Munro, Yann, et al.
Published: (2024)

How Causal Abstraction Underpins Computational Explanation
by: Geiger, Atticus, et al.
Published: (2025)

Systematic Outliers in Large Language Models
by: An, Yongqi, et al.
Published: (2025)

Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document Corpora
by: Majurski, Michael, et al.
Published: (2025)

A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)

HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection
by: Casanova, Morgane, et al.
Published: (2024)

A Dynamic Logic for Information Evaluation in Intelligence
by: Icard, Benjamin
Published: (2024)

Data Caricatures: On the Representation of African American Language in Pretraining Corpora
by: Deas, Nicholas, et al.
Published: (2025)

A Multi-Label Dataset of French Fake News: Human and Machine Insights
by: Icard, Benjamin, et al.
Published: (2024)

A Socratic RAG Approach to Connect Natural Language Queries on Research Topics with Knowledge Organization Systems
by: Lefton, Lew, et al.
Published: (2025)

From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering
by: Weir, Nathaniel, et al.
Published: (2024)

Evaluating Commercial AI Chatbots as News Intermediaries
by: Suzgun, Mirac, et al.
Published: (2026)

Interleaving Logic and Counting
by: van Benthem, Johan, et al.
Published: (2025)

SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora
by: Qarah, Faisal
Published: (2024)

Reap the Wild Wind: Detecting Media Storms in Large-Scale News Corpora
by: Markus, Dror K., et al.
Published: (2024)