Saved in:
| Main Authors: | Estève, Louis, Servan, Christophe, Lavergne, Thomas, Savary, Agata |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.22014 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Formalising lexical and syntactic diversity for data sampling in French
by: Estève, Louis, et al.
Published: (2025)
by: Estève, Louis, et al.
Published: (2025)
Patent Language Model Pretraining with ModernBERT
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
Chinese ModernBERT with Whole-Word Masking
by: Zhao, Zeyu, et al.
Published: (2025)
by: Zhao, Zeyu, et al.
Published: (2025)
TabiBERT: A Large-Scale ModernBERT Foundation Model and A Unified Benchmark for Turkish
by: Türker, Melikşah, et al.
Published: (2025)
by: Türker, Melikşah, et al.
Published: (2025)
A survey of diversity quantification in natural language processing: The why, what, where and how
by: Estève, Louis, et al.
Published: (2025)
by: Estève, Louis, et al.
Published: (2025)
ModernBERT + ColBERT: Enhancing biomedical RAG through an advanced re-ranking retriever
by: Rivera, Eduardo Martínez, et al.
Published: (2025)
by: Rivera, Eduardo Martínez, et al.
Published: (2025)
NorBERTo: A ModernBERT Model Trained for Portuguese with 331 Billion Tokens Corpus
by: Silva, Enzo S. N., et al.
Published: (2026)
by: Silva, Enzo S. N., et al.
Published: (2026)
Clinical ModernBERT: An efficient and long context encoder for biomedical text
by: Lee, Simon A., et al.
Published: (2025)
by: Lee, Simon A., et al.
Published: (2025)
ModernBERT is More Efficient than Conventional BERT for Chest CT Findings Classification in Japanese Radiology Reports
by: Yamagishi, Yosuke, et al.
Published: (2025)
by: Yamagishi, Yosuke, et al.
Published: (2025)
BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP
by: Sounack, Thomas, et al.
Published: (2025)
by: Sounack, Thomas, et al.
Published: (2025)
Pretraining Finnish ModernBERTs
by: Reunamo, Akseli, et al.
Published: (2025)
by: Reunamo, Akseli, et al.
Published: (2025)
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length
by: Sugiura, Issa, et al.
Published: (2025)
by: Sugiura, Issa, et al.
Published: (2025)
A Benchmark Evaluation of Clinical Named Entity Recognition in French
by: Bannour, Nesrine, et al.
Published: (2024)
by: Bannour, Nesrine, et al.
Published: (2024)
ModernBERT or DeBERTaV3? Examining Architecture and Data Influence on Transformer Encoder Models Performance
by: Antoun, Wissam, et al.
Published: (2025)
by: Antoun, Wissam, et al.
Published: (2025)
Spatial ModernBERT: Spatial-Aware Transformer for Table and Key-Value Extraction in Financial Documents at Scale
by: Javis AI Team, et al.
Published: (2025)
by: Javis AI Team, et al.
Published: (2025)
New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark
by: Alavoine, Nadège, et al.
Published: (2024)
by: Alavoine, Nadège, et al.
Published: (2024)
LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction
by: Pommeret, Luc, et al.
Published: (2026)
by: Pommeret, Luc, et al.
Published: (2026)
Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning
by: Lepagnol, Pierre, et al.
Published: (2025)
by: Lepagnol, Pierre, et al.
Published: (2025)
CamemBERT 2.0: A Smarter French Language Model Aged to Perfection
by: Antoun, Wissam, et al.
Published: (2024)
by: Antoun, Wissam, et al.
Published: (2024)
FRASIMED: a Clinical French Annotated Resource Produced through Crosslingual BERT-Based Annotation Projection
by: Zaghir, Jamil, et al.
Published: (2023)
by: Zaghir, Jamil, et al.
Published: (2023)
mALBERT: Is a Compact Multilingual BERT Model Still Worth It?
by: Servan, Christophe, et al.
Published: (2024)
by: Servan, Christophe, et al.
Published: (2024)
An investigation of structures responsible for gender bias in BERT and DistilBERT
by: Leteno, Thibaud, et al.
Published: (2024)
by: Leteno, Thibaud, et al.
Published: (2024)
How Gender Interacts with Political Values: A Case Study on Czech BERT Models
by: Ali, Adnan Al, et al.
Published: (2024)
by: Ali, Adnan Al, et al.
Published: (2024)
Breaking MLPerf Training: A Case Study on Optimizing BERT
by: Kim, Yongdeok, et al.
Published: (2024)
by: Kim, Yongdeok, et al.
Published: (2024)
m3BERT: A Modern, Multi-lingual, Matryoshka Bidirectional Encoder
by: Wang, Yaoxiang, et al.
Published: (2026)
by: Wang, Yaoxiang, et al.
Published: (2026)
Understanding the Interplay of Scale, Data, and Bias in Language Models: A Case Study with BERT
by: Ali, Muhammad, et al.
Published: (2024)
by: Ali, Muhammad, et al.
Published: (2024)
A Dataset for Pharmacovigilance in German, French, and Japanese: Annotating Adverse Drug Reactions across Languages
by: Raithel, Lisa, et al.
Published: (2024)
by: Raithel, Lisa, et al.
Published: (2024)
Construction Identification and Disambiguation Using BERT: A Case Study of NPN
by: Scivetti, Wesley, et al.
Published: (2025)
by: Scivetti, Wesley, et al.
Published: (2025)
AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic
by: Elshehy, Omar, et al.
Published: (2026)
by: Elshehy, Omar, et al.
Published: (2026)
CamemBERT-bio: Leveraging Continual Pre-training for Cost-Effective Models on French Biomedical Data
by: Touchent, Rian, et al.
Published: (2023)
by: Touchent, Rian, et al.
Published: (2023)
mmBERT: A Modern Multilingual Encoder with Annealed Language Learning
by: Marone, Marc, et al.
Published: (2025)
by: Marone, Marc, et al.
Published: (2025)
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
by: Khan, Eeham, et al.
Published: (2025)
by: Khan, Eeham, et al.
Published: (2025)
NeoBERT: A Next-Generation BERT
by: Breton, Lola Le, et al.
Published: (2025)
by: Breton, Lola Le, et al.
Published: (2025)
Emotionally Aware Moderation: The Potential of Emotion Monitoring in Shaping Healthier Social Media Conversations
by: Su, Xiaotian, et al.
Published: (2025)
by: Su, Xiaotian, et al.
Published: (2025)
Modeling the Construction of a Literary Archetype: The Case of the Detective Figure in French Literature
by: Barré, Jean, et al.
Published: (2025)
by: Barré, Jean, et al.
Published: (2025)
From BERT to T5: A Study of Named Entity Recognition
by: Jia, Mei
Published: (2026)
by: Jia, Mei
Published: (2026)
ConfliBERT: A Language Model for Political Conflict
by: Brandt, Patrick T., et al.
Published: (2024)
by: Brandt, Patrick T., et al.
Published: (2024)
Histoires Morales: A French Dataset for Assessing Moral Alignment
by: Leteno, Thibaud, et al.
Published: (2025)
by: Leteno, Thibaud, et al.
Published: (2025)
SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation
by: Lv, Changze, et al.
Published: (2023)
by: Lv, Changze, et al.
Published: (2023)
Attention on Multiword Expressions: A Multilingual Study of BERT-based Models with Regard to Idiomaticity and Microsyntax
by: Zaitova, Iuliia, et al.
Published: (2025)
by: Zaitova, Iuliia, et al.
Published: (2025)
Similar Items
-
Formalising lexical and syntactic diversity for data sampling in French
by: Estève, Louis, et al.
Published: (2025) -
Patent Language Model Pretraining with ModernBERT
by: Yousefiramandi, Amirhossein, et al.
Published: (2025) -
Chinese ModernBERT with Whole-Word Masking
by: Zhao, Zeyu, et al.
Published: (2025) -
TabiBERT: A Large-Scale ModernBERT Foundation Model and A Unified Benchmark for Turkish
by: Türker, Melikşah, et al.
Published: (2025) -
A survey of diversity quantification in natural language processing: The why, what, where and how
by: Estève, Louis, et al.
Published: (2025)