Saved in:
| Main Authors: | Barmina, Gianluca, Norman, Nathalie Carmen Hau, Schneider-Kamp, Peter, Poech, Lukas Galke |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.04799 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SDUs DAISY: A Benchmark for Danish Culture
by: Nielsen, Jacob, et al.
Published: (2026)
by: Nielsen, Jacob, et al.
Published: (2026)
Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals
by: Torrielli, Federico, et al.
Published: (2026)
by: Torrielli, Federico, et al.
Published: (2026)
Training Language Models to Use Prolog as a Tool
by: Mellgren, Niklas, et al.
Published: (2025)
by: Mellgren, Niklas, et al.
Published: (2025)
Isolating Culture Neurons in Multilingual Large Language Models
by: Namazifard, Danial, et al.
Published: (2025)
by: Namazifard, Danial, et al.
Published: (2025)
SommBench: Assessing Sommelier Expertise of Language Models
by: Brach, William, et al.
Published: (2026)
by: Brach, William, et al.
Published: (2026)
Chain of Summaries: Summarization Through Iterative Questioning
by: Brach, William, et al.
Published: (2025)
by: Brach, William, et al.
Published: (2025)
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
by: Brach, William, et al.
Published: (2026)
by: Brach, William, et al.
Published: (2026)
When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
by: Nielsen, Jacob, et al.
Published: (2024)
by: Nielsen, Jacob, et al.
Published: (2024)
Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion
by: Beltoft, Stine Lyngsø, et al.
Published: (2026)
by: Beltoft, Stine Lyngsø, et al.
Published: (2026)
DeToNATION: Decoupled Torch Network-Aware Training on Interlinked Online Nodes
by: From, Mogens Henrik, et al.
Published: (2025)
by: From, Mogens Henrik, et al.
Published: (2025)
ChronoMedKG: A Temporally-Grounded Biomedical Knowledge Graph and Benchmark for Clinical Reasoning
by: Ahmed, Md Shamim, et al.
Published: (2026)
by: Ahmed, Md Shamim, et al.
Published: (2026)
Dynaword: From One-shot to Continuously Developed Datasets
by: Enevoldsen, Kenneth, et al.
Published: (2025)
by: Enevoldsen, Kenneth, et al.
Published: (2025)
MELA: Multilingual Evaluation of Linguistic Acceptability
by: Zhang, Ziyin, et al.
Published: (2023)
by: Zhang, Ziyin, et al.
Published: (2023)
FlexMoRE: A Flexible Mixture of Rank-heterogeneous Experts for Efficient Federatedly-trained Large Language Models
by: Pirchert, Annemette Brok, et al.
Published: (2026)
by: Pirchert, Annemette Brok, et al.
Published: (2026)
The Provenance Gap in Clinical AI: Evidence-Traceable Temporal Knowledge Graphs for Rare Disease Reasoning
by: Ahmed, Md Shamim, et al.
Published: (2026)
by: Ahmed, Md Shamim, et al.
Published: (2026)
Learning and communication pressures in neural networks: Lessons from emergent communication
by: Galke, Lukas, et al.
Published: (2024)
by: Galke, Lukas, et al.
Published: (2024)
DaKultur: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers
by: Müller-Eberstein, Max, et al.
Published: (2025)
by: Müller-Eberstein, Max, et al.
Published: (2025)
Isotropy Matters: Soft-ZCA Whitening of Embeddings for Semantic Code Search
by: Diera, Andor, et al.
Published: (2024)
by: Diera, Andor, et al.
Published: (2024)
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
by: Nielsen, Jacob, et al.
Published: (2025)
by: Nielsen, Jacob, et al.
Published: (2025)
Four Shades of Life Sciences: A Dataset for Disinformation Detection in the Life Sciences
by: Seidlmayer, Eva, et al.
Published: (2025)
by: Seidlmayer, Eva, et al.
Published: (2025)
QFrCoLA: a Quebec-French Corpus of Linguistic Acceptability Judgments
by: Beauchemin, David, et al.
Published: (2025)
by: Beauchemin, David, et al.
Published: (2025)
What makes a language easy to deep-learn? Deep neural networks and humans similarly benefit from compositional structure
by: Galke, Lukas, et al.
Published: (2023)
by: Galke, Lukas, et al.
Published: (2023)
Not Everything That Counts Can Be Counted: A Case for Safe Qualitative AI
by: Beltoft, Stine, et al.
Published: (2025)
by: Beltoft, Stine, et al.
Published: (2025)
Efficient Continual Learning for Small Language Models with a Discrete Key-Value Bottleneck
by: Diera, Andor, et al.
Published: (2024)
by: Diera, Andor, et al.
Published: (2024)
Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5
by: Dang, Thao Anh, et al.
Published: (2024)
by: Dang, Thao Anh, et al.
Published: (2024)
BitNet b1.58 Reloaded: State-of-the-art Performance Also on Smaller Networks
by: Nielsen, Jacob, et al.
Published: (2024)
by: Nielsen, Jacob, et al.
Published: (2024)
Hesitation is defeat? Connecting Linguistic and Predictive Uncertainty
by: Manzo, Gianluca, et al.
Published: (2025)
by: Manzo, Gianluca, et al.
Published: (2025)
DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition
by: Enevoldsen, Kenneth, et al.
Published: (2024)
by: Enevoldsen, Kenneth, et al.
Published: (2024)
COLA-GEC: A Bidirectional Framework for Enhancing Grammatical Acceptability and Error Correction
by: Yang, Xiangyu, et al.
Published: (2025)
by: Yang, Xiangyu, et al.
Published: (2025)
Context Is What You Need: The Maximum Effective Context Window for Real World Limits of LLMs
by: Paulsen, Norman
Published: (2025)
by: Paulsen, Norman
Published: (2025)
Exploring Linguistic Features for Turkish Text Readability
by: Uluslu, Ahmet Yavuz, et al.
Published: (2023)
by: Uluslu, Ahmet Yavuz, et al.
Published: (2023)
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models
by: Orth, Jasmin, et al.
Published: (2025)
by: Orth, Jasmin, et al.
Published: (2025)
DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution
by: Jiang, Aiwen, et al.
Published: (2024)
by: Jiang, Aiwen, et al.
Published: (2024)
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks
by: Nielsen, Dan Saattrup, et al.
Published: (2024)
by: Nielsen, Dan Saattrup, et al.
Published: (2024)
Natural Language Processing RELIES on Linguistics
by: Opitz, Juri, et al.
Published: (2024)
by: Opitz, Juri, et al.
Published: (2024)
Evaluation of a Sign Language Avatar on Comprehensibility, User Experience \& Acceptability
by: Wasserroth, Fenya, et al.
Published: (2025)
by: Wasserroth, Fenya, et al.
Published: (2025)
We Should Evaluate Real-World Impact
by: Reiter, Ehud
Published: (2025)
by: Reiter, Ehud
Published: (2025)
Are We Really Making Much Progress in Text Classification? A Comparative Review
by: Galke, Lukas, et al.
Published: (2022)
by: Galke, Lukas, et al.
Published: (2022)
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
by: Nguyen, Duc Hau, et al.
Published: (2025)
by: Nguyen, Duc Hau, et al.
Published: (2025)
GuideWeb: A Benchmark for Automatic In-App Guide Generation on Real-World Web UIs
by: Gan, Chengguang, et al.
Published: (2026)
by: Gan, Chengguang, et al.
Published: (2026)
Similar Items
-
SDUs DAISY: A Benchmark for Danish Culture
by: Nielsen, Jacob, et al.
Published: (2026) -
Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals
by: Torrielli, Federico, et al.
Published: (2026) -
Training Language Models to Use Prolog as a Tool
by: Mellgren, Niklas, et al.
Published: (2025) -
Isolating Culture Neurons in Multilingual Large Language Models
by: Namazifard, Danial, et al.
Published: (2025) -
SommBench: Assessing Sommelier Expertise of Language Models
by: Brach, William, et al.
Published: (2026)