:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Spiesberger, Ari, Vazquez, Juan J., Pochinkov, Nicky, Gavenčiak, Tomáš, Grietzer, Peli, Leech, Gavin, Schoots, Nandi
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.12413
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AI-AI Bias: large language models favor communications generated by large language models
by: Laurito, Walter, et al.
Published: (2024)

Dissecting Language Models: Machine Unlearning via Selective Pruning
by: Pochinkov, Nicholas, et al.
Published: (2024)

Training Neural Networks for Modularity aids Interpretability
by: Golechha, Satvik, et al.
Published: (2024)

Relating Piecewise Linear Kolmogorov Arnold Networks to ReLU Networks
by: Schoots, Nandi, et al.
Published: (2025)

Studying Cross-cluster Modularity in Neural Networks
by: Golechha, Satvik, et al.
Published: (2025)

The Propensity for Density in Feed-forward Models
by: Schoots, Nandi, et al.
Published: (2024)

Extending Activation Steering to Broad Skills and Multiple Behaviours
by: van der Weij, Teun, et al.
Published: (2024)

Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms
by: Azarafrooz, Ari
Published: (2026)

On The Fragility of Benchmark Contamination Detection in Reasoning Models
by: Wang, Han, et al.
Published: (2025)

LLM Benchmark Datasets Should Be Contamination-Resistant
by: Al-Lawati, Ali, et al.
Published: (2026)

On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning
by: Karchmer, Ari
Published: (2024)

Deep Minds and Shallow Probes
by: Lee, Su Hyeong, et al.
Published: (2026)

LiveBench: A Challenging, Contamination-Limited LLM Benchmark
by: White, Colin, et al.
Published: (2024)

Beyond Tokens in Language Models: Interpreting Activations through Text Genre Chunks
by: Benito-Rodriguez, Éloïse, et al.
Published: (2025)

Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks
by: Lei, Yunwen, et al.
Published: (2026)

The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination
by: Sun, Yifan, et al.
Published: (2025)

Soft-ECM: An extension of Evidential C-Means for complex data
by: Soubeiga, Armel, et al.
Published: (2025)

Search-Time Data Contamination
by: Han, Ziwen, et al.
Published: (2025)

Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training
by: Wang, Kevin, et al.
Published: (2026)

ParaScopes: What do Language Models Activations Encode About Future Text?
by: Pochinkov, Nicky, et al.
Published: (2025)

From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference
by: de Araújo, Gracielle Antunes, et al.
Published: (2026)

Probabilistic Dreaming for World Models
by: Wong, Gavin
Published: (2026)

The Impact of Post-training on Data Contamination
by: Kocyigit, Muhammed Yusuf, et al.
Published: (2026)

A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data
by: Ulmer, Markus, et al.
Published: (2023)

LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
by: Yang, Chenghao, et al.
Published: (2025)

MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
by: Zhao, Qihao, et al.
Published: (2024)

State Contamination in Memory-Augmented LLM Agents
by: Wang, Yian, et al.
Published: (2026)

How Contaminated Is Your Benchmark? Quantifying Dataset Leakage in Large Language Models with Kernel Divergence
by: Choi, Hyeong Kyu, et al.
Published: (2025)

Language Generation with Infinite Contamination
by: Mehrotra, Anay, et al.
Published: (2025)

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning
by: Rens, Gavin B.
Published: (2025)

Forking Paths in Neural Text Generation
by: Bigelow, Eric, et al.
Published: (2024)

XAI-Units: Benchmarking Explainability Methods with Unit Tests
by: Lee, Jun Rui, et al.
Published: (2025)

BoTTA: Benchmarking on-device Test Time Adaptation
by: Danilowski, Michal, et al.
Published: (2025)

Learning Diverse Policies with Soft Self-Generated Guidance
by: Wang, Guojian, et al.
Published: (2024)

Online Detection of Water Contamination Under Concept Drift
by: Li, Jin, et al.
Published: (2025)

Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
by: Chen, Simin, et al.
Published: (2025)

Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models
by: Golchin, Shahriar, et al.
Published: (2023)

Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks
by: Dong, Hang-Cheng, et al.
Published: (2026)

The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity
by: Sahs, Justin, et al.
Published: (2025)

Deep Positive-Unlabeled Anomaly Detection for Contaminated Unlabeled Data
by: Takahashi, Hiroshi, et al.
Published: (2024)