Saved in:
| Main Authors: | Ghosh, Adhiraj, Udandarao, Vishaal, Nguyen, Thao, Farina, Matteo, Cherti, Mehdi, Jitsev, Jenia, Oh, Sewoong, Ricci, Elisa, Schmidt, Ludwig, Bethge, Matthias |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.20643 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks
by: Udandarao, Vishaal, et al.
Published: (2025)
by: Udandarao, Vishaal, et al.
Published: (2025)
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
by: Ghosh, Adhiraj, et al.
Published: (2024)
by: Ghosh, Adhiraj, et al.
Published: (2024)
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
by: Udandarao, Vishaal, et al.
Published: (2024)
by: Udandarao, Vishaal, et al.
Published: (2024)
A Practitioner's Guide to Continual Multimodal Pretraining
by: Roth, Karsten, et al.
Published: (2024)
by: Roth, Karsten, et al.
Published: (2024)
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
by: Nezhurina, Marianna, et al.
Published: (2024)
by: Nezhurina, Marianna, et al.
Published: (2024)
Reproducible scaling laws for contrastive language-image learning
by: Cherti, Mehdi, et al.
Published: (2022)
by: Cherti, Mehdi, et al.
Published: (2022)
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets
by: Nezhurina, Marianna, et al.
Published: (2025)
by: Nezhurina, Marianna, et al.
Published: (2025)
Inverse Deep Learning Ray Tracing for Heliostat Surface Prediction
by: Lewen, Jan, et al.
Published: (2024)
by: Lewen, Jan, et al.
Published: (2024)
Scalable heliostat surface predictions from focal spots: Sim-to-Real transfer of inverse Deep Learning Raytracing
by: Lewen, Jan, et al.
Published: (2025)
by: Lewen, Jan, et al.
Published: (2025)
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
by: Hochlehnert, Andreas, et al.
Published: (2025)
by: Hochlehnert, Andreas, et al.
Published: (2025)
Solving Spatial Supersensing Without Spatial Supersensing
by: Udandarao, Vishaal, et al.
Published: (2025)
by: Udandarao, Vishaal, et al.
Published: (2025)
Efficient Lifelong Model Evaluation in an Era of Rapid Progress
by: Prabhu, Ameya, et al.
Published: (2024)
by: Prabhu, Ameya, et al.
Published: (2024)
CiteME: Can Language Models Accurately Cite Scientific Claims?
by: Press, Ori, et al.
Published: (2024)
by: Press, Ori, et al.
Published: (2024)
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
by: Porian, Tomer, et al.
Published: (2024)
by: Porian, Tomer, et al.
Published: (2024)
How to Merge Your Multimodal Models Over Time?
by: Dziadzio, Sebastian, et al.
Published: (2024)
by: Dziadzio, Sebastian, et al.
Published: (2024)
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs
by: Schuhmann, Christoph, et al.
Published: (2025)
by: Schuhmann, Christoph, et al.
Published: (2025)
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
by: Nguyen, Thao, et al.
Published: (2025)
by: Nguyen, Thao, et al.
Published: (2025)
Better Alignment with Instruction Back-and-Forth Translation
by: Nguyen, Thao, et al.
Published: (2024)
by: Nguyen, Thao, et al.
Published: (2024)
Game Reasoning Arena: A Framework and Benchmark for Assessing Reasoning Capabilities of Large Language Models via Game Play
by: Cipolina-Kun, Lucia, et al.
Published: (2025)
by: Cipolina-Kun, Lucia, et al.
Published: (2025)
AudioToolAgent: An Agentic Framework for Audio-Language Models
by: Wijngaard, Gijs, et al.
Published: (2025)
by: Wijngaard, Gijs, et al.
Published: (2025)
Multilingual Diversity Improves Vision-Language Representations
by: Nguyen, Thao, et al.
Published: (2024)
by: Nguyen, Thao, et al.
Published: (2024)
Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data
by: Moroianu, Stefania L., et al.
Published: (2025)
by: Moroianu, Stefania L., et al.
Published: (2025)
LLM generation novelty through the lens of semantic similarity
by: Davydov, Philipp, et al.
Published: (2025)
by: Davydov, Philipp, et al.
Published: (2025)
Data-Centric Lessons To Improve Speech-Language Pretraining
by: Udandarao, Vishaal, et al.
Published: (2025)
by: Udandarao, Vishaal, et al.
Published: (2025)
Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages
by: Farina, Matteo, et al.
Published: (2025)
by: Farina, Matteo, et al.
Published: (2025)
Linear Model Merging Unlocks Simple and Scalable Multimodal Data Mixture Optimization
by: Berasi, Davide, et al.
Published: (2026)
by: Berasi, Davide, et al.
Published: (2026)
Paradoxes of Social Capital
by: Cherti, Myriam
Published: (2010)
by: Cherti, Myriam
Published: (2010)
Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training
by: Chattopadhyay, Adhiraj
Published: (2026)
by: Chattopadhyay, Adhiraj
Published: (2026)
Equivariance by Contrast: Identifiable Equivariant Embeddings from Unlabeled Finite Group Actions
by: Schmidt, Tobias, et al.
Published: (2025)
by: Schmidt, Tobias, et al.
Published: (2025)
Large Multimodal Models as General In-Context Classifiers
by: Garosi, Marco, et al.
Published: (2026)
by: Garosi, Marco, et al.
Published: (2026)
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
by: Berasi, Davide, et al.
Published: (2025)
by: Berasi, Davide, et al.
Published: (2025)
Frustratingly Easy Test-Time Adaptation of Vision-Language Models
by: Farina, Matteo, et al.
Published: (2024)
by: Farina, Matteo, et al.
Published: (2024)
Learning in Compact Spaces with Approximately Normalized Transformer
by: Franke, Jörg K. H., et al.
Published: (2025)
by: Franke, Jörg K. H., et al.
Published: (2025)
Sampling from Your Language Model One Byte at a Time
by: Hayase, Jonathan, et al.
Published: (2025)
by: Hayase, Jonathan, et al.
Published: (2025)
Pretraining Frequency Predicts Compositional Generalization of CLIP on Real-World Tasks
by: Wiedemer, Thaddäus, et al.
Published: (2025)
by: Wiedemer, Thaddäus, et al.
Published: (2025)
Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models
by: Thede, Lukas, et al.
Published: (2024)
by: Thede, Lukas, et al.
Published: (2024)
Investigating Continual Pretraining in Large Language Models: Insights and Implications
by: Yıldız, Çağatay, et al.
Published: (2024)
by: Yıldız, Çağatay, et al.
Published: (2024)
PairAlign: A Framework for Sequence Tokenization via Self-Alignment with Applications to Audio Tokenization
by: Banerjee, Adhiraj, et al.
Published: (2026)
by: Banerjee, Adhiraj, et al.
Published: (2026)
Dissipative relativistic fluid flow: A simple Lorentz invariant causal model capturing entropy shocks in its zero viscosity limit
by: Reintjes, Moritz, et al.
Published: (2024)
by: Reintjes, Moritz, et al.
Published: (2024)
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents
by: Banerjee, Adhiraj, et al.
Published: (2025)
by: Banerjee, Adhiraj, et al.
Published: (2025)
Similar Items
-
A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks
by: Udandarao, Vishaal, et al.
Published: (2025) -
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
by: Ghosh, Adhiraj, et al.
Published: (2024) -
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
by: Udandarao, Vishaal, et al.
Published: (2024) -
A Practitioner's Guide to Continual Multimodal Pretraining
by: Roth, Karsten, et al.
Published: (2024) -
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
by: Nezhurina, Marianna, et al.
Published: (2024)