Saved in:
| Main Authors: | Dorner, Florian E., Hardt, Moritz |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.02249 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Benchmark Prediction from Fewer Data Misses the Mark
by: Zhang, Guanhua, et al.
Published: (2025)
by: Zhang, Guanhua, et al.
Published: (2025)
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
by: Dorner, Florian E., et al.
Published: (2024)
by: Dorner, Florian E., et al.
Published: (2024)
Training on the Test Task Confounds Evaluation and Emergence
by: Dominguez-Olmedo, Ricardo, et al.
Published: (2024)
by: Dominguez-Olmedo, Ricardo, et al.
Published: (2024)
Don't Fool Me Twice: Adapting to Adversity in the Wild with Experience-Driven Reasoning
by: Ravie, Navin Sriram, et al.
Published: (2026)
by: Ravie, Navin Sriram, et al.
Published: (2026)
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
by: Choudhury, Rohan, et al.
Published: (2024)
by: Choudhury, Rohan, et al.
Published: (2024)
Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs
by: Kechris, Christodoulos, et al.
Published: (2024)
by: Kechris, Christodoulos, et al.
Published: (2024)
Balancing Label Quantity and Quality for Scalable Elicitation
by: Mallen, Alex, et al.
Published: (2024)
by: Mallen, Alex, et al.
Published: (2024)
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
by: Zhang, Guanhua, et al.
Published: (2024)
by: Zhang, Guanhua, et al.
Published: (2024)
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
by: Johnson, Daniel D., et al.
Published: (2024)
by: Johnson, Daniel D., et al.
Published: (2024)
Good Allocations from Bad Estimates
by: Casacuberta, Sílvia, et al.
Published: (2026)
by: Casacuberta, Sílvia, et al.
Published: (2026)
Test-Time Training on Nearest Neighbors for Large Language Models
by: Hardt, Moritz, et al.
Published: (2023)
by: Hardt, Moritz, et al.
Published: (2023)
Do causal predictors generalize better to new domains?
by: Nastl, Vivian Y., et al.
Published: (2024)
by: Nastl, Vivian Y., et al.
Published: (2024)
Performative Prediction: Past and Future
by: Hardt, Moritz, et al.
Published: (2023)
by: Hardt, Moritz, et al.
Published: (2023)
Is your model predicting the past?
by: Hardt, Moritz, et al.
Published: (2022)
by: Hardt, Moritz, et al.
Published: (2022)
ImageNot: A contrast with ImageNet preserves model rankings
by: Salaudeen, Olawale, et al.
Published: (2024)
by: Salaudeen, Olawale, et al.
Published: (2024)
What Makes ImageNet Look Unlike LAION
by: Shirali, Ali, et al.
Published: (2023)
by: Shirali, Ali, et al.
Published: (2023)
Don't Forget Imagination!
by: Vityaev, Evgenii E., et al.
Published: (2025)
by: Vityaev, Evgenii E., et al.
Published: (2025)
Unprocessing Seven Years of Algorithmic Fairness
by: Cruz, André F., et al.
Published: (2023)
by: Cruz, André F., et al.
Published: (2023)
If You Don't Understand It, Don't Use It: Eliminating Trojans with Filters Between Layers
by: Hernandez, Adriano
Published: (2024)
by: Hernandez, Adriano
Published: (2024)
Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation
by: Vernikos, Giorgos, et al.
Published: (2024)
by: Vernikos, Giorgos, et al.
Published: (2024)
Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds
by: Zhang, Jiefu, et al.
Published: (2026)
by: Zhang, Jiefu, et al.
Published: (2026)
Computational Arbitrage in AI Model Markets
by: Olmedo, Ricardo, et al.
Published: (2026)
by: Olmedo, Ricardo, et al.
Published: (2026)
Position: Don't be Afraid of Over-Smoothing And Over-Squashing
by: Kormann, Niklas, et al.
Published: (2026)
by: Kormann, Niklas, et al.
Published: (2026)
When Models Don't Collapse: On the Consistency of Iterative MLE
by: Barzilai, Daniel, et al.
Published: (2025)
by: Barzilai, Daniel, et al.
Published: (2025)
Grow, Don't Overwrite: Fine-tuning Without Forgetting
by: Adila, Dyah, et al.
Published: (2026)
by: Adila, Dyah, et al.
Published: (2026)
Allocation Requires Prediction Only if Inequality Is Low
by: Shirali, Ali, et al.
Published: (2024)
by: Shirali, Ali, et al.
Published: (2024)
Leaderboard Incentives: Model Rankings under Strategic Post-Training
by: Chen, Yatong, et al.
Published: (2026)
by: Chen, Yatong, et al.
Published: (2026)
Train-before-Test Harmonizes Language Model Rankings
by: Zhang, Guanhua, et al.
Published: (2025)
by: Zhang, Guanhua, et al.
Published: (2025)
Don't Stop Me Now: Embedding Based Scheduling for LLMs
by: Shahout, Rana, et al.
Published: (2024)
by: Shahout, Rana, et al.
Published: (2024)
We Still Don't Understand High-Dimensional Bayesian Optimization
by: Doumont, Colin, et al.
Published: (2025)
by: Doumont, Colin, et al.
Published: (2025)
ReducedLUT: Table Decomposition with "Don't Care" Conditions
by: Cassidy, Oliver, et al.
Published: (2024)
by: Cassidy, Oliver, et al.
Published: (2024)
Position: Measure Dataset Diversity, Don't Just Claim It
by: Zhao, Dora, et al.
Published: (2024)
by: Zhao, Dora, et al.
Published: (2024)
Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles
by: Forel, Alexandre, et al.
Published: (2022)
by: Forel, Alexandre, et al.
Published: (2022)
Transformers Don't In-Context Learn Least Squares Regression
by: Hill, Joshua, et al.
Published: (2025)
by: Hill, Joshua, et al.
Published: (2025)
Don't Walk the Line: Boundary Guidance for Filtered Generation
by: Ball, Sarah, et al.
Published: (2025)
by: Ball, Sarah, et al.
Published: (2025)
Don’t Test Twice, It’s All Right
by: Zoe Raglow, et al.
Published: (2025)
by: Zoe Raglow, et al.
Published: (2025)
Limits to Predicting Online Speech Using Large Language Models
by: Remeli, Mina, et al.
Published: (2024)
by: Remeli, Mina, et al.
Published: (2024)
Evaluating language models as risk scores
by: Cruz, André F., et al.
Published: (2024)
by: Cruz, André F., et al.
Published: (2024)
Don't Forget the Nonlinearity: Unlocking Activation Functions in Efficient Fine-Tuning
by: Yin, Bo, et al.
Published: (2025)
by: Yin, Bo, et al.
Published: (2025)
LoRA and Privacy: When Random Projections Help (and When They Don't)
by: Hu, Yaxi, et al.
Published: (2026)
by: Hu, Yaxi, et al.
Published: (2026)
Similar Items
-
How Benchmark Prediction from Fewer Data Misses the Mark
by: Zhang, Guanhua, et al.
Published: (2025) -
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
by: Dorner, Florian E., et al.
Published: (2024) -
Training on the Test Task Confounds Evaluation and Emergence
by: Dominguez-Olmedo, Ricardo, et al.
Published: (2024) -
Don't Fool Me Twice: Adapting to Adversity in the Wild with Experience-Driven Reasoning
by: Ravie, Navin Sriram, et al.
Published: (2026) -
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
by: Choudhury, Rohan, et al.
Published: (2024)