Saved in:
| Main Authors: | Nuyts, Loren, Davis, Jesse |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.20821 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Powerful are Decoder-Only Transformer Neural Models?
by: Roberts, Jesse
Published: (2023)
by: Roberts, Jesse
Published: (2023)
Biases in Expected Goals Models Confound Finishing Ability
by: Davis, Jesse, et al.
Published: (2024)
by: Davis, Jesse, et al.
Published: (2024)
Language Models Improve When Pretraining Data Matches Target Tasks
by: Mizrahi, David, et al.
Published: (2025)
by: Mizrahi, David, et al.
Published: (2025)
Targeted Learning for Variable Importance
by: Wang, Xiaohan, et al.
Published: (2024)
by: Wang, Xiaohan, et al.
Published: (2024)
How Do Transformers Learn Variable Binding in Symbolic Programs?
by: Wu, Yiwei, et al.
Published: (2025)
by: Wu, Yiwei, et al.
Published: (2025)
RL + Transformer = A General-Purpose Problem Solver
by: Rentschler, Micah, et al.
Published: (2025)
by: Rentschler, Micah, et al.
Published: (2025)
Knowing What You Cannot Explain: Learning to Reject Low-Quality Explanations
by: Stradiotti, Luca, et al.
Published: (2025)
by: Stradiotti, Luca, et al.
Published: (2025)
Faster Repeated Evasion Attacks in Tree Ensembles
by: Cascioli, Lorenzo, et al.
Published: (2024)
by: Cascioli, Lorenzo, et al.
Published: (2024)
When do spectral gradient updates help in deep learning?
by: Davis, Damek, et al.
Published: (2025)
by: Davis, Damek, et al.
Published: (2025)
Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks
by: Guerdan, Luke, et al.
Published: (2025)
by: Guerdan, Luke, et al.
Published: (2025)
Stop Guessing: Optimizing Goalkeeper Policies for Soccer Penalty Kicks
by: Bransen, Lotte, et al.
Published: (2025)
by: Bransen, Lotte, et al.
Published: (2025)
AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization
by: Cranney, Caleb, et al.
Published: (2025)
by: Cranney, Caleb, et al.
Published: (2025)
Synthetic Augmentation in Imbalanced Learning: When It Helps, When It Hurts, and How Much to Add
by: Ma, Zhengchi, et al.
Published: (2026)
by: Ma, Zhengchi, et al.
Published: (2026)
Dynamics of Transient Structure in In-Context Linear Regression Transformers
by: Carroll, Liam, et al.
Published: (2025)
by: Carroll, Liam, et al.
Published: (2025)
Breaking Symmetry When Training Transformers
by: Zuo, Chunsheng, et al.
Published: (2024)
by: Zuo, Chunsheng, et al.
Published: (2024)
Deep Neural Network Benchmarks for Selective Classification
by: Pugnana, Andrea, et al.
Published: (2024)
by: Pugnana, Andrea, et al.
Published: (2024)
Bounded-Abstention Multi-horizon Time-series Forecasting
by: Stradiotti, Luca, et al.
Published: (2026)
by: Stradiotti, Luca, et al.
Published: (2026)
Foundation Models in Radiology: What, How, When, Why and Why Not
by: Paschali, Magdalini, et al.
Published: (2024)
by: Paschali, Magdalini, et al.
Published: (2024)
When and How to Canonize: A Generalization Perspective
by: Sverdlov, Yonatan, et al.
Published: (2026)
by: Sverdlov, Yonatan, et al.
Published: (2026)
Transformed Latent Variable Multi-Output Gaussian Processes
by: Jiang, Xiaoyu, et al.
Published: (2026)
by: Jiang, Xiaoyu, et al.
Published: (2026)
When Context Is Not Enough: Modeling Unexplained Variability in Car-Following Behavior
by: Zhang, Chengyuan, et al.
Published: (2025)
by: Zhang, Chengyuan, et al.
Published: (2025)
When accurate prediction models yield harmful self-fulfilling prophecies
by: van Amsterdam, Wouter A. C., et al.
Published: (2023)
by: van Amsterdam, Wouter A. C., et al.
Published: (2023)
When LRP Diverges from Leave-One-Out in Transformers
by: You, Weiqiu, et al.
Published: (2025)
by: You, Weiqiu, et al.
Published: (2025)
When and How Does In-Distribution Label Help Out-of-Distribution Detection?
by: Du, Xuefeng, et al.
Published: (2024)
by: Du, Xuefeng, et al.
Published: (2024)
Transformer Learns Optimal Variable Selection in Group-Sparse Classification
by: Zhang, Chenyang, et al.
Published: (2025)
by: Zhang, Chenyang, et al.
Published: (2025)
Trapped by simplicity: When Transformers fail to learn from noisy features
by: Peters, Evan, et al.
Published: (2026)
by: Peters, Evan, et al.
Published: (2026)
When Can Transformers Count to n?
by: Yehudai, Gilad, et al.
Published: (2024)
by: Yehudai, Gilad, et al.
Published: (2024)
Anti Mode-Collapse in Mean-Field Transformer via Auxiliary Variables
by: Imaizumi, Masaaki, et al.
Published: (2026)
by: Imaizumi, Masaaki, et al.
Published: (2026)
Dealing with Uncertainty in Contextual Anomaly Detection
by: Bindini, Luca, et al.
Published: (2025)
by: Bindini, Luca, et al.
Published: (2025)
When and How Long? The Readout-Mediator Angle in Temporal Reasoning
by: Fadnavis, Shreyas, et al.
Published: (2026)
by: Fadnavis, Shreyas, et al.
Published: (2026)
When and How to Fool Explainable Models (and Humans) with Adversarial Examples
by: Vadillo, Jon, et al.
Published: (2021)
by: Vadillo, Jon, et al.
Published: (2021)
Transformers learn factored representations
by: Shai, Adam, et al.
Published: (2026)
by: Shai, Adam, et al.
Published: (2026)
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
by: Mousavi-Hosseini, Alireza, et al.
Published: (2025)
by: Mousavi-Hosseini, Alireza, et al.
Published: (2025)
How do Transformers Learn Implicit Reasoning?
by: Ye, Jiaran, et al.
Published: (2025)
by: Ye, Jiaran, et al.
Published: (2025)
Removing Neural Signal Artifacts with Autoencoder-Targeted Adversarial Transformers (AT-AT)
by: Choi, Benjamin J.
Published: (2025)
by: Choi, Benjamin J.
Published: (2025)
FairTargetSim: An Interactive Simulator for Understanding and Explaining the Fairness Effects of Target Variable Definition
by: Gala, Dalia, et al.
Published: (2024)
by: Gala, Dalia, et al.
Published: (2024)
Probabilistic Transformers for Joint Modeling of Global Weather Dynamics and Decision-Centric Variables
by: Rauba, Paulius, et al.
Published: (2026)
by: Rauba, Paulius, et al.
Published: (2026)
M3PT: A Transformer for Multimodal, Multi-Party Social Signal Prediction with Person-aware Blockwise Attention
by: Tang, Yiming, et al.
Published: (2025)
by: Tang, Yiming, et al.
Published: (2025)
When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models
by: Wang, Danny, et al.
Published: (2026)
by: Wang, Danny, et al.
Published: (2026)
Efficient Latent Variable Causal Discovery: Combining Score Search and Targeted Testing
by: Ramsey, Joseph, et al.
Published: (2025)
by: Ramsey, Joseph, et al.
Published: (2025)
Similar Items
-
How Powerful are Decoder-Only Transformer Neural Models?
by: Roberts, Jesse
Published: (2023) -
Biases in Expected Goals Models Confound Finishing Ability
by: Davis, Jesse, et al.
Published: (2024) -
Language Models Improve When Pretraining Data Matches Target Tasks
by: Mizrahi, David, et al.
Published: (2025) -
Targeted Learning for Variable Importance
by: Wang, Xiaohan, et al.
Published: (2024) -
How Do Transformers Learn Variable Binding in Symbolic Programs?
by: Wu, Yiwei, et al.
Published: (2025)