Saved in:
| Main Authors: | Washbourne, Robert, Iyer, Rishi, Figliolia, Tomas, Zheng, Henry, Lorig-Roach, Ryan, Yang, Sungyeon, Yuvraj, Pritish, Anthony, Quentin, Tokpanov, Yury, Yang, Xiao, Nanduru, Ganesh, Ebert, Stephen, Medepalli, Praneeth, Szot, Skyler, Rajagopal, Srivatsan, Ong, Alex, Mehta, Bhavana, Millidge, Beren |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.05365 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System Design
by: Anthony, Quentin, et al.
Published: (2025)
by: Anthony, Quentin, et al.
Published: (2025)
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
by: Figliolia, Tomas, et al.
Published: (2025)
by: Figliolia, Tomas, et al.
Published: (2025)
Online Vector Quantized Attention
by: Alonso, Nick, et al.
Published: (2026)
by: Alonso, Nick, et al.
Published: (2026)
ZAYA1-VL-8B Technical Report
by: Shapourian, Hassan, et al.
Published: (2026)
by: Shapourian, Hassan, et al.
Published: (2026)
Zyda-2: a 5 Trillion Token High-Quality Dataset
by: Tokpanov, Yury, et al.
Published: (2024)
by: Tokpanov, Yury, et al.
Published: (2024)
BlackMamba: Mixture of Experts for State-Space Models
by: Anthony, Quentin, et al.
Published: (2024)
by: Anthony, Quentin, et al.
Published: (2024)
Toward Conversational Agents with Context and Time Sensitive Long-term Memory
by: Alonso, Nick, et al.
Published: (2024)
by: Alonso, Nick, et al.
Published: (2024)
Hybrid Associative Memories
by: Lufkin, Leon, et al.
Published: (2026)
by: Lufkin, Leon, et al.
Published: (2026)
Generalising E-prop to Deep Networks
by: Millidge, Beren
Published: (2025)
by: Millidge, Beren
Published: (2025)
Equivalence of Personalized PageRank and Successor Representations
by: Millidge, Beren
Published: (2025)
by: Millidge, Beren
Published: (2025)
Zamba: A Compact 7B SSM Hybrid Model
by: Glorioso, Paolo, et al.
Published: (2024)
by: Glorioso, Paolo, et al.
Published: (2024)
Zyda: A 1.3T Dataset for Open Language Modeling
by: Tokpanov, Yury, et al.
Published: (2024)
by: Tokpanov, Yury, et al.
Published: (2024)
Mixture-of-PageRanks: Replacing Long-Context with Real-Time, Sparse GraphRAG
by: Alonso, Nicholas, et al.
Published: (2024)
by: Alonso, Nicholas, et al.
Published: (2024)
The Zamba2 Suite: Technical Report
by: Glorioso, Paolo, et al.
Published: (2024)
by: Glorioso, Paolo, et al.
Published: (2024)
ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification
by: Yuvraj, Pritish, et al.
Published: (2025)
by: Yuvraj, Pritish, et al.
Published: (2025)
Exploring Action-Centric Representations Through the Lens of Rate-Distortion Theory
by: Varona, Miguel de Llanza, et al.
Published: (2024)
by: Varona, Miguel de Llanza, et al.
Published: (2024)
Translation, the ‘Folk Process’, and Socially Committed Songs of the 1960s
by: Kelly Washbourne
Published: (2013)
by: Kelly Washbourne
Published: (2013)
Associative Memories in the Feature Space
by: Salvatori, Tommaso, et al.
Published: (2024)
by: Salvatori, Tommaso, et al.
Published: (2024)
A Review of Neuroscience-Inspired Machine Learning
by: Ororbia, Alexander, et al.
Published: (2024)
by: Ororbia, Alexander, et al.
Published: (2024)
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
by: Shyam, Vasudev, et al.
Published: (2024)
by: Shyam, Vasudev, et al.
Published: (2024)
ZUNA: Flexible EEG Superresolution with Position-Aware Diffusion Autoencoders
by: Warner, Christopher, et al.
Published: (2026)
by: Warner, Christopher, et al.
Published: (2026)
Predictive Coding beyond Correlations
by: Salvatori, Tommaso, et al.
Published: (2023)
by: Salvatori, Tommaso, et al.
Published: (2023)
A Calculus of Variations Approach to Stochastic Control
by: Lorig, Matthew
Published: (2025)
by: Lorig, Matthew
Published: (2025)
Optimal Control of the Ethena Yield-Bearing Stablecoin
by: Lorig, Matthew
Published: (2026)
by: Lorig, Matthew
Published: (2026)
Collective behavior from surprise minimization
by: Heins, Conor, et al.
Published: (2023)
by: Heins, Conor, et al.
Published: (2023)
Cognitively Inspired Energy-Based World Models
by: Gladstone, Alexi, et al.
Published: (2024)
by: Gladstone, Alexi, et al.
Published: (2024)
Short-Rate-Dependent Volatility Models
by: Leung, Tim, et al.
Published: (2026)
by: Leung, Tim, et al.
Published: (2026)
Interest rate derivatives in a CTMC setting: pricing, replication and Ross recovery
by: Leung, Tim, et al.
Published: (2024)
by: Leung, Tim, et al.
Published: (2024)
LA TRANSICIÓN DEMOGRÁFICO-EPIDEMIOLÓGICA EN CHILE, 1960-2001
by: Jorge Szot Meza
Published: (2003)
by: Jorge Szot Meza
Published: (2003)
An Overview of Recent Developments on Electrodes Modified with Bacteriophages
by: Katarzyna Szot‐Karpińska
Published: (2025)
by: Katarzyna Szot‐Karpińska
Published: (2025)
Deconstructing Bias: A Multifaceted Framework for Diagnosing Cultural and Compositional Inequities in Text-to-Image Generative Models
by: Said, Muna Numan, et al.
Published: (2025)
by: Said, Muna Numan, et al.
Published: (2025)
Comparison of Radiofrequency Microneedling and Ultrasound Delivery of Plant‐Based Derived Secretory Factor (CFa1) Hair Serum for the Cosmetic Improvement of Androgenetic Alopecia
by: Lauren S. Mohan, et al.
Published: (2026)
by: Lauren S. Mohan, et al.
Published: (2026)
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
by: Bandarkar, Lucas, et al.
Published: (2024)
by: Bandarkar, Lucas, et al.
Published: (2024)
A characterization of finite étale morphisms in tensor triangular geometry
by: Sanders, Beren
Published: (2021)
by: Sanders, Beren
Published: (2021)
The tensor triangular geometry of fully faithful functors
by: Sanders, Beren
Published: (2025)
by: Sanders, Beren
Published: (2025)
Thermodynamic Gravity with Non-Extensive Horizon Entropy and Topological Calibration
by: Figliolia, Marco, et al.
Published: (2026)
by: Figliolia, Marco, et al.
Published: (2026)
The MUG-10 Framework for Preventing Usability Issues in Mobile Application Development
by: Weichbroth, Pawel, et al.
Published: (2025)
by: Weichbroth, Pawel, et al.
Published: (2025)
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks
by: Salvatori, Tommaso, et al.
Published: (2022)
by: Salvatori, Tommaso, et al.
Published: (2022)
Optimal Liquidation of Perpetual Contracts
by: Donnelly, Ryan, et al.
Published: (2026)
by: Donnelly, Ryan, et al.
Published: (2026)
Optimal positioning in derivative securities in incomplete markets
by: Leung, Tim, et al.
Published: (2024)
by: Leung, Tim, et al.
Published: (2024)
Similar Items
-
Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System Design
by: Anthony, Quentin, et al.
Published: (2025) -
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
by: Figliolia, Tomas, et al.
Published: (2025) -
Online Vector Quantized Attention
by: Alonso, Nick, et al.
Published: (2026) -
ZAYA1-VL-8B Technical Report
by: Shapourian, Hassan, et al.
Published: (2026) -
Zyda-2: a 5 Trillion Token High-Quality Dataset
by: Tokpanov, Yury, et al.
Published: (2024)