:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Eyuboglu, Sabri, Ehrlich, Ryan, Arora, Simran, Guha, Neel, Zinsley, Dylan, Liu, Emily, Tennien, Will, Rudra, Atri, Zou, James, Mirhoseini, Azalia, Re, Christopher
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computation and Language Artificial Intelligence Machine Learning
Online-Zugang:	https://arxiv.org/abs/2506.06266
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Simple linear attention language models balance the recall-throughput tradeoff
von: Arora, Simran, et al.
Veröffentlicht: (2024)

Constructing Efficient Fact-Storing MLPs for Transformers
von: Dugan, Owen, et al.
Veröffentlicht: (2025)

Just read twice: closing the recall gap for recurrent language models
von: Arora, Simran, et al.
Veröffentlicht: (2024)

KernelBench: Can LLMs Write Efficient GPU Kernels?
von: Ouyang, Anne, et al.
Veröffentlicht: (2025)

Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
von: Arora, Simran, et al.
Veröffentlicht: (2023)

CodeMonkeys: Scaling Test-Time Compute for Software Engineering
von: Ehrlich, Ryan, et al.
Veröffentlicht: (2025)

Towards Learning High-Precision Least Squares Algorithms with Sequence Models
von: Liu, Jerry, et al.
Veröffentlicht: (2025)

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
von: Saad-Falcon, Jon, et al.
Veröffentlicht: (2024)

Hydragen: High-Throughput LLM Inference with Shared Prefixes
von: Juravsky, Jordan, et al.
Veröffentlicht: (2024)

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
von: Brown, Bradley, et al.
Veröffentlicht: (2024)

Archon: An Architecture Search Framework for Inference-Time Techniques
von: Saad-Falcon, Jon, et al.
Veröffentlicht: (2024)

On the Role of Temperature Sampling in Test-Time Scaling
von: Wu, Yuheng, et al.
Veröffentlicht: (2025)

That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip Design
von: Goldie, Anna, et al.
Veröffentlicht: (2024)

Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
von: Narayan, Avanika, et al.
Veröffentlicht: (2025)

Think, Prune, Train, Improve: Scaling Reasoning without Scaling Models
von: Costello, Caia, et al.
Veröffentlicht: (2025)

Federation of Experts: Communication Efficient Distributed Inference for Large Language Models
von: Abdurrahman, Muhammad Shahir, et al.
Veröffentlicht: (2026)

TRACE: Capability-Targeted Agentic Training
von: Kang, Hangoo, et al.
Veröffentlicht: (2026)

ForTIFAI: Fending Off Recursive Training Induced Failure for AI Model Collapse
von: Shabgahi, Soheil Zibakhsh, et al.
Veröffentlicht: (2025)

Agent JIT Compilation for Latency-Optimizing Web Agent Planning and Scheduling
von: Winston, Caleb, et al.
Veröffentlicht: (2026)

Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
von: Goldie, Anna, et al.
Veröffentlicht: (2025)

CHESS: Contextual Harnessing for Efficient SQL Synthesis
von: Talaei, Shayan, et al.
Veröffentlicht: (2024)

CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models
von: Lee, Donghyun, et al.
Veröffentlicht: (2024)

Counting Clinical Trials: New Evidence on Pharmaceutical Sector Productivity
von: Durvasula, Maya M., et al.
Veröffentlicht: (2024)

Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
von: Garcia, Roberto, et al.
Veröffentlicht: (2025)

Late Time Acceleration with Observational Constraints in Modified Theories of Gravity
von: Arora, Simran
Veröffentlicht: (2023)

ParallelKittens: Systematic and Practical Simplification of Multi-GPU AI Kernels
von: Sul, Stuart H., et al.
Veröffentlicht: (2025)

SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models
von: Biju, Emil, et al.
Veröffentlicht: (2025)

Smoothie: Label Free Language Model Routing
von: Guha, Neel, et al.
Veröffentlicht: (2024)

The unregulated plant‐based ‘milk’ industry: A threat to nutrition, health and safety?
von: Simran Kaur Arora
Veröffentlicht: (2024)

BWLer: Barycentric Weight Layer Elucidates a Precision-Conditioning Tradeoff for PINNs
von: Liu, Jerry, et al.
Veröffentlicht: (2025)

ThunderKittens: Simple, Fast, and Adorable AI Kernels
von: Spector, Benjamin F., et al.
Veröffentlicht: (2024)

Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment
von: Kwok, Jacky, et al.
Veröffentlicht: (2026)

Bayesian and Machine-Learning Analyses of Nonminimal $f(Q)$ Gravity and $H_0$ Tension
von: Arora, Simran, et al.
Veröffentlicht: (2025)

Towards Testable Type-III Leptogenesis in Non-Standard Early Universe Scenarios
von: Arora, Simran, et al.
Veröffentlicht: (2026)

Boosting multi-demographic federated learning for chest radiograph analysis using general-purpose self-supervised representations
von: Lotfinia, Mahshad, et al.
Veröffentlicht: (2025)

GRAM: Spatial general-purpose audio representations for real-world environments
von: Yuksel, Goksenin, et al.
Veröffentlicht: (2026)

Astra: A Multi-Agent System for GPU Kernel Performance Optimization
von: Wei, Anjiang, et al.
Veröffentlicht: (2025)

RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models
von: Kwok, Jacky, et al.
Veröffentlicht: (2025)

Revisiting kink-like parametrization and constraints using OHD/Pantheon+/BAO samples
von: Arora, Simran, et al.
Veröffentlicht: (2023)

Prospector Heads: Generalized Feature Attribution for Large Models & Data
von: Machiraju, Gautam, et al.
Veröffentlicht: (2024)