Saved in:
| Main Authors: | Sokota, Samuel, Farina, Gabriele, Wu, David J., Hu, Hengyuan, Wang, Kevin A., Kolter, J. Zico, Brown, Noam |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2304.13138 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search
by: Sokota, Samuel, et al.
Published: (2025)
by: Sokota, Samuel, et al.
Published: (2025)
Mimetic Initialization of MLPs
by: Trockman, Asher, et al.
Published: (2026)
by: Trockman, Asher, et al.
Published: (2026)
Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line
by: Kim, Eungyeup, et al.
Published: (2023)
by: Kim, Eungyeup, et al.
Published: (2023)
Reevaluating Policy Gradient Methods for Imperfect-Information Games
by: Rudolph, Max, et al.
Published: (2025)
by: Rudolph, Max, et al.
Published: (2025)
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
by: Bick, Aviv, et al.
Published: (2024)
by: Bick, Aviv, et al.
Published: (2024)
A Simple and Effective Pruning Approach for Large Language Models
by: Sun, Mingjie, et al.
Published: (2023)
by: Sun, Mingjie, et al.
Published: (2023)
Looking beyond the next token
by: Thankaraj, Abitha, et al.
Published: (2025)
by: Thankaraj, Abitha, et al.
Published: (2025)
Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
by: Li, Kevin Y., et al.
Published: (2024)
by: Li, Kevin Y., et al.
Published: (2024)
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
by: Xu, Yixuan Even, et al.
Published: (2025)
by: Xu, Yixuan Even, et al.
Published: (2025)
Weight Ensembling Improves Reasoning in Language Models
by: Dang, Xingyu, et al.
Published: (2025)
by: Dang, Xingyu, et al.
Published: (2025)
Accelerating Diffusion Planners in Offline RL via Reward-Aware Consistency Trajectory Distillation
by: Duan, Xintong, et al.
Published: (2025)
by: Duan, Xintong, et al.
Published: (2025)
ROGUE: Misaligned Agent Behavior Arising from Ordinary Computer Use
by: Tien, Jeremy, et al.
Published: (2026)
by: Tien, Jeremy, et al.
Published: (2026)
Base Models Look Human To AI Detectors
by: Xu, Yixuan Even, et al.
Published: (2026)
by: Xu, Yixuan Even, et al.
Published: (2026)
Provably Bounding Neural Network Preimages
by: Kotha, Suhas, et al.
Published: (2023)
by: Kotha, Suhas, et al.
Published: (2023)
Understanding Optimization in Deep Learning with Central Flows
by: Cohen, Jeremy M., et al.
Published: (2024)
by: Cohen, Jeremy M., et al.
Published: (2024)
Neural Network Verification with Branch-and-Bound for General Nonlinearities
by: Shi, Zhouxing, et al.
Published: (2024)
by: Shi, Zhouxing, et al.
Published: (2024)
Contextures: Representations from Contexts
by: Zhai, Runtian, et al.
Published: (2025)
by: Zhai, Runtian, et al.
Published: (2025)
One-Step Diffusion Distillation through Score Implicit Matching
by: Luo, Weijian, et al.
Published: (2024)
by: Luo, Weijian, et al.
Published: (2024)
Compute-Optimal LLMs Provably Generalize Better With Scale
by: Finzi, Marc, et al.
Published: (2025)
by: Finzi, Marc, et al.
Published: (2025)
Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion
by: Dontas, Michail, et al.
Published: (2024)
by: Dontas, Michail, et al.
Published: (2024)
Training a Generally Curious Agent
by: Tajwar, Fahim, et al.
Published: (2025)
by: Tajwar, Fahim, et al.
Published: (2025)
Existing Large Language Model Unlearning Evaluations Are Inconclusive
by: Feng, Zhili, et al.
Published: (2025)
by: Feng, Zhili, et al.
Published: (2025)
Antidistillation Fingerprinting
by: Xu, Yixuan Even, et al.
Published: (2026)
by: Xu, Yixuan Even, et al.
Published: (2026)
Imitation Bootstrapped Reinforcement Learning
by: Hu, Hengyuan, et al.
Published: (2023)
by: Hu, Hengyuan, et al.
Published: (2023)
AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs
by: Akinwande, Victor, et al.
Published: (2024)
by: Akinwande, Victor, et al.
Published: (2024)
Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Autospeculation
by: Hu, Hengyuan, et al.
Published: (2025)
by: Hu, Hengyuan, et al.
Published: (2025)
Inference-Time Code Selection via Symbolic Equivalence Partitioning
by: Cho, David, et al.
Published: (2026)
by: Cho, David, et al.
Published: (2026)
FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers
by: Williams, Joshua Nathaniel, et al.
Published: (2024)
by: Williams, Joshua Nathaniel, et al.
Published: (2024)
Deriving Equivalent Symbol-Based Decision Models from Feedforward Neural Networks
by: Seidel, Sebastian, et al.
Published: (2025)
by: Seidel, Sebastian, et al.
Published: (2025)
Quantifying the Pre-training Dividend: Generative versus Latent Self-Supervised Learning for Time Series Foundation Models
by: Major, Noam, et al.
Published: (2026)
by: Major, Noam, et al.
Published: (2026)
Efficient & Correct Predictive Equivalence for Decision Trees
by: Marques-Silva, Joao, et al.
Published: (2025)
by: Marques-Silva, Joao, et al.
Published: (2025)
Membership Inference Attacks Against Time-Series Models
by: Koren, Noam, et al.
Published: (2024)
by: Koren, Noam, et al.
Published: (2024)
Understanding Understanding: A Pragmatic Framework Motivated by Large Language Models
by: Leyton-Brown, Kevin, et al.
Published: (2024)
by: Leyton-Brown, Kevin, et al.
Published: (2024)
LiteEFG: An Efficient Python Library for Solving Extensive-form Games
by: Liu, Mingyang, et al.
Published: (2024)
by: Liu, Mingyang, et al.
Published: (2024)
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
by: Liu, Mingyang, et al.
Published: (2024)
by: Liu, Mingyang, et al.
Published: (2024)
Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
by: He, Yutong, et al.
Published: (2024)
by: He, Yutong, et al.
Published: (2024)
Predicting the Performance of Black-box LLMs through Follow-up Queries
by: Sam, Dylan, et al.
Published: (2025)
by: Sam, Dylan, et al.
Published: (2025)
Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning
by: Huang, Benhao, et al.
Published: (2026)
by: Huang, Benhao, et al.
Published: (2026)
Why is SAM Robust to Label Noise?
by: Baek, Christina, et al.
Published: (2024)
by: Baek, Christina, et al.
Published: (2024)
A Look at Value-Based Decision-Time vs. Background Planning Methods Across Different Settings
by: Alver, Safa, et al.
Published: (2022)
by: Alver, Safa, et al.
Published: (2022)
Similar Items
-
Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search
by: Sokota, Samuel, et al.
Published: (2025) -
Mimetic Initialization of MLPs
by: Trockman, Asher, et al.
Published: (2026) -
Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line
by: Kim, Eungyeup, et al.
Published: (2023) -
Reevaluating Policy Gradient Methods for Imperfect-Information Games
by: Rudolph, Max, et al.
Published: (2025) -
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
by: Bick, Aviv, et al.
Published: (2024)