:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xie, Zixuan, Liu, Xinyu, Chen, Claire, Liu, Shuze Daniel, Chandra, Rohan, Zhang, Shangtong
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.07333
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Convergence and Emergence of In-Context Reinforcement Learning with Chain of Thought
by: Xie, Zixuan, et al.
Published: (2026)

Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
by: Xie, Zixuan, et al.
Published: (2025)

Doubly Optimal Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)

Efficient Multi-Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
by: Chen, Claire, et al.
Published: (2024)

Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning
by: Mahadevan, Vagul, et al.
Published: (2026)

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
by: Liu, Shuze Daniel, et al.
Published: (2024)

Towards Provable Emergence of In-Context Reinforcement Learning
by: Wang, Jiuqi, et al.
Published: (2025)

Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning
by: Liu, Xinyu, et al.
Published: (2025)

Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set
by: Liu, Xinyu, et al.
Published: (2025)

Almost Sure Convergence Rates of Stochastic Approximation and Reinforcement Learning via a Poisson-Moreau Drift
by: Liu, Xinyu, et al.
Published: (2026)

Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
by: Liu, Shuze, et al.
Published: (2023)

MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics
by: Liu, Xinyu, et al.
Published: (2026)

Almost Sure Convergence Rates and Concentration of Stochastic Approximation and Reinforcement Learning with Markovian Noise
by: Qian, Xiaochi, et al.
Published: (2024)

Predicting Plasticity in Deep Continual Learning: A Theoretical Perspective
by: Wang, Jiuqi, et al.
Published: (2026)

MathlibPR: Pull Request Merge-Readiness Benchmark for Formal Mathematical Libraries
by: Xie, Zixuan, et al.
Published: (2026)

Offline Two-Player Zero-Sum Markov Games with KL Regularization
by: Chen, Claire, et al.
Published: (2026)

A Survey of In-Context Reinforcement Learning
by: Moeini, Amir, et al.
Published: (2025)

Safe In-Context Reinforcement Learning
by: Moeini, Amir, et al.
Published: (2025)

Group Fairness in Multi-Task Reinforcement Learning
by: Song, Kefan, et al.
Published: (2025)

Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
by: Khurram, Aleesha, et al.
Published: (2025)

Experience Replay Addresses Loss of Plasticity in Continual Learning
by: Wang, Jiuqi, et al.
Published: (2025)

Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
by: Wang, Jiuqi, et al.
Published: (2024)

Reward Is Enough: LLMs Are In-Context Reinforcement Learners
by: Song, Kefan, et al.
Published: (2025)

Towards Formalizing Reinforcement Learning Theory
by: Zhang, Shangtong
Published: (2025)

Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective
by: Boursier, Etienne, et al.
Published: (2025)

Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
by: Zhang, Shangtong, et al.
Published: (2021)

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
by: Collins, Liam, et al.
Published: (2024)

Why Softmax Attention Outperforms Linear Attention
by: Deng, Yichuan, et al.
Published: (2023)

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention
by: He, Jianliang, et al.
Published: (2025)

Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features
by: Wang, Jiuqi, et al.
Published: (2024)

Universal Approximation with Softmax Attention
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)

Softmax-free Linear Transformers
by: Lu, Jiachen, et al.
Published: (2022)

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry
by: Zhang, Michael, et al.
Published: (2024)

Counterfactual Explanations for Continuous Action Reinforcement Learning
by: Dong, Shuyang, et al.
Published: (2025)

Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models
by: Song, Kefan, et al.
Published: (2025)

Rethinking Attention: Polynomial Alternatives to Softmax in Transformers
by: Saratchandran, Hemanth, et al.
Published: (2024)

CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
by: Kulkarni, Amar, et al.
Published: (2024)

Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
by: Nishikawa, Naoki, et al.
Published: (2025)

Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)