:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Amani, Mohammad Hossein, Lotfi, Aryo, Baldwin, Nicolas Mario, Bengio, Samy, Farajtabar, Mehrdad, Abbe, Emmanuel, West, Robert
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.18110
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
by: Abbe, Emmanuel, et al.
Published: (2024)

Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning
by: Mahrooghi, Ilia, et al.
Published: (2026)

Generalization on the Unseen, Logic Reasoning and Degree Curriculum
by: Abbe, Emmanuel, et al.
Published: (2023)

Chain-of-Sketch: Enabling Global Visual Reasoning
by: Lotfi, Aryo, et al.
Published: (2024)

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
by: Shojaee, Parshin, et al.
Published: (2025)

Symbolic Autoencoding for Self-Supervised Sequence Learning
by: Amani, Mohammad Hossein, et al.
Published: (2024)

Reasoning's Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection
by: Chegini, Atoosa, et al.
Published: (2025)

When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)

AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking
by: Gao, Silin, et al.
Published: (2025)

GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning
by: Gabouj, Oussama, et al.
Published: (2025)

Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context
by: Alizadeh, Keivan, et al.
Published: (2026)

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
by: Mehta, Sachin, et al.
Published: (2024)

MoE-PHDS: One MoE checkpoint for flexible runtime sparsity
by: Hannah, Lauren. A, et al.
Published: (2025)

Conformal Thinking: Risk Control for Reasoning on a Compute Budget
by: Wang, Xi, et al.
Published: (2026)

$k$-server-bench: Automating Potential Discovery for the $k$-Server Conjecture
by: Brilliantov, Kirill, et al.
Published: (2026)

Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity
by: Joudaki, Amir, et al.
Published: (2025)

Task Specific Sharpness Aware O-RAN Resource Management using Multi Agent Reinforcement Learning
by: Lotfi, Fatemeh, et al.
Published: (2025)

ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network Slicing
by: Lotfi, Fatemeh, et al.
Published: (2025)

Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing
by: Lotfi, Fatemeh, et al.
Published: (2025)

You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
by: Roy, Shuvendu, et al.
Published: (2025)

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)

TIDE: Every Layer Knows the Token Beneath the Context
by: Jaiswal, Ajay, et al.
Published: (2026)

GFlowNet Foundations
by: Bengio, Yoshua, et al.
Published: (2021)

GFlowNet Pretraining with Inexpensive Rewards
by: Pandey, Mohit, et al.
Published: (2024)

Solving Bayesian inverse problems with diffusion priors and off-policy RL
by: Scimeca, Luca, et al.
Published: (2025)

LLM in a flash: Efficient Large Language Model Inference with Limited Memory
by: Alizadeh, Keivan, et al.
Published: (2023)

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)

To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models
by: Malach, Eran, et al.
Published: (2025)

Bypassing the Rationale: Causal Auditing of Implicit Reasoning in Language Models
by: Sathyanarayanan, Anish, et al.
Published: (2026)

SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies
by: Samadi, Amir, et al.
Published: (2024)

Were RNNs All We Needed?
by: Feng, Leo, et al.
Published: (2024)

Self-Evolving Curriculum for LLM Reasoning
by: Chen, Xiaoyin, et al.
Published: (2025)

Boolformer: Symbolic Regression of Logic Functions with Transformers
by: d'Ascoli, Stéphane, et al.
Published: (2023)

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
by: Armandpour, Mohammadreza, et al.
Published: (2026)

Debate as Reward: A Multi-Agent Reward System for Scientific Ideation via RL Post-Training
by: Salimi, Moein, et al.
Published: (2026)

Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?
by: Chen, Zihan, et al.
Published: (2025)

Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
by: Kim, Minseo, et al.
Published: (2025)

A Comprehensive Study of Supervised Machine Learning Models for Zero-Day Attack Detection: Analyzing Performance on Imbalanced Data
by: Lotfi, Zahra, et al.
Published: (2025)

Structural Rationale Distillation via Reasoning Space Compression
by: Yang, Jialin, et al.
Published: (2026)