Saved in:
| Main Authors: | Amani, Mohammad Hossein, Lotfi, Aryo, Baldwin, Nicolas Mario, Bengio, Samy, Farajtabar, Mehrdad, Abbe, Emmanuel, West, Robert |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.18110 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
by: Abbe, Emmanuel, et al.
Published: (2024)
by: Abbe, Emmanuel, et al.
Published: (2024)
Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning
by: Mahrooghi, Ilia, et al.
Published: (2026)
by: Mahrooghi, Ilia, et al.
Published: (2026)
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
by: Abbe, Emmanuel, et al.
Published: (2023)
by: Abbe, Emmanuel, et al.
Published: (2023)
Chain-of-Sketch: Enabling Global Visual Reasoning
by: Lotfi, Aryo, et al.
Published: (2024)
by: Lotfi, Aryo, et al.
Published: (2024)
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)
by: Mirzadeh, Iman, et al.
Published: (2024)
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
by: Shojaee, Parshin, et al.
Published: (2025)
by: Shojaee, Parshin, et al.
Published: (2025)
Symbolic Autoencoding for Self-Supervised Sequence Learning
by: Amani, Mohammad Hossein, et al.
Published: (2024)
by: Amani, Mohammad Hossein, et al.
Published: (2024)
Reasoning's Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection
by: Chegini, Atoosa, et al.
Published: (2025)
by: Chegini, Atoosa, et al.
Published: (2025)
When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)
by: Boix-Adsera, Enric, et al.
Published: (2023)
AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking
by: Gao, Silin, et al.
Published: (2025)
by: Gao, Silin, et al.
Published: (2025)
GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning
by: Gabouj, Oussama, et al.
Published: (2025)
by: Gabouj, Oussama, et al.
Published: (2025)
Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context
by: Alizadeh, Keivan, et al.
Published: (2026)
by: Alizadeh, Keivan, et al.
Published: (2026)
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
by: Mehta, Sachin, et al.
Published: (2024)
by: Mehta, Sachin, et al.
Published: (2024)
MoE-PHDS: One MoE checkpoint for flexible runtime sparsity
by: Hannah, Lauren. A, et al.
Published: (2025)
by: Hannah, Lauren. A, et al.
Published: (2025)
Conformal Thinking: Risk Control for Reasoning on a Compute Budget
by: Wang, Xi, et al.
Published: (2026)
by: Wang, Xi, et al.
Published: (2026)
$k$-server-bench: Automating Potential Discovery for the $k$-Server Conjecture
by: Brilliantov, Kirill, et al.
Published: (2026)
by: Brilliantov, Kirill, et al.
Published: (2026)
Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity
by: Joudaki, Amir, et al.
Published: (2025)
by: Joudaki, Amir, et al.
Published: (2025)
Task Specific Sharpness Aware O-RAN Resource Management using Multi Agent Reinforcement Learning
by: Lotfi, Fatemeh, et al.
Published: (2025)
by: Lotfi, Fatemeh, et al.
Published: (2025)
ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network Slicing
by: Lotfi, Fatemeh, et al.
Published: (2025)
by: Lotfi, Fatemeh, et al.
Published: (2025)
Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing
by: Lotfi, Fatemeh, et al.
Published: (2025)
by: Lotfi, Fatemeh, et al.
Published: (2025)
You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
by: Roy, Shuvendu, et al.
Published: (2025)
by: Roy, Shuvendu, et al.
Published: (2025)
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)
by: Samragh, Mohammad, et al.
Published: (2024)
TIDE: Every Layer Knows the Token Beneath the Context
by: Jaiswal, Ajay, et al.
Published: (2026)
by: Jaiswal, Ajay, et al.
Published: (2026)
GFlowNet Foundations
by: Bengio, Yoshua, et al.
Published: (2021)
by: Bengio, Yoshua, et al.
Published: (2021)
GFlowNet Pretraining with Inexpensive Rewards
by: Pandey, Mohit, et al.
Published: (2024)
by: Pandey, Mohit, et al.
Published: (2024)
Solving Bayesian inverse problems with diffusion priors and off-policy RL
by: Scimeca, Luca, et al.
Published: (2025)
by: Scimeca, Luca, et al.
Published: (2025)
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
by: Alizadeh, Keivan, et al.
Published: (2023)
by: Alizadeh, Keivan, et al.
Published: (2023)
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)
by: Maheswaran, Monishwaran, et al.
Published: (2025)
To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models
by: Malach, Eran, et al.
Published: (2025)
by: Malach, Eran, et al.
Published: (2025)
Bypassing the Rationale: Causal Auditing of Implicit Reasoning in Language Models
by: Sathyanarayanan, Anish, et al.
Published: (2026)
by: Sathyanarayanan, Anish, et al.
Published: (2026)
SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies
by: Samadi, Amir, et al.
Published: (2024)
by: Samadi, Amir, et al.
Published: (2024)
Were RNNs All We Needed?
by: Feng, Leo, et al.
Published: (2024)
by: Feng, Leo, et al.
Published: (2024)
Self-Evolving Curriculum for LLM Reasoning
by: Chen, Xiaoyin, et al.
Published: (2025)
by: Chen, Xiaoyin, et al.
Published: (2025)
Boolformer: Symbolic Regression of Logic Functions with Transformers
by: d'Ascoli, Stéphane, et al.
Published: (2023)
by: d'Ascoli, Stéphane, et al.
Published: (2023)
Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
by: Armandpour, Mohammadreza, et al.
Published: (2026)
by: Armandpour, Mohammadreza, et al.
Published: (2026)
Debate as Reward: A Multi-Agent Reward System for Scientific Ideation via RL Post-Training
by: Salimi, Moein, et al.
Published: (2026)
by: Salimi, Moein, et al.
Published: (2026)
Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?
by: Chen, Zihan, et al.
Published: (2025)
by: Chen, Zihan, et al.
Published: (2025)
Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
by: Kim, Minseo, et al.
Published: (2025)
by: Kim, Minseo, et al.
Published: (2025)
A Comprehensive Study of Supervised Machine Learning Models for Zero-Day Attack Detection: Analyzing Performance on Imbalanced Data
by: Lotfi, Zahra, et al.
Published: (2025)
by: Lotfi, Zahra, et al.
Published: (2025)
Structural Rationale Distillation via Reasoning Space Compression
by: Yang, Jialin, et al.
Published: (2026)
by: Yang, Jialin, et al.
Published: (2026)
Similar Items
-
How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
by: Abbe, Emmanuel, et al.
Published: (2024) -
Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning
by: Mahrooghi, Ilia, et al.
Published: (2026) -
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
by: Abbe, Emmanuel, et al.
Published: (2023) -
Chain-of-Sketch: Enabling Global Visual Reasoning
by: Lotfi, Aryo, et al.
Published: (2024) -
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)