Saved in:
| Main Authors: | Haskins, Reilly, Chughtai, Bilal, Engels, Joshua |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.15257 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation
by: Haskins, Reilly, et al.
Published: (2025)
by: Haskins, Reilly, et al.
Published: (2025)
KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
by: Haskins, Reilly, et al.
Published: (2025)
by: Haskins, Reilly, et al.
Published: (2025)
Building Production-Ready Probes For Gemini
by: Kramár, János, et al.
Published: (2026)
by: Kramár, János, et al.
Published: (2026)
Difficulties with Evaluating a Deception Detector for AIs
by: Smith, Lewis, et al.
Published: (2025)
by: Smith, Lewis, et al.
Published: (2025)
Can Language Models Explain Their Own Classification Behavior?
by: Sherburn, Dane, et al.
Published: (2024)
by: Sherburn, Dane, et al.
Published: (2024)
Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs
by: Chughtai, Bilal, et al.
Published: (2024)
by: Chughtai, Bilal, et al.
Published: (2024)
Transformer Circuit Faithfulness Metrics are not Robust
by: Miller, Joseph, et al.
Published: (2024)
by: Miller, Joseph, et al.
Published: (2024)
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step
by: Deng, Yuntian, et al.
Published: (2024)
by: Deng, Yuntian, et al.
Published: (2024)
Detecting Strategic Deception Using Linear Probes
by: Goldowsky-Dill, Nicholas, et al.
Published: (2025)
by: Goldowsky-Dill, Nicholas, et al.
Published: (2025)
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
by: Arnav, Benjamin, et al.
Published: (2025)
by: Arnav, Benjamin, et al.
Published: (2025)
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
by: Sprague, Zayne, et al.
Published: (2024)
by: Sprague, Zayne, et al.
Published: (2024)
Robust Filtering -- Novel Statistical Learning and Inference Algorithms with Applications
by: Chughtai, Aamir Hussain
Published: (2025)
by: Chughtai, Aamir Hussain
Published: (2025)
Noticing the Watcher: LLM Agents Can Infer CoT Monitoring from Blocking Feedback
by: Jiralerspong, Thomas, et al.
Published: (2026)
by: Jiralerspong, Thomas, et al.
Published: (2026)
To Think or Not to Think: The Hidden Cost of Meta-Training with Excessive CoT Examples
by: Kothapalli, Vignesh, et al.
Published: (2025)
by: Kothapalli, Vignesh, et al.
Published: (2025)
Unveiling and Causalizing CoT: A Causal Pespective
by: Fu, Jiarun, et al.
Published: (2025)
by: Fu, Jiarun, et al.
Published: (2025)
EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation
by: Jia, Jinghan, et al.
Published: (2025)
by: Jia, Jinghan, et al.
Published: (2025)
Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL
by: Tong, Xuhan, et al.
Published: (2026)
by: Tong, Xuhan, et al.
Published: (2026)
Self-Verifying Reflection Helps Transformers with CoT Reasoning
by: Yu, Zhongwei, et al.
Published: (2025)
by: Yu, Zhongwei, et al.
Published: (2025)
Low-Rank Adapting Models for Sparse Autoencoders
by: Chen, Matthew, et al.
Published: (2025)
by: Chen, Matthew, et al.
Published: (2025)
Decomposing The Dark Matter of Sparse Autoencoders
by: Engels, Joshua, et al.
Published: (2024)
by: Engels, Joshua, et al.
Published: (2024)
RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?
by: Gupta, Rohan, et al.
Published: (2025)
by: Gupta, Rohan, et al.
Published: (2025)
CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning
by: Fang, Yuanheng, et al.
Published: (2025)
by: Fang, Yuanheng, et al.
Published: (2025)
Data Shifts Hurt CoT: A Theoretical Study
by: Yin, Lang, et al.
Published: (2025)
by: Yin, Lang, et al.
Published: (2025)
Visual CoT Makes VLMs Smarter but More Fragile
by: Xu, Chunxue, et al.
Published: (2025)
by: Xu, Chunxue, et al.
Published: (2025)
Exploring the Limitations of Mamba in COPY and CoT Reasoning
by: Ren, Ruifeng, et al.
Published: (2024)
by: Ren, Ruifeng, et al.
Published: (2024)
Divide-and-Conquer CoT: RL for Reducing Latency via Parallel Reasoning
by: Mahankali, Arvind, et al.
Published: (2026)
by: Mahankali, Arvind, et al.
Published: (2026)
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
by: Feng, Yunzhen, et al.
Published: (2025)
by: Feng, Yunzhen, et al.
Published: (2025)
CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
by: Altabaa, Awni, et al.
Published: (2025)
by: Altabaa, Awni, et al.
Published: (2025)
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
by: Yan, Shaotian, et al.
Published: (2026)
by: Yan, Shaotian, et al.
Published: (2026)
Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning
by: Yao, Xinhao, et al.
Published: (2025)
by: Yao, Xinhao, et al.
Published: (2025)
Is continuous CoT better suited for multi-lingual reasoning?
by: Bashir, Ali Hamza, et al.
Published: (2026)
by: Bashir, Ali Hamza, et al.
Published: (2026)
Enhancing Confidence Estimation in Telco LLMs via Twin-Pass CoT-Ensembling
by: Saenko, Anton, et al.
Published: (2026)
by: Saenko, Anton, et al.
Published: (2026)
What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
by: Yuan, Yufeng, et al.
Published: (2025)
by: Yuan, Yufeng, et al.
Published: (2025)
Amalgam: A Framework for Obfuscated Neural Network Training on the Cloud
by: Taki, Sifat Ut, et al.
Published: (2024)
by: Taki, Sifat Ut, et al.
Published: (2024)
EMORF-II: Adaptive EM-based Outlier-Robust Filtering with Correlated Measurement Noise
by: Majal, Arslan, et al.
Published: (2025)
by: Majal, Arslan, et al.
Published: (2025)
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
by: Zhang, Ruichen, et al.
Published: (2025)
by: Zhang, Ruichen, et al.
Published: (2025)
How Likely Do LLMs with CoT Mimic Human Reasoning?
by: Bao, Guangsheng, et al.
Published: (2024)
by: Bao, Guangsheng, et al.
Published: (2024)
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
by: Zhang, Boxuan, et al.
Published: (2025)
by: Zhang, Boxuan, et al.
Published: (2025)
To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers
by: Xu, Kevin, et al.
Published: (2025)
by: Xu, Kevin, et al.
Published: (2025)
The Ends Justify the Thoughts: RL-Induced Motivated Reasoning in LLM CoTs
by: Howe, Nikolaus, et al.
Published: (2025)
by: Howe, Nikolaus, et al.
Published: (2025)
Similar Items
-
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation
by: Haskins, Reilly, et al.
Published: (2025) -
KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
by: Haskins, Reilly, et al.
Published: (2025) -
Building Production-Ready Probes For Gemini
by: Kramár, János, et al.
Published: (2026) -
Difficulties with Evaluating a Deception Detector for AIs
by: Smith, Lewis, et al.
Published: (2025) -
Can Language Models Explain Their Own Classification Behavior?
by: Sherburn, Dane, et al.
Published: (2024)