Saved in:
| Main Author: | Wilam, Piotr |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.24603 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
by: Nawrot, Piotr, et al.
Published: (2025)
by: Nawrot, Piotr, et al.
Published: (2025)
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
by: Piękos, Piotr, et al.
Published: (2025)
by: Piękos, Piotr, et al.
Published: (2025)
AlignSAE: Concept-Aligned Sparse Autoencoders
by: Yang, Minglai, et al.
Published: (2025)
by: Yang, Minglai, et al.
Published: (2025)
Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks
by: Karvonen, Adam, et al.
Published: (2024)
by: Karvonen, Adam, et al.
Published: (2024)
Visual Exploration of Feature Relationships in Sparse Autoencoders with Curated Concepts
by: Yan, Xinyuan, et al.
Published: (2025)
by: Yan, Xinyuan, et al.
Published: (2025)
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
by: O'Neill, Charles, et al.
Published: (2024)
by: O'Neill, Charles, et al.
Published: (2024)
Sparse Attention Decomposition Applied to Circuit Tracing
by: Franco, Gabriel, et al.
Published: (2024)
by: Franco, Gabriel, et al.
Published: (2024)
Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts
by: Peng, Kenny, et al.
Published: (2025)
by: Peng, Kenny, et al.
Published: (2025)
Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders
by: Li, Aaron J., et al.
Published: (2025)
by: Li, Aaron J., et al.
Published: (2025)
Circuit Component Reuse Across Tasks in Transformer Language Models
by: Merullo, Jack, et al.
Published: (2023)
by: Merullo, Jack, et al.
Published: (2023)
Transformer Circuit Faithfulness Metrics are not Robust
by: Miller, Joseph, et al.
Published: (2024)
by: Miller, Joseph, et al.
Published: (2024)
Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
by: Ren, Yunwei, et al.
Published: (2024)
by: Ren, Yunwei, et al.
Published: (2024)
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
by: Su, Jingtong, et al.
Published: (2025)
by: Su, Jingtong, et al.
Published: (2025)
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
by: Wen, Kaiyue, et al.
Published: (2024)
by: Wen, Kaiyue, et al.
Published: (2024)
Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in modern Transformers
by: Huang, Yiran, et al.
Published: (2026)
by: Huang, Yiran, et al.
Published: (2026)
From Indirect Object Identification to Syllogisms: Exploring Binary Mechanisms in Transformer Circuits
by: Saraipour, Karim, et al.
Published: (2025)
by: Saraipour, Karim, et al.
Published: (2025)
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
by: Marks, Samuel, et al.
Published: (2024)
by: Marks, Samuel, et al.
Published: (2024)
Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence
by: Xiao, Liu
Published: (2026)
by: Xiao, Liu
Published: (2026)
Sparse Transformer with Local and Seasonal Adaptation for Multivariate Time Series Forecasting
by: Zhang, Yifan, et al.
Published: (2023)
by: Zhang, Yifan, et al.
Published: (2023)
Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text?
by: Ilyankou, Ilya, et al.
Published: (2024)
by: Ilyankou, Ilya, et al.
Published: (2024)
Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations
by: Joshi, Shruti, et al.
Published: (2025)
by: Joshi, Shruti, et al.
Published: (2025)
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
by: Thangarasa, Vithursan, et al.
Published: (2023)
by: Thangarasa, Vithursan, et al.
Published: (2023)
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
by: Lou, Chao, et al.
Published: (2024)
by: Lou, Chao, et al.
Published: (2024)
Dimensional Collapse in Transformer Attention Outputs: A Challenge for Sparse Dictionary Learning
by: Wang, Junxuan, et al.
Published: (2025)
by: Wang, Junxuan, et al.
Published: (2025)
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
by: Csordás, Róbert, et al.
Published: (2023)
by: Csordás, Róbert, et al.
Published: (2023)
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
by: Muhamed, Aashiq, et al.
Published: (2024)
by: Muhamed, Aashiq, et al.
Published: (2024)
Latent Concept Disentanglement in Transformer-based Language Models
by: Hong, Guan Zhe, et al.
Published: (2025)
by: Hong, Guan Zhe, et al.
Published: (2025)
Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
by: Ahmad, Areeb, et al.
Published: (2025)
by: Ahmad, Areeb, et al.
Published: (2025)
Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
by: Hsu, Aliyah R., et al.
Published: (2024)
by: Hsu, Aliyah R., et al.
Published: (2024)
Transformers meet Neural Algorithmic Reasoners
by: Bounsi, Wilfried, et al.
Published: (2024)
by: Bounsi, Wilfried, et al.
Published: (2024)
Judge Circuits
by: Feldhus, Nils, et al.
Published: (2026)
by: Feldhus, Nils, et al.
Published: (2026)
Automatic Generation of Python Programs Using Context-Free Grammars
by: Yamani, Kamel, et al.
Published: (2024)
by: Yamani, Kamel, et al.
Published: (2024)
A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations
by: Javidnia, Hossein
Published: (2026)
by: Javidnia, Hossein
Published: (2026)
Automated SNOMED CT Concept Annotation in Clinical Text Using Bi-GRU Neural Networks
by: Noori, Ali, et al.
Published: (2025)
by: Noori, Ali, et al.
Published: (2025)
Circuit Complexity Bounds for RoPE-based Transformer Architecture
by: Chen, Bo, et al.
Published: (2024)
by: Chen, Bo, et al.
Published: (2024)
Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
by: Adhikari, Rabin
Published: (2025)
by: Adhikari, Rabin
Published: (2025)
LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering
by: Wong, Sing Hieng, et al.
Published: (2026)
by: Wong, Sing Hieng, et al.
Published: (2026)
Hierarchical Sparse Circuit Extraction from Billion-Parameter Language Models through Scalable Attribution Graph Decomposition
by: Uddin, Mohammed Mudassir, et al.
Published: (2026)
by: Uddin, Mohammed Mudassir, et al.
Published: (2026)
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention
by: Shao, Jintian, et al.
Published: (2025)
by: Shao, Jintian, et al.
Published: (2025)
Similar Items
-
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
by: Nawrot, Piotr, et al.
Published: (2025) -
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
by: Piękos, Piotr, et al.
Published: (2025) -
AlignSAE: Concept-Aligned Sparse Autoencoders
by: Yang, Minglai, et al.
Published: (2025) -
Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks
by: Karvonen, Adam, et al.
Published: (2024) -
Visual Exploration of Feature Relationships in Sparse Autoencoders with Curated Concepts
by: Yan, Xinyuan, et al.
Published: (2025)