:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Wilam, Piotr
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2605.24603
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
by: Nawrot, Piotr, et al.
Published: (2025)

Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
by: Piękos, Piotr, et al.
Published: (2025)

AlignSAE: Concept-Aligned Sparse Autoencoders
by: Yang, Minglai, et al.
Published: (2025)

Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks
by: Karvonen, Adam, et al.
Published: (2024)

Visual Exploration of Feature Relationships in Sparse Autoencoders with Curated Concepts
by: Yan, Xinyuan, et al.
Published: (2025)

Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
by: O'Neill, Charles, et al.
Published: (2024)

Sparse Attention Decomposition Applied to Circuit Tracing
by: Franco, Gabriel, et al.
Published: (2024)

Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts
by: Peng, Kenny, et al.
Published: (2025)

Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders
by: Li, Aaron J., et al.
Published: (2025)

Circuit Component Reuse Across Tasks in Transformer Language Models
by: Merullo, Jack, et al.
Published: (2023)

Transformer Circuit Faithfulness Metrics are not Robust
by: Miller, Joseph, et al.
Published: (2024)

Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
by: Ren, Yunwei, et al.
Published: (2024)

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)

From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
by: Su, Jingtong, et al.
Published: (2025)

From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
by: Wen, Kaiyue, et al.
Published: (2024)

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in modern Transformers
by: Huang, Yiran, et al.
Published: (2026)

From Indirect Object Identification to Syllogisms: Exploring Binary Mechanisms in Transformer Circuits
by: Saraipour, Karim, et al.
Published: (2025)

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
by: Marks, Samuel, et al.
Published: (2024)

Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence
by: Xiao, Liu
Published: (2026)

Sparse Transformer with Local and Seasonal Adaptation for Multivariate Time Series Forecasting
by: Zhang, Yifan, et al.
Published: (2023)

Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text?
by: Ilyankou, Ilya, et al.
Published: (2024)

Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations
by: Joshi, Shruti, et al.
Published: (2025)

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
by: Thangarasa, Vithursan, et al.
Published: (2023)

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
by: Lou, Chao, et al.
Published: (2024)

Dimensional Collapse in Transformer Attention Outputs: A Challenge for Sparse Dictionary Learning
by: Wang, Junxuan, et al.
Published: (2025)

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
by: Csordás, Róbert, et al.
Published: (2023)

Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
by: Muhamed, Aashiq, et al.
Published: (2024)

Latent Concept Disentanglement in Transformer-based Language Models
by: Hong, Guan Zhe, et al.
Published: (2025)

Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
by: Ahmad, Areeb, et al.
Published: (2025)

Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
by: Hsu, Aliyah R., et al.
Published: (2024)

Transformers meet Neural Algorithmic Reasoners
by: Bounsi, Wilfried, et al.
Published: (2024)

Judge Circuits
by: Feldhus, Nils, et al.
Published: (2026)

Automatic Generation of Python Programs Using Context-Free Grammars
by: Yamani, Kamel, et al.
Published: (2024)

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations
by: Javidnia, Hossein
Published: (2026)

Automated SNOMED CT Concept Annotation in Clinical Text Using Bi-GRU Neural Networks
by: Noori, Ali, et al.
Published: (2025)

Circuit Complexity Bounds for RoPE-based Transformer Architecture
by: Chen, Bo, et al.
Published: (2024)

Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
by: Adhikari, Rabin
Published: (2025)

LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering
by: Wong, Sing Hieng, et al.
Published: (2026)

Hierarchical Sparse Circuit Extraction from Billion-Parameter Language Models through Scalable Attribution Graph Decomposition
by: Uddin, Mohammed Mudassir, et al.
Published: (2026)

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention
by: Shao, Jintian, et al.
Published: (2025)