:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Martin-Linares, Cristina P., Ling, Jonathan P.
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2512.24975
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Multi-Level Features with Matryoshka Sparse Autoencoders
by: Bussmann, Bart, et al.
Published: (2025)

Matryoshka Quantization
by: Nair, Pranav, et al.
Published: (2025)

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
by: Wen, Tiansheng, et al.
Published: (2025)

Matryoshka Concept Bottleneck Models
by: Chen, Ziye, et al.
Published: (2026)

SAE-FD: Sparse Autoencoder Feature Distillation for Continual Learning of Large Language Models
by: Zhang, Mingxu, et al.
Published: (2026)

Matryoshka Representation Learning
by: Kusupati, Aditya, et al.
Published: (2022)

Matryoshka Diffusion Models
by: Gu, Jiatao, et al.
Published: (2023)

Ensembling Sparse Autoencoders
by: Gadgil, Soham, et al.
Published: (2025)

Model Unlearning via Sparse Autoencoder Subspace Guided Projections
by: Wang, Xu, et al.
Published: (2025)

2D Matryoshka Sentence Embeddings
by: Li, Xianming, et al.
Published: (2024)

Sparse-Autoencoder-Guided Internal Representation Unlearning for Large Language Models
by: Yamashita, Tomoya, et al.
Published: (2025)

Analysis of Variational Sparse Autoencoders
by: Baker, Zachary, et al.
Published: (2025)

Toward Identifiable Sparse Autoencoders
by: Nelson, Walter, et al.
Published: (2026)

Sparse Autoencoders, Again?
by: Lu, Yin, et al.
Published: (2025)

Matryoshka Multimodal Models
by: Cai, Mu, et al.
Published: (2024)

Towards Interpretable and Inference-Optimal COT Reasoning with Sparse Autoencoder-Guided Generation
by: Zhao, Daniel, et al.
Published: (2025)

Transcoders Beat Sparse Autoencoders for Interpretability
by: Paulo, Gonçalo, et al.
Published: (2025)

Evaluating Sparse Autoencoders for Monosemantic Representation
by: Fereidouni, Moghis, et al.
Published: (2025)

Decomposing The Dark Matter of Sparse Autoencoders
by: Engels, Joshua, et al.
Published: (2024)

Disentangling Dense Embeddings with Sparse Autoencoders
by: O'Neill, Charles, et al.
Published: (2024)

Preference Instability in Reward Models: Detection and Mitigation via Sparse Autoencoders
by: Liu, Shunchang, et al.
Published: (2026)

Are Sparse Autoencoder Benchmarks Reliable?
by: Chanin, David
Published: (2026)

MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization
by: Kleinegger, Maximilian, et al.
Published: (2026)

Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory
by: Xue, Huiyan, et al.
Published: (2025)

Matryoshka Model Learning for Improved Elastic Student Models
by: Verma, Chetan, et al.
Published: (2025)

Low-Rank Adapting Models for Sparse Autoencoders
by: Chen, Matthew, et al.
Published: (2025)

Interpretable Reward Model via Sparse Autoencoder
by: Zhang, Shuyi, et al.
Published: (2025)

Efficient Dictionary Learning with Switch Sparse Autoencoders
by: Mudide, Anish, et al.
Published: (2024)

Steering Language Model Refusal with Sparse Autoencoders
by: O'Brien, Kyle, et al.
Published: (2024)

Stable and Steerable Sparse Autoencoders with Weight Regularization
by: Jedryszek, Piotr, et al.
Published: (2026)

Interpreting Attention Layer Outputs with Sparse Autoencoders
by: Kissane, Connor, et al.
Published: (2024)

Federated Model Heterogeneous Matryoshka Representation Learning
by: Yi, Liping, et al.
Published: (2024)

Locate-then-Sparsify: Attribution Guided Sparse Strategy for Visual Hallucination Mitigation
by: Dang, Tiantian, et al.
Published: (2026)

Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability
by: Bhalla, Usha, et al.
Published: (2025)

Objective-Specific Privileged Bases via Full-Prefix Matryoshka Learning
by: Talukder, Arghamitra, et al.
Published: (2026)

BatchTopK Sparse Autoencoders
by: Bussmann, Bart, et al.
Published: (2024)

Improving Sparse Autoencoder with Dynamic Attention
by: Wang, Dongsheng, et al.
Published: (2026)

Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
by: Kantamneni, Subhash, et al.
Published: (2025)

Attribution-Guided Decoding
by: Komorowski, Piotr, et al.
Published: (2025)

Sparse Autoencoders are Topic Models
by: Girrbach, Leander, et al.
Published: (2025)