Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Kawamura, Takao, Niizumi, Daisuke, Ono, Nobutaka
Format:	Preprint
Publié:	2026
Sujets:	Audio and Speech Processing Sound
Accès en ligne:	https://arxiv.org/abs/2602.15307
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866908837431214080
author	Kawamura, Takao Niizumi, Daisuke Ono, Nobutaka
author_facet	Kawamura, Takao Niizumi, Daisuke Ono, Nobutaka
contents	In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_15307
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model Kawamura, Takao Niizumi, Daisuke Ono, Nobutaka Audio and Speech Processing Sound In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.
title	What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model
topic	Audio and Speech Processing Sound
url	https://arxiv.org/abs/2602.15307

Documents similaires