Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Plachouras, Christos, Guinot, Julien, Fazekas, George, Quinton, Elio, Benetos, Emmanouil, Pauwels, Johan
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.06224
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916729153650688
author	Plachouras, Christos Guinot, Julien Fazekas, George Quinton, Elio Benetos, Emmanouil Pauwels, Johan
author_facet	Plachouras, Christos Guinot, Julien Fazekas, George Quinton, Elio Benetos, Emmanouil Pauwels, Johan
contents	Downstream probing has been the dominant method for evaluating model representations, an important process given the increasing prominence of self-supervised learning and foundation models. However, downstream probing primarily assesses the availability of task-relevant information in the model's latent space, overlooking attributes such as equivariance, invariance, and disentanglement, which contribute to the interpretability, adaptability, and utility of representations in real-world applications. While some attempts have been made to measure these qualities in representations, no unified evaluation framework with modular, generalizable, and interpretable metrics exists. In this paper, we argue for the importance of representation evaluation beyond downstream probing. We introduce a standardized protocol to quantify informativeness, equivariance, invariance, and disentanglement of factors of variation in model representations. We use it to evaluate representations from a variety of models in the image and speech domains using different architectures and pretraining approaches on identified controllable factors of variation. We find that representations from models with similar downstream performance can behave substantially differently with regard to these attributes. This hints that the respective mechanisms underlying their downstream performance are functionally different, prompting new research directions to understand and improve representations.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_06224
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks Plachouras, Christos Guinot, Julien Fazekas, George Quinton, Elio Benetos, Emmanouil Pauwels, Johan Machine Learning Downstream probing has been the dominant method for evaluating model representations, an important process given the increasing prominence of self-supervised learning and foundation models. However, downstream probing primarily assesses the availability of task-relevant information in the model's latent space, overlooking attributes such as equivariance, invariance, and disentanglement, which contribute to the interpretability, adaptability, and utility of representations in real-world applications. While some attempts have been made to measure these qualities in representations, no unified evaluation framework with modular, generalizable, and interpretable metrics exists. In this paper, we argue for the importance of representation evaluation beyond downstream probing. We introduce a standardized protocol to quantify informativeness, equivariance, invariance, and disentanglement of factors of variation in model representations. We use it to evaluate representations from a variety of models in the image and speech domains using different architectures and pretraining approaches on identified controllable factors of variation. We find that representations from models with similar downstream performance can behave substantially differently with regard to these attributes. This hints that the respective mechanisms underlying their downstream performance are functionally different, prompting new research directions to understand and improve representations.
title	Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
topic	Machine Learning
url	https://arxiv.org/abs/2505.06224

Similar Items