Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Juscafresa, A. Nieto, Herreros, Á. Mazcuñán, Sullivan, J.
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.13416
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915739729920000
author	Juscafresa, A. Nieto Herreros, Á. Mazcuñán Sullivan, J.
author_facet	Juscafresa, A. Nieto Herreros, Á. Mazcuñán Sullivan, J.
contents	Diffusion models have emerged as state-of-the-art generative methods for image synthesis, yet their potential as general-purpose feature encoders remains underexplored. Trained for denoising and generation without labels, they can be interpreted as self-supervised learners that capture both low- and high-level structure. We show that a frozen diffusion backbone enables strong fine-grained recognition by probing intermediate denoising features across layers and timesteps and training a linear classifier for each pair. We evaluate this in a real-world plankton-monitoring setting with practical impact, using controlled and comparable training setups against established supervised and self-supervised baselines. Frozen diffusion features are competitive with supervised baselines and outperform other self-supervised methods in both balanced and naturally long-tailed settings. Out-of-distribution evaluations on temporally and geographically shifted plankton datasets further show that frozen diffusion features maintain strong accuracy and Macro F1 under substantial distribution shift.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_13416
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study Juscafresa, A. Nieto Herreros, Á. Mazcuñán Sullivan, J. Computer Vision and Pattern Recognition Diffusion models have emerged as state-of-the-art generative methods for image synthesis, yet their potential as general-purpose feature encoders remains underexplored. Trained for denoising and generation without labels, they can be interpreted as self-supervised learners that capture both low- and high-level structure. We show that a frozen diffusion backbone enables strong fine-grained recognition by probing intermediate denoising features across layers and timesteps and training a linear classifier for each pair. We evaluate this in a real-world plankton-monitoring setting with practical impact, using controlled and comparable training setups against established supervised and self-supervised baselines. Frozen diffusion features are competitive with supervised baselines and outperform other self-supervised methods in both balanced and naturally long-tailed settings. Out-of-distribution evaluations on temporally and geographically shifted plankton datasets further show that frozen diffusion features maintain strong accuracy and Macro F1 under substantial distribution shift.
title	Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2601.13416

Similar Items