Saved in:
Bibliographic Details
Main Authors: Kataoka, Asaki, Nagano, Yoshihiro, Oizumi, Masafumi
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.20364
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914179154182144
author Kataoka, Asaki
Nagano, Yoshihiro
Oizumi, Masafumi
author_facet Kataoka, Asaki
Nagano, Yoshihiro
Oizumi, Masafumi
contents Recent advances in self-supervised learning have attracted significant attention from both machine learning and neuroscience. This is primarily because self-supervised methods do not require annotated supervisory information, making them applicable to training artificial networks without relying on large amounts of curated data, and potentially offering insights into how the brain adapts to its environment in an unsupervised manner. Although several previous studies have elucidated the correspondence between neural representations in deep convolutional neural networks (DCNNs) and biological systems, the extent to which unsupervised or self-supervised learning can explain the human-like acquisition of categorically structured information remains less explored. In this study, we investigate the correspondence between the internal representations of DCNNs trained using a self-supervised contrastive learning algorithm and human semantics and recognition. To this end, we employ a few-shot learning evaluation procedure, which measures the ability of DCNNs to recognize novel concepts from limited exposure, to examine the inter-categorical structure of the learned representations. Two comparative approaches are used to relate the few-shot learning outcomes to human semantics and recognition, with results suggesting that the representations acquired through contrastive learning are well aligned with human cognition. These findings underscore the potential of self-supervised contrastive learning frameworks to model learning mechanisms similar to those of the human brain, particularly in scenarios where explicit supervision is unavailable, such as in human infants prior to language acquisition.
format Preprint
id arxiv_https___arxiv_org_abs_2504_20364
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Exploring internal representation of self-supervised networks: few-shot learning abilities and comparison with human semantics and recognition of objects
Kataoka, Asaki
Nagano, Yoshihiro
Oizumi, Masafumi
Neurons and Cognition
Recent advances in self-supervised learning have attracted significant attention from both machine learning and neuroscience. This is primarily because self-supervised methods do not require annotated supervisory information, making them applicable to training artificial networks without relying on large amounts of curated data, and potentially offering insights into how the brain adapts to its environment in an unsupervised manner. Although several previous studies have elucidated the correspondence between neural representations in deep convolutional neural networks (DCNNs) and biological systems, the extent to which unsupervised or self-supervised learning can explain the human-like acquisition of categorically structured information remains less explored. In this study, we investigate the correspondence between the internal representations of DCNNs trained using a self-supervised contrastive learning algorithm and human semantics and recognition. To this end, we employ a few-shot learning evaluation procedure, which measures the ability of DCNNs to recognize novel concepts from limited exposure, to examine the inter-categorical structure of the learned representations. Two comparative approaches are used to relate the few-shot learning outcomes to human semantics and recognition, with results suggesting that the representations acquired through contrastive learning are well aligned with human cognition. These findings underscore the potential of self-supervised contrastive learning frameworks to model learning mechanisms similar to those of the human brain, particularly in scenarios where explicit supervision is unavailable, such as in human infants prior to language acquisition.
title Exploring internal representation of self-supervised networks: few-shot learning abilities and comparison with human semantics and recognition of objects
topic Neurons and Cognition
url https://arxiv.org/abs/2504.20364