Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Comput. Neurosci.

Volume 19 - 2025 | doi: 10.3389/fncom.2025.1613291

This article is part of the Research TopicArtificial Neuroscience: machine cognition and behaviourView all articles

Exploring Internal Representations of Self-supervised Networks: Few-shot Learning Abilities and Comparison with Human Semantics and Recognition of Objects

Provisionally accepted
  • 1The University of Tokyo, Bunkyo, Japan
  • 2Kyoto University, Kyoto, Kyōto, Japan

The final, formatted version of the article will be published soon.

Recent advances in self-supervised learning have attracted significant attention from both machine learning and neuroscience. This is primarily because self-supervised methods do not require annotated supervisory information, making them applicable to training artificial networks without relying on large amounts of curated data, and potentially offering insights into how the brain adapts to its environment in an unsupervised manner. Although several previous studies have elucidated the correspondence between neural representations in deep convolutional neural networks (DCNNs) and biological systems, the extent to which unsupervised or self-supervised learning can explain the human-like acquisition of categorically structured information remains less explored. In this study, we investigate the correspondence between the internal representations of DCNNs trained using a self-supervised contrastive learning algorithm and human semantics and recognition. To this end, we employ a few-shot learning evaluation procedure, which measures the ability of DCNNs to recognize novel concepts from limited exposure, to examine the inter-categorical structure of the learned representations. Two comparative approaches are used to relate the few-shot learning outcomes to human semantics and recognition, with results suggesting that the representations acquired through contrastive learning are well aligned with human cognition. These findings underscore the potential of self-supervised contrastive learning frameworks to model learning mechanisms similar to those of the human brain, particularly in scenarios where explicit supervision is unavailable, such as in human infants prior to language acquisition.

Keywords: Contrastive learning, Few-shot learning, Human semantics, Human recognition, Similarity, Self-supervised learning

Received: 17 Apr 2025; Accepted: 02 Oct 2025.

Copyright: © 2025 Kataoka, Nagano and Oizumi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Masafumi Oizumi, c-oizumi@g.ecc.u-tokyo.ac.jp

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.