MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Muttenthaler, Lukas, Greff, Klaus, Born, Frieda, Spitzer, Bernhard, Kornblith, Simon, Mozer, Michael C., Müller, Klaus-Robert, Unterthiner, Thomas, Lampinen, Andrew K.
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Accesso online:	https://arxiv.org/abs/2409.06509
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866911135190482944
author	Muttenthaler, Lukas Greff, Klaus Born, Frieda Spitzer, Bernhard Kornblith, Simon Mozer, Michael C. Müller, Klaus-Robert Unterthiner, Thomas Lampinen, Andrew K.
author_facet	Muttenthaler, Lukas Greff, Klaus Born, Frieda Spitzer, Bernhard Kornblith, Simon Mozer, Michael C. Müller, Klaus-Robert Unterthiner, Thomas Lampinen, Andrew K.
contents	Deep neural networks have achieved success across a wide range of applications, including as models of human behavior and neural representations in vision tasks. However, neural network training and human learning differ in fundamental ways, and neural networks often fail to generalize as robustly as humans do raising questions regarding the similarity of their underlying representations. What is missing for modern learning systems to exhibit more human-aligned behavior? We highlight a key misalignment between vision models and humans: whereas human conceptual knowledge is hierarchically organized from fine- to coarse-scale distinctions, model representations do not accurately capture all these levels of abstraction. To address this misalignment, we first train a teacher model to imitate human judgments, then transfer human-aligned structure from its representations to refine the representations of pretrained state-of-the-art vision foundation models via finetuning. These human-aligned models more accurately approximate human behavior and uncertainty across a wide range of similarity tasks, including a new dataset of human judgments spanning multiple levels of semantic abstractions. They also perform better on a diverse set of machine learning tasks, increasing generalization and out-of-distribution robustness. Thus, infusing neural networks with additional human knowledge yields a best-of-both-worlds representation that is both more consistent with human cognitive judgments and more practically useful, thus paving the way toward more robust, interpretable, and human-aligned artificial intelligence systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_06509
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Aligning Machine and Human Visual Representations across Abstraction Levels Muttenthaler, Lukas Greff, Klaus Born, Frieda Spitzer, Bernhard Kornblith, Simon Mozer, Michael C. Müller, Klaus-Robert Unterthiner, Thomas Lampinen, Andrew K. Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Deep neural networks have achieved success across a wide range of applications, including as models of human behavior and neural representations in vision tasks. However, neural network training and human learning differ in fundamental ways, and neural networks often fail to generalize as robustly as humans do raising questions regarding the similarity of their underlying representations. What is missing for modern learning systems to exhibit more human-aligned behavior? We highlight a key misalignment between vision models and humans: whereas human conceptual knowledge is hierarchically organized from fine- to coarse-scale distinctions, model representations do not accurately capture all these levels of abstraction. To address this misalignment, we first train a teacher model to imitate human judgments, then transfer human-aligned structure from its representations to refine the representations of pretrained state-of-the-art vision foundation models via finetuning. These human-aligned models more accurately approximate human behavior and uncertainty across a wide range of similarity tasks, including a new dataset of human judgments spanning multiple levels of semantic abstractions. They also perform better on a diverse set of machine learning tasks, increasing generalization and out-of-distribution robustness. Thus, infusing neural networks with additional human knowledge yields a best-of-both-worlds representation that is both more consistent with human cognitive judgments and more practically useful, thus paving the way toward more robust, interpretable, and human-aligned artificial intelligence systems.
title	Aligning Machine and Human Visual Representations across Abstraction Levels
topic	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2409.06509

Documenti analoghi