:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Mehta, Yash, Bonner, Michael F.
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.05556
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Universal dimensions of visual representation
by: Chen, Zirui, et al.
Published: (2024)

Efficient coding along the visual hierarchy
by: Passi, Ananya, et al.
Published: (2026)

Predicting upcoming visual features during eye movements yields scene representations aligned with human visual cortex
by: Thorat, Sushrut, et al.
Published: (2025)

Self-supervised video pretraining yields robust and more human-aligned visual representations
by: Parthasarathy, Nikhil, et al.
Published: (2022)

Evaluating alignment between humans and neural network representations in image-based learning tasks
by: Demircan, Can, et al.
Published: (2023)

Visual representations in the human brain are aligned with large language models
by: Doerig, Adrien, et al.
Published: (2022)

Do text-free diffusion models learn discriminative visual representations?
by: Mukhopadhyay, Soumik, et al.
Published: (2023)

Can multimodal representation learning by alignment preserve modality-specific information?
by: Thoreau, Romain, et al.
Published: (2025)

Vision Transformer attention alignment with human visual perception in aesthetic object evaluation
by: Carrasco, Miguel, et al.
Published: (2025)

Dimensions underlying the representational alignment of deep neural networks with humans
by: Mahner, Florian P., et al.
Published: (2024)

Characterizing the visual representation of objects from the child's view
by: Yang, Jane, et al.
Published: (2026)

Dilated Convolution with Learnable Spacings makes visual models more aligned with humans: a Grad-CAM study
by: Chamas, Rabih, et al.
Published: (2024)

Do computer vision foundation models learn the low-level characteristics of the human visual system?
by: Cai, Yancheng, et al.
Published: (2025)

Closing the gap in multimodal medical representation alignment
by: Grassucci, Eleonora, et al.
Published: (2026)

Towards aligned body representations in vision models
by: Gizdov, Andrey, et al.
Published: (2025)

Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis
by: Englebert, Alexandre, et al.
Published: (2024)

Assessing the alignment between infants' visual and linguistic experience using multimodal language models
by: Tan, Alvin Wei Ming, et al.
Published: (2025)

A transition towards virtual representations of visual scenes
by: Pereira, Américo, et al.
Published: (2024)

MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization
by: Pappa, Massimiliano, et al.
Published: (2024)

AGA: An adaptive group alignment framework for structured medical cross-modal representation learning
by: Li, Wei, et al.
Published: (2025)

Sparse components distinguish visual pathways & their alignment to neural networks
by: Marvi, Ammar I, et al.
Published: (2025)

Extending global-local view alignment for self-supervised learning with remote sensing imagery
by: Wanyan, Xinye, et al.
Published: (2023)

CoCoG-2: Controllable generation of visual stimuli for understanding human concept representation
by: Wei, Chen, et al.
Published: (2024)

Learning complete and explainable visual representations from itemized text supervision
by: Lyu, Yiwei, et al.
Published: (2025)

A Self supervised learning framework for imbalanced medical imaging datasets
by: Sharma, Yash Kumar, et al.
Published: (2026)

Generating visual explanations from deep networks using implicit neural representations
by: Byra, Michal, et al.
Published: (2025)

On the dynamic evolution of CLIP texture-shape bias and its relationship to human alignment and model robustness
by: Hernández-Cámara, Pablo, et al.
Published: (2025)

Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations
by: Xie, Yudi, et al.
Published: (2024)

Human alignment of neural network representations
by: Muttenthaler, Lukas, et al.
Published: (2022)

Self-supervised structured object representation learning
by: Hadjerci, Oussama, et al.
Published: (2025)

Deep video representation learning: a survey
by: Ravanbakhsh, Elham, et al.
Published: (2024)

Pathological Truth Bias in Vision-Language Models
by: Thube, Yash
Published: (2025)

A deep multiple instance learning approach based on coarse labels for high-resolution land-cover mapping
by: Perantoni, Gianmarco, et al.
Published: (2025)

Quantifying the human visual exposome with vision language models
by: Rominger, Christian, et al.
Published: (2026)

UniAR: A Unified model for predicting human Attention and Responses on visual content
by: Li, Peizhao, et al.
Published: (2023)

Affine transformation estimation improves visual self-supervised learning
by: Torpey, David, et al.
Published: (2024)

Improving generalization by mimicking the human visual diet
by: Madan, Spandan, et al.
Published: (2022)

CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback
by: Hein, Dennis, et al.
Published: (2024)

Generalizability analysis of deep learning predictions of human brain responses to augmented and semantically novel visual stimuli
by: Piskovskyi, Valentyn, et al.
Published: (2024)

Noise-aware few-shot learning through bi-directional multi-view prompt alignment
by: Niu, Lu, et al.
Published: (2026)