Invariant Object Recognition in Deep Neural Networks and Humans

Haider Al-Tahan, Western University

Abstract

Invariant object recognition, a cornerstone of human vision, enables recognizing objects despite variations in rotations, positions, and scales. To emulate human-like generalization across object transformations, computational models must perform well in this aspect. Deep neural networks (DNNs) are popular models for human ventral visual stream processing, though their alignment with human performance remains inconsistent. We examine object recognition across transformations in human adults and pretrained feedforward DNNs. DNNs are grouped in model families by architecture, visual diet, and learning goal. We focus on object rotation in depth, and observe that object recognition performance is better preserved in humans than in DNNs, although they show a similar pattern in how performance drops as a function of rotational angle. DNNs also exhibit decreased recognition after other transformations, especially scale changes. Model architecture minimally influences performance, while DNNs trained on richer visual diets and unsupervised learning goals excel. Our study suggests that visual diet and learning goals may play an important role in the development of invariant object recognition in humans.

This item has been relocated to Western University’s Open Repository

Invariant Object Recognition in Deep Neural Networks and Humans

Abstract

Links

Browse

Author Corner

Links