Potential of Vision Transformers for Advanced Driver-Assistance Systems: An Evaluative Approach

Andrew Katoch, The University of Western Ontario

Abstract

In this thesis, we examine the performance of Vision Transformers concerning the current state of Advanced Driving Assistance Systems (ADAS). We explore the Vision Transformer model and its variants on the problems of vehicle computer vision. Vision transformers show performance competitive to convolutional neural networks but require much more training data. Vision transformers are also more robust to image permutations than CNNs. Additionally, Vision Transformers have a lower pre-training compute cost but can overfit on smaller datasets more easily than CNNs. Thus we apply this knowledge to tune Vision transformers on ADAS image datasets, including general traffic objects, vehicles, traffic lights, and traffic signs. We compare the performance of Vision Transformers on this problem to existing convolutional neural network approaches to determine the viability of Vision Transformer usage.

This item has been relocated to Western University’s Open Repository

Potential of Vision Transformers for Advanced Driver-Assistance Systems: An Evaluative Approach

Abstract

Links

Browse

Author Corner

Links