Electronic Thesis and Dissertation Repository

Deep Reinforcement Learning for Autonomous Unmanned Aerial Vehicle Navigation

Fadi AlMahamid, Western University

Abstract

Unmanned Aerial Vehicles (UAVs) are instrumental in various tasks, including package delivery, disaster response, and surveillance. Their varied applications highlight the need for advanced navigation techniques, with Deep Reinforcement Learning (DRL) being a key approach in enhancing UAV autonomy. The challenges in UAV navigation using DRL span three key areas: comprehending DRL applications in UAV navigation, navigation frameworks accommodating the requirements of autonomous UAV navigation, and adaptive DRL algorithms handling high-dimensional inputs and temporal dependencies inherent in UAV navigation.

In response to these challenges, this thesis explores challenges associated with DRL in autonomous UAV navigation in complex 3D environments. The investigation accentuates understanding algorithmic properties and navigation tasks to leverage DRL methodologies in UAV navigation. The DRL algorithms for autonomous UAV navigation are investigated and classified. The comprehensive review includes over fifty Reinforcement Learning (RL) algorithms, their traits, relations, and classifications based on the application environment and UAV navigation. Moreover, a process for selecting the appropriate DRL algorithm based on the navigation environment and algorithmic needs is presented.

Next, the thesis presents VizNav, a modular RL-based navigation framework that addresses the current challenges in RL-based autonomous UAV navigation, leveraging off-policy RL algorithm and employing Prioritized Experience Replay (PER) for improved UAV navigation results and algorithm convergence. Additionally, VizNav uses Depth Map Images (DMI) to provide the agent with a more accurate and comprehensive depth perspective, enhancing UAV navigation. VizNav experimental results reveal enhanced navigation using TD3 supported by PER and DMI while maintaining adaptability using different algorithms and environments.

Finally, this thesis proposes Agile Deep Q-Network (AG-DQN), a novel DRL algorithm to manage high-dimensional inputs and temporal dependencies, employing a dynamic multi-glimpse strategy and advanced temporal processing to selectively and dynamically extract salient features for improved decision-making. AG-DQN outperforms other state-of-the-art methods like DRQN and DARQN in complex UAV navigation tasks, using only 32% of the total image pixels (environment state). Overall, the thesis contributes to developing fully autonomous UAVs capable of navigating various scenarios, paving the way for their broadened applications.