Electronic Thesis and Dissertation Repository

Degree

Master of Engineering Science

Program

Electrical and Computer Engineering

Supervisor

Jagath Samarabandu

Abstract

Recurrent Neural Networks (RNN) show a remarkable result in sequence learning, particularly in architectures with gated unit structures such as Long Short-term Memory (LSTM). In recent years, several permutations of LSTM architecture have been proposed mainly to overcome the computational complexity of LSTM. In this dissertation, a novel study is presented that will empirically investigate and evaluate LSTM architecture variants such as Gated Recurrent Unit (GRU), Bi-Directional LSTM, and Dynamic-RNN for LSTM and GRU specifically on detecting network intrusions. The investigation is designed to identify the learning time required for each architecture algorithm and to measure the intrusion prediction accuracy. RNN was evaluated on the DARPA/KDD Cup’99 intrusion detection dataset for each architecture. Feature selection mechanisms were also implemented to help in identifying and removing nonessential variables from data that do not affect the accuracy of the prediction models, in this case Principle Component Analysis (PCA) and the RandomForest (RF) algorithm. The results showed that RF captured more significant features over PCA when the accuracy for RF 97.86% for LSTM and 96.59% for GRU, were PCA 64.34% for LSTM and 67.97% for GRU. In terms of RNN architectures, prediction accuracy of each variant exhibited improvement at specific parameters, yet with a large dataset and a suitable time training, the standard vanilla LSTM tended to lead among all other RNN architectures which scored 99.48%. Although Dynamic RNN’s offered better performance with accuracy, Dynamic-RNN GRU scored 99.34%, however they tended to take a longer time to be trained with high training cycles, Dynamic-RNN LSTM needs 25284.03 seconds at 1000 training cycle. GRU architecture had one variant introduced to reduce LSTM complexity, which developed with fewer parameters resulting in a faster-trained model compared to LSTM needs 1903.09 seconds when LSTM required 2354.93 seconds for the same training cycle. It also showed equivalent performance with respect to the parameters such as hidden layers and time-step. BLSTM offered impressive training time as 190 seconds at 100 training cycle, though the accuracy was below that of the other RNN architectures which didn’t exceed 90%.

Share

COinS