
Optimized Machine Learning Models Towards Intelligent Systems
Abstract
The rapid growth of the Internet and related technologies has led to the collection of large amounts of data by individuals, organizations, and society in general [1]. However, this often leads to information overload which occurs when the amount of input (e.g. data) a human is trying to process exceeds their cognitive capacities [2]. Machine learning (ML) has been proposed as one potential methodology capable of extracting useful information from large sets of data [1]. This thesis focuses on two applications. The first is education, namely e-Learning environments. Within this field, this thesis proposes different optimized ML ensemble models to predict students’ performance at earlier stages of the course delivery. Experimental results showed that the proposed optimized ML ensemble models accurately identified the weak students who needed help. More specifically, these models achieved an accuracy of up to 96% in the binary case and 93.1% in the multi-class case. The second application is network security intrusion detection. Within this application field, this thesis proposes different optimized ML classification frameworks using a variety of optimization modeling algorithms and heuristics to improve the performance of the IDSs through anomaly detection while maintaining or reducing their time complexity. Experimental results showed that the developed models reduced the training sample size by up to 74%, reduced the feature set size by almost 60%, and improved the detection accuracy by up to 2%. This thesis can be divided into two main parts. The first part analyzes different educational datasets and proposes different optimized ML classification ensemble models that accurately predict weak students who may need help. The second part proposes optimized ML classification frameworks that accurately detect network attacks while maintaining a low false alarm rate and time complexity. It is noteworthy that the developed models and frameworks could be generalized as follows:
- Optimized ML ensemble models proposed in the first part of this thesis can be generalized to many applications such as finance, network security, social media, and healthcare systems.
- Optimized ML classification models proposed in the second part of this thesis can be generalized to other applications that typically generate large datasets in terms of instances and feature set.