Doctor of Philosophy
Internet traffic forecast is a crucial component for the proactive management of self-organizing networks (SON) to ensure better Quality of Service (QoS) and Quality of Experience (QoE). Given the volatile and random nature of traffic data, this forecasting influences strategic development and investment decisions in the Internet Service Provider (ISP) industry. Modern machine learning algorithms have shown potential in dealing with complex Internet traffic prediction tasks, yet challenges persist. This thesis systematically explores these issues over five empirical studies conducted in the past three years, focusing on four key research questions: How do outlier data samples impact prediction accuracy for both short-term and long-term forecasting? How can a denoising mechanism enhance prediction accuracy? How can robust machine learning models be built with limited data? How can out-of-distribution traffic data be used to improve the generalizability of prediction models? Based on extensive experiments, we propose a novel traffic forecast/prediction framework and associated models that integrate outlier management and noise reduction strategies, outperforming traditional machine learning models. Additionally, we suggest a transfer learning-based framework combined with a data augmentation technique to provide robust solutions with smaller datasets. Lastly, we propose a hybrid model with signal decomposition techniques to enhance model generalization for out-of-distribution data samples. We also brought the issue of cyber threats as part of our forecast research, acknowledging their substantial influence on traffic unpredictability and forecasting challenges. Our thesis presents a detailed exploration of cyber-attack detection, employing methods that have been validated using multiple benchmark datasets. Initially, we incorporated ensemble feature selection with ensemble classification to improve DDoS (Distributed Denial-of-Service) attack detection accuracy with minimal false alarms. Our research further introduces a stacking ensemble framework for classifying diverse forms of cyber-attacks. Proceeding further, we proposed a weighted voting mechanism for Android malware detection to secure Mobile Cyber-Physical Systems, which integrates the mobility of various smart devices to exchange information between physical and cyber systems. Lastly, we employed Generative Adversarial Networks for generating flow-based DDoS attacks in Internet of Things environments. By considering the impact of cyber-attacks on traffic volume and their challenges to traffic prediction, our research attempts to bridge the gap between traffic forecasting and cyber security, enhancing proactive management of networks and contributing to resilient and secure internet infrastructure.
Summary for Lay Audience
Predicting internet traffic is like forecasting the weather - it's crucial for planning, but it's also quite challenging due to the unpredictable nature of data flow. This process is important for companies that provide internet services because it helps them plan their resources and investments more effectively. In our research, we have used modern computer algorithms, commonly known as machine learning, to predict this traffic. However, we faced certain challenges - for instance, sometimes we have limited data or outliers (data points that are significantly different from others), which can impact the prediction's accuracy. Over the past three years, we conducted five studies to answer these challenges and others. The result was the creation of a new model that can predict traffic better than traditional ones by managing unusual data and reducing the noise in the information. We also developed methods to work with small data and improve predictions on data types not seen during the training of the model. On top of traffic prediction, our work also focused on detecting cyber-attacks, specifically those causing a lot of internet traffic like DDoS attacks. These attacks often disguise themselves and cause unexpected traffic bursts, making it hard to predict traffic volumes accurately. To solve this, we combined different techniques to improve detection accuracy with fewer false alarms. We also created different models for detecting various types of attacks, malware on Android devices, and even developed a way to enhance data in Internet of Things (IoT) environments. We validated all these methods with multiple sets of data, proving the effectiveness of our solutions. In simple terms, our work helps internet providers better predict traffic and secure networks from cyber-attacks. It's like providing them with a more accurate traffic forecast and a better security system, ensuring smoother internet experiences for all users.
Saha, Sajal, "Toward Building an Intelligent and Secure Network: An Internet Traffic Forecasting Perspective" (2023). Electronic Thesis and Dissertation Repository. 9556.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License