Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Engineering Science

Program

Electrical and Computer Engineering

Supervisor

Ouda, Abdelkader

Abstract

The field of cybersecurity is exploring new ways to defend against cyber-attacks, including a technique called continuous user authentication. This method uses keystroke (typing) data to continuously match the user's typing pattern with patterns previously recorded using artificial intelligence (AI) to identify the user. While this approach has the potential to improve security, it also has some challenges, including the time it takes to register a user, the performance of machine learning algorithms on real-world data, and latency within the system. In this study, the researchers proposed solutions to these issues by using transfer learning to reduce user registration time, testing machine learning algorithms on real-world data, and developing a universal benchmarking framework to evaluate databases in practical situations. The results of the experiments supported the researchers' observations and suggestions for improving continuous user authentication.

Summary for Lay Audience

Modern systems require robust cybersecurity solutions. Traditional authentication methods like passwords, fingerprints, authorization cards, etc. authenticate the user at the beginning of the session but there is no validation during the session, which makes the system vulnerable. Continuous authentication is the solution to this challenge. In continuous authentication, keystroke data is used to extract the behavior patterns of the user. The data is then applied to train the machine learning (ML) classification algorithms to identify the unique behavioral patterns of each user and classify them accordingly. However, using continuous authentication comes with different challenges. To begin, it required a long registration time because ML algorithms require a lot of data to find the user's behavioral pattern, and plenty of time is required to gather the data which extends the start of continuously authenticating the new user. To solve this the transfer learning technique was used for a feed-forward neural network model to overcome this issue for new users. Besides this, the performance of the ML classification algorithm is key in continuous user authentication, and it requires diverse and comprehensive data to be effective in the production environment. In many cases, the ML algorithm is trained on the datasets collected in a controlled lab environment and the model fails or does not perform as expected in the production environment. For example, China’s facial recognition system recognized the face on a bus ad as a jaywalker because the model was not trained on real-world data. To overcome this problem, this study uses the real-world data of 48 financial organizations’ employees to compare the performance of advanced ML algorithms and ensembles of algorithms. Next, data latency is critical in continuous authentication as millions of records are required to be managed by the database and its performance has a great influence on the continuous authentication process. Hence it is necessary to identify the leading database for a continuous authentication system. Therefore, to evaluate different databases a universal database benchmarking tool is developed, and the performance of MySQL and PostgreSQL is evaluated in production-like scenarios to determine the best-suited database for a continuous authentication system.

Share

COinS