Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article

Degree

Master of Science

Program

Computer Science

Supervisor

Narayan, Apurva

Abstract

This thesis addresses the critical challenge of automating security test case generation from attack trees, a process traditionally labor-intensive and lacking comprehensive automation in software testing. We introduce the Security Test Automation Framework (STAF), a novel approach leveraging Large Language Models (LLMs) and a two-step self-corrective Retrieval-Augmented Generation (RAG) framework. STAF provides an end-to-end solution for automatically generating executable security test cases from attack trees, enabling comprehensive coverage of potential vulnerabilities and attack vectors within software systems. Our methodology employs a custom RAG framework designed specifically for security test case generation, addressing limitations in existing approaches. Experimental results demonstrate that STAF augmented LLama 3.1 \& Qwen2.5 outperform closed-source models like GPT-4o and Claude 3.5 Sonnet, despite having 2-3 times fewer parameters. Additionally, we present the first publicly available benchmark dataset for security test case generation from attack trees, supporting standardized evaluation. The study reveals significant improvements in efficiency, accuracy, and scalability, with STAF seamlessly integrating into existing workflows. These findings mark a substantial advancement in security testing methodologies, potentially transforming how organizations approach vulnerability assessment and mitigation in software systems.

Summary for Lay Audience

In today's digital age, ensuring the security of software systems is more crucial than ever. However, testing for potential vulnerabilities is often a time-consuming and complex process. This research introduces an innovative solution to this challenge: the Security Test Automation Framework (STAF).

STAF leverages the power of advanced artificial intelligence, specifically Large Language Models (LLMs), to automate the generation of security tests. It does this by analyzing "attack trees" – diagrams that map out potential ways a system could be compromised. Think of an attack tree as a blueprint of vulnerabilities that cybercriminals might exploit.

What sets STAF apart is its novel approach. It uses a two-step process that not only generates tests but also refines them, ensuring they are accurate and comprehensive. This is achieved through a technique called Retrieval-Augmented Generation, which allows the system to pull relevant information from a vast knowledge base and apply it to create targeted security tests. The research demonstrates that STAF outperforms even more complex AI models in this task. It's more efficient, accurate, and scalable than current methods, potentially revolutionizing how we approach security testing in software development.

Moreover, this study introduces a new benchmark dataset for evaluating such systems, contributing to the broader field of cybersecurity research. By automating and improving the security testing process, STAF could help developers identify and address vulnerabilities more quickly and thoroughly, ultimately leading to safer software systems for everyone.

This advancement has far-reaching implications, from improving the security of everyday applications to bolstering defenses against sophisticated cyber attacks. As our reliance on digital systems grows, innovations like STAF play a crucial role in building a more secure digital future.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Share

COinS