Electronic Thesis and Dissertation Repository

Thesis Format



Master of Science


Computer Science


Kontogiannis, Konstantinos


Continuous software engineering principles advocate a release-small, release-often process model, where new functionality is added to a system, in small increments and very frequently. In such a process model, every time a change is introduced it is important to identify as early as possible, whether the system has entered a state where faults are more likely to occur. In this paper, we present a method that is based on process, quality, and source code metrics to evaluate the likelihood that an imminent bug-inducing commit is highly probable. More specifically, the method analyzes the correlations and the rate of change of selected metrics. The findings from the technical debt dataset extracted data from SonarQube indicate that before bug-inducing commits, metrics that otherwise are not correlated, suddenly exhibit a high correlation or a high rate of change. This metric behavior can then be used for assessing an impending bug-inducing commit. The technique is programing language agnostic, based on metrics that can be extracted without the use of specialized parsers and can be applied to forewarn developers that a file, or a collection of files, has entered a state where faults are highly probable.

Summary for Lay Audience

The primary focus of the software development process is to produce high-quality software at every stage. To achieve this, it is important to continuously assess and improve the quality of the software. One way to accomplish this is through software quality prediction, which involves evaluating the quality of the software regularly and identifying any potential quality issues early on. This process is also known as bug prediction, as it aims to identify and address bugs or defects in the software.

Bug prediction is vital for improving the overall quality of the software and can also help to reduce the cost and time required for testing. This thesis proposes techniques for evaluating the likelihood of the fault-inducing commit based on process, quality, and source code metrics. Our approach is based on the calculation of correlation and the angle of the least square regression line of different metrics. By examining the values, we aim to assess that faults are highly probable in the upcoming commits.