Faculty
Science
Supervisor Name
Konstantinos Kontogiannis
Keywords
machine learning, git commits, github, software testing
Description
Large software systems are implemented using many different programming languages and scripts, and consequently the dependencies between their components are very complex. It is therefore difficult to extract and understand these dependencies by solely analyzing the source code, so that failure risks can be detected accurately. On the other hand, it is a common practice for software engineers to keep track of process related metrics such as the number of times a component was maintained, with which other components it has been co-committed, whether the maintenance activity was a bug-fixing activity, and how many lines of source code have been altered. These data provide valuable information to be used for training a machine learning model and for devising metrics which can predict the risk associated with a future failure of a component due to maintenance activities in this or in another component related to it.
Acknowledgements
Konstantinos Kontogiannis, Marios Grigoriou, Alberto Giammaria (IBM US, Austin TX. Lab), Chris Brealey (IBM Canada, Toronto Canada Lab)
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Document Type
Poster
Included in
Evaluating Machine Learning Model Stability for Software Bug Prediction
Large software systems are implemented using many different programming languages and scripts, and consequently the dependencies between their components are very complex. It is therefore difficult to extract and understand these dependencies by solely analyzing the source code, so that failure risks can be detected accurately. On the other hand, it is a common practice for software engineers to keep track of process related metrics such as the number of times a component was maintained, with which other components it has been co-committed, whether the maintenance activity was a bug-fixing activity, and how many lines of source code have been altered. These data provide valuable information to be used for training a machine learning model and for devising metrics which can predict the risk associated with a future failure of a component due to maintenance activities in this or in another component related to it.