
Automatically Classifying Non-functional Requirements with Feature Extraction and Supervised Machine Learning Techniques
Abstract
Abstract. Context and Motivation: Non-functional requirements (NFRs) of a system need to be classified into different types such as usability, performance, etc. This would enable stakeholders to ensure the completeness of their work by extracting specific NFRs related to their expertise. Question/Problem: Because of the size and complexity of requirement specification documents, the manual classification of NFRs is time-consuming, labour-intensive, and error-prone. We thus need an automated solution that can provide a highly accurate and efficient categorization of NFRs. Principal ideas/results: In this investigation, using natural language processing and supervised machine learning (SML) techniques, we investigate with feature extraction techniques including Part Of Speech-tagging based, Bag of Words (BoW) ,and Term Frequency-Inverse Document Frequency (TF-IDF) combined with SML algorithms including Support Vector Machine (SVM), Stochastic Gradient Descent (SGD) SVM, Linear Regression (LR), Decision Tree (DT), Bagging DT, Extra Tree, Random Forest (RF), Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes (MNB), and Bernoulli Naïve Bayes (BNB). Contribution: The proposed strategy consists of three different combinations of the above-mentioned techniques. SVM with TF-IDF, LR with POS and BoW, and MNB with BoW all achieved recall values higher than 0.90, precision values above 0.87, and execution times less than 0.1s. In addition, we validated these classifiers using a case-study dataset where they promise results of recall values over 0.90 and precision values over 0.92.