A Hybrid Continual Machine Learning Model for Efficient Hierarchical Classification of Domain-Specific Text in The Presence of Class Overlap (Case Study: IT Support Tickets)
Doctor of Philosophy
Madhavji, Nazim H.
In today’s world, support ticketing systems are employed by a wide range of businesses. The ticketing system facilitates the interaction between customers and the support teams when the customer faces an issue with a product or a service. For large-scale IT companies with a large number of clients and a great volume of communications, the task of automating the classification of incoming tickets is key to guaranteeing long-term clients and ensuring business growth.
Although the problem of text classification has been widely studied in the literature, the majority of the proposed approaches revolve around state-of-the-art deep learning models. This thesis addresses the following research questions: What are the reasons behind employing black box models (i.e., deep learning models) for text classification tasks? What is the level of polysemy (i.e., the coexistence of many possible meanings for a word or phrase) in a technical (i.e., specialized) text? How do static word embeddings like Word2vec fare against traditional TFIDF vectorization? How do dynamic word embeddings (e.g., PLMs) compare against a linear classifier such as Support Vector Machine (SVM) for classifying a domain-specific text?
This integrated article thesis aims to investigate the aforementioned issues through five empirical studies that were conducted over the past four years. The observation of our studies is an emerging theory that demonstrates why traditional ML models offer a more efficient solution to domain-specific text classification compared to state-of-the-art DL language models (i.e., PLMs).
Based on extensive experiments on a real-world dataset, we propose a novel Hybrid Online Offline Model (HOOM) that can efficiently classify IT Support Tickets in a real-time (i.e., dynamic) environment. Our classification model is anticipated to build trust and confidence when deployed into production as the model is interpretable, efficient, and can detect concept drifts in the data.
Summary for Lay Audience
According to a recent study, 96% of unhappy customers don’t complain, and 91% of those will simply leave and never come back. In the IT business, when customers have issues with the systems they are using, they submit a ‘support ticket’. A ‘Support Ticketing System’ is the term used to describe the way customers interact with the support agents to get their issues resolved. For large IT firms, support agents deal with a tremendous volume of support tickets daily. Handling these tickets manually is almost impossible, so the need to automate the process of organizing these tickets into different categories becomes crucial. This is called Text Classification (TC), which is one of several Natural Language Processing (NLP) tasks.
Due to the complexity of the unstructured nature of human language, TC is challenging. Recently, a suite of deep learning models called Pre-trained Language Models (PLMs) have been used extensively for all NLP tasks, including TC. These PLMs have achieved striking success in the NLP field where they are trained on an enormous amount of text (e.g., books, Wikipedia, etc), which enables these models to better understand the language. However, despite their impressive performance, we argue against the need to employ PLMs for TC tasks, especially when the text is domain-specific (i.e., related to a specialized domain such as IT).
Based on this, we pose the key research question: Are PLMs the most cost-efficient solution for domain-specific TC tasks?. The findings of our study suggest that the problem of classifying domain-specific can be addressed efficiently using old traditional classifiers such as SVM and a vectorization technique such as TFIDF that do not involve the complexity found in neural network models such as PLMs.
This thesis proposes a novel hybrid approach to classify IT Support Tickets using a non-deep learning approach that combines a static ML model trained in an offline setting with an online ML model trained in a dynamic (real-time) environment. Our classification model is anticipated to build trust and confidence when deployed into production as the model is efficient, and can detect data changes that occur over time.
Wahba, Yasmen M., "A Hybrid Continual Machine Learning Model for Efficient Hierarchical Classification of Domain-Specific Text in The Presence of Class Overlap (Case Study: IT Support Tickets)" (2023). Electronic Thesis and Dissertation Repository. 9192.