Electrical and Computer Engineering Publications

A Novel Multidimensional Reference Model For Heterogeneous Textual Datasets Using Context, Semantic And Syntactic Clues

Ganesh Kumar, Universiti Teknologi PetronasFollow
Shuib Basri, Universiti Teknologi PetronasFollow
Abdullahi Abubakar Imam, Universiti Brunei DarussalamFollow
Abdullateef Abdullateef Oluwagbemiga Balogun, Universiti Teknologi PetronasFollow
Hussaini Mamman, Universiti Teknologi PetronasFollow
Luiz Fernando Capretz, University of Western OntarioFollow

Document Type

Article

Publication Date

10-31-2023

Volume

Issue

Journal

International Journal of Advanced Computer Science and Applications

First Page

754

Last Page

763

Abstract

With the advent of technology and use of latest devices, they produces voluminous data. Out of it, 80% of the data are unstructured and remaining 20% are structured and semi-structured. The produced data are in heterogeneous format and without following any standards. Among heterogeneous (structured, semi-structured and unstructured) data, textual data are nowadays used by industries for prediction and visualization of future challenges. Extracting useful information from it is really challenging for stakeholders due to lexical and semantic matching. Few studies have been solving this issue by using ontologies and semantic tools, but the main limitations of proposed work were the less coverage of multidimensional terms. To solve this problem, this study aims to produce a novel multidimensional reference model using linguistics categories for heterogeneous textual datasets. The categories such context, semantic and syntactic clues are focused along with their score. The main contribution of MRM is that it checks each tokens with each term based on indexing of linguistic categories such as synonym, antonym, formal, lexical word order and co-occurrence. The experiments show that the percentage of MRM is better than the state-of-the-art single dimension reference model in terms of more coverage, linguistics categories and heterogeneous datasets.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Citation of this paper:

Kumar G., Iman A.A., Mamman H., Basri S., Balogun A.O., Capretz L.F., A Novel Multidimensional Reference Model for Heterogeneous Textual Datasets Using Context, Semantic and Syntactic Clues, International Journal of Advanced Science and Applications, Volume 14, Issue 10, pp. 754-763, 2023.

Download

Included in

Software Engineering Commons

COinS

Electrical and Computer Engineering Publications

A Novel Multidimensional Reference Model For Heterogeneous Textual Datasets Using Context, Semantic And Syntactic Clues

Document Type

Publication Date

Volume

Issue

Journal

First Page

Last Page

Abstract

Creative Commons License

Citation of this paper:

Included in

Links

Browse

Author Corner

Electrical and Computer Engineering Publications

A Novel Multidimensional Reference Model For Heterogeneous Textual Datasets Using Context, Semantic And Syntactic Clues

Authors

Document Type

Publication Date

Volume

Issue

Journal

First Page

Last Page

Abstract

Creative Commons License

Citation of this paper:

Included in

Share

Links

Browse

Author Corner