Survey on highly imbalanced multi-class data

Abdul Hamid, Mohd Hakim and Yusoff, Marina and Mohamed, Azlinah (2022) Survey on highly imbalanced multi-class data. (IJACSA) INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 13 (6). pp. 211-229. ISSN 2156-5570

[img] Text
PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF

Download (815kB)

Abstract

Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data.

Item Type: Article
Uncontrolled Keywords: Imbalanced data, Highly imbalanced data, Highly imbalanced multi-class, Data strategies
Divisions: Faculty of Information and Communication Technology
Depositing User: mr eiisaa ahyead
Date Deposited: 10 Feb 2023 15:15
Last Modified: 10 Feb 2023 15:15
URI: http://eprints.utem.edu.my/id/eprint/26188
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item