6533b81ffe1ef96bd127884c

RESEARCH PRODUCT

A Machine Learning in Binary and Multiclassification Results on Imbalanced Heart Disease Data Stream

Danish HamidSyed Sajid UllahJawaid IqbalSaddam HussainCh Anwar Ul HassanFazlullah Umar

subject

Article SubjectControl and Systems EngineeringElectrical and Electronic EngineeringVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550Instrumentation

description

In medical filed, predicting the occurrence of heart diseases is a significant piece of work. Millions of healthcare-related complexities that have remained unsolved up until now can be greatly simplified with the help of machine learning. The proposed study is concerned with the cardiac disease diagnosis decision support system. An OpenML repository data stream with 1 million instances of heart disease and 14 features is used for this study. After applying to preprocess and feature engineering techniques, machine learning approaches like random forest, decision trees, gradient boosted trees, linear support vector classifier, logistic regression, one-vs-rest, and multilayer perceptron are used to perform binary and multiclassification on the data stream. When combined with the Max Abs Scaler technique, the multilayer perceptron performed satisfactorily in both binary (Accuracy 94.8%) and multiclassification (accuracy 88.2%). Compared to the other binary classification algorithms, the GBT delivered the right outcome (accuracy of 95.8%). Multilayer perceptrons, however, did well in multiple classifications. Techniques such as oversampling and undersampling have a negative impact on disease prediction. Machine learning methods like multilayer perceptrons and ensembles can be helpful for diagnosing cardiac conditions. For this kind of unbalanced data stream, sampling techniques like oversampling and undersampling are not practical.

https://doi.org/10.1155/2022/8400622