Performance Evaluation of Classification Algorithms in Predicting Hepatitis Virus

  • Musa Wakil Bara Department of Computer Science Mai Idris Alooma Polytechnic Geidam, Yobe State
  • Yusuf Abubakar Department of Computer Science, Nuhu Bamalli Polytechnic, Zaria Kaduna
  • Mohammed Abubakar Department of Computer Science Mai Idris Alooma Polytechnic Geidam, Yobe State
Keywords: Hepatitis, WEKA, Data Mining, Classification

Abstract

Nowadays, the number of hepatitis victims is increasing daily and the chance of the survivability of the patients is becoming difficult to predict. Hepatitis is a disease that is caused by many factors such as alcohol consumption, eating polluted foods, drugs and so on. Due to the increase of the number of hepatitis patients and various causes of the disease, doctors find it difficult to accurately diagnose and predict the disease. Researchers have applied data mining methods to extract valuable information from hepatitis database to help in predicting the presence of the disease. In this paper, five (5) classification algorithms were selected to accurately predict the presence or absence of the disease, and to compare their performance against RandomForest which recorded the highest performance in the literature. The experiment was conducted on hepatitis dataset obtained from UCI machine learning repository using WEKA data mining tool. IBk classification algorithm outperformed the rests with accuracy of 88.80%, recall of 98.80% and precision of 83.20%. However, RandomForest still remains the favourite in terms of accuracy. It is recommended that RandomForest algorithm can be used in hepatitis disease prediction where accuracy is preferred. IBk algorithm is recommended in applications that prefer recall while MODLEM is recommended in applications that give preference to precision

Published
2021-11-17