A classification model for water quality analysis using decision tree

Abstract

A classification algorithm is used to assign predefined classes to test instances for evaluation) or future instances to an application). This study presents a Classification model using decision tree for the purpose of analyzing water quality data from different counties in Kenya. The water quality is very important in ensuring citizens get to drink clean water. Application of decision tree as a data mining method to predict clean water based on the water quality parameters can ease the work of the laboratory technologist by predicting which water samples should proceed to the next step of analysis. The secondary data from Kenya Water institute was used for creation of this model.  The data model was implemented in WEKA software. Classification using decision tree was applied to classify /predict the clean and not clean water. The analysis of water Alkalinity,pH level and conductivity can play a major role in  assessing water quality. Five decision tree classifiers which are J48, LMT, Random forest, Hoeffding tree and Decision Stump were used to build the model and the accuracy compared. J48 decision tree had the highest accuracy of 94% with Decision Stump having the lowest accuracy of 83%.

Keywords: Data mining, Decision Tree, Water Quality, Weka Tool, classification model

Article Review Status: Published

Pages: 1-8 (Download PDF)

Creative Commons Licence
This work by European American Journals is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License