Tag Archives: Data mining

Behavioral Data Analysis in Emotional Intelligence of Social Network Consumers (Published)

Emotional intelligence is both characteristic of personality and intellectual capacity, which a person inherits from the genetic material of its parents and evolves – develops throughout lifetime. It refers to information processing capacity arising from the emotions and their utility to guide action in situations that require activation of the cognitive system. The purpose of the present research work is the application of Machine Learning and Data Mining methods for the evaluation of emotional IQ in a sample of students and social network consumers (age 18-26 years). Understanding how users behave when they connect to social networking sites creates opportunities for better interface design, richer studies of social interactions, and improved design of content distribution systems. The data were collected by completion of the self-report questionnaire Trait Emotional Intelligence (TEIQue) and used for the application of data mining methods. Then the collected data were selected for analysis, with relevant transformations in order to have a suitable form for the implementation of the respective machine learning algorithms included in the software package R. Furthermore, the parameters of the corresponding set of algorithms were determined depending on the case of application to produce inference rules. Some of the algorithms implemented according to specific research questions that were applied, were the classification algorithms (ID3 and J48) for the production of decision trees, regarding the four more general factors (welfare, self-control, emotionality and sociability) and in overall emotional intelligence. The results obtained, after weighing and criteria basis, present consumers’ rates, which in turn analyze the degree of emotional intelligence.

Keywords: Consumers, Data mining, Emotional Intelligence, Marketing, Social Networks, behavioral data

Opinion Mining In Big Data: Trend of Thinking for Big Data Era (Published)

This ear with the rapidly growing of internet and network using there are a huge data that have been introduced, Big Data are now on the double expanding rabidly in all domains, including opinion and sentiment analysis, for there are many social media and other websites that offer chances to provide the visitors and customers to post their opinion which usually contains valuable information that could be helpfully for several issues. And there are different methods and techniques that proposed to face this huge data and the big social data to make it more beneficial for several fields. This Paper   introduces the big data and the most common it is usage and challenge, and it also investigate the sentiment analysis and it is common techniques and thinking about it is futures. This paper also thinking about the future of big data and opinion mining is clearly discussed and thinking about the future of big data and opinion mining. And the paper will discuss the challenges that facing the big data and opinion mining. 

Keywords: Big Data, Data mining, Social media, opinion mining

Utilization of Data Mining and Anonymous Communication Techniques for Fraud Detection in Large Scale Business Organisations in Delta State (Published)

This study on utilisation of data mining and anonymous communication techniques for fraud detection in large scale business organisations in Delta State was necessitated by the growing incidence frauds that are crippling businesses and socio-economic development of the state. Two research questions guided the study and two null hypotheses were tested at 0.05 level of significance. Related literature to the study were reviewed. Descriptive survey research design was adopted for the study. The population of the study was 260 accounting staff. A sample size of 160 was selected for the study using simple random sampling technique. A four-point rating scale questionnaire developed by the researchers was used for data collection.  Cronbach Alpha method was used to determine the reliability of the questionnaire and this yielded reliability coefficient values of 0.85 and 0.80 respectively for the sections with an overall reliability of 0.83. Data were analyzed using mean and standard deviation to ascertain the homogeneity of the respondent while t-test and analysis of variance were used to test the hypotheses at 0.05 level of significance. The results showed that the accounting staff lowly utilised data mining and anonymous techniques for fraud detection.  Furthermore, it was found that types and status of organization in NSE significantly influenced respondents’ ratings on the utilization of data mining but did not influence their ratings on utilization of anonymous communications for fraud detection. From the findings of the study, it was concluded that the accounting staff did not utilize forensic auditing investigation techniques for fraud detection in large-scale business organisations as required. Based on the findings, the researcher recommended among others, shareholders and directors of large-scale business organisations should provide regular training on data mining techniques to equip their accounting staff with the relevant and up-to-date skills, abilities, attitude and competences for fraud detection.

Keywords: Data mining, anonymous communication and large-scale business organisation

A classification model for water quality analysis using decision tree (Published)

A classification algorithm is used to assign predefined classes to test instances for evaluation) or future instances to an application). This study presents a Classification model using decision tree for the purpose of analyzing water quality data from different counties in Kenya. The water quality is very important in ensuring citizens get to drink clean water. Application of decision tree as a data mining method to predict clean water based on the water quality parameters can ease the work of the laboratory technologist by predicting which water samples should proceed to the next step of analysis. The secondary data from Kenya Water institute was used for creation of this model.  The data model was implemented in WEKA software. Classification using decision tree was applied to classify /predict the clean and not clean water. The analysis of water Alkalinity,pH level and conductivity can play a major role in  assessing water quality. Five decision tree classifiers which are J48, LMT, Random forest, Hoeffding tree and Decision Stump were used to build the model and the accuracy compared. J48 decision tree had the highest accuracy of 94% with Decision Stump having the lowest accuracy of 83%.

Keywords: Data mining, Decision Tree, Water Quality, Weka Tool, classification model

Data Mining Technology and Its Role in Discovering Financial Fraud (Published)

The basis of any business – the customer database, which provides information about the client relationship with the company. The increasing complexity of organizational processes and rapidly changing business environment led to strong growth in domestic corporate data companies. In this regard, the increasing interest from the point of view of fraud risk assessments are beginning to provide tools such as data mining (Forensic Data Analytics – FDA), which allows you to narrow sample of suspicious transactions while minimizing the volume of checks. For example, in the field of communication in the database stores information about the conclusion of agreements for the use of services, the time of termination of the contract, a region rate, etc. The analysis revealed 7 out of 31 dentists who deliberately overstate the value of work performed by the insurance.

K-means algorithm using the algorithm of k-means as 4 clusters formed:

  • Cluster 1: specialized work using expensive additional procedures, the average age of the client – 25, the average cost of services – $ 715;
  • Cluster 2: minor works without the use of additional procedures, the average age of the client – 21, the average cost of services – $ 286;
  • Cluster 3: Significant work using expensive additional procedures, the average age of the client – 38, the average cost of services – $ 819;
  • Cluster 4: Significant work with cheap additional procedures, the average age of the client – 27, the average cost of services – $ 551.

Keywords: Cluster Algorithms, Data Miner., Data mining, Financial Fraud, K-Means Algorithm

Using Data Mining Techniques to Identify the Causes of Deaths in Al-Gedaref Hospital (Published)

Data mining technology extensively used in managing relationship through a variety of approaches. There are many tools and methods for analyzing mortality data. The mining technology is one of these tools. The research aims to illustrate the concept of data mining and causes of deaths in Gedaref hospital. The methodology of data mining which used in deaths files is used to integrate two algorithms which are (clustering and classification) to help Gedaref state hospital on prediction and decision making. The study also aims to indicate the level of the interest in the exploration areas and the components of the structure of the application of exploration concepts and tools. One can concluded that the large proportion of deaths is caused by Malaria especially between male’s students and employees in early ages 32 year who live in Kassab village in Gedaref. We also recommended that the hospital administration have to provide training programs to workers

Keywords: Cause of Death, Computer Application, Data mining, Database


Security must be addressed in the phase of planning and designing of e-government system. Management process is needed to assess security control, where management allows departments and agencies to maintain and measure the extent of data security depending on the mechanism of revealing the security weak points .Revealing the weak points is done by using a series of standards built on the application of machine learning methods specifically Using the Neural Networks Model, and intelligent data analysis. All these techniques are useful in monitoring and measuring the extent of the secured data and the provided services. The applied results on the data site of ”Cairo cleanliness and beautification authority for cleaning” in Egypt showed that measurement qualifications were adequate, proper ,preaching, and can be generalized. The proposed approach of monitoring is very comprehensive where it limits the risk of information security that affect organizations’ risk management decisions.

Keywords: Data mining, Government Cyber space, the Neural Networks Model