Data Mining: Concepts, Models and Techniques

03/14/2011 · No Responses

The knowledge discovery process is as old as Homo sapiens. Until some time ago this process was solely based on the ‘natural personal’ computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the ‘artificial’ computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since “knowledge is power”. The goal of this book is to provide, in a friendly way, both theoretical concepts and, especially, practical techniques of this exciting field, ready to be applied in real-world situations. Accordingly, it is meant for all those who wish to learn how to explore and analysis of large quantities of data in order to discover the hidden nugget of information.

The book consists of six chapters, organized as follows:
– Chapter 1 introduces and explains fundamental aspects about data mining used throughout the book. These are related to: what is data mining, why to use data mining, how to mine data? Data mining solvable problems, issues concerning the modeling process and models, main data mining applications, methodology and terminology used in data mining are also discussed.
– Chapter 2 is dedicated to a short review regarding some important issues concerning data: definition of data, types of data, data quality, and types of data attributes.
– Chapter 3 deals with the problem of data analysis. Having in mind that data mining is an analytic process designed to explore large amounts of data in search of consistent and valuable hidden knowledge, the first step consists in an initial data exploration and data preparation. Then, depending on the nature of the problem to be solved, it can involve anything from simple descriptive statistics to regression models, time series, multivariate exploratory techniques, etc. The aim of this chapter is therefore to provide an overview of the main topics concerning exploratory data analysis.
– Chapter 4 presents a short overview concerning the main steps in building and applying classification and decision trees in real-life problems.
– Chapter 5 summarizes some well-known data mining techniques and models, such as: Bayesian and rule-based classifiers, artificial neural networks, k-nearest neighbors, rough sets, clustering algorithms, and genetic algorithms.
– Chapter 6 discusses the problem of evaluating the performance of different classification (and decision) models.