Data Mining: Concepts, Models, Methods, and Algorithms, Second Edition

Data Mining: Concepts, Models, Methods, and Algorithms, Second Edition, 9780470890455 (0470890452), John Wiley & Sons, 2011

Now updated—the systematic introductory guide to modern analysis of large data sets

As data sets continue to grow in size and complexity, there has been an inevitable move towards indirect, automatic, and intelligent data analysis in which the analyst works via more complex and sophisticated software tools. This book reviews state-of-the-art methodologies and techniques for analyzing enormous quantities of raw data in high-dimensional data spaces to extract new information for decision-making.

This Second Edition of Data Mining: Concepts, Models, Methods, and Algorithms discusses data mining principles and then describes representative state-of-the-art methods and algorithms originating from different disciplines such as statistics, machine learning, neural networks, fuzzy logic, and evolutionary computation. Detailed algorithms are provided with necessary explanations and illustrative examples, and questions and exercises for practice at the end of each chapter. This new edition features the following new techniques/methodologies:

Support Vector Machines (SVM)—developed based on statistical learning theory, they have a large potential for applications in predictive data mining
Kohonen Maps (Self-Organizing Maps - SOM)—one of very applicative neural-networks-based methodologies for descriptive data mining and multi-dimensional data visualizations
DBSCAN, BIRCH, and distributed DBSCAN clustering algorithms—representatives of an important class of density-based clustering methodologies
Bayesian Networks (BN) methodology often used for causality modeling
Algorithms for measuring Betweeness and Centrality parameters in graphs, important for applications in mining large social networks
CART algorithm and Gini index in building decision trees
Bagging & Boosting approaches to ensemble-learning methodologies, with details of AdaBoost algorithm
Relief algorithm, one of the core feature selection algorithms inspired by instance-based learning
PageRank algorithm for mining and authority ranking of web pages
Latent Semantic Analysis (LSA) for text mining and measuring semantic similarities between text-based documents
New sections on temporal, spatial, web, text, parallel, and distributed data mining
More emphasis on business, privacy, security, and legal aspects of data mining technology

This text offers guidance on how and when to use a particular software tool (with the companion data sets) from among the hundreds offered when faced with a data set to mine. This allows analysts to create and perform their own data mining experiments using their knowledge of the methodologies and techniques provided. The book emphasizes the selection of appropriate methodologies and data analysis software, as well as parameter tuning. These critically important, qualitative decisions can only be made with the deeper understanding of parameter meaning and its role in the technique that is offered here.

This volume is primarily intended as a data-mining textbook for computer science, computer engineering, and computer information systems majors at the graduate level. Senior students at the undergraduate level and with the appropriate background can also successfully comprehend all topics presented here.

Comments

Amazing Books

Professional ASP.NET MVC 4 (Wrox Professional Guides)

Wrox Press, 2012

IT’S A GREAT TIME TO BE an ASP.NET developer!

Whether you’ve been developing with ASP.NET for years or are just getting started, now is a great time to dig into ASP.NET MVC 4. ASP.NET MVC has been a lot of fun to work with from the start, but the last two releases have added many features that make the entire...

Computer and Machine Vision, Fourth Edition: Theory, Algorithms, Practicalities

Academic Press, 2012

Computer and Machine Vision: Theory, Algorithms, Practicalities (previously entitled Machine Vision) clearly and systematically presents the basic methodology of computer and machine vision, covering the essential elements of the theory while emphasizing algorithmic and practical design constraints. This fully revised fourth edition has...

Security of e-Systems and Computer Networks

Cambridge University Press, 2007

E-based systems are ubiquitous in the modern world with applications spanning e-commerce, WLANs, health care and government organisations. The secure transfer of information has therefore become a critical area of research, development, and investment. This book presents the fundamental concepts and tools of e-based security and its range of...

C: A Reference Manual (5th Edition)

Prentice Hall, 2002

This text is a reference manual for the C programming language. OUf aim is to provide a complete and precise discussion of the language, the run-time libraries. and a style of C programming that emphasizes correctness, portability, and maintainability.

We expect our readers to already understand basic programming concepts, and many...

Workflow Handbook 2003

Future Strategies, 2002

Published in association with the Workflow Management Coalition (WfMC), the Workflow Handbook 2002 comprises four sections in over 400 information-packed pages:

SECTION 1: The World of Workflow covers a wide spectrum of viewpoints and discussions by experts in their respective fields. Papers range from an Introduction to Workflow through to a...

Computational and Manufacturing Strategies: Experimental Expressions of Wood Capabilities (SpringerBriefs in Architectural Design and Technology)

Springer, 2018

This book highlights computationally enabled and digitally fabricated strategies used in the design of a series of full-size wooden structures. It introduces theoretical foundations and then focuses on the possibilities that have emerged as a result of the material-aware processes. The case studies expound wood as one of the most...