The convergence of computing and communication has produced a society that feeds
on information. Yet most of the information is in its raw form: data. If data is characterized
as recorded facts, then information is the set of patterns, or expectations,
that underlie the data. There is a huge amount of information locked up in databases—
information that is potentially important but has not yet been discovered or
articulated. Our mission is to bring it forth.
Data mining is the extraction of implicit, previously unknown, and potentially
useful information from data. The idea is to build computer programs that sift
through databases automatically, seeking regularities or patterns. Strong patterns, if
found, will likely generalize to make accurate predictions on future data. Of course,
there will be problems. Many patterns will be banal and uninteresting. Others will
be spurious, contingent on accidental coincidences in the particular dataset used.
And real data is imperfect: Some parts will be garbled, some missing. Anything that
is discovered will be inexact: There will be exceptions to every rule and cases not
covered by any rule. Algorithms need to be robust enough to cope with imperfect
data and to extract regularities that are inexact but useful.
Machine learning provides the technical basis of data mining. It is used to extract
information from the raw data in databases—information that is expressed in a
comprehensible form and can be used for a variety of purposes. The process is one
of abstraction: taking the data, warts and all, and inferring whatever structure underlies
it. This book is about the tools and techniques of machine learning that are used
in practical data mining for finding, and describing, structural patterns in data.
As with any burgeoning new technology that enjoys intense commercial attention,
the use of data mining is surrounded by a great deal of hype in the technical—
and sometimes the popular—press. Exaggerated reports appear of the secrets that
can be uncovered by setting learning algorithms loose on oceans of data. But there
is no magic in machine learning, no hidden power, no alchemy. Instead, there is an
identifiable body of simple and practical techniques that can often extract useful
information from raw data. This book describes these techniques and shows how
they work.
We interpret machine learning as the acquisition of structural descriptions from
examples. The kind of descriptions that are found can be used for prediction, explanation,
and understanding. Some data mining applications focus on prediction:
They forecast what will happen in new situations from data that describe what happened
in the past, often by guessing the classification of new examples. But we are
equally—perhaps more—interested in applications where the result of “learning” is
an actual description of a structure that can be used to classify examples. This structural
description supports explanation and understanding as well as prediction. In
our experience, insights gained by the user are of most interest in the majority of
practical data mining applications; indeed, this is one of machine learning’s major
advantages over classical statistical modeling.