This book presents this new discipline in a very accessible form: both as a text to train the next generation of practitioners and researchers, and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate.
If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start.
--From the foreword by Jim Gray, Microsoft Research
As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful information from raw data. This book describes these techniques and shows how they work.
The book is a major revision of the first edition that appeared in 1999. While the basic core remains the same, it has been updated to reflect the changes that have taken place over five years, and now has nearly double the references. The highlights for the new edition include thirty new technique sections; an enhanced Weka machine learning workbench, which now features an interactive interface; comprehensive information on neural networks; a new section on Bayesian networks; plus much more.
Offering a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques, inside youll find:
+ Algorithmic methods at the heart of successful data miningincluding tried and true techniques as well as leading edge methods;
+ Performance improvement techniques that work by transforming the input or output;
+ Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualizationin a new, interactive interface.
About the Author
Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being
Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann. Eibe Frank is a researcher in the Machine Learning group at the University of Waikato. He holds a degree in computer science from the University of Karlsruhe in Germany and is the author of several papers, both presented at machine learning conferences and published in machine learning journals.