This book is a worthy contribution to the field of text mining. By focusing on classification (rather than exhaustively covering extraction, summarization, and other tasks), it achieves the right balance of coherence and comprehensiveness. It collects papers by the leading authors in the field, who employ and explain a variety of techniques—kernel methods, link analysis, latent Dirichlet allocation, non-negative matrix factorization, and others. Together the papers bring unity and clarity to a disjointed and sometimes perplexing field and serve as the perfect introduction for an advanced student.
—Peter Norvig, Director of Research, Google, Inc., Mountain View, California, USA
This is a state-of-the-art, outstanding collection of overviews on text mining by a group of leading researchers in the field. The book meets an imminent need for an up-to-date overview of this exciting, dynamic research frontier and may serve as an excellent textbook on text mining for graduate students and researchers in the field as well.
—Jiawei Han, University of Illinois at Urbana-Champaign, USA
Giving a broad perspective of the field from numerous vantage points, Text Mining focuses on statistical methods for text mining and analysis. It examines methods to automatically cluster and classify text documents and applies these methods in a variety of areas, including adaptive information filtering, information distillation, and text search. The book begins with the classification of documents into predefined categories and then describes novel methods for clustering documents into groups that are not predefined. It concludes with various text mining applications that have significant implications for future research and industrial use.
About the Author
Ashok N. Srivastava
is the Principal Investigator of the Integrated Vehicle Health Management research project in the NASA Aeronautics Research Mission Directorate. Dr. Srivastava also leads the Intelligent Data Understanding group at NASA Ames Research Center.
Mehran Sahami is an Associate Professor and Associate Chair for Education in the computer science department at Stanford University.