Mining the Web: Discovering Knowledge from Hypertext Data is the first book devoted entirely to techniques for extracting and producing knowledge from the vast body of unstructured Web data. Building on an initial survey of infrastructural issuesincluding Web crawling and indexingChakrabarti examines machine learning techniques as they relate specifically to the challenges of Web mining and provides applications of machine learning to sytematically acquire, store, and analyze data. Here the focus is on results: the strengths and weaknesses of these applications, along with their potential as foundations for further progress toward a Web that is more aware of content semantics. This thorough and forward-looking book gives the theoretical and practical foundations you need to build innovative applications for mining the Web.
About the Author
- A comprehensive, critical exploration of statistics-based attempts to make sense of Web data.
- Details the special challenges associated with analyzing unstructured and semi-structured data.
- Looks at how classical Information Retrieval techniques have been modified for use with Web data.
- Focuses on today's dominant learning methods: clustering and classification, hyperlink analysis, and supervised and semi-supervised learning.
- Analyzes current applications for resource discovery and social network analysis.
- An excellent way to introduce students to especially vital applications of data mining and machine learning technology.
Soumen Chakrabarti is assistant Professor in Computer Science and Engineering at the Indian Institute of Technology, Bombay. Prior to joining IIT, he worked on hypertext databases and data mining at IBM Almaden Research Center. He has developed three systems and holds five patents in this area. Chakrabarti has served as a vice-chair and program committee member for many conferences, including WWW, SIGIR, ICDE, and KDD, and as a guest editor of the IEEE TKDE special issue on mining and searching the Web. His work on focused crawling received the Best Paper award at the 8th International World Wide Web Conference (1999). He holds a Ph.D. from the University of California, Berkeley.
Pediatric Critical Care Study Guide: Text and Review
This is the first comprehensive study guide covering all aspects of pediatric critical care medicine. It fills a void that exists in learning resources currently available to pediatric critical care practitioners. The major textbooks are excellent references, but do not allow concise reading on specific topics and are not intended to act as...
The Definitive Guide to Plone, Second Edition
The Definitive Guide to Plone, Second Edition has been completely updated to cover the latest version of Plone and its newest features. This book provides a complete and detailed overview of Plone. It is divided into three parts, which cover using, configuring, and developing and customizing Plone. After the coverage of Plone’s...
For C developers who want a comprehensive introduction to ZeroMQ, this is the perfect tutorial. With a user-friendly approach and practical examples, it covers everything from fundamental message patterns to working with multiple sockets.
Learn fundamental message/queue design patterns...
The Nothing That Is: A Natural History of Zero A symbol for what is not there, an emptiness that increases any number it's added to, an inexhaustible and indispensable paradox. As we enter the year 2000, zero is once again making its presence felt. Nothing itself, it makes possible a myriad of calculations. Indeed, without zero mathematics as we know it would not exist. And without mathematics... Reality ColdFusion MX: Flash MX Integration
Come be part of a real-world development team, from the start of a project to the finish! In Reality ColdFusion MX: Flash MX Integration, you get to experience how the power of ColdFusion can help drive the Flash experience by designing Web applications, discussing development issues, finding solutions, and implementating the final products....