Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage

Buy
Learn How To Convert Web Data Into Web Knowledge

This text demonstrates how to extract knowledge by finding meaningful connections among data spread throughout the Web. Readers learn methods and algorithms from the fields of information retrieval, machine learning, and data mining which, when combined, provide a solid framework for mining the Web. The authors walk readers through the algorithms with the aid of examples and exercises.

This text is divided into three parts:

  • Part One, Web Structure, presents basic concepts and techniques for extracting information from the Web. Readers learn how to collect and index Web documents as well as search and rank Web pages according to their textual content and hyperlink structure.

  • Part Two, Web Content Management, offers two approaches, clustering and classification, for organizing Web content. For both approaches, the authors set forth specific algorithms that enable readers to convert Web data into knowledge.

  • Part Three, Web Usage Mining, demonstrates the application of data mining methods to uncover meaningful patterns of Internet usage.

Methods and algorithms are illustrated by simple examples. More than 100 exercises help readers assess their grasp of the material. Further, thirty-four hands-on analysis problems ask readers to use their new data mining expertise to solve real problems, working with large data sets. All the data sets needed for the examples, exercises, and analysis problems are available on the companion Web site.

The extensive use of examples, along with the opportunity to test and apply data mining skills, makes this text ideal for graduate and upper-level undergraduates in computer science and engineering. Web designers and researchers will find that this text gives them a new set of tools to further mine the Web for knowledge and move well beyond the capabilities of standard search engines.

About the Author

Zdravko Markov, PhD, is Associate Professor of Computer Science at Central Connecticut State University. The author of three textbooks, Dr. Markov teaches undergraduate and graduate courses in computer science and artificial intelligence. He is currently a Principal Investigator (PI) in a National Science Foundation–funded project designed to introduce machine learning to undergraduates.

Daniel T. Larose, PhD, is Professor of Statistics in the Department of Mathematical Sciences at Central Connecticut State University. He is the author of three data mining books and a forthcoming textbook in undergraduate statistics. He developed and directs CCSU's DataMining@CCSU programs.

(HTML tags aren't allowed.)

Integrative Physiology in the Proteomics and Post-Genomics Age
Integrative Physiology in the Proteomics and Post-Genomics Age
There is a perception in the scientific community that the discipline of Physiology is in crisis, or at least, in a phase of profound transition and change. At the root of the problem is confusion between objectives (the biological questions to be solved) and the methods and technologies to be applied. Traditionally, ever since...
Intuitive Probability and Random Processes using MATLAB
Intuitive Probability and Random Processes using MATLAB

Intuitive Probability and Random Processes using MATLAB® is an introduction to probability and random processes that merges theory with practice. Based on the author’s belief that only "hands-on" experience with the material can promote intuitive understanding, the approach is to motivate the need...

HP-UX 11i Internals (Hewlett-Packard Professional Books)
HP-UX 11i Internals (Hewlett-Packard Professional Books)
HP-UX under the hood: practical insight for optimization and troubleshooting

To maximize the performance, efficiency, and reliability of your HP-UX sysem, you need to know what's going on under the hood. HP-UX 11i Internals goes beyond generic UNIX internals, showing exactly how HP-UX works in PA-RISC environments.

HP experts...


Integration of Medical and Dental Care and Patient Data (Health Informatics)
Integration of Medical and Dental Care and Patient Data (Health Informatics)

This largely revised second edition comprehensively reviews the need and rationale for the integration of medical and dental patient data. The reader will find extensive guidance on issues involved with care and data integration, and how to achieve an integrated model of healthcare. The book discusses how the use of state-of-the-art,...

Intellectual Property Rights in Agricultural Biotechnology (Biotechnology in Agriculture Series)
Intellectual Property Rights in Agricultural Biotechnology (Biotechnology in Agriculture Series)
The successful application of biotechnology tools has had and is having dramatic effects in some areas of agriculture. These effects are being felt throughout the world in academic, government and industrial communities. The result is the rapid development of a multi-million dollar industry. This work has been going on for more than two decades...
The Mobile Frontier
The Mobile Frontier

Mobile user experience is a new frontier. Untethered from a keyboard and mouse, this rich design space is lush with opportunity to invent new and more human ways for people to interact with information. Invention requires casting off many anchors and conventions inherited from the last 50 years of computer science and traditional design and...

©2019 LearnIT (support@pdfchm.net) - Privacy Policy