The rapid growth of the Web in the past two decades has made it the largest
publicly accessible data source in the world. Web mining aims to discover
useful information or knowledge from Web hyperlinks, page contents,
and usage logs. Based on the primary kinds of data used in the
mining process, Web mining tasks can be categorized into three main
types: Web structure mining, Web content mining and Web usage mining.
Web structure mining discovers knowledge from hyperlinks, which represent
the structure of the Web. Web content mining extracts useful information/
knowledge from Web page contents. Web usage mining mines user
activity patterns from usage logs and other forms of logs of user interactions
with Web systems. Since the publication of the first edition at the end
of 2006, there have been some important advances in several areas. To reflect
these advances, new materials have been added to most chapters. The
major changes are in Chapter 11 and Chapter 12, which have been rewritten
and significantly expanded. When the first edition was written,
opinion mining (Chapter 11) was still in its infancy. Since then, the research
community has gained a much better understanding of the problem
and has proposed many novel techniques to solve various aspects of the
problem. To include the latest developments for the Web usage mining
chapter (Chapter 12), the topics of recommender systems and collaborative
filtering, query log mining, and computational advertising have been
added. This new edition is thus considerably longer, from a total of 532
pages in the first edition to a total of 622 pages in this second edition.
The goal of the book is to present the above Web data mining tasks and
their core mining algorithms. The book is intended to be a text with a
comprehensive coverage, and therefore, for each topic, sufficient details
are given so that readers can gain a reasonably complete knowledge of its
algorithms or techniques without referring to any external materials. Five
of the chapters - partially supervised learning, structured data extraction,
information integration, opinion mining and sentiment analysis, and Web
usage mining - make this book unique. These topics are not covered by existing
books, but yet are essential to Web data mining. Traditional Web
mining topics such as search, crawling and resource discovery, and social
network analysis are also covered in detail in this book.