Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
High Performance Data Mining - Scaling Algorithms, Applications and Systems

Buy
This special issue of Data Mining and Knowledge Discovery addresses the issue of scaling data mining algorithms, applications and systems to massive data sets by applying high performance computing technology. With the commoditization of high performance computing using clusters of workstations and related technologies, it is becoming more and more common to have the necessary infrastructure for high performance data mining. On the other hand, many of the commonly used data mining algorithms do not scale to large data sets. Two fundamental challenges are: to develop scalable versions of the commonly used data mining algorithms and to develop new algorithms for mining very large data sets. In other words, today it is easy to spin a terabyte of disk, but difficult to analyze and mine a terabyte of data.

Developing algorithms which scale takes time. As an example, consider the successful scale up and parallelization of linear algebra algorithms during the past two decades. This success was due to several factors, including: a) developing versions of some standard algorithms which exploit the specialized structure of some linear systems, such as blockstructured systems, symmetric systems, or Toeplitz systems; b) developing new algorithms such as the Wierderman and Lancos algorithms for solving sparse systems; and c) developing software tools providing high performance implementations of linear algebra primitives, such as Linpack, LA Pack, and PVM.

In some sense, the state of the art for scalable and high performance algorithms for data mining is in the same position that linear algebra was in two decades ago. We suspect that strategies a)–c) will work in data mining also.

High performance data mining is still a very new subject with challenges. Roughly speaking, some data mining algorithms can be characterised as a heuristic search process involving many scans of the data. Thus, irregularity in computation, large numbers of data access, and non-deterministic search strategies make efficient parallelization of a data mining algorithms a difficult task. Research in this area will not only contribute to large scale data mining applications but also enrich high performance computing technology itself. This was part of the motivation for this special issue.
(HTML tags aren't allowed.)

Lead Generation For Dummies (For Dummies (Business & Personal Finance))
Lead Generation For Dummies (For Dummies (Business & Personal Finance))

Learn how to get your message heard above the online noise

The buying process is greatly changed. With the Internet, the buyer is in charge. If your product is going to compete, you need to master 21st century lead generation, and this book shows you how. It's packed with effective strategies for inbound and outbound...

Algebraic Specification of Communication Protocols (Cambridge Tracts in Theoretical Computer Science)
Algebraic Specification of Communication Protocols (Cambridge Tracts in Theoretical Computer Science)

The specifications in this book are the result of a number of case studies performed by researchers from the Programming Research Group at the University of Amsterdam. The primary goal was to study the use of the techniques developed by the Programming Research Group for the specification of real-life protocols. From the pool of...

DK Essential Managers: Innovation
DK Essential Managers: Innovation
In today’s dynamic and turbulent world, organizations face a stark challenge—change or perish. Unless they keep renewing their products and services, and update the ways they create and deliver them, they risk being overtaken by competitors. Innovation—the process of change—is critical to the success...

MIMO Wireless Networks, Second Edition: Channels, Techniques and Standards for Multi-Antenna, Multi-User and Multi-Cell Systems
MIMO Wireless Networks, Second Edition: Channels, Techniques and Standards for Multi-Antenna, Multi-User and Multi-Cell Systems

This book is unique in presenting channels, techniques and standards for the next generation of MIMO wireless networks. Through a unified framework, it emphasizes how propagation mechanisms impact the system performance under realistic power constraints. Combining a solid mathematical analysis with a physical and intuitive approach to...

Internationalizing the Curriculum in Organizational Psychology
Internationalizing the Curriculum in Organizational Psychology

This book assembles state-of-the-art thinking on the internationalization of the curriculum of training centers in I/O and Work Psychology. The experts contributing chapters share their thoughts on the knowledge and skills that students must master in the 21st century, as well as their research on how we can develop students to be globally...

Show Me Microsoft Project 2003
Show Me Microsoft Project 2003

Welcome to Show Me Microsoft Office Project 2003, a visual quick reference book that shows you how to work efficiently with Project 2003. This book provides complete coverage of basic and intermediate Project 2003 skills.

This book covers the...

©2019 LearnIT (support@pdfchm.net) - Privacy Policy