With the unprecedented rate at which data is being collected today in almost all
elds of human endeavor, there is an emerging economic and scientic need to
extract useful information from it. For example, many companies already have
data-warehouses in the terabyte range (e.g., FedEx, Walmart). The World Wide
Web has an estimated 800 million web-pages. Similarly, scientic data is reach-
ing gigantic proportions (e.g., NASA space missions, Human Genome Project).
High-performance, scalable, parallel, and distributed computing is crucial for
ensuring system scalability and interactivity as datasets continue to grow in size
and complexity.
To address this need we organized the workshop on Large-Scale Parallel KDD
Systems, which was held in conjunction with the 5th ACM SIGKDD Interna-
tional Conference on Knowledge Discovery and Data Mining, on August 15th,
1999, San Diego, California. The goal of this workshop was to bring researchers
and practitioners together in a setting where they could discuss the design, im-
plementation, and deployment of large-scale parallel knowledge discovery (PKD)
systems, which can manipulate data taken from very large enterprise or scien-
tic databases, regardless of whether the data is located centrally or is globally
distributed.