Like all engineering activities, computer programming is both craft and science. Building a bridge or a computer program requires familiarity with the known techniques for the overall design of similar artifacts. And making intelligent choices among the available techniques and designs requires understanding of the mathematical principles governing their performance and economy. This book is about methods for organizing, reorganizing, moving, exploring, and retrieving data in digital computers, and the mathematical analysis of those techniques. This subject is a theoretical foundation of the useful art of computer programming in the same way that the statics and dynamics of physical systems lie at the heart of mechanical engineering.
A few simple principles have governed our choice of topics. First, we have chosen only practically useful techniques. We omit treatment of some theoretically excellent algorithms that are not practical for data sets of reasonable size. Second, we have included both classical and recently discovered methods, relying on inherent simplicity, wide applicability, and potential usefulness as the criteria for inclusion rather than any preconceived exhaustive catalogue. For example, Chapter 6, List and Tree Implementations of Sets, includes both the classical algorithm for construction of optimal binary search trees on static data, and the newer skip list structures for dynamic data. In other chapters there are sections on splay trees, extendible hashing, grid files, and other elegant newly developed methods. Third, we have included an analysis of almost every method we describe. One of our major objectives has been to present analyses that are relatively brief and nontechnical but illuminate the important performance characteristics of the algorithms. As in mechanical engineering, one of the crucial lessons to be taught is about scalability: a method that is satisfactory for a structure of one size may be unsuitable for a structure ten times as large.