The field of digital computer architecture has grown explosively in the past two decades.
Through a steady stream of experimental research, tool-building efforts, and theoretical
studies, the design of an instruction-set architecture, once considered an art, has been
transformed into one of the most quantitative branches of computer technology. At the same
time, better understanding of various forms of concurrency, from standard pipelining to
massive parallelism, and invention of architectural structures to support a reasonably efficient
and user-friendly programming model for such systems, has allowed hardware performance
to continue its exponential growth. This trend is expected to continue in the near future.
This explosive growth, linked with the expectation that performance will continue its
exponential rise with each new generation of hardware and that (in stark contrast to software)
computer hardware will function correctly as soon as it comes off the assembly line, has its
down side. It has led to unprecedented hardware complexity and almost intolerable development
costs. The challenge facing current and future computer designers is to institute
simplicity where we now have complexity; to use fundamental theories being developed in
this area to gain performance and ease-of-use benefits from simpler circuits; to understand
the interplay between technological capabilities and limitations, on the one hand, and design
decisions based on user and application requirements on the other.
In computer designers’ quest for user-friendliness, compactness, simplicity, high performance,
low cost, and low power, parallel processing plays a key role. High-performance
uniprocessors are becoming increasingly complex, expensive, and power-hungry. A basic
trade-off thus exists between the use of one or a small number of such complex processors,
at one extreme, and a moderate to very large number of simpler processors, at the other.
When combined with a high-bandwidth, but logically simple, interprocessor communication
facility, the latter approach leads to significant simplification of the design process. However,
two major roadblocks have thus far prevented the widespread adoption of such moderately
to massively parallel architectures: the interprocessor communication bottleneck and the
difficulty, and thus high cost, of algorithm/software development.