Computer vision as a field is an intellectual frontier. Like any frontier, it is
exciting and disorganized, and there is often no reliable authority to appeal to.
Many useful ideas have no theoretical grounding, and some theories are useless
in practice; developed areas are widely scattered, and often one looks completely
inaccessible from the other. Nevertheless, we have attempted in this book to present
a fairly orderly picture of the field.
We see computer vision—or just “vision”; apologies to those who study human
or animal vision—as an enterprise that uses statistical methods to disentangle data
using models constructed with the aid of geometry, physics, and learning theory.
Thus, in our view, vision relies on a solid understanding of cameras and of the
physical process of image formation (Part I of this book) to obtain simple inferences
from individual pixel values (Part II), combine the information available in multiple
images into a coherent whole (Part III), impose some order on groups of pixels to
separate them from each other or infer shape information (Part IV), and recognize
objects using geometric information or probabilistic techniques (Part V). Computer
vision has a wide variety of applications, both old (e.g., mobile robot navigation,
industrial inspection, and military intelligence) and new (e.g., human computer
interaction, image retrieval in digital libraries, medical image analysis, and the
realistic rendering of synthetic scenes in computer graphics). We discuss some of
these applications in part VII.