Computer users spend a lot of time doing simple, mechanical data manipulation
- changing the format of data, checking its validity, finding items with
some property, adding up numbers, printing reports, and the like. All of these
jobs ought to be mechanized, but it's a real nuisance to have to write a specialpurpose
program in a standard language like C or Pascal each time such a task
comes up.
Awk is a programming language that makes it possible to handle such tasks
with very short programs, often only one or two lines long. An awk program is
a sequence of patterns and actions that tell what to look for in the input data
and what to do when it's found. Awk searches a set of files for lines matched
by any of the patterns; when a matching line is found, the corresponding action
is performed. A pattern can select lines by combinations of regular expressions
and comparison operations on strings, numbers, fields, variables, and array elements.
Actions may perform arbitrary processing on selected lines; the action
language looks like C but there are no declarations, and strings and numbers
are built-in data types.
Awk scans the input files and splits each input line into fields automatically.
Because so many things are automatic - input, field splitting, storage management,
initialization - awk programs are usually much smaller than they would
be in a more conventional language. Thus one common use of awk is for the
kind of data manipulation suggested above. Programs, a line or two long, are
composed at the keyboard, run once, then discarded. In effect, awk is a
general-purpose programmable tool that can reprace a host of specialized tools
or programs.
The same brevity of expression and convenience of operations make awk
valuable for prototyping larger programs. One starts with a few lines, then
refines the program until it does the desired job, experimenting with designs by
trying alternatives quickly. Since programs are short, it's easy to get started,
and easy to start over when experience suggests a different direction. And it's
straightforward to translate an awk program into another language once the
design is right.