home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Fresh Fish 4
/
FreshFish_May-June1994.bin
/
gnu
/
info
/
gawk.info-2
(
.txt
)
< prev
next >
Wrap
GNU Info File
|
1994-02-21
|
51KB
|
945 lines
This is Info file gawk.info, produced by Makeinfo-1.55 from the input
file gawk.texi.
This file documents `awk', a program that you can use to select
particular records in a file and perform operations upon them.
This is Edition 0.15 of `The GAWK Manual',
for the 2.15 version of the GNU implementation
of AWK.
Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.
File: gawk.info, Node: Statements/Lines, Next: When, Prev: Comments, Up: Getting Started
`awk' Statements versus Lines
=============================
Most often, each line in an `awk' program is a separate statement or
separate rule, like this:
awk '/12/ { print $0 }
/21/ { print $0 }' BBS-list inventory-shipped
But sometimes statements can be more than one line, and lines can
contain several statements. You can split a statement into multiple
lines by inserting a newline after any of the following:
, { ? : || && do else
A newline at any other point is considered the end of the statement.
(Splitting lines after `?' and `:' is a minor `gawk' extension. The
`?' and `:' referred to here is the three operand conditional
expression described in *Note Conditional Expressions: Conditional Exp.)
If you would like to split a single statement into two lines at a
point where a newline would terminate it, you can "continue" it by
ending the first line with a backslash character, `\'. This is allowed
absolutely anywhere in the statement, even in the middle of a string or
regular expression. For example:
awk '/This program is too long, so continue it\
on the next line/ { print $1 }'
We have generally not used backslash continuation in the sample
programs in this manual. Since in `gawk' there is no limit on the
length of a line, it is never strictly necessary; it just makes
programs prettier. We have preferred to make them even more pretty by
keeping the statements short. Backslash continuation is most useful
when your `awk' program is in a separate source file, instead of typed
in on the command line. You should also note that many `awk'
implementations are more picky about where you may use backslash
continuation. For maximal portability of your `awk' programs, it is
best not to split your lines in the middle of a regular expression or a
string.
*Warning: backslash continuation does not work as described above
with the C shell.* Continuation with backslash works for `awk'
programs in files, and also for one-shot programs *provided* you are
using a POSIX-compliant shell, such as the Bourne shell or the
Bourne-again shell. But the C shell used on Berkeley Unix behaves
differently! There, you must use two backslashes in a row, followed by
a newline.
When `awk' statements within one rule are short, you might want to
put more than one of them on a line. You do this by separating the
statements with a semicolon, `;'. This also applies to the rules
themselves. Thus, the previous program could have been written:
/12/ { print $0 } ; /21/ { print $0 }
*Note:* the requirement that rules on the same line must be separated
with a semicolon is a recent change in the `awk' language; it was done
for consistency with the treatment of statements within an action.
File: gawk.info, Node: When, Prev: Statements/Lines, Up: Getting Started
When to Use `awk'
=================
You might wonder how `awk' might be useful for you. Using additional
utility programs, more advanced patterns, field separators, arithmetic
statements, and other selection criteria, you can produce much more
complex output. The `awk' language is very useful for producing
reports from large amounts of raw data, such as summarizing information
from the output of other utility programs like `ls'. (*Note A More
Complex Example: More Complex.)
Programs written with `awk' are usually much smaller than they would
be in other languages. This makes `awk' programs easy to compose and
use. Often `awk' programs can be quickly composed at your terminal,
used once, and thrown away. Since `awk' programs are interpreted, you
can avoid the usually lengthy edit-compile-test-debug cycle of software
development.
Complex programs have been written in `awk', including a complete
retargetable assembler for 8-bit microprocessors (*note Glossary::., for
more information) and a microcode assembler for a special purpose Prolog
computer. However, `awk''s capabilities are strained by tasks of such
complexity.
If you find yourself writing `awk' scripts of more than, say, a few
hundred lines, you might consider using a different programming
language. Emacs Lisp is a good choice if you need sophisticated string
or pattern matching capabilities. The shell is also good at string and
pattern matching; in addition, it allows powerful use of the system
utilities. More conventional languages, such as C, C++, and Lisp, offer
better facilities for system programming and for managing the complexity
of large programs. Programs in these languages may require more lines
of source code than the equivalent `awk' programs, but they are easier
to maintain and usually run more efficiently.
File: gawk.info, Node: Reading Files, Next: Printing, Prev: Getting Started, Up: Top
Reading Input Files
*******************
In the typical `awk' program, all input is read either from the
standard input (by default the keyboard, but often a pipe from another
command) or from files whose names you specify on the `awk' command
line. If you specify input files, `awk' reads them in order, reading
all the data from one before going on to the next. The name of the
current input file can be found in the built-in variable `FILENAME'
(*note Built-in Variables::.).
The input is read in units called records, and processed by the
rules one record at a time. By default, each record is one line. Each
record is split automatically into fields, to make it more convenient
for a rule to work on its parts.
On rare occasions you will need to use the `getline' command, which
can do explicit input from any number of files (*note Explicit Input
with `getline': Getline.).
* Menu:
* Records:: Controlling how data is split into records.
* Fields:: An introduction to fields.
* Non-Constant Fields:: Non-constant Field Numbers.
* Changing Fields:: Changing the Contents of a Field.
* Field Separators:: The field separator and how to change it.
* Constant Size:: Reading constant width data.
* Multiple Line:: Reading multi-line records.
* Getline:: Reading files under explicit program control
using the `getline' function.
* Close Input:: Closing an input file (so you can read from
the beginning once more).
File: gawk.info, Node: Records, Next: Fields, Prev: Reading Files, Up: Reading Files
How Input is Split into Records
===============================
The `awk' language divides its input into records and fields.
Records are separated by a character called the "record separator". By
default, the record separator is the newline character, defining a
record to be a single line of text.
Sometimes you may want to use a different character to separate your
records. You can use a different character by changing the built-in
variable `RS'. The value of `RS' is a string that says how to separate
records; the default value is `"\n"', the