Fresh Fish 4

home *** CD-ROM | disk | FTP | other *** search

/ Fresh Fish 4 / FreshFish_May-June1994.bin / gnu / info / gawk.info-2 (.txt) < prev next >

Wrap

GNU Info File | 1994-02-21 | 51KB | 945 lines

This is Info file gawk.info, produced by Makeinfo-1.55 from the input file gawk.texi. This file documents `awk', a program that you can use to select particular records in a file and perform operations upon them. This is Edition 0.15 of `The GAWK Manual', for the 2.15 version of the GNU implementation of AWK. Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation. File: gawk.info, Node: Statements/Lines, Next: When, Prev: Comments, Up: Getting Started `awk' Statements versus Lines ============================= Most often, each line in an `awk' program is a separate statement or separate rule, like this: awk '/12/ { print $0 } /21/ { print $0 }' BBS-list inventory-shipped But sometimes statements can be more than one line, and lines can contain several statements. You can split a statement into multiple lines by inserting a newline after any of the following: , { ? : || && do else A newline at any other point is considered the end of the statement. (Splitting lines after `?' and `:' is a minor `gawk' extension. The `?' and `:' referred to here is the three operand conditional expression described in *Note Conditional Expressions: Conditional Exp.) If you would like to split a single statement into two lines at a point where a newline would terminate it, you can "continue" it by ending the first line with a backslash character, `\'. This is allowed absolutely anywhere in the statement, even in the middle of a string or regular expression. For example: awk '/This program is too long, so continue it\ on the next line/ { print $1 }' We have generally not used backslash continuation in the sample programs in this manual. Since in `gawk' there is no limit on the length of a line, it is never strictly necessary; it just makes programs prettier. We have preferred to make them even more pretty by keeping the statements short. Backslash continuation is most useful when your `awk' program is in a separate source file, instead of typed in on the command line. You should also note that many `awk' implementations are more picky about where you may use backslash continuation. For maximal portability of your `awk' programs, it is best not to split your lines in the middle of a regular expression or a string. *Warning: backslash continuation does not work as described above with the C shell.* Continuation with backslash works for `awk' programs in files, and also for one-shot programs *provided* you are using a POSIX-compliant shell, such as the Bourne shell or the Bourne-again shell. But the C shell used on Berkeley Unix behaves differently! There, you must use two backslashes in a row, followed by a newline. When `awk' statements within one rule are short, you might want to put more than one of them on a line. You do this by separating the statements with a semicolon, `;'. This also applies to the rules themselves. Thus, the previous program could have been written: /12/ { print $0 } ; /21/ { print $0 } *Note:* the requirement that rules on the same line must be separated with a semicolon is a recent change in the `awk' language; it was done for consistency with the treatment of statements within an action. File: gawk.info, Node: When, Prev: Statements/Lines, Up: Getting Started When to Use `awk' ================= You might wonder how `awk' might be useful for you. Using additional utility programs, more advanced patterns, field separators, arithmetic statements, and other selection criteria, you can produce much more complex output. The `awk' language is very useful for producing reports from large amounts of raw data, such as summarizing information from the output of other utility programs like `ls'. (*Note A More Complex Example: More Complex.) Programs written with `awk' are usually much smaller than they would be in other languages. This makes `awk' programs easy to compose and use. Often `awk' programs can be quickly composed at your terminal, used once, and thrown away. Since `awk' programs are interpreted, you can avoid the usually lengthy edit-compile-test-debug cycle of software development. Complex programs have been written in `awk', including a complete retargetable assembler for 8-bit microprocessors (*note Glossary::., for more information) and a microcode assembler for a special purpose Prolog computer. However, `awk''s capabilities are strained by tasks of such complexity. If you find yourself writing `awk' scripts of more than, say, a few hundred lines, you might consider using a different programming language. Emacs Lisp is a good choice if you need sophisticated string or pattern matching capabilities. The shell is also good at string and pattern matching; in addition, it allows powerful use of the system utilities. More conventional languages, such as C, C++, and Lisp, offer better facilities for system programming and for managing the complexity of large programs. Programs in these languages may require more lines of source code than the equivalent `awk' programs, but they are easier to maintain and usually run more efficiently. File: gawk.info, Node: Reading Files, Next: Printing, Prev: Getting Started, Up: Top Reading Input Files ******************* In the typical `awk' program, all input is read either from the standard input (by default the keyboard, but often a pipe from another command) or from files whose names you specify on the `awk' command line. If you specify input files, `awk' reads them in order, reading all the data from one before going on to the next. The name of the current input file can be found in the built-in variable `FILENAME' (*note Built-in Variables::.). The input is read in units called records, and processed by the rules one record at a time. By default, each record is one line. Each record is split automatically into fields, to make it more convenient for a rule to work on its parts. On rare occasions you will need to use the `getline' command, which can do explicit input from any number of files (*note Explicit Input with `getline': Getline.). * Menu: * Records:: Controlling how data is split into records. * Fields:: An introduction to fields. * Non-Constant Fields:: Non-constant Field Numbers. * Changing Fields:: Changing the Contents of a Field. * Field Separators:: The field separator and how to change it. * Constant Size:: Reading constant width data. * Multiple Line:: Reading multi-line records. * Getline:: Reading files under explicit program control using the `getline' function. * Close Input:: Closing an input file (so you can read from the beginning once more). File: gawk.info, Node: Records, Next: Fields, Prev: Reading Files, Up: Reading Files How Input is Split into Records =============================== The `awk' language divides its input into records and fields. Records are separated by a character called the "record separator". By default, the record separator is the newline character, defining a record to be a single line of text. Sometimes you may want to use a different character to separate your records. You can use a different character by changing the built-in variable `RS'. The value of `RS' is a string that says how to separate records; the default value is `"\n"', the