This is Info file textutils.info, produced by Makeinfo-1.64 from the
input file /ade-src/fsf/textutils/doc/textutils.texi.
START-INFO-DIR-ENTRY
* Text utilities: (textutils).          GNU text utilities.
* cat: (textutils)cat invocation.               Concatenate and write files.
* cksum: (textutils)cksum invocation.           Print POSIX CRC checksum.
* comm: (textutils)comm invocation.             Compare sorted files by line.
* csplit: (textutils)csplit invocation.         Split by context.
* cut: (textutils)cut invocation.               Print selected parts of lines.
* expand: (textutils)expand invocation.         Convert tabs to spaces.
* fmt: (textutils)fmt invocation.               Reformat paragraph text.
* fold: (textutils)fold invocation.             Wrap long input lines.
* head: (textutils)head invocation.             Output the first part of files.
* join: (textutils)join invocation.             Join lines on a common field.
* md5sum: (textutils)md5sum invocation.         Print or check message-digests.
* nl: (textutils)nl invocation.                 Number lines and write files.
* od: (textutils)od invocation.                 Dump files in octal, etc.
* paste: (textutils)paste invocation.           Merge lines of files.
* pr: (textutils)pr invocation.                 Paginate or columnate files.
* sort: (textutils)sort invocation.             Sort text files.
* split: (textutils)split invocation.           Split into fixed-size pieces.
* sum: (textutils)sum invocation.               Print traditional checksum.
* tac: (textutils)tac invocation.               Reverse files.
* tail: (textutils)tail invocation.             Output the last part of files.
* tr: (textutils)tr invocation.                 Translate characters.
* unexpand: (textutils)unexpand invocation.     Convert spaces to tabs.
* uniq: (textutils)uniq invocation.             Uniqify files.
* wc: (textutils)wc invocation.                 Byte, word, and line counts.
END-INFO-DIR-ENTRY
   This file documents the GNU text utilities.
   Copyright (C) 1994, 95, 96 Free Software Foundation, Inc.
   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.
   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.
File: textutils.info,  Node: Top,  Next: Introduction,  Up: (dir)
GNU text utilities
******************
   This manual minimally documents version 1.19 of the GNU text
utilities.
* Menu:
* Introduction::                       Caveats, overview, and authors.
* Common options::                     Common options.
* Output of entire files::             cat tac nl od
* Formatting file contents::           fmt pr fold
* Output of parts of files::           head tail split csplit
* Summarizing files::                  wc sum cksum md5sum
* Operating on sorted files::          sort uniq comm
* Operating on fields within a line::  cut paste join
* Operating on characters::            tr expand unexpand
* Opening the software toolbox::       The software tools philosophy.
* Index::                              General index.
File: textutils.info,  Node: Introduction,  Next: Common options,  Prev: Top,  Up: Top
Introduction
************
   This manual is incomplete: No attempt is made to explain basic
concepts in a way suitable for novices.  Thus, if you are interested,
please get involved in improving this manual.  The entire GNU community
will benefit.
   The GNU text utilities are mostly compatible with the POSIX.2
standard.
   Please report bugs to `bug-gnu-utils@prep.ai.mit.edu'.  Remember to
include the version number, machine architecture, input files, and any
other information needed to reproduce the bug: your input, what you
expected, what you got, and why it is wrong.  Diffs are welcome, but
please include a description of the problem as well, since this is
sometimes difficult to infer. *Note Bugs: (gcc)Bugs.
   This manual is based on the Unix man pages in the distribution, which
were originally written by David MacKenzie and updated by Jim Meyering.
The original `fmt' man page was written by Ross Paterson.  Franc,ois
Pinard did the initial conversion to Texinfo format.  Karl Berry did
the indexing, some reorganization, and editing of the results.  Richard
Stallman contributed his usual invaluable insights to the overall
process.
File: textutils.info,  Node: Common options,  Next: Output of entire files,  Prev: Introduction,  Up: Top
Common options
**************
   Certain options are available in all these programs.  Rather than
writing identical descriptions for each of the programs, they are
described here.  (In fact, every GNU program accepts (or should accept)
these options.)
   A few of these programs take arbitrary strings as arguments.  In
those cases, `--help' and `--version' are taken as these options only
if there is one and exactly one command line argument.
`--help'
     Print a usage message listing all available options, then exit
     successfully.
`--version'
     Print the version number, then exit successfully.
File: textutils.info,  Node: Output of entire files,  Next: Formatting file contents,  Prev: Common options,  Up: Top
Output of entire files
**********************
   These commands read and write entire files, possibly transforming
them in some way.
* Menu:
* cat invocation::              Concatenate and write files.
* tac invocation::              Concatenate and write files in reverse.
* nl invocation::               Number lines and write files.
* od invocation::               Write files in octal or other formats.
File: textutils.info,  Node: cat invocation,  Next: tac invocation,  Up: Output of entire files
`cat': Concatenate and write files
==================================
   `cat' copies each FILE (`-' means standard input), or standard input
if none are given, to standard output.  Synopsis:
     cat [OPTION] [FILE]...
   The program accepts the following options.  Also see *Note Common
options::.
`--show-all'
     Equivalent to `-vET'.
`--number-nonblank'
     Number all nonblank output lines, starting with 1.
     Equivalent to `-vE'.
`--show-ends'
     Display a `$' after the end of each line.
`--number'
     Number all output lines, starting with 1.
`--squeeze-blank'
     Replace multiple adjacent blank lines with a single blank line.
     Equivalent to `-vT'.
`--show-tabs'
     Display TAB characters as `^I'.
     Ignored; for Unix compatibility.
`--show-nonprinting'
     Display control characters except for LFD and TAB using `^'
     notation and precede characters that have the high bit set with
     `M-'.
File: textutils.info,  Node: tac invocation,  Next: nl invocation,  Prev: cat invocation,  Up: Output of entire files
`tac': Concatenate and write files in reverse
=============================================
   `tac' copies each FILE (`-' means standard input), or standard input
if none are given, to standard output, reversing the records (lines by
default) in each separately.  Synopsis:
     tac [OPTION]... [FILE]...
   "Records" are separated by instances of a string (newline by
default).  By default, this separator string is attached to the end of
the record that it follows in the file.
   The program accepts the following options.  Also see *Note Common
options::.
`--before'
     The separator is attached to the beginning of the record that it
     precedes in the file.
`--regex'
     Treat the separator string as a regular expression.
`-s SEPARATOR'
`--separator=SEPARATOR'
     Use SEPARATOR as the record separator, instead of newline.
File: textutils.info,  Node: nl invocation,  Next: od invocation,  Prev: tac invocation,  Up: Output of entire files
`nl': Number lines and write files
==================================
   `nl' writes each FILE (`-' means standard input), or standard input
if none are given, to standard output, with line numbers added to some
or all of the lines.  Synopsis:
     nl [OPTION]... [FILE]...
   `nl' decomposes its input into (logical) pages; by default, the line
number is reset to 1 at the top of each logical page.  `nl' treats all
of the input files as a single document; it does not reset line numbers
or logical pages between files.
   A logical page consists of three sections: header, body, and footer.
Any of the sections can be empty.  Each can be numbered in a different
style from the others.
   The beginnings of the sections of logical pages are indicated in the
input file by a line containing exactly one of these delimiter strings:
`\:\:\:'
     start of header;
`\:\:'
     start of body;
     start of footer.
   The two characters from which these strings are made can be changed
from `\' and `:' via options (see below), but the pattern and length of
each string cannot be changed.
   A section delimiter is replaced by an empty line on output.  Any text
that comes before the first section delimiter string in the input file
is considered to be part of a body section, so `nl' treats a file that
contains no section delimiters as a single body section.
   The program accepts the following options.  Also see *Note Common
options::.
`-b STYLE'
`--body-numbering=STYLE'
     Select the numbering style for lines in the body section of each
     logical page.  When a line is not numbered, the current line number
     is not incremented, but the line number separator character is
     still prepended to the line.  The styles are:
    `a'
          number all lines,
    `t'
          number only nonempty lines (default for body),
    `n'
          do not number lines (default for header and footer),
    `pREGEXP'
          number only lines that contain a match for REGEXP.
`-d CD'
`--section-delimiter=CD'
     Set the section delimiter characters to CD; default is `\:'. If
     only C is given, the second remains `:'.  (Remember to protect `\'
     or other metacharacters from shell expansion with quotes or extra
     backslashes.)
`-f STYLE'
`--footer-numbering=STYLE'
     Analogous to `--body-numbering'.
`-h STYLE'
`--header-numbering=STYLE'
     Analogous to `--body-numbering'.
`-i NUMBER'
`--page-increment=NUMBER'
     Increment line numbers by NUMBER (default 1).
`-l NUMBER'
`--join-blank-lines=NUMBER'
     Consider NUMBER (default 1) consecutive empty lines to be one
     logical line for numbering, and only number the last one.  Where
     fewer than NUMBER consecutive empty lines occur, do not number
     them.  An empty line is one that contains no characters, not even
     spaces or tabs.
`-n FORMAT'
`--number-format=FORMAT'
     Select the line numbering format (default is `rn'):
    `ln'
          left justified, no leading zeros;
    `rn'
          right justified, no leading zeros;
    `rz'
          right justified, leading zeros.
`--no-renumber'
     Do not reset the line number at the start of a logical page.
`-s STRING'
`--number-separator=STRING'
     Separate the line number from the text line in the output with
     STRING (default is TAB).
`-v NUMBER'
`--starting-line-number=NUMBER'
     Set the initial line number on each logical page to NUMBER
     (default 1).
`-w NUMBER'
`--number-width=NUMBER'
     Use NUMBER characters for line numbers (default 6).
File: textutils.info,  Node: od invocation,  Prev: nl invocation,  Up: Output of entire files
`od': Write files in octal or other formats
===========================================
   `od' writes an unambiguous representation of each FILE (`-' means
standard input), or standard input if none are given.  Synopsis:
     od [OPTION]... [FILE]...
     od -C [FILE] [[+]OFFSET [[+]LABEL]]
   Each line of output consists of the offset in the input, followed by
groups of data from the file. By default, `od' prints the offset in
octal, and each group of file data is two bytes of input printed as a
single octal number.
   The program accepts the following options.  Also see *Note Common
options::.
`-A RADIX'
`--address-radix=RADIX'
     Select the base in which file offsets are printed.  RADIX can be
     one of the following:
    `d'
          decimal;
    `o'
          octal;
    `x'
          hexadecimal;
    `n'
          none (do not print offsets).
     The default is octal.
`-j BYTES'
`--skip-bytes=BYTES'
     Skip BYTES input bytes before formatting and writing.  If BYTES
     begins with `0x' or `0X', it is interpreted in hexadecimal;
     otherwise, if it begins with `0', in octal; otherwise, in decimal.
     Appending `b' multiplies BYTES by 512, `k' by 1024, and `m' by
     1048576.
`-N BYTES'
`--read-bytes=BYTES'
     Output at most BYTES bytes of the input.  Prefixes and suffixes on
     `bytes' are interpreted as for the `-j' option.
`-s [N]'
`--strings[=N]'
     Instead of the normal output, output only "string constants": at
     least N (3 by default) consecutive ASCII graphic characters,
     followed by a null (zero) byte.
`-t TYPE'
`--format=TYPE'
     Select the format in which to output the file data.  TYPE is a
     string of one or more of the below type indicator characters.  If
     you include more than one type indicator character in a single TYPE
     string, or use this option more than once, `od' writes one copy of
     each output line using each of the data types that you specified,
     in the order that you specified.
    `a'
          named character,
    `c'
          ASCII character or backslash escape,
    `d'
          signed decimal,
    `f'
          floating point,
    `o'
          octal,
    `u'
          unsigned decimal,
    `x'
          hexadecimal.
     The type `a' outputs things like `sp' for space, `nl' for newline,
     and `nul' for a null (zero) byte.  Type `c' outputs ` ', `\n', and
     `\0', respectively.
     Except for types `a' and `c', you can specify the number of bytes
     to use in interpreting each number in the given data type by
     following the type indicator character with a decimal integer.
     Alternately, you can specify the size of one of the C compiler's
     built-in data types by following the type indicator character with
     one of the following characters.  For integers (`d', `o', `u',
     `x'):
    `C'
          char,
    `S'
          short,
    `I'
          int,
    `L'
          long.
     For floating point (`f'):
    F
          float,
    D
          double,
    L
          long double.
`--output-duplicates'
     Output consecutive lines that are identical.  By default, when two
     or more consecutive output lines would be identical, `od' outputs
     only the first line, and puts just an asterisk on the following
     line to indicate the elision.
`-w[N]'
`--width[=N]'
     Dump `n' input bytes per output line.  This must be a multiple of
     the least common multiple of the sizes associated with the
     specified output types.  If N is omitted, the default is 32.  If
     this option is not given at all, the default is 16.
   The next several options map the old, pre-POSIX format specification
options to the corresponding POSIX format specs.  GNU `od' accepts any
combination of old- and new-style options.  Format specification
options accumulate.
     Output as named characters.  Equivalent to `-ta'.
     Output as octal bytes.  Equivalent to `-toC'.
     Output as ASCII characters or backslash escapes.  Equivalent to
     `-tc'.
     Output as unsigned decimal shorts.  Equivalent to `-tu2'.
     Output as floats.  Equivalent to `-tfF'.
     Output as hexadecimal shorts.  Equivalent to `-tx2'.
     Output as decimal shorts.  Equivalent to `-td2'.
     Output as decimal longs.  Equivalent to `-td4'.
     Output as octal shorts.  Equivalent to `-to2'.
     Output as hexadecimal shorts.  Equivalent to `-tx2'.
`--traditional'
     Recognize the pre-POSIX non-option arguments that traditional `od'
     accepted.  The following syntax:
          od --traditional [FILE] [[+]OFFSET[.][b] [[+]LABEL[.][b]]]
     can be used to specify at most one file and optional arguments
     specifying an offset and a pseudo-start address, LABEL.  By
     default, OFFSET is interpreted as an octal number specifying how
     many input bytes to skip before formatting and writing.  The
     optional trailing decimal point forces the interpretation of
     OFFSET as a decimal number.  If no decimal is specified and the
     offset begins with `0x' or `0X' it is interpreted as a hexadecimal
     number.  If there is a trailing `b', the number of bytes skipped
     will be OFFSET multiplied by 512.  The LABEL argument is
     interpreted just like OFFSET, but it specifies an initial
     pseudo-address.  The pseudo-addresses are displayed in parentheses
     following any normal address.
File: textutils.info,  Node: Formatting file contents,  Next: Output of parts of files,  Prev: Output of entire files,  Up: Top
Formatting file contents
************************
   These commands reformat the contents of files.
* Menu:
* fmt invocation::              Reformat paragraph text.
* pr invocation::               Paginate or columnate files for printing.
* fold invocation::             Wrap input lines to fit in specified width.
File: textutils.info,  Node: fmt invocation,  Next: pr invocation,  Up: Formatting file contents
`fmt': Reformat paragraph text
==============================
   `fmt' fills and joins lines to produce output lines of (at most) a
given number of characters (75 by default).  Synopsis:
     fmt [OPTION]... [FILE]...
   `fmt' reads from the specified FILE arguments (or standard input if
none are given), and writes to standard output.
   By default, blank lines, spaces between words, and indentation are
preserved in the output; successive input lines with different
indentation are not joined; tabs are expanded on input and introduced on
output.
   `fmt' prefers breaking lines at the end of a sentence, and tries to
avoid line breaks after the first word of a sentence or before the last
word of a sentence.  A "sentence break" is defined as either the end of
a paragraph or a word ending in any of `.?!', followed by two spaces or
end of line, ignoring any intervening parentheses or quotes.  Like TeX,
`fmt' reads entire "paragraphs" before choosing line breaks; the
algorithm is a variant of that in "Breaking Paragraphs Into Lines"
(Donald E. Knuth and Michael F. Plass, `Software--Practice and
Experience', 11 (1981), 1119-1184).
   The program accepts the following options.  Also see *Note Common
options::.
`--crown-margin'
     "Crown margin" mode: preserve the indentation of the first two
     lines within a paragraph, and align the left margin of each
     subsequent line with that of the second line.
`--tagged-paragraph'
     "Tagged paragraph" mode: like crown margin mode, except that if
     indentation of the first line of a paragraph is the same as the
     indentation of the second, the first line is treated as a one-line
     paragraph.
`--split-only'
     Split lines only.  Do not join short lines to form longer ones.
     This prevents sample lines of code, and other such "formatted"
     text from being unduly combined.
`--uniform-spacing'
     Uniform spacing.  Reduce spacing between words to one space, and
     spacing between sentences to two spaces.
`-WIDTH'
`-w WIDTH'
`--width=WIDTH'
     Fill output lines up to WIDTH characters (default 75).  `fmt'
     initially tries to make lines about 7% shorter than this, to give
     it room to balance line lengths.
`-p PREFIX'
`--prefix=PREFIX'
     Only lines beginning with PREFIX (possibly preceded by whitespace)
     are subject to formatting. The prefix and any preceding whitespace
     are stripped for the formatting and then re-attached to each
     formatted output line.  One use is to format certain kinds of
     program comments, while leaving the code unchanged.
File: textutils.info,  Node: pr invocation,  Next: fold invocation,  Prev: fmt invocation,  Up: Formatting file contents
`pr': Paginate or columnate files for printing
==============================================
   `pr' writes each FILE (`-' means standard input), or standard input
if none are given, to standard output, paginating and optionally
outputting in multicolumn format.  Synopsis:
     pr [OPTION]... [FILE]...
   By default, a 5-line header is printed: two blank lines; a line with
the date, the file name, and the page count; and two more blank lines.
A five line footer (entirely) is also printed.
   Form feeds in the input cause page breaks in the output.
   The program accepts the following options.  Also see *Note Common
options::.
`+PAGE'
     Begin printing with page PAGE.
`-COLUMN'
     Produce COLUMN-column output and print columns down.  The column
     width is automatically decreased as COLUMN increases; unless you
     use the `-w' option to increase the page width as well, this option
     might well cause some input to be truncated.
     Print columns across rather than down.
     Balance columns on the last page.
     Print control characters using hat notation (e.g., `^G'); print
     other unprintable characters in octal backslash notation.  By
     default, unprintable characters are not changed.
     Double space the output.
`-e[IN-TABCHAR[IN-TABWIDTH]]'
     Expand tabs to spaces on input.  Optional argument IN-TABCHAR is
     the input tab character (default is TAB).  Second optional
     argument IN-TABWIDTH is the input tab character's width (default
     is 8).
     Use a formfeed instead of newlines to separate output pages.
`-h HEADER'
     Replace the file name in the header with the string HEADER.
`-i[OUT-TABCHAR[OUT-TABWIDTH]]'
     Replace spaces with tabs on output.  Optional argument OUT-TABCHAR
     is the output tab character (default is TAB).  Second optional
     argument OUT-TABWIDTH is the output tab character's width (default
     is 8).
`-l N'
     Set the page length to N (default 66) lines.  If N is less than
     10, the headers and footers are omitted, as if the `-t' option had
     been given.
     Print all files in parallel, one in each column.
`-n[NUMBER-SEPARATOR[DIGITS]]'
     Precede each column with a line number; with parallel files (`-m'),
     precede each line with a line number.  Optional argument
     NUMBER-SEPARATOR is the character to print after each number
     (default is TAB).  Optional argument DIGITS is the number of
     digits per line number (default is 5).
`-o N'
     Indent each line with N (default is zero) spaces wide, i.e., set
     the left margin.  The total page width is `n' plus the width set
     with the `-w' option.
     Do not print a warning message when an argument FILE cannot be
     opened.  (The exit status will still be nonzero, however.)
`-s[C]'
     Separate columns by the single character C.  If C is omitted, the
     default is space; if this option is omitted altogether, the
     default is TAB.
     Do not print the usual 5-line header and the 5-line footer on each
     page, and do not fill out the bottoms of pages (with blank lines or
     formfeeds).
     Print unprintable characters in octal backslash notation.
`-w N'
     Set the page width to N (default is 72) columns.
File: textutils.info,  Node: fold invocation,  Prev: pr invocation,  Up: Formatting file contents
`fold': Wrap input lines to fit in specified width
==================================================
   `fold' writes each FILE (`-' means standard input), or standard
input if none are given, to standard output, breaking long lines.
Synopsis:
     fold [OPTION]... [FILE]...
   By default, `fold' breaks lines wider than 80 columns. The output is
split into as many lines as necessary.
   `fold' counts screen columns by default; thus, a tab may count more
than one column, backspace decreases the column count, and carriage
return sets the column to zero.
   The program accepts the following options.  Also see *Note Common
options::.
`--bytes'
     Count bytes rather than columns, so that tabs, backspaces, and
     carriage returns are each counted as taking up one column, just
     like other characters.
`--spaces'
     Break at word boundaries: the line is broken after the last blank
     before the maximum line length.  If the line contains no such
     blanks, the line is broken at the maximum line length as usual.
`-w WIDTH'
`--width=WIDTH'
     Use a maximum line length of WIDTH columns instead of 80.
File: textutils.info,  Node: Output of parts of files,  Next: Summarizing files,  Prev: Formatting file contents,  Up: Top
Output of parts of files
************************
   These commands output pieces of the input.
* Menu:
* head invocation::             Output the first part of files.
* tail invocation::             Output the last part of files.
* split invocation::            Split a file into fixed-size pieces.
* csplit invocation::           Split a file into context-determined pieces.
File: textutils.info,  Node: head invocation,  Next: tail invocation,  Up: Output of parts of files
`head': Output the first part of files
======================================
   `head' prints the first part (10 lines by default) of each FILE; it
reads from standard input if no files are given or when given a FILE of
`-'.  Synopses:
     head [OPTION]... [FILE]...
     head -NUMBER [OPTION]... [FILE]...
   If more than one FILE is specified, `head' prints a one-line header
consisting of
     ==> FILE NAME <==
before the output for each FILE.
   `head' accepts two option formats: the new one, in which numbers are
arguments to the options (`-q -n 1'), and the old one, in which the
number precedes any option letters (`-1q').
   The program accepts the following options.  Also see *Note Common
options::.
`-COUNTOPTIONS'
     This option is only recognized if it is specified first.  COUNT is
     a decimal number optionally followed by a size letter (`b', `k',
     `m') as in `-c', or `l' to mean count by lines, or other option
     letters (`cqv').
`-c BYTES'
`--bytes=BYTES'
     Print the first BYTES bytes, instead of initial lines.  Appending
     `b' multiplies BYTES by 512, `k' by 1024, and `m' by 1048576.
`-n N'
`--lines=N'
     Output the first N lines.
`--quiet'
`--silent'
     Never print file name headers.
`--verbose'
     Always print file name headers.
File: textutils.info,  Node: tail invocation,  Next: split invocation,  Prev: head invocation,  Up: Output of parts of files
`tail': Output the last part of files
=====================================
   `tail' prints the last part (10 lines by default) of each FILE; it
reads from standard input if no files are given or when given a FILE of
`-'.  Synopses:
     tail [OPTION]... [FILE]...
     tail -NUMBER [OPTION]... [FILE]...
     tail +NUMBER [OPTION]... [FILE]...
   If more than one FILE is specified, `tail' prints a one-line header
consisting of
     ==> FILE NAME <==
before the output for each FILE.
   GNU `tail' can output any amount of data (some other versions of
`tail' cannot).  It also has no `-r' option (print in reverse), since
reversing a file is really a different job from printing the end of a
file; BSD `tail' (which is the one with `-r') can only reverse files
that are at most as large as its buffer, which is typically 32k.  A
more reliable and versatile way to reverse files is the GNU `tac'
command.
   `tail' accepts two option formats: the new one, in which numbers are
arguments to the options (`-n 1'), and the old one, in which the number
precedes any option letters (`-1' or `+1').
   If any option-argument is a number N starting with a `+', `tail'
begins printing with the Nth item from the start of each file, instead
of from the end.
   The program accepts the following options.  Also see *Note Common
options::.
`-COUNT'
`+COUNT'
     This option is only recognized if it is specified first.  COUNT is
     a decimal number optionally followed by a size letter (`b', `k',
     `m') as in `-c', or `l' to mean count by lines, or other option
     letters (`cfqv').
`-c BYTES'
`--bytes=BYTES'
     Output the last BYTES bytes, instead of final lines.  Appending
     `b' multiplies BYTES by 512, `k' by 1024, and `m' by 1048576.
`--follow'
     Loop forever trying to read more characters at the end of the file,
     presumably because the file is growing.  Ignored if reading from a
     pipe.  If more than one file is given, `tail' prints a header
     whenever it gets output from a different file, to indicate which
     file that output is from.
`-n N'
`--lines=N'
     Output the last N lines.
`-quiet'
`--silent'
     Never print file name headers.
`--verbose'
     Always print file name headers.
File: textutils.info,  Node: split invocation,  Next: csplit invocation,  Prev: tail invocation,  Up: Output of parts of files
`split': Split a file into fixed-size pieces
============================================
   `split' creates output files containing consecutive sections of
INPUT (standard input if none is given or INPUT is `-').  Synopsis:
     split [OPTION] [INPUT [PREFIX]]
   By default, `split' puts 1000 lines of INPUT (or whatever is left
over for the last section), into each output file.
   The output files' names consist of PREFIX (`x' by default) followed
by a group of letters `aa', `ab', and so on, such that concatenating
the output files in sorted order by file name produces the original
input file.  (If more than 676 output files are required, `split' uses
`zaa', `zab', etc.)
   The program accepts the following options.  Also see *Note Common
options::.
`-LINES'
`-l LINES'
`--lines=LINES'
     Put LINES lines of INPUT into each output file.
`-b BYTES'
`--bytes=BYTES'
     Put the first BYTES bytes of INPUT into each output file.
     Appending `b' multiplies BYTES by 512, `k' by 1024, and `m' by
     1048576.
`-C BYTES'
`--line-bytes=BYTES'
     Put into each output file as many complete lines of INPUT as
     possible without exceeding BYTES bytes.  For lines longer than
     BYTES bytes, put BYTES bytes into each output file until less than
     BYTES bytes of the line are left, then continue normally.  BYTES
     has the same format as for the `--bytes' option.
`--verbose=BYTES'
     Write a diagnostic to standard error just before each output file
     is opened.
File: textutils.info,  Node: csplit invocation,  Prev: split invocation,  Up: Output of parts of files
`csplit': Split a file into context-determined pieces
=====================================================
   `csplit' creates zero or more output files containing sections of
INPUT (standard input if INPUT is `-').  Synopsis:
     csplit [OPTION]... INPUT PATTERN...
   The contents of the output files are determined by the PATTERN
arguments, as detailed below.  An error occurs if a PATTERN argument
refers to a nonexistent line of the input file (e.g., if no remaining
line matches a given regular expression).  After every PATTERN has been
matched, any remaining input is copied into one last output file.
   By default, `csplit' prints the number of bytes written to each
output file after it has been created.
   The types of pattern arguments are:
     Create an output file containing the input up to but not including
     line N (a positive integer).  If followed by a repeat count, also
     create an output file containing the next LINE lines of the input
     file once for each repeat.
`/REGEXP/[OFFSET]'
     Create an output file containing the current line up to (but not
     including) the next line of the input file that contains a match
     for REGEXP.  The optional OFFSET is a `+' or `-' followed by a
     positive integer.  If it is given, the input up to the matching
     line plus or minus OFFSET is put into the output file, and the
     line after that begins the next section of input.
`%REGEXP%[OFFSET]'
     Like the previous type, except that it does not create an output
     file, so that section of the input file is effectively ignored.
`{REPEAT-COUNT}'
     Repeat the previous pattern REPEAT-COUNT additional times.
     REPEAT-COUNT can either be a positive integer or an asterisk,
     meaning repeat as many times as necessary until the input is
     exhausted.
   The output files' names consist of a prefix (`xx' by default)
followed by a suffix.  By default, the suffix is an ascending sequence
of two-digit decimal numbers from `00' and up to `99'.  In any case,
concatenating the output files in sorted order by filename produces the
original input file.
   By default, if `csplit' encounters an error or receives a hangup,
interrupt, quit, or terminate signal, it removes any output files that
it has created so far before it exits.
   The program accepts the following options.  Also see *Note Common
options::.
`-f PREFIX'
`--prefix=PREFIX'
     Use PREFIX as the output file name prefix.
`-b SUFFIX'
`--suffix=SUFFIX'
     Use SUFFIX as the output file name suffix.  When this option is
     specified, the suffix string must include exactly one
     `printf(3)'-style conversion specification, possibly including
     format specification flags, a field width, a precision
     specifications, or all of these kinds of modifiers.  The format
     letter must convert a binary integer argument to readable form;
     thus, only `d', `i', `u', `o', `x', and `X' conversions are
     allowed.  The entire SUFFIX is given (with the current output file
     number) to `sprintf(3)' to form the file name suffixes for each of
     the individual output files in turn.  If this option is used, the
     `--digits' option is ignored.
`-n DIGITS'
`--digits=DIGITS'
     Use output file names containing numbers that are DIGITS digits
     long instead of the default 2.
`--keep-files'
     Do not remove output files when errors are encountered.
`--elide-empty-files'
     Suppress the generation of zero-length output files.  (In cases
     where the section delimiters of the input file are supposed to
     mark the first lines of each of the sections, the first output
     file will generally be a zero-length file unless you use this
     option.)  The output file sequence numbers always run
     consecutively starting from 0, even when this option is specified.
`--silent'
`--quiet'
     Do not print counts of output file sizes.
File: textutils.info,  Node: Summarizing files,  Next: Operating on sorted files,  Prev: Output of parts of files,  Up: Top
Summarizing files
*****************
   These commands generate just a few numbers representing entire
contents of files.
* Menu:
* wc invocation::               Print byte, word, and line counts.
* sum invocation::              Print checksum and block counts.
* cksum invocation::            Print CRC checksum and byte counts.
* md5sum invocation::           Print or check message-digests.
File: textutils.info,  Node: wc invocation,  Next: sum invocation,  Up: Summarizing files
`wc': Print byte, word, and line counts
=======================================
   `wc' counts the number of bytes, whitespace-separated words, and
newlines in each given FILE, or standard input if none are given or for
a FILE of `-'.  Synopsis:
     wc [OPTION]... [FILE]...
   `wc' prints one line of counts for each file, and if the file was
given as an argument, it prints the file name following the counts.  If
more than one FILE is given, `wc' prints a final line containing the
cumulative counts, with the file name `total'.  The counts are printed
in this order: newlines, words, bytes.
   By default, `wc' prints all three counts.  Options can specify that
only certain counts be printed.  Options do not undo others previously
given, so
     wc --bytes --words
prints both the byte counts and the word counts.
   The program accepts the following options.  Also see *Note Common
options::.
`--bytes'
`--chars'
     Print only the byte counts.
`--words'
     Print only the word counts.
`--lines'
     Print only the newline counts.
File: textutils.info,  Node: sum invocation,  Next: cksum invocation,  Prev: wc invocation,  Up: Summarizing files
`sum': Print checksum and block counts
======================================
   `sum' computes a 16-bit checksum for each given FILE, or standard
input if none are given or for a FILE of `-'.  Synopsis:
     sum [OPTION]... [FILE]...
   `sum' prints the checksum for each FILE followed by the number of
blocks in the file (rounded up).  If more than one FILE is given, file
names are also printed (by default).  (With the `--sysv' option,
corresponding file name are printed when there is at least one file
argument.)
   By default, GNU `sum' computes checksums using an algorithm
compatible with BSD `sum' and prints file sizes in units of 1024-byte
blocks.
   The program accepts the following options.  Also see *Note Common
options::.
     Use the default (BSD compatible) algorithm.  This option is
     included for compatibility with the System V `sum'.  Unless `-s'
     was also given, it has no effect.
`--sysv'
     Compute checksums using an algorithm compatible with System V
     `sum''s default, and print file sizes in units of 512-byte blocks.
   `sum' is provided for compatibility; the `cksum' program (see next
section) is preferable in new applications.
File: textutils.info,  Node: cksum invocation,  Next: md5sum invocation,  Prev: sum invocation,  Up: Summarizing files
`cksum': Print CRC checksum and byte counts
===========================================
   `cksum' computes a cyclic redundancy check (CRC) checksum for each
given FILE, or standard input if none are given or for a FILE of `-'.
Synopsis:
     cksum [OPTION]... [FILE]...
   `cksum' prints the CRC checksum for each file along with the number
of bytes in the file, and the filename unless no arguments were given.
   `cksum' is typically used to ensure that files transferred by
unreliable means (e.g., netnews) have not been corrupted, by comparing
the `cksum' output for the received files with the `cksum' output for
the original files (typically given in the distribution).
   The CRC algorithm is specified by the POSIX.2 standard.  It is not
compatible with the BSD or System V `sum' algorithms (see the previous
section); it is more robust.
   The only options are `--help' and `--version'.  *Note Common
options::.
File: textutils.info,  Node: md5sum invocation,  Prev: cksum invocation,  Up: Summarizing files
`md5sum': Print or check message-digests
========================================
   `md5sum' computes a 128-bit checksum (or "fingerprint" or
"message-digest") for each specified FILE.  If a FILE is specified as
`-' or if no files are given `md5sum' computes the checksum for the
standard input.  `md5sum' can also determine whether a file and
checksum are consistent. Synopsis:
     md5sum [OPTION]... [FILE]...
     md5sum [OPTION]... --check [FILE]
     md5sum [OPTION]... --string=STRING ...
   For each FILE, `md5sum' outputs the MD5 checksum, a flag indicating
a binary or text input file, and the filename.  If FILE is omitted or
specified as `-', standard input is read.
   The program accepts the following options.  Also see *Note Common
options::.
`--binary'
     Treat all input files as binary.  This option has no effect on Unix
     systems, since they don't distinguish between binary and text
     files.  This option is useful on systems that have different
     internal and external character representations.
`--check'
     Read filenames and checksum information from the single FILE (or
     from stdin if no FILE was specified) and report whether each named
     file and the corresponding checksum data are consistent.  The
     input to this mode of `md5sum' is usually the output of a prior,
     checksum-generating run of `md5sum'.  Each valid line of input
     consists of an MD5 checksum, a binary/text flag, and then a
     filename.  Binary files are marked with `*', text with ` '.  For
     each such line, `md5sum' reads the named file and computes its MD5
     checksum.  Then, if the computed message digest does not match the
     one on the line with the filename, the file is noted as having
     failed the test.  Otherwise, the file passes the test.  By
     default, for each valid line, one line is written to standard
     output indicating whether the named file passed the test.  After
     all checks have been performed, if there were any failures, a
     warning is issued to standard error.  Use the `--status' option to
     inhibit that output.  If any listed file cannot be opened or read,
     if any valid line has an MD5 checksum inconsistent with the
     associated file, or if no valid line is found, `md5sum' exits with
     nonzero status.  Otherwise, it exits successfully.
`--status'
     This option is useful only when verifying checksums.  When
     verifying checksums, don't generate the default one-line-per-file
     diagnostic and don't output the warning summarizing any failures.
     Failures to open or read a file still evoke individual diagnostics
     to standard error.  If all listed files are readable and are
     consistent with the associated MD5 checksums, exit successfully.
     Otherwise exit with a status code indicating there was a failure.
`--string=STRING'
     Compute the message digest for STRING, instead of for a file.  The
     result is the same as for a file that contains exactly STRING.
`--text'
     Treat all input files as text files.  This is the reverse of
     `--binary'.
`--warn'
     When verifying checksums, warn about improperly formated MD5
     checksum lines.  This option is useful only if all but a few lines
     in the checked input are valid.
File: textutils.info,  Node: Operating on sorted files,  Next: Operating on fields within a line,  Prev: Summarizing files,  Up: Top
Operating on sorted files
*************************
   These commands work with (or produce) sorted files.
* Menu:
* sort invocation::             Sort text files.
* uniq invocation::             Uniqify files.
* comm invocation::             Compare two sorted files line by line.