home *** CD-ROM | disk | FTP | other *** search
- grep.doc : The 'grep' command
- By: Brian E. Yoder
- 11/11/91
- 11/17/93
-
- The grep (get regular expression and print) command is an adaptation of
- the AIX grep command. It searches one or more files for a pattern.
-
- ========================================================================
- Usage
- ========================================================================
-
- grep [-flags] "pattern" [fspec ...]
-
- The grep command searches for the specified pattern. It writes each
- line that contains the pattern to standard output. The pattern is a
- regular expression as described later in this document.
-
- One or more file specifications may be present on the command line. If
- no file specifications are present, then grep reads lines from standard
- input.
-
- A file specification consists of an optional drive, optional path
- information, and a filename. The filename may consist of AIX shell
- pattern-matching characters. See pattern.doc for more details.
-
- The flags that are supported are a subset of those supported by AIX:
-
- c Display only a count of matching lines.
-
- i Ignore case when making comparisons.
-
- l List just the names of files (once) with matching lines. Each
- file name is separated by a new-line character.
-
- n Preceed each line with the file name and line number, in the
- following form: fname(line)
-
- s Descend subdirectories, also.
-
- v Display lines that don't contain the pattern.
-
- y Ignore case when making comparisons (compatible with RS/6000).
-
- ========================================================================
- Notes
- ========================================================================
-
- Lines are limited to a length of 512 characters. Longer lines are split
- in pieces of 512 characters or less.
-
- Binary files can be searched; nulls, end-of-text (0x1a) characters, and
- non-ASCII characters are treated as newlines. When searching binary
- files, the line number won't mean much but strings less than 512
- characters long will tend to be kept together.
-
- A dot within a filename is treated as just another character and not
- assumed to be the beginning of a file extension. Therefore, you should
- specify * instead of *.* when searching all files in a specific
- directory. Again, see pattern.doc or the AIX shell documentation for
- more information.
-
- Errors are listed on standard error.
-
- Lines from the input file are written to standard output as follows:
-
- filename : text from line
-
- If the -n flag is specified, then lines from the input file are written:
-
- filename(line_number) : text from line
-
- For the DOS and OS/2 versions of grep, the flags are case-insensitive.
-
- The grep program can search binary files as well as text files. To help
- keep strings together, grep treats nulls and some ASCII control
- characters (that aren't normally present in flat text files) as newline
- characters. For binary files, the -n flag won't yield a meaningful line
- count. For text files that contain nulls and these control characters,
- the -n flag won't yield the correct line count. But then again, fgets()
- would not have been able to read past a null character anyway. It's a
- compromise. See the lgets() function in lbuf.c for more details.
-
- ========================================================================
- Regular Expressions
- ========================================================================
-
- The following expressions match a single character:
-
- c Any ordinary character, other than one of the special
- pattern-matching characters, matches itself.
-
- . A . (period) matches any single character.
-
- [string] A string enclosed in square brackets matches any one
- character in the string.
-
- [.-.] A range is two characters separated by a dash and
- enclosed in square brackets. It matches any
- character that is within the range.
-
- [^string] A string (or range) enclosed in square brackets and
- preceeded by a ^ (circumflex) matches any character
- except for the characters in the string (or range).
-
- Strings and ranges may be combined as needed, as in:
- [a-m0-9xyz], which matches a thru m, 0 thru 9, x, y,
- or z.
-
- \c The \ (backslash) followed by any character matches
- that character. This is useful for matching the
- following special characters: . * [ ] { } ^ $ \
-
- The single-character expressions can be combined into regular
- expressions as follows:
-
- * Match zero or more occurences of the previous
- character.
-
- {m} Matches exactly m occurrences of the previous
- character.
-
- {m,} Matches at least m occurrences of the previous
- character.
-
- {m,n} Matches at least m but no more than n occurrences of
- the previous character.
-
- m and n must be integers from 0 to 255, inclusive.
-
- A regular expression can be restricted to match a string that begins on
- the first character of the line, ends on the last character of the line,
- or both, as follows:
-
- ^pattern The pattern matches a string that begins on the first
- character of a line.
-
- pattern$ The pattern matches a string that ends on the last
- character of a line.
-
- ^pattern$ The pattern matches an entire line.
-
-
- ========================================================================
- Examples
- ========================================================================
-
- grep "the cat" *.txt \*.bat
-
- This command searches the files in the current directory that
- end in .txt and the .bat files in the root directory for the
- string "the cat". Only exact case matches are listed on
- standard output: occurrences of "The cat" and "the Cat" are not
- listed.
-
-
- grep -i "the cat" *.txt \*.bat
-
- This command is similar to the previous one except that it
- performs a case-insensitive search. Occurrences of "The cat"
- and "the CAT" would be listed in addition to any occurrences of
- "the cat".
-
-
- grep -sn "the {1,}cat" *.txt
-
- This command searches all .txt files in the current directory
- and in all subdirectories (recursively) for the pattern. The
- pattern consists of the word "the", followed by one or more
- spaces, and followed by the word "cat". Therefore, it would
- match lines that contain "the cat" or "the cat". The
- filename(line number) string is prepended to each line of each
- file in which the pattern is found.
-
-
- grep -i "[a-z][a-z0-9_]{0,}(" *.c
-
- This command searches all .c files in the current directory for
- function prototypes and function declarations. The pattern
- matches any string that begins with a letter, is followed by
- zero or more letters, numbers, or underscores, and ends with an
- open parenthesis.
-
-
- t2bm -i <myfile.txt | grep "&cont\."
-
- This command searches the output of t2bm for the "&cont."
- string. Note that the period in the pattern was escaped with a
- backslash. The grep command would interpret the "&cont."
- pattern as meaning the "&cont" string followed by any character.
-
-
- grep "^Usage$" *.doc
-
- This command searches all .doc files in the current directory
- for lines that consist of nothing but the string "Usage".
-
-
- grep "streams\[" *.c
-
- This command searches all C files in the current directory for
- lines that contain the "streams[" string. Note that the "["
- character has to be escaped in the pattern so that it is
- interpreted by grep as a bracket and not as the beginning of a
- set or range.
-
-
- grep -v "^$" *.doc
-
- This command searches all .doc files in the current directory
- and lists all lines that are not blank. Note that the ^$
- pattern matches any blank line.
-