home *** CD-ROM | disk | FTP | other *** search
- $Id: TXTCUT.DOC 1.2 1997/06/02 00:25:45 brian Exp $
-
- txtcut : The 'txtcut' command
- By: Brian Yoder.
-
- (c) Copyright International Business Machines Corporation 1997
- All rights reserved.
-
- ========================================================================
- Introduction
- ========================================================================
-
- Files that contain blank lines, comments preceded by # (pound signs), a
- verying amount of whitespace between tokens, and other features to make
- them well-documented and readable (such as INI-style ASCII text files)
- are not easily processed by a Korn shell or REXX script. Additional
- logic can be used to skip over blank lines and remove comments, but this
- logic adds undesirable complexity and slow performance to the script.
-
- One solution is txtcut. This program prepares a text file, stripping out
- comments and blank lines and handling simple strings. It was originally
- developed as a fast preprocessor for the AIX cut command. However, the
- output of txtcut can be piped into rxqueue, giving OS/2's REXX the
- ability to easily and quickly handle text-style INI-like files. Even awk
- and Perl (of which very good native OS/2 ports exist) can benefit from
- txtcut.
-
- Information can be stored in an INI-style text file in an easily-
- maintained and readable fashion using comments, strings, and blank
- space. The txtcut program preprocesses this text file, removing comments
- and blank lines, processing simple strings, and delimiting the tokens in
- a consistent manner. The tokens produced by txtcut can be easily
- extracted using the AIX cut command, a REXX script, or even an awk or
- Perl script.
-
- The REXX, awk, Perl, and the *ix shells cover a lot of ground, but I
- have wished for a long time for the capabilities of txtcut to make the
- basic AIX and OS/2 toolsets more complete. Text INI-style files that are
- highly readable by people *and* easily processed by command scripts:
- txtcut gives you the best of both worlds!
-
- ========================================================================
- Command Usage
- ========================================================================
-
- txtcut [ -dchar ] [ -n ] [ -l ] [ -c ] [ textfile ]
-
- This program prepares a text file for the AIX cut command. If the name
- of the text file is missing, then txtcut reads from standard input.
-
- For each line that contains one or more tokens, txtcut writes one line
- to stdout that contains the tokens. A delimiter character is placed
- between each pair of tokens.
-
- The default delimiter character is a tab. You may use the -d option to
- change it to another character.
-
- Flags:
-
- -dchar Specify a character to be used as the token delimiter. This
- character is specified just as it is for the cut command.
- The default is a tab character.
-
- -n List the filename as the first token in each line. If the text
- file is being read from standard input, then the filename is
- listed as "(stdin)". See example 4 for more information.
-
- -l List the line number within the file as a token.
-
- -c List the number of tokens on the line as a token. This count
- does not include the additional tokens that may be added by the
- -n, -l or -c flags.
-
- -? List brief help.
-
- ========================================================================
- Format of a text file
- ========================================================================
-
- Blank lines are ignored. Lines that begin with a # or ; are assumed to
- be comments and are ignored.
-
- Each nonblank, noncomment line consists of one or more tokens. A token
- can be:
-
- An = sign.
-
- A string of characters enclosed by either double quotes, single
- quotes, or square brackets.
-
- A series of contiguous nonblank characters.
-
- If a non-string token that begins with a # or a ; is encountered on a
- line, it and the remaining tokens are assumed to be comments and are
- ignored.
-
- ========================================================================
- Example 1
- ========================================================================
-
- Assume that the sample.txt file contains the following:
-
- # This is a comment line
-
- f1 = a b c # First line of tokens
- x y z # Second line of tokens
- aaa bbbb "cccc ddddd" # Last line of tokens
-
- Run the command:
-
- txtcut -d: sample.txt
-
- The txtcut command writes the following to stdout:
-
- f1:=:a:b:c
- x:y:z
- aaa:bbbb:cccc ddddd
-
- ========================================================================
- Example 2
- ========================================================================
-
- Using the same sample.txt file in example 1, we'll cut the 3rd field
- from each line of the file. Run the command:
-
- txtcut sample.txt | cut -f3
-
- The following is written to stdout:
-
- a
- z
- cccc ddddd
-
- ========================================================================
- Example 3
- ========================================================================
-
- Again, using the same sample.txt file in example 1, we'll cut the 3rd
- field from each line of the file. But this time we'll pipe the file
- into txtcut's standard input. And we'll use a colon for the delimiter:
-
- cat sample.txt | txtcut -d: | cut -f3 -d:
-
- Again, the following is written to stdout:
-
- a
- z
- cccc ddddd
-
- ========================================================================
- Example 4: Use of -n and -l flags
- ========================================================================
-
- Using the same sample.txt file in example 1, we will add the filename,
- the line number within the file where each line of tokens was found,
- and the number of tokens in the line. The following commands and the
- output of each illustrate this:
-
- ---------------------------- ----------------------------------
- Command Output
- ---------------------------- ----------------------------------
- txtcut -d: sample.txt f1:=:a:b:c
- x:y:z
- aaa:bbbb:cccc ddddd
-
- txtcut -d: -n sample.txt sample.txt:f1:=:a:b:c
- sample.txt:x:y:z
- sample.txt:aaa:bbbb:cccc ddddd
-
- txtcut -d: -l sample.txt 3:f1:=:a:b:c
- 4:x:y:z
- 5:aaa:bbbb:cccc ddddd
-
-
- txtcut -d: -n -l sample.txt sample.txt:3:f1:=:a:b:c
- sample.txt:4:x:y:z
- sample.txt:5:aaa:bbbb:cccc ddddd
-
- txtcut -d: -n -l -c sample.txt sample.txt:3:5:f1:=:a:b:c
- sample.txt:4:3:x:y:z
- sample.txt:5:3:aaa:bbbb:cccc ddddd
-