OS/2 Shareware BBS: 11 Util

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Shareware BBS: 11 Util / 11-Util.zip / utlos2.zip / Doc / TXTCUT.DOC < prev next >

Wrap

Text File | 1997-10-24 | 7KB | 182 lines

$Id: TXTCUT.DOC 1.3 1997/10/25 02:03:41 brian Exp $ txtcut : The 'txtcut' command By: Brian Yoder. (c) Copyright International Business Machines Corporation 1997 All rights reserved. ======================================================================== Introduction ======================================================================== Files that contain blank lines, comments preceded by # (pound signs), a verying amount of whitespace between tokens, and other features to make them well-documented and readable (such as INI-style ASCII text files) are not easily processed by a Korn shell or REXX script. Additional logic can be used to skip over blank lines and remove comments, but this logic adds undesirable complexity and slow performance to the script. One solution is txtcut. This program prepares a text file, stripping out comments and blank lines and handling simple strings. It was originally developed as a fast preprocessor for the AIX cut command. However, the output of txtcut can be piped into rxqueue, giving OS/2's REXX the ability to easily and quickly handle text-style INI-like files. Even awk and Perl (of which very good native OS/2 ports exist) can benefit from txtcut. Information can be stored in an INI-style text file in an easily- maintained and readable fashion using comments, strings, and blank space. The txtcut program preprocesses this text file, removing comments and blank lines, processing simple strings, and delimiting the tokens in a consistent manner. The tokens produced by txtcut can be easily extracted using the AIX cut command, a REXX script, or even an awk or Perl script. The REXX, awk, Perl, and the *ix shells cover a lot of ground, but I have wished for a long time for the capabilities of txtcut to make the basic AIX and OS/2 toolsets more complete. Text INI-style files that are highly readable by people *and* easily processed by command scripts: txtcut gives you the best of both worlds! ======================================================================== Command Usage ======================================================================== txtcut [ -dchar ] [ -n ] [ -l ] [ -c ] [ textfile ] This program prepares a text file for the AIX cut command. If the name of the text file is missing, then txtcut reads from standard input. For each line that contains one or more tokens, txtcut writes one line to stdout that contains the tokens. A delimiter character is placed between each pair of tokens. The default delimiter character is a tab. You may use the -d option to change it to another character. Flags: -dchar Specify a character to be used as the token delimiter. This character is specified just as it is for the cut command. The default is a tab character. -n List the filename as the first token in each line. If the text file is being read from standard input, then the filename is listed as "(stdin)". See example 4 for more information. -l List the line number within the file as a token. -c List the number of tokens on the line as a token. This count does not include the additional tokens that may be added by the -n, -l or -c flags. -? List brief help. ======================================================================== Format of a text file ======================================================================== Blank lines are ignored. Lines that begin with a # or ; are assumed to be comments and are ignored. Each nonblank, noncomment line consists of one or more tokens. A token can be: An = sign. A string of characters enclosed by either double quotes, single quotes, or square brackets. A series of contiguous nonblank characters. If a non-string token that begins with a # or a ; is encountered on a line, it and the remaining tokens are assumed to be comments and are ignored. ======================================================================== Example 1 ======================================================================== Assume that the sample.txt file contains the following: # This is a comment line f1 = a b c # First line of tokens x y z # Second line of tokens aaa bbbb "cccc ddddd" # Last line of tokens Run the command: txtcut -d: sample.txt The txtcut command writes the following to stdout: f1:=:a:b:c x:y:z aaa:bbbb:cccc ddddd ======================================================================== Example 2 ======================================================================== Using the same sample.txt file in example 1, we'll cut the 3rd field from each line of the file. Run the command: txtcut sample.txt | cut -f3 The following is written to stdout: a z cccc ddddd ======================================================================== Example 3 ======================================================================== Again, using the same sample.txt file in example 1, we'll cut the 3rd field from each line of the file. But this time we'll pipe the file into txtcut's standard input. And we'll use a colon for the delimiter: cat sample.txt | txtcut -d: | cut -f3 -d: Again, the following is written to stdout: a z cccc ddddd ======================================================================== Example 4: Use of -n and -l flags ======================================================================== Using the same sample.txt file in example 1, we will add the filename, the line number within the file where each line of tokens was found, and the number of tokens in the line. The following commands and the output of each illustrate this: ---------------------------- ---------------------------------- Command Output ---------------------------- ---------------------------------- txtcut -d: sample.txt f1:=:a:b:c x:y:z aaa:bbbb:cccc ddddd txtcut -d: -n sample.txt sample.txt:f1:=:a:b:c sample.txt:x:y:z sample.txt:aaa:bbbb:cccc ddddd txtcut -d: -l sample.txt 3:f1:=:a:b:c 4:x:y:z 5:aaa:bbbb:cccc ddddd txtcut -d: -n -l sample.txt sample.txt:3:f1:=:a:b:c sample.txt:4:x:y:z sample.txt:5:aaa:bbbb:cccc ddddd txtcut -d: -n -l -c sample.txt sample.txt:3:5:f1:=:a:b:c sample.txt:4:3:x:y:z sample.txt:5:3:aaa:bbbb:cccc ddddd