home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 11 Util
/
11-Util.zip
/
utlos2.zip
/
Doc
/
TXTCUT.DOC
< prev
next >
Wrap
Text File
|
1997-10-24
|
7KB
|
182 lines
$Id: TXTCUT.DOC 1.3 1997/10/25 02:03:41 brian Exp $
txtcut : The 'txtcut' command
By: Brian Yoder.
(c) Copyright International Business Machines Corporation 1997
All rights reserved.
========================================================================
Introduction
========================================================================
Files that contain blank lines, comments preceded by # (pound signs), a
verying amount of whitespace between tokens, and other features to make
them well-documented and readable (such as INI-style ASCII text files)
are not easily processed by a Korn shell or REXX script. Additional
logic can be used to skip over blank lines and remove comments, but this
logic adds undesirable complexity and slow performance to the script.
One solution is txtcut. This program prepares a text file, stripping out
comments and blank lines and handling simple strings. It was originally
developed as a fast preprocessor for the AIX cut command. However, the
output of txtcut can be piped into rxqueue, giving OS/2's REXX the
ability to easily and quickly handle text-style INI-like files. Even awk
and Perl (of which very good native OS/2 ports exist) can benefit from
txtcut.
Information can be stored in an INI-style text file in an easily-
maintained and readable fashion using comments, strings, and blank
space. The txtcut program preprocesses this text file, removing comments
and blank lines, processing simple strings, and delimiting the tokens in
a consistent manner. The tokens produced by txtcut can be easily
extracted using the AIX cut command, a REXX script, or even an awk or
Perl script.
The REXX, awk, Perl, and the *ix shells cover a lot of ground, but I
have wished for a long time for the capabilities of txtcut to make the
basic AIX and OS/2 toolsets more complete. Text INI-style files that are
highly readable by people *and* easily processed by command scripts:
txtcut gives you the best of both worlds!
========================================================================
Command Usage
========================================================================
txtcut [ -dchar ] [ -n ] [ -l ] [ -c ] [ textfile ]
This program prepares a text file for the AIX cut command. If the name
of the text file is missing, then txtcut reads from standard input.
For each line that contains one or more tokens, txtcut writes one line
to stdout that contains the tokens. A delimiter character is placed
between each pair of tokens.
The default delimiter character is a tab. You may use the -d option to
change it to another character.
Flags:
-dchar Specify a character to be used as the token delimiter. This
character is specified just as it is for the cut command.
The default is a tab character.
-n List the filename as the first token in each line. If the text
file is being read from standard input, then the filename is
listed as "(stdin)". See example 4 for more information.
-l List the line number within the file as a token.
-c List the number of tokens on the line as a token. This count
does not include the additional tokens that may be added by the
-n, -l or -c flags.
-? List brief help.
========================================================================
Format of a text file
========================================================================
Blank lines are ignored. Lines that begin with a # or ; are assumed to
be comments and are ignored.
Each nonblank, noncomment line consists of one or more tokens. A token
can be:
An = sign.
A string of characters enclosed by either double quotes, single
quotes, or square brackets.
A series of contiguous nonblank characters.
If a non-string token that begins with a # or a ; is encountered on a
line, it and the remaining tokens are assumed to be comments and are
ignored.
========================================================================
Example 1
========================================================================
Assume that the sample.txt file contains the following:
# This is a comment line
f1 = a b c # First line of tokens
x y z # Second line of tokens
aaa bbbb "cccc ddddd" # Last line of tokens
Run the command:
txtcut -d: sample.txt
The txtcut command writes the following to stdout:
f1:=:a:b:c
x:y:z
aaa:bbbb:cccc ddddd
========================================================================
Example 2
========================================================================
Using the same sample.txt file in example 1, we'll cut the 3rd field
from each line of the file. Run the command:
txtcut sample.txt | cut -f3
The following is written to stdout:
a
z
cccc ddddd
========================================================================
Example 3
========================================================================
Again, using the same sample.txt file in example 1, we'll cut the 3rd
field from each line of the file. But this time we'll pipe the file into
txtcut's standard input. And we'll use a colon for the delimiter:
cat sample.txt | txtcut -d: | cut -f3 -d:
Again, the following is written to stdout:
a
z
cccc ddddd
========================================================================
Example 4: Use of -n and -l flags
========================================================================
Using the same sample.txt file in example 1, we will add the filename,
the line number within the file where each line of tokens was found, and
the number of tokens in the line. The following commands and the output
of each illustrate this:
---------------------------- ----------------------------------
Command Output
---------------------------- ----------------------------------
txtcut -d: sample.txt f1:=:a:b:c
x:y:z
aaa:bbbb:cccc ddddd
txtcut -d: -n sample.txt sample.txt:f1:=:a:b:c
sample.txt:x:y:z
sample.txt:aaa:bbbb:cccc ddddd
txtcut -d: -l sample.txt 3:f1:=:a:b:c
4:x:y:z
5:aaa:bbbb:cccc ddddd
txtcut -d: -n -l sample.txt sample.txt:3:f1:=:a:b:c
sample.txt:4:x:y:z
sample.txt:5:aaa:bbbb:cccc ddddd
txtcut -d: -n -l -c sample.txt sample.txt:3:5:f1:=:a:b:c
sample.txt:4:3:x:y:z
sample.txt:5:3:aaa:bbbb:cccc ddddd