home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Current Shareware 1994 January
/
SHAR194.ISO
/
textutil
/
delim12.zip
/
DOCS.TXT
< prev
next >
Wrap
Text File
|
1993-10-15
|
19KB
|
425 lines
DELIMIT
Version 1.0
Copyright 1993 Jefferson P. Carey
Delimit is a program that scans columnar reports (stored as ascii text
files) and extracts relevant data, writing data to an ascii file in comma-
delimited format. A full explanation will follow, but I believe a simple
example is the best way to show you the capabilities of Delimit. In a
nutshell, Delimit can take a report like this:
-----------------------------------------------------------------------------
XYZ Corporation Page: 1
Sales Commission Report
For The Month Beginning 01/01/1993
Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
-------------- ----------- -------- --------- ---------- ----------
8342 981239872 5 74.95 374.75 56.21
987243873 23 14.95 343.85 51.58
989123783 3 274.85 824.55 123.68
---------- ----------
Totals: 1543.15 231.47
Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
-------------- ----------- -------- --------- ---------- ----------
8573 981239872 4 74.95 299.80 44.97
987243873 27 14.95 403.65 60.55
989123783 6 274.85 1649.10 247.37
---------- ----------
Totals: 2352.55 352.89
XYZ Corporation Page: 2
Sales Commission Report
For The Month Beginning 02/01/1993
Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
-------------- ----------- -------- --------- ---------- ----------
8342 981239872 6 74.95 449.70 67.46
987243873 25 14.95 373.75 56.06
989123783 4 274.85 1099.40 164.91
---------- ----------
Totals: 1922.85 288.43
Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
-------------- ----------- -------- --------- ---------- ----------
8573 981239872 3 74.95 224.85 33.73
987243873 22 14.95 328.90 49.34
989123783 5 274.85 1374.25 206.14
---------- ----------
Totals: 1928.00 289.21
Grand Totals: 7746.55 1162.00
-----------------------------------------------------------------------------
and create a file containing the data from the report in comma-delimited
format, like this:
"01/01/1993",8342,981239872,5,74.95,374.75,56.21
"01/01/1993",8342,987243873,23,14.95,343.85,51.58
"01/01/1993",8342,989123783,3,274.85,824.55,123.68
"01/01/1993",8573,981239872,4,74.95,299.80,44.97
"01/01/1993",8573,987243873,27,14.95,403.65,60.55
"01/01/1993",8573,989123783,6,274.85,1649.10,247.37
"02/01/1993",8342,981239872,6,74.95,449.70,67.46
"02/01/1993",8342,987243873,25,14.95,373.75,56.06
"02/01/1993",8342,989123783,4,274.85,1099.40,164.91
"02/01/1993",8573,981239872,3,74.95,224.85,33.73
"02/01/1993",8573,987243873,22,14.95,328.90,49.34
"02/01/1993",8573,989123783,5,274.85,1374.25,206.14
What's the point? In many organizations, out-of-date, unfriendly, and
inflexible computer systems still prevail. Many of these systems are
capable of outputting a variety of highly informative reports (such as the
one shown above), but lack the capability to be easily customized to
provide other types of data output. Essentially, the data exists in the
system, but you can only see it presented in ways that the system
designers intended (unchangeable reports). In many cases the data
presented in such reports could be of even greater value if it could be
extracted and analyzed using software (databases, spreadsheets, etc.) on
a PC. Herein lies the value of Delimit.
Any modern data analysis PC software that's worth a dime can import
data from a comma-delimited ascii file (exactly the kind of file created
by Delimit). Once the data is available to the PC software, the
possibilities for analysis (and even new reports) are endless.
At this point, if you still don't understand the purpose of Delimit, this
program probably isn't going to be of use to you. Do me a favor and
pass it on to a friend who might be interested. You could be doing your
friend a favor as well.
On the other hand, if you are faced with the same situation I've just
described, read on. The rest of this document will explain how to use
Delimit, and includes examples for all of the features.
Just one thing before we get started. This program is being released as
shareware...with a twist. Individuals using it for personal use, and
nonprofit organizations using it in their nonprofit ventures, are free to
use Delimit without paying the registration fee, if they choose. Anyone
else using Delimit (for-profit businesses), beyond a reasonable trial
period (use your own judgement here), must pay a registration fee of
$24.95 to continue to use the program. Registered users will receive a
disk containing the latest version of Delimit, an upgrade notice when a
newer version of the program is available, and a discount off the cost of
registering the newer version. Please note that individuals and
nonprofit organizations electing to use Delimit without paying the
registration fee are only entitled to free use of the program, and not to
these additional benefits of registered users.
Yes, you could easily "cheat" and use Delimit without paying the
registration fee. But, my sincere hope is that those who use it will
appreciate its real value (time saved, information gained, money saved,
etc.) and will realize that their $24.95 is a worthwhile investment.
Delimit required a great deal of personal time and effort to develop. I
appreciate all of you who support my work through your registration.
Thank you!
To register your copy of Delimit, print
the file REGISTER.TXT, fill it
out and enclose payment, and mail it
to the address at the bottom of the form. Using Delimit
To use Delimit you need to create a configuration file for the report you
want to process. Don't worry, this configuration file is quite simple to
make. Below is the configuration file that was used to process the report
shown earlier in the documentation. I'll explain each line in this
configuration file next.
-------------------------------------------------------------------------------------
[Settings]
InputFile=sample.txt
OutputFile=output.txt
DiscardFile=discard.txt
FilterDefault=Exclude
BlankFieldFill=True
IncludeOperator=And
ExcludeOperator=And
Trimming=True
StringDelimiter="
FieldDelimiter=,
[Include]
18,11,Numeric
[Exclude]
[Fields]
1,14,Numeric
18,11,Numeric
32,8,Numeric
43,9,Numeric
55,10,Numeric
68,10,Numeric
[Occasionals]
25,23,"For The Month Beginning",49,10,Alpha
-------------------------------------------------------------------------------------
{Settings] section
In this section you set the values of several parameters that affect the
operation of Delimit. Each parameter, and its possible values, is
explained below.
InputFile
This parameter specifies the name of the report that you want to
process. You may specify a drive and path if the file is not in the
current directory. This parameter is required.
OutputFile
This parameter specifies the name of the report where you want
Delimit to send its output (the comma delimited data). You may
specify a drive and path if the file will not be in the current
directory. This parameter is required.
DiscardFile
Later in the configuration file you will be able to specify which
lines Delimit should "throw out" when processing the report. In
our example report we are only interested in lines with item sales
figures, and all other lines should be discarded. If you include
this parameter in your configuration file, the discarded lines will
be written to the specified file. This feature is useful for checking
that the proper lines were discarded when you are working on
creating a correct configuration file. After running Discard, you
can look at the contents of the discard file and make sure no good
lines were discarded. This parameter is optional.
FilterDefault
This parameter tells Delimit whether to keep or discard lines in
the report by default. The valid values for this parameter are
"Include" and "Exclude". In some cases it will be easiest to specify
which lines contain data, so by default Delimit should exclude
lines from the report (i.e. it will discard (exclude) a line unless it
meets the criteria you have specified for keeping a line --
FilterDefault=Exclude). In other cases, it will be easiest to
specify which lines to exclude (FilterDefault=Include). For the
example report, we are going to specify which lines to keep so
FilterDefault=Exclude.
BlankFieldFill
This parameter determines whether Delimit will fill a blank field
with the most recent nonblank value of that field. The valid
values for this parameter are "True" and "False". In the sample
report, the salesperson id is shown on the first line for each
salesperson, but on successive lines this field is blank. In this
case, BlankFieldFill=True so that salesperson id's will be carried
down to successive lines in the comma delimited file, until a new
salesperson id is found. If BlankFieldFill=False, the first few
lines of the comma delimited file would have looked like this:
"01/01/1993",8342,981239872,5,74.95,374.75,56.21
"01/01/1993",,987243873,23,14.95,343.85,51.58
"01/01/1993",,989123783,3,274.85,824.55,123.68
"01/01/1993",8573,981239872,4,74.95,299.80,44.97
"01/01/1993",,987243873,27,14.95,403.65,60.55
IncludeOperator
Later in this documentation I will explain how to specify
conditions that lines must meet in order to be included in
processing and output to the comma delimited file. At times it
might be necessary to specify more than one condition that a line
must meet to be included. This parameter specifies whether those
conditions should be combined with an AND or an OR. For
example, you can specify that a line must meet condition x AND
condition y, or you can specify that a line must meet condition x
OR condition y. Valid values for this parameter are "And" and
"Or".
ExcludeOperator
Later in this documentation I will explain how to specify
conditions that lines must meet in order to be excluded from
processing and output to the comma delimited file. At times it
might be necessary to specify more than one condition that a line
must meet to be excluded. This parameter specifies whether those
conditions should be combined with an AND or an OR. For
example, you can specify that a line must meet condition x AND
condition y, or you can specify that a line must meet condition x
OR condition y. Valid values for this parameter are "And" and
"Or".
Trimming
This parameter determines if Delimit will trim spaces from the
beginning and end of fields that are written to the comma
delimited file. Valid values for this parameter are "True" and
"False".
StringDelimiter
This parameter determines which ascii character will be used to
delimit strings in the output file. The default is the double quote
character ("). Either specify a single character (such as " or ') or
specify the ascii value of a character in three digit decimal form
(such as 047 or 179).
FieldDelimiter
This parameter determines which ascii character will be used to
separate fields in the output file. The default is the comma
character (,). Either specify a single character (such as , or |) or
specify the ascii value of a character in three digit decimal form
(such as 047 or 179).
{Include} section
In this section, you specify the conditions that each line in the report
must meet in order to be included in the comma delimited file. The
format of the lines in this section is "column number, number of
characters, condition". The column number and number of characters
specify the characters that must meet the condition. The condition can
be a string, a set of characters, or one of the words "Alpha", "Numeric",
"Blank", or "NonBlank". In the example report the line "18,11,Numeric"
specified that the 11 characters, starting in column 18, must be a
number for the line to be included. Here are some more examples:
The character in column 11 must be 'A', 'B', or 'C':
11,1,{ABC}
The 3 characters starting in column 11 must be "Abc":
11,3,"Abc"
The first 5 characters on the line must be blank:
1,5,Blank
At least one of the first 5 characters must not be blank:
1,5,NonBlank
None of the 10 characters starting in column 35 can be a number:
35,10,Alpha
The 10 characters starting in column 35 must be a number:
35,10,Numeric
Note: " 123 " is a number, while "123 456" is not a valid
number.
If you put more than one line in this section, the conditions you specify
on each line will be combined with one of the logical operators AND or
OR, as determined by the value of the IncludeOperator parameter. For
example:
The first character must be an 'A' AND the next 10 characters
must be a number:
[Settings]
IncludeOperator=And
[Include}
1,1,{A}
2,10,Numeric
The first character must be an 'A' OR the next 10 characters must
be a number:
[Settings]
IncludeOperator=Or
[Include}
1,1,{A}
2,10,Numeric
{Exclude] section
The exclude section is identical to the include section, except that it is
used to specify the lines that should be excluded rather than included.
Complex conditions can be specified using a combination of
FilterDefault, IncludeOperator, ExcludeOperator, [Include}, and
[Exclude] settings. Some examples follow.
Include all lines that have a number in the first 10 characters
AND have a '/' in columns 50 and 53 (a good way to search for
dates of the form MM/DD/YY) but do not have the word "Deleted"
beginning in column 17:
[Settings]
FilterDefault=Exclude
IncludeOperator=And
[Include]
1,10,Numeric
50,1,{/}
53,1,{/}
[Exclude]
17,7,"Deleted"
Exclude any lines in which the first 5 characters are blank OR
contain the word "Total", unless there is a number in the 6
characters beginning in column 30:
[Settings]
FilterDefault=Include
ExcludeOperator=Or
[Exclude]
1,5,Blank
1,5,"Total"
[Include]
30,6,Numeric
[Fields] section
In the Fields section, you specify which columns from the included lines
contain the data that you want sent to the output file. Each line in the
Fields section is of the form "column number, number of characters, field
type". The column number and number of characters specify which
characters to extract from the line in the report. The field type is either
of the words "Alpha" or "Numeric". If the field is alpha, it will be
enclosed in quotes in the output file. Numeric fields will not be enclosed
in quotes. For example, in the sample output file line below, the first
field was specified as Alpha and the second field was specified as
numeric.
"John Doe", 25
You may specify an unlimited number of fields. They will be sent to the
output file in the order specified in the configuration file.
[Occasionals] section
An occasional is a combination of both an Include and a Field
specification. In many reports, data is listed only occasionally at the
beginning of a section or at the top of a page. In the sample report
shown earlier, the reporting period is only shown at the top of each page
on the line that contains the text "For The Month Beginning". To
include this data at the beginning of each line in the output file, an
occasional was specified in the configuration file. The format of the lines
in the Occasionals section is "column number, number of characters,
condition, field column, field characters, field type". The first three
parameters specify the condition that occasional lines meet, and the last
three parameters specify the position on the line and type of the data to
be written to the output file. For example:
Any lines containing the text "For The Month Beginning" starting
in column 25, contain a 10 character Alpha field starting in
column 49, is specified as:
25,23,"For The Month Beginning",49,10,Alpha.
The fields specified in the Occasionals section will not be output as the
occasional lines are encountered, but instead, at the beginning of each
and every line that is included from the report. Take a look at the
sample output file to see how this works. You may have more than one
occasional, with each one being output at the beginning of each included
line.
Running Delimit
Once you have created a configuration file, just type DELIMIT followed
by the name of the configuration file, then press the Enter key. A
sample report and configuration file have been included with Delimit.
The sample report is named SAMPLE.TXT and the configuration file for
processing this report is named SAMPLE.CFG. To run Delimit on this
report, just type DELIMIT SAMPLE.CFG and press the Enter key. The
results will be written to the files OUTPUT.TXT and DISCARD.TXT.
Contacting the Author
I can be reached on CompuServe. My ID is 70413,1360.
You may also contact me in writing at:
Jeff Carey
3735 Eastmont Avenue
Bloomington, IN 47403
I'd greatly appreciate any suggestions for improvement, constructive
criticisms, or even compliments!