home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-01-01 | 45.4 KB | 1,102 lines |
-
- ===========================================================================
- ===========================================================================
- ============================ ============================
- ============================ ============================
- ============================ PARSE-O-MATIC ============================
- ============================ ============================
- ============================ ============================
- ===========================================================================
- ===========================================================================
-
-
-
- ---------------------------------------------------------------------------
- | |
- | HERE ARE A FEW OF THE THINGS PARSE-O-MATIC CAN DO FOR YOU: |
- | |
- | Importing Exporting Automated Editing |
- | Text Extraction Data Conversion Table Lookup |
- | Retabulation Info Weeding Selective Copying |
- | Binary-File to Text Report Reformatting Wide-Text Folding |
- | Auto-Batch Creation Comm-log Trimming Tab Replacement |
- | Character Filtering Column Switching Time-Saving! |
- | |
- | "Parse-O-Matic is a wonderful time saver .... Each report that |
- | I can convert from our ... accounting system saves our company |
- | about 500 man hours per year" -- R. Brooker (a happy POM user) |
- | |
- ---------------------------------------------------------------------------
-
-
-
- Parse-O-Matic is Copyright (C) 1992, 1994 by:
-
- Pinnacle Software, CP 386 Mount Royal, Quebec, Canada H3P 3C6
- U.S. Office: Box 714 Airport Road, Swanton, Vermont 05488 USA
-
- Support Line (514) 345-9578 --- Free Files BBS (514) 345-8654
-
-
-
- ---------------------------------------------------------------------------
- | |
- | |
- | This is a SHAREWARE product. That means we would like you to |
- | pass around unregistered copies to other people. If you have |
- | a modem, you can upload a copy to a local BBS, or give a copy |
- | to a friend who will save time and effort with Parse-O-Matic. |
- | |
- | |
- ---------------------------------------------------------------------------
-
-
-
- ===========================================================================
- OVERVIEW
- ===========================================================================
-
-
- This manual contains the following sections:
-
- INTRODUCTION
- ------------
- Why you need Parse-O-Matic -- an example
- Parse-O-Matic to the rescue!
- How it works
-
- FUNDAMENTALS
- ------------
- The Parse-O-Matic command
- The POM file
- Padding for clarity
-
- COMMAND WORDS
- -------------
- Command formats
- The SET command
- The IF command
- The BEGIN and END commands
- The OUT and OUTEND commands
- The MINLEN command
- The IGNORE command
- The ACCEPT command
- The TRIM command
- The PAD command
- The INSERT command
- The CHANGE command
- The SPLIT command
- The CHOP command
- The LOOKUP command
- The LOOKFILE command
- The LOOKCOLS command
- The LOOKSPEC command
-
- TERMS AND TECHNIQUES
- --------------------
- Values
- Delimiters
- Illegal characters
- Incrementing
- Line counters
- Tracing
- Examples
-
-
-
- ===========================================================================
- INTRODUCTION
- ===========================================================================
-
-
- Parse-O-Matic is a programmable file-parser. Simple enough for a non-
- programmer to master, it can help out in countless ways. Here are some
- of the things Parse-O-Matic can do: Importing, Exporting, Automated
- Editing, Text Extraction, Data Conversion, Table Lookup, Retabulation,
- Info Weeding, Selective Copying, Binary-File to Text, Tab Replacement,
- Report Reformatting, Wide-Text Folding, Auto-Batch Creation, Character
- Filtering, Column Switching and more!
-
-
-
- ----------------------------------------
- WHY YOU NEED PARSE-O-MATIC -- AN EXAMPLE
- ----------------------------------------
-
- There are plenty of programs out there that have valuable data locked away
- inside them. How do you get that data OUT of one program and into another
- one?
-
- Some programs provide a feature which "exports" a file into some kind of
- generic format. Perhaps the most popular of these formats is known as a
- "comma-delimited file", which is a text file in which each data field is
- separated by a comma. Literal strings -- which might themselves contain
- commas -- are surrounded by double quotes. So a few lines from a
- comma-delimited file might look something like this (an export from a
- hypothetical database of people who owe your company money):
-
- +-------------------------------------------------------------------------+
- | "JONES","FRED","1234 GREEN AVENUE", "KANSAS CITY", "MO",293.64 |
- | "SMITH","JOHN","2343 OAK STREET","NEW YORK","NY",22.50 |
- | "WILLIAMS","JOSEPH","23 GARDEN CRESCENT","TORONTO","ON",16.99 |
- +-------------------------------------------------------------------------+
-
- Unfortunately, not all programs export or import data in this format.
- Even more frustrating is a program that exports data in a format that is
- ALMOST what you need!
-
- If that's the case, you might decide to spend a few hours in a text editor,
- modifying the export file so that the other program can understand it. Or
- you might write a program to do the editing for you. Both solutions are
- time-consuming.
-
- An even more challenging problem arises when a program which has no export
- capability does have the ability to "print" reports to a file. You can
- write a program to read these files and convert them to something you can
- use, but this can be a LOT of work!
-
-
-
- ----------------------------
- PARSE-O-MATIC TO THE RESCUE!
- ----------------------------
-
- Parse-O-Matic is a utility that interprets text and fixed-length files and
- converts them to other formats. It can help you "boil down" reports into
- their essential data. You can also use it to convert NEARLY compatible
- file formats.
-
-
- ------------
- HOW IT WORKS
- ------------
-
- You need three things:
-
- 1) The Parse-O-Matic program
- 2) A Parse-O-Matic "POM" file (to tell Parse-O-Matic what to do)
- 3) The input file
-
- The input file is usually a report from another program, or a fixed record
- length data file. We've provided several examples of typical input files.
- For example, the file EXAMPLE2.TXT comes from the AccPac accounting
- software. AccPac is a great program, but its export capabilities leave
- something to be desired. Parse-O-Matic can help!
-
-
-
- ===========================================================================
- FUNDAMENTALS
- ===========================================================================
-
-
- This documentation assumes that you are an experienced computer user. If
- you have trouble, you might ask a programmer to help you -- POM file
- creation is a little like programming!
-
-
-
- -------------------------
- THE PARSE-O-MATIC COMMAND
- -------------------------
-
- The basic format of the Parse-O-Matic command line is:
-
- POM pom-file input-file output-file
-
- Here's an example, as you would type it at the DOS command line:
-
- POM POMFILE.POM REPORT.TXT OUTPUT.TXT
-
- For a more formal description of the command line, start up POM by typing
- this command at the DOS prompt:
-
- POM
-
- ------------
- THE POM FILE
- ------------
-
- The POM file is a text file with a .POM extension. The following
- conventions are used when interpreting the POM file:
-
- - Null lines and lines starting with a semi-colon (comments) are ignored.
-
- - A POM file may contain up to 500 lines of specifications.
- Comment lines do not count in this total.
-
- A POM file contains no "loops" (to use the programming term). Each line of
- the input file is processed by the entire POM file. If you'd like it
- expressed in terms of programming languages, here's what POM does:
-
- +-------------------------------------------------------------------------+
- | START: If there's nothing left in the input file, go to QUIT. |
- | Read a line from the input file |
- | Do everything in the POM file |
- | Go to START |
- | QUIT: Tell the user you're finished! |
- +-------------------------------------------------------------------------+
-
-
-
- -------------------
- PADDING FOR CLARITY
- -------------------
-
- Spaces and tabs between the words and variables in a POM file line are
- generally ignored (except in the case of the OUT and OUTEND commands). You
- can use spaces to make your POM files easier to read.
-
- Additionally, in any line in the POM file, the following terms are ignored:
-
- = THEN ELSE
-
- These can be added to make the lines easier to read. For example, the IF
- command can be written in any of the following ways:
-
- Very terse: IF PRICE "0.00" BONUS "0.00" "1.00"
-
- Padded with spaces: IF PRICE "0.00" BONUS "0.00" "1.00"
-
- Fully padded: IF PRICE = "0.00" THEN BONUS = "0.00" ELSE "1.00"
-
-
-
- ===========================================================================
- COMMAND WORDS
- ===========================================================================
-
-
- For ease of learning, the commands words are explained in the following
- order:
-
- +-------------------------------------------------------------------------+
- | |
- | COMMANDS WHICH WILL... LIST OF COMMANDS |
- | ---------------------------------- --------------------------------- |
- | Break up an input line into fields SET IF |
- | Control processing flow BEGIN END |
- | Generate output OUT OUTEND |
- | Accept or reject input MINLEN IGNORE ACCEPT |
- | Alter fields TRIM PAD INSERT CHANGE |
- | Preprocess input SPLIT CHOP |
- | Look up data in another file LOOKUP LOOKFILE LOOKCOLS LOOKSPEC |
- | |
- +-------------------------------------------------------------------------+
-
-
- Here is a quick-reference table of all the commands. The following conven-
- tions are used in the table:
-
- "var" means a variable that is being set.
- "value" means a variable whose value is being read.
- Square brackets [like this] indicate optional items.
-
-
- ------------------------------------------- ------------------------------
- COMMAND FORMATS EXAMPLE
- =========================================== ==============================
- SET var1 value1 SET NAME $FLINE[20 26]
- IF value1 value2 var1 value3 [value4] IF X = "Y" THEN Z = "N"
- ------------------------------------------- ------------------------------
- BEGIN value1 value2 BEGIN LINECNTR = "3"
- END END
- ------------------------------------------- ------------------------------
- OUT [value1 value2] |output-picture OUT "X" "X" |{PRICE}
- OUTEND [value1 value2] |output-picture OUTEND "X" "X" |{$FLINE}
- ------------------------------------------- ------------------------------
- MINLEN number MINLEN "15"
- IGNORE value1 value2 IGNORE PRICE "0.00"
- ACCEPT value1 value2 ACCEPT $FLINE[1 3] "YES"
- ------------------------------------------- ------------------------------
- TRIM var1 spec1 character TRIM PRICE "R" "$"
- PAD var1 spec1 character len PAD SERIALNUM "L" "0" "10"
- INSERT var1 spec1 value1 INSERT PRICE "L" "$"
- CHANGE var1 value1 value2 CHANGE DATE "/" "-"
- ------------------------------------------- ------------------------------
- SPLIT from to [,from to] [...] SPLIT 1 250, 251 300
- CHOP from to [,from to] [...] CHOP 1 250, 251 300
- ------------------------------------------- ------------------------------
- LOOKUP var1 value1 LOOKUP PHONENUM "FRED JONES"
- LOOKFILE value1 LOOKFILE "C:\TABLES\DATA.TBL"
- LOOKCOLS value1 value2 value3 value4 LOOKCOLS "1" "3" "8" "255"
- LOOKSPEC value1 value2 value3 LOOKSPEC "Y" "N" "N"
- ------------------------------------------- ------------------------------
-
-
- The commands are explained in more detail (and in the same order) in the
- following sections.
-
-
-
- ---------------
- The SET Command
- ---------------
-
- FORMAT: SET var1 value1
-
- SET assigns a value to a variable. The usual reason to do this is to set a
- variable from the input line (represented by the variable $FLINE) prior to
- cleaning it up with TRIM. For example, if the input line looked like this:
-
- JOHN SMITH 555-1234 322 Westchester Lane Architect
- | | | | |
- Column 1 Col 12 Col 22 Col 33 Col 57
-
- then we could extract the last name from the input line with these two POM
- commands:
-
- SET NAME = $FLINE[12 21] (Sets the variable from the input line)
- TRIM NAME "R" " " (Trims any spaces on the right side)
-
- SET would first set the variable NAME to this value: "SMITH "
- After the TRIM, the variable NAME would have the value: "SMITH"
-
- You will also use SET if you plan to include a substring of $FLINE in the
- output, since the OUT and OUTEND commands do not recognize substrings after
- the "|" marker, only complete variables.
-
-
-
- --------------
- The IF Command
- --------------
-
- FORMAT: IF value1 value2 var1 value3 [value4]
-
- If value1 contains value2, var1 is set to value3. Otherwise, it is set to
- value4. If value4 is missing, nothing is done (i.e. var1 is not changed).
- Here's an example of the IF command...
-
- SET EARNING = $FLINE[20 26]
- TRIM EARNING "A" " "
- IF EARNING = "0.00" THEN BONUS = "0.00" ELSE "1.00"
-
- This would obtain the value between columns 20 and 26, remove any spaces,
- then check if it equals "0.00". If it does, the variable BONUS is set to
- 0.00. If not, BONUS is set to "1.00".
-
-
-
- --------------------------
- The BEGIN and END Commands
- --------------------------
-
- The format for the BEGIN and END commands is as follows:
-
- BEGIN value1 value2
- :
- Dependant code
- :
- END
-
- If value1 equals value2, then the dependant code (the POM lines between
- the BEGIN and the END) are executed. If value1 does not equal value2,
- then the dependant code is skipped.
-
- It is traditional in programming to indent code that appears in blocks
- such as Parse-O-Matic's BEGIN/END technique. This makes the logic of
- the program easier to understand. For example:
-
- BEGIN datatype = "Employee"
- SET phone = $FLINE[ 1 10]
- SET address = $FLINE[12 31]
- END
-
- BEGIN/END blocks can be nested. That is to say, you can have BEGIN/END
- blocks inside other BEGIN/END blocks. Here is an example, with arrows
- to indicate the levels of each BEGIN/END block...
-
- BEGIN datatype = "Employee" <---------------------
- SET phone = $FLINE[ 1 10] |
- SET address = $FLINE[12 31] |
- SET areacode = phone[1 3] | First
- BEGIN areacode = "514" <------- Second | Level
- SET local = "Y" | Level | Block
- SET tax = "Y" <------- Block |
- END |
- END <---------------------
-
- In this case, the "inner" block (starting with BEGIN areacode = "514")
- would only be reached if the "outer" block (BEGIN datatype = "Employee")
- was true. If the outer block was false, the inner block would never be
- executed.
-
- A nested BEGIN/END block must always be completely inside the outer
- block. Study the following (incorrect) example:
-
- BEGIN datatype = "Employee" <----
- SET phone = $FLINE[ 1 10] | First
- SET areacode = phone[1 3] | Level
- BEGIN areacode = "514" <--- | Block?
- SET local = "Y" | |
- END | <----
- SET tax = "Y" |
- END <--- Second Level Block?
-
- Parse-O-Matic does not pay attention to the indenting -- it is only a
- tradition we use to make the file easier to read. The code will be
- understood this way:
-
- BEGIN datatype = "Employee" <---------------------
- SET phone = $FLINE[ 1 10] | First
- SET areacode = phone[1 3] | Level
- BEGIN areacode = "514" <--- Second | Block
- SET local = "Y" | Level |
- END <--- Block |
- SET tax = "Y" |
- END <---------------------
-
- You can nest BEGIN/END blocks up to 25 deep. (It is quite unlikely you
- will ever actually need that much nesting) Here is an example of code
- that uses nesting up to three deep:
-
- BEGIN datatype = "Dog" <----------------------------------
- SET breed = $FLINE[1 10] | First
- BEGIN breed = "Collie" <----------------------- | Level
- SET sound = "Woof" | Second | Block
- BEGIN name = "Spot" <------ Third | Level |
- SET attitude = "Friendly" | Level | Block |
- END <------ Block | |
- END <----------------------- |
- BEGIN breed = "Other" <----------------------- Another |
- SET sound = "Arf" | Second |
- SET attitude = "Unknown" | Level |
- END <----------------------- Block |
- END <----------------------------------
-
- Once again, the indentation is for clarity only and does not affect the
- way the POM file runs. However, you will find that it makes your POM
- file much easier to understand.
-
-
-
- ---------------------------
- The OUT and OUTEND Commands
- ---------------------------
-
- FORMAT: OUT[END] [value1 value2] |output-picture
-
- The OUT command generates output without an end-of-line (i.e. carriage
- return and linefeed characters). The OUTEND command generates output and
- also adds an end-of-line.
-
- When value1 matches value2 (or if the comparison is omitted), a line is
- output to the output file, according to the output picture. Within the
- output picture, all text is taken literally (i.e. " is taken to mean
- literally that -- a quotation mark character).
-
- The only exception to this is variable names, which are identified by the
- { and } characters. For example, a POM file that contained the following
- single line:
-
- OUTEND "X" = "X" |{$FLINE}
-
- would simply output every line from the input file (not very useful!).
-
- The "X" = "X" part of the command is the comparator which controls when
- output occurs; if both parts of the comparator are both forced to the same
- value, output will always occur.
-
- NOTE: For efficiency, OUT does not write immediately to the output file; it
- accumulates the output until it reaches 255 characters before writing. You
- must do an OUTEND command to ensure that the data is actually written.
-
- You can not use substrings after the "|" marker. Thus, the following line
- is NOT legal:
-
- OUTEND $FLINE[1 3] = "IBM" |{$FLINE[1 15]}
-
- The correct way to code this is as follows:
-
- SET CODE = $FLINE[1 15]
- OUTEND $FLINE[1 3] = "IBM" |{CODE}
-
- This would output the first 15 characters of any line that contains the
- letters IBM in the first three positions.
-
-
-
- ------------------
- The MINLEN Command
- ------------------
-
- FORMAT: MINLEN number
-
- MINLEN specifies the minimum length a line must be to be considered for
- parsing. If you omit the MINLEN command, the minimum length is assumed to
- be 1. That is to say, all lines longer than 1 character will be processed
- and shorter lines (null lines in other words) will be ignored.
-
- MINLEN is useful for ignoring brief information lines that clutter up a
- report that you are parsing. For example, in the sample file EXAMPLE2.POM,
- the MINLEN command is set to 85 to ensure that all lines shorter than 85
- characters long will be ignored. This simplifies the coding considerably.
-
- The longest allowable input line is 255 characters, unless you use the
- SPLIT or CHOP command (described later).
-
-
-
- ------------------
- The IGNORE Command
- ------------------
-
- FORMAT: IGNORE value1 value2
-
- When value1 contains value2, the input line is ignored and all further
- processing on the input line stops. The usual format of this command is as
- in this example:
-
- IGNORE $FLINE[3 9] = "Date"
-
- This would skip any input line that contains the word "Date" between
- columns 3 and 9 ($FLINE is the line just read from the input file).
-
-
-
- ------------------
- The ACCEPT Command
- ------------------
-
- FORMAT: ACCEPT value1 value2
-
- The ACCEPT command accepts the input line if value1 contains value2. For
- example, if the entire POM file read as follows:
-
- ACCEPT $FLINE[15 17] = "YES"
- OUTEND "X" = "X" |{$FLINE}
-
- then any input line that contains "YES" starting in column 15 would be sent
- to the output file. All other lines would be ignored.
-
- CLUSTERED ACCEPTS: Sometimes you have to check more than one value to see
- if the input line is valid. You do this by using "clustered ACCEPTs",
- which are several ACCEPT commands in a row.
-
- Briefly stated, if you have several ACCEPTs in a row ("clustered"), they
- are all processed to determine if the input line is acceptable or not. If
- even one ACCEPT matches up, the line is accepted. To express this in more
- detail...
-
- When value1 contains value2, the line is accepted, and processing of the
- POM file continues for that input line, even if the immediately following
- ACCEPTs do NOT produce a match. After all, we've already got a match!
-
- If value1 does NOT contain value2, Parse-O-Matic looks at the next commmand
- in the POM file. If it is not another ACCEPT, the input line is ignored.
- If it is another ACCEPT, maybe it will product a match -- so Parse-O-Matic
- moves to that command.
-
- The following POM file uses clustered ACCEPTs to accept any line that
- contains the name "FRED" or "MARY" between columns 5 and 8, or contains the
- word "MEMBER" between columns 20 and 25.
-
- SET NAME = $FLINE[5 8] (Set the variable)
- ACCEPT NAME = "FRED" (Look for FRED)
- ACCEPT NAME = "MARY" (Look for MARY)
- ACCEPT $FLINE[20 25] = "MEMBER" (Look for MEMBER)
- OUTEND "X" = "X" |{$FLINE} (Output the line if we get this far)
-
- The following example would NOT work, however:
-
- ACCEPT $FLINE[20 25] = "MEMBER"
- SET NAME = $FLINE[5 8]
- ACCEPT NAME = "FRED"
- ACCEPT NAME = "MARY"
- OUTEND "X" = "X" |{$FLINE}
-
- It would not work because the ACCEPTs are not clustered; if the first
- ACCEPT fails, the input line will be rejected as soon as the SET command is
- encountered. The next two ACCEPTs would not be reached in such case.
-
-
-
- ----------------
- The TRIM Command
- ----------------
-
- FORMAT: TRIM var1 spec1 character
-
- Removes characters from var1. This is usually used to remove blanks.
-
- spec1 can be: A=All B=Both ends L=Left side only R = Right side only
-
- For example:
-
- SET PRICE = $FLINE[20 26]
- TRIM PRICE "A" ","
- TRIM PRICE "L" "$"
-
- This would remove all commas from the variable "PRICE", and remove the
- leading dollar sign. Thus:
-
- If the input contained the string: "$25,783"
- The first TRIM would change it to: "$25783"
- The second TRIM would change it to: "25783"
-
-
-
- ---------------
- The PAD Command
- ---------------
-
- FORMAT: PAD var1 spec1 character len
-
- PAD makes var1 a specified length, padded with a specified character.
-
- spec1 is "L", "R", or "C" (Left, Right or Center)
- character is the character used to pad the string
- len is the desired string length
-
- For example, if the variable ABC is set to "1234" ...
-
- PAD ABC "L" "0" "7" left-pads it 7 characters wide with zeros ("0001234")
- PAD ABC "R" " " "5" right-pads it 5 characters wide with spaces ("1234 ")
- PAD ABC "C" "*" "8" would center it, 8 wide, with asterisks ("**1234**")
-
- If the length is less than the length of the string, it is unchanged. For
- example, if you set variable XYZ to "PINNACLE", then
-
- PAD XYZ "R" " " "3"
-
- would leave the string as-is ("PINNACLE").
-
- Thus, PAD can not be used to shorten a string. If it was your intention to
- make XYZ 3 letters long, it would be appropriate to use the SET command:
-
- SET XYZ = XYZ[1 3]
-
-
- ------------------
- The INSERT Command
- ------------------
-
- FORMAT: INSERT var1 spec1 value1
-
- The INSERT command inserts text on the left or right of var1, or at a
- "found text" position.
-
- spec1 is "L" or "R" (Left or Right) or a find-string (e.g. "@HELLO")
- value1 is the value to be inserted
-
- For example, if the variable ABC is set to "Parse-O-Matic", then
-
- INSERT ABC "L" "Register " would set ABC to "Register Parse-O-Matic"
- INSERT ABC "R" " is super" would set ABC to "Parse-O-Matic is super"
-
- You can use a find-string to insert text at the first occurance of the text
- you specify. For example:
-
- INSERT ABC "@-O-Matic" "!" would set ABC to "Parse!-O-Matic"
-
- If the find-string is not found, nothing is done.
-
-
-
- ------------------
- The CHANGE Command
- ------------------
-
- FORMAT: CHANGE var1 value1 value2
-
- The CHANGE command replaces ALL occurances of value1 with value2. This is
- more powerful than TRIM, but is not as efficient. Here is an example of
- the CHANGE command in action:
-
- SET DATE = $FLINE[31 38]
- CHANGE DATE "/" "--"
-
- If the SET command assigned DATE the value: "93/10/15"
- Then the CHANGE command would convert it to: "93--10--15"
-
-
-
- -----------------
- The SPLIT Command
- -----------------
-
- FORMAT: SPLIT from-position to-position [,from-pos'n to-pos'n] [...]
-
- The maximum length of an input line from a text file is 255 characters. If
- your input file is wider than that, you must break up the file into
- manageable chunks, using the SPLIT command. This command lets you specify
- the way in which each input line is broken up so that it will look like
- several SEPARATE lines.
-
- For example, if your input lines were up to 300 characters wide, you could
- specify:
-
- SPLIT 1 255, 256 300
-
- This would break up each line as if it was two lines. (If some of the
- lines were less than 256 characters they would still be treated as two
- lines, though the second line would be null (i.e. empty).)
-
- You can specify up to 100 splits (use multiple SPLIT commands if
- necessary). With SPLIT, Parse-O-Matic can handle input records of up to
- 32767 characters.
-
- The best way of handling SPLIT or CHOPped files is to use a combination of
- $SPLIT (explained in more detail later) and BEGIN/END. For example:
-
- SPLIT 1 250, 251 300
- BEGIN $SPLIT = "1"
- SET a = $FLINE[ 1 10]
- SET b = $FLINE[11 20]
- END
- BEGIN $SPLIT = "2"
- SET x = $FLINE[ 1 10]
- SET y = $FLINE[11 20]
- OUTEND |{a} {b} {x} {y}
- END
-
- This would output the data which appears (in the input file) in columns
- 1-10, 11-20, 251-260 and 261-280.
-
-
-
- ----------------
- The CHOP Command
- ----------------
-
- FORMAT: CHOP from-position to-position [,from-pos'n to-pos'n] [...]
-
- The CHOP command works the same way as the SPLIT command, with one
- exception: it informs Parse-O-Matic that the input is a fixed-record-
- length file. In other words, it means that the input records are
- distinguished by having a particular (and exact) length, rather than being
- separated by end-of-line characters (Carriage Return, Linefeed) as is the
- case for a standard text file.
-
- Thus, if you have an input file containing fixed-length records, each of
- which is 200 characters wide, you could specify it like this:
-
- CHOP 1 200
-
- If the input record is more than 255 characters, you must break it up into
- smaller chunks. For example, if the input record was 300 characters wide,
- you could break it up like this:
-
- CHOP 1 250, 251 300
-
- By using CHOP, Parse-O-Matic can handle input records up to 32767
- characters wide. You can use the $SPLIT variable to manage your use of
- CHOP. See the example in the section describing the SPLIT command.
-
-
-
- ------------------
- The LOOKUP Command
- ------------------
-
- FORMAT: LOOKUP var1 value1
-
- The LOOKUP command will search for value1 in a text file (the name of which
- is specified either by the LOOKFILE command or the /L startup parameter).
- When POM finds it, it sets var1 to another value found on the same line.
-
- Let us suppose you created a text file, named NAMES.TBL, which looks like
- this:
-
- R. REAGAN Ronald Reagan
- D. EISENHOWER Dwight Eisenhower
- G. BUSH George Bush
- : :
- Column 1 Column 18
-
- This file can be used to look up a name, as in this POM file:
-
- LOOKFILE "NAMES.TBL"
- LOOKCOLS "1" "17" "18" "34"
- SET oldname = $FLINE[21 37]
- TRIM oldname "R" " "
- LOOKUP newname = oldname
- OUTEND |{oldname} {newname}
-
- The LOOKFILE command specifies the name of the look-up file. The LOOKCOLS
- command specifies the starting and end columns for both the "text-to-look-
- for" field (known as the key field) and the "text-to-replace-with" field
- (known as the data field).
-
- The LOOKUP command will look for oldname in NAMES.TBL. If oldname was set
- to "G. BUSH", LOOKUP set newname to "George Bush". If, however, oldname
- was set to "G. WASHINGTON", which doesn't appear in NAMES.TBL, newname
- would be set to "" (that is to say, an empty string).
-
- There is no limit to the number of lines that you can put in a look-up
- file. However, the more line there are, the longer it will take to process
- (because there is more to search). The maximum length of a line in a
- look-up file is 255 characters.
-
- In the look-up file, null (empty) lines are ignored. You can also include
- comments in the file by starting the line with a semi-colon:
-
- ; Some of the Presidents of the United States
- R. REAGAN Ronald Reagan
- D. EISENHOWER Dwight Eisenhower
- G. BUSH George Bush
-
- The LOOKUP command can be used for more than just names, of course. You
- could use it to look up prices, phone numbers, addresses and so on.
-
-
-
- --------------------
- The LOOKFILE Command
- --------------------
-
- FORMAT: LOOKFILE value1
-
- The LOOKFILE command specifies the name of the look-up file for the next
- LOOKUP command. This lets you use several look-up files in one POM file.
- For example:
-
- SET name = $FLINE[1 20]
- ; Look up full name
- LOOKFILE "NAMES.TBL"
- LOOKCOLS "1" "25" "30" "50"
- LOOKUP fullname = name
- ; Look up phone number
- LOOKFILE "PHONE.TBL"
- LOOKCOLS "1" "25" "30" "40"
- LOOKUP phone = name
- ; Output result
- OUTEND |{name} {fullname} {newname}
-
- If you only have one look-up file, you may omit the LOOKFILE command and
- specify the file name on the command line, using the /L parameter. For
- example, you could write a POM file like this:
-
- SET name = $FLINE[1 20]
- ; Look up full name
- LOOKCOLS "1" "25" "30" "50"
- LOOKUP fullname = name
- ; Output result
- OUTEND |{name} {fullname}
-
- Your POM command could then look like this:
-
- POM MYPOM.POM INPUT.TXT OUTPUT.TXT /LC:\MYFILES\NAMES.TBL
-
- This technique allows you to use several different look-up files with the
- same POM file, simply by changing the command line.
-
-
-
- --------------------
- The LOOKCOLS Command
- --------------------
-
- FORMAT: LOOKCOLS value1 value2 value3 value4
-
- The LOOKCOLS command specifies the starting and ending columns for the
- key and data fields in a look-up file (see the explanation of the LOOKUP
- command for an overview of look-up files).
-
- value1 specifies the starting column for the key field
- value2 specified the ending column for the key field
- value3 specifies the starting column for the data field
- value4 specified the ending column for the data field
-
- You can specify a null value to indicate "same as last time". For example:
-
- SET name = $FLINE[1 20]
- LOOKFILE "NAMES.TBL"
- LOOKCOLS "1" "25" "30" "50"
- LOOKUP fullname = name
- LOOKFILE "PHONE.TBL"
- LOOKCOLS "" "" "" "40"
- LOOKUP phonenum = name
- OUTEND |{name} {fullname} {phonenum}
-
- The second LOOKCOLS command uses the same numbers for the first three
- values that the first LOOKCOLS command used.
-
- If you do not specify a LOOKCOLS command, the default values are:
-
- Key Field: Starting column = 1
- Ending column = 10
- Data Field: Starting column = 12
- Ending column = 255
-
- This is equivalent to LOOKCOLS "1" "10" "12" "255".
-
-
-
- --------------------
- The LOOKSPEC Command
- --------------------
-
- FORMAT: LOOKSPEC value1 value2 value3
-
- The LOOKSPEC command configures the way the next LOOKUP command will work.
-
- value1 = Trim ("Y" or "N" -- default "Y")
- value2 = Sorted ("Y" or "N" -- default "N")
- value3 = Case-sensitive ("Y" or "N" -- default "N")
-
- The Trim setting specifies whether or not the data field should have spaces
- stripped off both ends.
-
- The Sorted setting specifies whether or not the look-up file is sorted by
- key field. A sorted file is much faster than an unsorted file.
-
- The Case-sensitive setting specifies whether or not LOOKUP should distin-
- guish between upper and lower case when searching. The default setting is
- "N" (No), so that LOOKUP would find "John Smith", even if it appeared in
- the look-up file as "JOHN SMITH". It is usually safest to set Case-
- sensitivity to "N", but if you set it to "Y", searching is slightly faster.
-
- You can specify a null value to indicate "same as last time". For example:
-
- SET name = $FLINE[1 20]
- LOOKFILE "DATA.TBL"
- LOOKCOLS "1" "25" "30" "50"
- LOOKSPEC "Y" "Y" "Y"
- LOOKUP fullname = name
- LOOKCOLS "" "" "60" "70"
- LOOKSPEC "N" "" ""
- LOOKUP phonenum = name
- OUTEND |{name} {fullname} {phonenum}
-
- The second LOOKSPEC command uses the same settings for Sorted and Case-
- sensitivity as the first one, but specifies a different Trim setting.
-
-
-
- ===========================================================================
- TERMS AND TECHNIQUES
- ===========================================================================
-
-
- ------
- VALUES
- ------
-
- A value can be specified in the following ways:
-
- "text" A literal text string
- VARNAME The name of a variable
- VARNAME[start end] A substring of a variable
- VARNAME[start] A single character
- VARNAME+ Incremented variable (see explanation below)
-
- Variable names can be up to 8 characters long. There is no distinction
- between upper and lower case in the variable name. You can create up to
- 220 variables and literals.
-
- Parse-O-Matic predefines several variables. They are:
-
- $FLINE = The line just read from the file (max. length 255 characters)
- $FLUPC = The line just read from the file, in uppercase
- $BRL = The { character (used in OUT)
- $BRR = The } character (used in OUT)
- $TAB = The tab character (Hex $09; ASCII 09)
- $SPLIT = The CHOP or SPLIT number you are currently processing
-
- Since $FLINE has a maximum length of 255 characters, you will have to use
- the SPLIT or CHOP command if your input file is wider than that. The
- $SPLIT variable reports which segment you are processing. For example,
- if you had this command...
-
- CHOP 1 255, 256 380
-
- then $SPLIT would be set to "1" when it was processing columns 1 to 255,
- and it would be set to "2" when it was processing columns 256 to 380.
-
-
-
- ----------
- DELIMITERS
- ----------
-
- If you need to specify a quotation mark, use "". For example:
-
- IGNORE $FLINE = "He said ""Hello"" to me."
-
- This would ignore lines containing: He said "Hello" to me.
-
-
-
- ------------------
- ILLEGAL CHARACTERS
- ------------------
-
- No command can contain these ASCII characters:
-
- HEX DECIMAL NAME
- --- ------- --------------------
- $00 0 NULL
- $0A 10 LF (Linefeed)
- $0D 13 CR (Carriage Return)
-
- Of course, LF and CR do appear at the end of each line, in a text file.
-
-
-
- ------------
- INCREMENTING
- ------------
-
- Only numeric incrementing is supported. Attempting to increment another
- type of variable will result in an error.
-
- - Incrementing "1" gives you "2"
- - Incrementing "9" gives you "10"
-
- The first time a variable is referenced, it has a null value. If you
- increment this, it will be changed from "" (i.e. null) to "1".
-
-
-
- -------------
- LINE COUNTERS
- -------------
-
- If your input record is divided over several lines (due to its original
- format or perhaps because you used the SPLIT or CHOP command), it is
- helpful to set up a line counter. The following example would extract the
- first six characters of the second line of input records that span three
- lines (designated lines 0, 1 & 2):
-
- IF LineCntr = "1" THEN MyField = $FLINE[1 6]
- OUTEND LineCntr = "1" |{MyField}
- IF LineCntr = "2" THEN LineCntr = "" ELSE LineCntr+
-
-
-
- -------
- TRACING
- -------
-
- By setting the DOS variable POM to ALL, you can generate a trace file,
- named POM.TRC. This is helpful if you have trouble understanding why your
- file isn't being parsed properly. But be sure to test it with a SMALL
- input file; the trace is quite detailed, and it can easily generate a huge
- output file.
-
- To save space, you can specify a particular list of variables to be traced,
- rather than tracing everything. For example, to trace only the variable
- PRICE, enter this DOS command:
-
- SET POM=PRICE
-
- To trace several variables, separate the variable names by slashes, as in
- this example:
-
- SET POM=PRICE/BONUS/NAME
-
-
- --------
- EXAMPLES
- --------
-
- Most of these techniques are demonstrated by the examples provided with the
- standard Parse-O-Matic package. To see these examples, switch to your
- Parse-O-Matic directory and type START at the DOS prompt.
-
-
-
-
-