Shareware Overload

home *** CD-ROM | disk | FTP | other *** search

/ Shareware Overload / ShartewareOverload.cdr / games / patterns.zip / GDT.DOC < prev next >

Wrap

Text File | 1988-04-01 | 11KB | 276 lines

GDT.COM Horizontally Aligns Material on a GIven Marker Byte by Robert A. Magnuson DMB, DCRT, NIH Bethesda, MD 20892 Mar 1988 Revised Apr 1988 GDT (for Generalized Decimal Tab) is a filter that reads lines from a set of files--or from stdin if no files are specified. Each line is written to stdout, shifted such that the first occurrence of a marker byte is in a specified column. A line with no marker byte is shifted as if a marker byte were just beyond the end. An option permits truncation. Without that option the shifting stops just short of truncation. Such lines can optionally be written to stderr. [This document is intended to be read from the screen. It contains some characters which will probably not print correctly on a printer.] A DOS command line that invokes GDT contains a number of arguments. The syntax is expressed by the following diagram: ┌──────────────────────────────┐ │ ┌─ d ─┐ delete marker │ ┌──────────┐ GDT ─┴ / ┬─┤ ├────────────────┬┴─ <marker> ── <col> ─┴┬ <file> ┬┴── │ ├─ e ─┤ write truncated │ └───────┘ │ │ │ lines to stderr │ │ └─ t ─┘ truncation OK │ 3/27/88 └────────────────────────┘ First there may be optional arguments specifying various GDT options. Then there is the required marker byte, the first occurrence of which in each line read determines the horizontal shifting of that line. Next, there is the required column specification. Then come the filenames which can be wildcarded. If no filenames appear, GDT gets its input from stdin. The arguments can optionally be enclosed in double quotes. The enclosing quotes are stripped off and not seen by GDT. Should you need to have a double quote within an argument, the argument must be double quoted and the internal double quote must be escaped by preceding it with a backslash. This treatment of the double quotes is done by the argc/argv mechanism of the C compiler. [GDT is implemented in Borland TurboC.] GDT exits with ERRORLEVEL set to one if some lines needed to be truncated--but weren't because there was no /t option. In that case the produced alignment is incorrect for those lines. The ERRORLEVEL is set to zero when no truncation problems arose, and to two for syntax problems. OPTION ARGUMENTS: GDT options are specified by the presence or absence of various option letters in option arguments. Any option argument must begin with a slash and must appear in front of the other kinds of arguments. Any number of legal option letters can appear in an option argument. Thus, you can have multiple option arguments, perhaps each with a single option letter (and each beginning with a slash), or just one option argument containing all of the option letters desired. CURRENTLY, ALL LEGAL GDT OPTION LETTERS ARE lower case. For the sake of readability, the above syntax diagram shows only the case where all option letters appear in one option argument. A GDT syntax error occurs when illegal option letters appear, and when required arguments are missing or invalid. Both the <marker> and <col> arguments are required. The marker must be entered as a single byte, as \d..d, or as \xh..h--where the d's are decimal digits, and the h's are hexadecimal digits. When a syntax error occurs GDT prints a boxed syntax diagram containing terse instructions on how to use GDT. This mechanism can be deliberately tripped in order to get on-screen help. The suggested way is to invoke GDT with no arguments--thus causing the no-<marker> syntax error. The /d option specifiess that the markers are to be deleted from the shifted lines. The marker is searched for from left to right. The first one found is used for alignment. Any further markers are ignored and remain as text. The /e option is used for error reporting. Lines that could not be aligned without truncation are copied to stderr. The /t option freely permits truncation. No errors are reported, and the ERRORLEVEL will not be one. EXAMPLES: Let the file AMOUNTS contain the following lines: 47.23 1923.50 .57 32768 1.27 4.12 588.90 The indentation shown is to distinguish the dislayed contents from the expanatory text. I.e., each line in AMOUNTS starts in column one without any leading blanks. A copy of this data with the decimal points aligned in, say, column ten is obtained via gdt . 10 AMOUNTS Here the marker is a period, the alignment column is 10. The results will come out on the screen, i.e., stdout, as follows: 47.23 1923.50 .57 32768 1.27 4.12 588.90 The following material Julius @Caesar Charles @De Gaulle Hernando @de Soto Ferdinand Victor Eugene @Delacroix Office of the @Emperor Kublai @Khan @Office of the President Aleksandr Sergeyevich @Pushkin Baron von @Richthofen @Royal Canadian Air Force Bachelor of @Theology Vincent @van Gogh when filtered through GDT via gdt/d @ 30 comes out as Julius Caesar Charles De Gaulle Hernando de Soto Ferdinand Victor Eugene Delacroix Office of the Emperor Kublai Khan Office of the President Aleksandr Sergeyevich Pushkin Baron von Richthofen Royal Canadian Air Force Bachelor of Theology Vincent van Gogh Now you may wonder how the at-sign markers got into the file, how the material got sorted, etc. Much of this can be done by FP.EXE and CHP.EXE (which are pattern utilities). Suppose we start with NAMES, an unsorted file of the material to be massaged: ---------- NAMES [1]Charles @De Gaulle [2]Kublai Khan [3]Bachelor of Theology [4]Aleksandr Sergeyevich Pushkin [5]@Office of the President [6]Office of the Emperor [7]Julius Caesar [8]Vincent @van Gogh [9]@Royal Canadian Air Force [10]Ferdinand Victor Eugene Delacroix [11]Hernando @de Soto [12]Baron von Richthofen We assume here that most names are to be sorted by the last word first. Therefore, at-sign markers appear only where needed by exception (in lines 1, 5, 8, 9 and 11). GDT is used at the end of the processing. The required processing is: ■ Separate out the lines with the at-signs, ■ For the other lines precede the last word with the at-sign mark, ■ Recombine lines just automatically marked with others, ■ For each line, put the mark and following bytes in front followed by a new temporary mark (here a smiley face), followed by stuff before mark, ■ sort, ■ rearrange each line in former order (removing smiley faces), ■ Finally via GDT, align mark on chosen column. This can be done automatically via the following batch file: ---------- Z.BAT [1]:split off already marked names to temp file, [2]: and, at the same time, [3]:take unmarked names, precede last word with mark, [4]:then append to temp file [5]fp/vou (tmp) @ names|chp "$ ?$$[a-z]+$$" \1@\2>>(tmp) [6]:put the mark and following stuff in front, [7]:followed by a smiley face, followed by stuff before mark, [8]:then sort, then rearrange, [9]:then align at-sign mark on column 30 [10]chp $.*$$@.*$ \2\1 (tmp) | sort | chp $.*$$.*$ \2\1 |gdt/d @ 30 Comments and explanations follow: Note that the /u option of FP is used below. You will need a version of FP.EXE with a date of at least 4/01/88 for this option to be in FP. [5]fp/vou (tmp) @ names|chp "$ ?$$[a-z]+$$" \1@\2>>(tmp) Here FP (for Find Pattern) first writes to the file (TMP) those lines in the file NAMES that contain an at-sign. These lines are the ones that are not selected, and are written because of the /u option. At the same time FP selects those lines that do NOT have an at_sign. (The /v option inverts the selection, and the /o option omits including the filename--which would be ---------- NAMES.) Those lines are piped to CHP (CHange via Pattern). The pattern consists mainly of two backslash-parenthesized patterns. The first matches zero or one blank. The second matches one or more letters (irrespective of case). The ending dollar sign matches end of line. Thus \2 is the last word, and \1 is the preceding space. (There would be no space if a line had just one word on it.) The result consists of the possibly empty blank, the at_sign, and the last word. What about the material to the left of the rightmost word? CHP writes out intact all unmatched material. This is then appended (because of the >> redirection) to the temp file that already contains the lines that had had at-signs. Now every line has an at-sign in front of the major-sort part. The [a-z] character set would have to be enlarged if any of the names were hyphenated or contained apostrophes. [10]chp $.*$$@.*$ \2\1 (tmp) | sort | chp $.*$$.*$ \2\1 |gdt/d @ 30 Now, before sorting, the major-sort part needs to be put in front. Therefore, the first CHP--reading from the temp file--puts all the material preceding the at-sign into \1, and puts the at-sign and all following bytes into \2. The result is to be the two parts interchanged with a smiley face in between. The smiley face is inserted so that the end of the last name (in case of multiple words there) can be found later. This is piped to SORT whose stdout is, in turn, piped to the second CHP. There, the two parts are interchanged again and the smiley face is removed. Finally GDT does its thing. You may wonder about the smiley face used temporarily to mark the end of the major-sort part. Other characters could have been used but this temporary marker should be lexicographically below a blank in order to sort correctly.