home *** CD-ROM | disk | FTP | other *** search
- Introduction.
- Index of Demonstration Programs.
- A Bare Minimum Program.
- Establishing Communication.
- Reading and Writing Files.
- Families of Files.
- Further Examples.
- :Introduction.
-
- The purpose of CNVPRG.HLP is to introduce the programming language Convert,
- which is a pattern-directed language, whose commands are written as examples,
- in a style typified by: "if you see this, then do that..." Nevertheless, this
- is not a HELP file from first principles, but slightly more advanced. Another
- file, CNVRT.HLP, which outlines the language should be consulted first. This
- file shows how to construct programs, beginning with a totally trivial
- example showing how to organize a file containing a Convert program. However,
- an editor can elaborate even this program into something more useful. It, and
- some of the other programs to be taken up are good seed programs.
-
- The first step is to consider input and output. Although Convert has a default
- exchange between the console and the program, most programs will work with
- disk files, and it is necessary to know what facilities are available to
- use the disk. But first, we show a program which interacts exclusively with
- the console. Even when disk files are involved, console interaction can always
- play a part in the program.
-
- The next step is simple reading and writing involving just a single file in
- each of these two activities.
-
-
- Going on from individual files, there is frequent occasion to work with
- families of files, using MS-DOS's wildcard conventions. This gives us an
- opportunity to use some of the function skeletons which communicate with
- MS-DOS to execute advanced directory operations. It is also an opportunity to
- show how a list of tasks can be built up, which will be attended to one by one.
-
- Finally, a few moderately complicated examples are shown, which are quite
- convenient utilities in their own right.
-
- These programs are all relatively straightforward, in that they do not
- involve any but the simplest pattern matching and searching. In one of
- the sample programs, SENTEN.CNV, there is an exception when the definition
- of a "sentence" in terms of Convert is undertaken. Typesetters and writers
- use constructions just slightly beyond the definitions given; but we do not
- pursue the matter in any detail.
-
- :Index of Demonstration Programs.
-
- Program Section Panel
- ------- ------- -----
-
- SAMPLE.CNV C 4
- VOWEL.CNV D 10
- COPY.CNV E 1
- SENTEN.CNV E 4
- PYP.CNV E 9
- PAK.CNV F 3,4
- UPAK.CNV F 7
- BORRA.CNV F 9
- FIND.CNV G 2,3,4
- KWIK.CNV G 6
- BINCOM.CNV G 8,9
-
- :A Bare Minimum Program.
-
- The simplest possible Convert program is:
-
- (()()()())
-
- It defines no patterns.
- It defines no skeletons.
- It uses no variables.
- It does nothing.
-
- In spite of the fact that it does nothing, it IS a program, and gives us
- a way to get started. We could also write it in the form:
-
- ((
- )(
- )()(
- ))
-
- In the second form it is easy to insert more lines, since its structure as a
- quadruple is already established. The null list of variables can't contain any
- spaces, carriage returns, or line feeds, so it is sandwiched on the third line.
-
- To compile and execute a program we must place it in a disk file. From the
- the very beginning we should follow good programming habits and document our
- program. Its file needs a name, which could be the name of the program itself.
- Since we are all forgetful, especially after we have dozens of disks laying
- around, it is a good idea to place the name of the file at the very beginning
- of the file. That way, it will always show up on listings; we can also peek at
- the file with a TYPE in MS-DOS, and if names, dates, revision numbers, comments
- and the like are at the front of the file they can be scanned rapidly.
-
- If a program ends with the word "end" it will be evident that something has
- happened to the file if this final remark turns up missing.
-
- Without being very original, SAMPLE.CNV could be the name of a practice file,
- containing the following text:
-
- [SAMPLE.CNV]
- [A. Programmer, 15 March 1984]
- [A sample of Convert programming]
- (()()()())
- [end]
-
- The square brackets make the text they enclose into comments for REC.
-
- [SAMPLE.CNV]
- [A. Programmer, 15 March 1984]
- [A sample of Convert programming]
-
- [This panel shows a complete program. Comments, such as this one, can be
- used liberally to explain and document a program. Although defaults are
- provided, CONVERT.REC expects to find the following four components of the
- program, in order:
-
- 1. file name (as a comment)
- 2. special comments (Include or Exclude)
- 3. the first subroutine (or the main program, if there
- are no subroutines)
-
- More blank lines may be present, and any number of additional comments.]
-
- [main program]
- (()()()())
-
- [end]
-
- :Establishing Communication.
-
- There might be programs that do not require input or produce output; think of a
- program that tests memory, which simply runs until it fails. Otherwise, sources
- and destinations must be established; then the program can receive information
- and record the results of its computation. If disk files are going to be used,
- MS-DOS places the names of such disk files on the command line. If no files are
- specified, it is a natural assumption that the console is to be used.
-
- A Convert program will be executed by typing a command line similar to
-
- REC86 SAMPLE D:FILE.EXT
-
- because REC allows the command line tail to be passed along to the program
- which it is going to load and execute. The information D:FILE.EXT will be
- forwarded to the workspace during the initialization which accompanies every
- Convert program. Of course, there may be more than one argument and the
- arguments need not be file names. Any lowercase letters in the argument
- list, however, are received by REC as uppercase letters.
-
- To complete the exchange between the program and its environment, once the
- program has finished, anything left in the workspace is typed at the console.
-
- To see how this works, suppose that we use the file SAMPLE.CNV containing the
- null main program (()()()()). Since it does nothing, it will end up by typing
- whatever it finds in its workspace. We can try it with various command lines,
- to see what happens.
-
-
- 1) A>rec86 sample file.ext
-
- FILE.EXT
- A>
-
-
- 2) A>rec86 sample
-
-
- A>
-
-
- Note in the first trial that COMMAND.COM has converted lower case letters to
- upper case. The second trial shows the null argument list being output as
- a blank line.
-
-
- If the program is to use the file which has been designated, that file has to
- be opened, used, and closed; this means that we are going to have to include
- some substance in the program. First, the disk and filename can be bound to
- variables while ignoring the extension. A suitable program would be:
-
- [main program]
- (()()(8 9)(
- (<8>:<9>(or, ,.,<>),<<
- >>(%Or,<8>:<9>.OLD)<<
- >>(%Ow,<8>:<9>.NEW)<<
- >>(a)<<
- >>);
- (<9>,@:<9>):
- ))
-
- This program uses only one filename, but the input file will be distinguished
- from the output file because the former has the extension OLD, the latter the
- extension NEW. The as yet undefined function "a" will do the processing of the
- file. A detail which should be noted is the way in which we make sure that the
- variable <8> has a value to which to bind. Note further that the filename can
- terminate with a space, a dot, or the end of the workspace text.
-
-
- A slightly more elaborate main program is desirable. Since it is likely that
- the input file will be read by many different rules throughout the program,
- the single skeleton (R) can be defined once and for all in the main program
- to do this reading. We can also prime the function "a" with an initial line.
-
- [main program]
- (()(
- ((%r,<8>:<9>.OLD)) R
- ((%W,<8>:<9>.NEW,<=>(^MJ))) W
- )(8 9)(
- (<8>:<9>(or, ,.,<>),<<
- >>(%Or,<8>:<9>.OLD)<<
- >>(%Ow,<8>:<9>.NEW)<<
- >>(a,(R))<<
- >>);
- (<9>,@:<9>):
- ))
-
- We have also included function W, in which the skeleton <=> represents the
- argument given in references to W; for instance, (W,end) writes on the output
- file a line consisting of the word "end" and the control characters CR and LF
- (^M and ^J).
-
- Most programs will make automatic use of disk files whose names are derived
- from the command line, so the main program in each case will be similar to
- the one shown in the last panel. Typically, it will open the files to be used,
- execute an auxiliary program, and then close the files that it opened. Some
- general purpose reading and writing skeletons may be defined at this level,
- which is also the level at which disk assignments and generic file names can
- be bound. The choice of high numbers like 8 and 9 for these variables is simply
- a personal choice which leaves the low numbers free for use in other programs
- which will occupy the same file.
-
- If the program being prepared is to work with a family of files, supposing that
- an ambiguous file name were given on the command line, additional programming
- will be required to trace down all files in the directory which correspond to
- the ambiguous name and save references to them for later use in the program.
-
- Convert allows access to the "standard input" and "standard output" files
- by specifying the null string as the file name. Thus a program may read its
- input from a file or the keyboard depending on whether or not standard input
- redirection is specified on the command line; a similar situation holds for
- the standard output, which goes to the screen unless redirected. If keyboard
- input and screen output are needed regardless of redirection, (%r,TTY:) and
- (%W,TTY:,...) or (%t,...) will serve.
-
- To explore the uses of the console as a default "file" consider the following:
-
- [VOWEL.CNV]
- (()()()(
- (stop,goodbye!);
- ((or,A,E,I,O,U),(%t,VOWEL)(%r)):
- (,(%t,other)(%r)):
- ))
-
- This program types out a comment according to the initial letter of whatever
- is typed in response to the prompt at the console. It requires a lower case
- "stop" to terminate the program. In the combination (%t,....)(%r), %t erases
- its argument, so it leaves a clean workspace into which is inserted the
- response following the next prompt.
-
- This program requires no variables because it does not use any portion of the
- workspace, as dissected by a variable-containing pattern, in creating the new
- contents of the workspace. Generally, variables are not required when all the
- rules of a set are of the form (recognition, response). A program which made
- substitutions from a table, or classified intervals, would not use variables.
-
-
-
- Let us try this same program again, using %W instead of %t. For the sake of
- variety, let us also mention the console specifically using (%r,TTY:) for (%r).
-
- [VOWEL.CNV]
- (()()()(
- (stop,goodbye!);
- ((or,a,e,i,o,u),(%W,,vowel)(%r,TTY:)):
- ((or,A,E,I,O,U),(%t,VOWEL)(%r)):
- (,(%W,,other)(%r,)):
- ))
-
- Note the following details:
-
- 1) "goodbye!" does not need %T because it is the last thing placed
- in the workspace, to be typed as the program exits to MS-DOS.
-
- 2) %W is followed by TWO commas because we have to distinguish the
- message it will type from the file name; by definition the standard
- output is not spelled out by name, but we have to show up its absence
- somehow. Notice that %W will not add CR,LF, as %t does.
-
- 3) (%r,) works; but is redundant, the same as (%r).
-
- Suppose that we make a hasty copy of this program -- including only minimal
- comments -- and give it a trial run. The following (annotated) transcript
- might result:
-
- A>rec86 convert vowel ;first compile VOWEL.CNV
- ... ;CONVERT.REC will output something here
- A>rec86 vowel ;now execute VOWEL.REC
- other ;this happens in reply to the null workspace
- > a<CR>vowel ;<CR> moves cursor, reply overwrites "> a"
- > A<CR> ;<CR> denotes the user's CR
- VOWEL ;reply on new line
- > b<CR>other ;different response, actually overwrites "> b"
- > stop<CR> ;time to quit
- goodbye! ;acknowledgement
- A> ;back to MS-DOS
-
- The treatment of a and A was different -- %W types what it sees, and if you
- want a CR,LF you have to put one in, say as (^MJ). The prompt showed up on a
- new line because %r always prefaces the prompt with a CR,LF. %t does the same
- because it is intended for debugging or for message transmission direct to the
- console, where it is a good idea to start everything off on a new line.
-
-
- When a program is running, one often forgets how to stop it, or even what
- kind of data it is expecting. The purpose of a null-argument message is to
- supply this kind of information. A rule (<>,(%t,...message...)(%r,TTY:)):
- may be included; this will display a message whenever the workspace is
- completely empty, and will read a line from the keyboard to be acted upon
- by the rule set.
-
- A good null-argument rule for this program would be
-
- (<>,(%t,To identify upper and lower case vowels...(^MJ)<<
- >>type any character, either shifted or regular.(^MJ)<<
- >>Type stop to quit.)(%r)):
-
- Note the CR-LF pairs inserted in the argument to %t; if not present, the
- argument would be a output as a single, long line.
-
- With practice one begins to pick up little formatting details. For example,
- when a reply falls on the same line as a prompt, and knowing that the carriage
- return which terminates console input will be echoed without a line feed, we
- might program it in our output.
-
- The next panel shows a finished version of VOWEL.CNV.
-
-
- [VOWEL.CNV]
- [Harold V. McIntosh, 16 March 1984]
- [Rev. for MS-DOS by G. Cisneros, 1.10.1990]
-
- [main program]
- (()()()(
- (<>,(%t,To identify upper and lower case vowels...(^MJ)<<
- >>type any character, either shifted or regular.(^MJ)<<
- >>Type stop to quit.)(%r)):
- (stop,goodbye!);
- ((or,a,e,i,o,u),(%W,,(^J)vowel)(%r,TTY:)):
- ((or,A,E,I,O,U),(%t,VOWEL)(%r)):
- (,(%W,,(^J)other)(%r,)):
- ))
-
- [end]
-
- :Reading and Writing Files.
-
- Once the basics of transmitting the MS-DOS command line to a Convert program
- have been mastered and one has prepared a seed program, it can be copied to
- start a new program. A good place to begin is with a copying program, which
- is not all that useful, but which IS simple.
-
- [COPY.CNV]
- (()()()( ((^Z),); (,(W,<=>)(R)): )) a
-
- [main program]
- (()(
- ((%r,<8>:<9>.OLD)) R
- ((%W,<8>:<9>.NEW,<=>(^MJ))) W
- )(8 9)(
- (<8>:<9>(or, ,.),<<
- >>(%Or,<8>:<9>.OLD)<<
- >>(%Ow,<8>:<9>.NEW)<<
- >>(a,(R))<<
- >>);
- (<9>,@:<9>):
- ))
-
- There are fine points to be perceived in the program of the preceding panel.
-
- 1) The program "a" is written on one line; it is harder to read but
- since it is quite short, it is nicer to save the space.
- 2) %r will read the block of information which corresponds to it, one
- single line - delimited by but not including its CR,LF - unless
- formatted reading has been requested. %r will NEVER deliver a ^Z unless
- it is the first character delivered. (Well, almost NEVER. A formatted
- read could include a ^Z, but that could also be considered poor
- programming practice.) In non-pattern-directed reads, %r ALWAYS
- returns ^Z upon exhausting the file, and on any further read attempts.
- 3) The terminal rule in "a" leaves a null workspace, not one containing
- ^Z. For users of certain brands of terminals, this avoids a
- disconcerting flash on the screen as the final workspace is typed
- on the console prior to returning to the operating system.
- 4) The pair (W,<=>)(R) in "a" could be (R)(W,<=>) instead since each
- function has its private workspace. However, (W,<=>)(R) uses less
- total space. Note the use of the "same" skeleton <=>, whose value
- is the workspace contents (the argument to which the rule applied).
- 5) Note also the <=> used in skeleton W defined in the main program,
- where its value is whatever argument string W is invoked to operate
- on.
-
- Since simple copying of a file is easy enough to do with COPY, it might be a
- bit more interesting to look at programs which are capable of fancier maneuvers
- than that. First of all, when working with written text, the sentence is a much
- more natural unit than a line; indeed the discrepancy between the two accounts
- for much of the complexity involved in "word processors."
-
- What is a sentence? Traditionally, it begins with a capital letter and ends
- with a period; the period is the more important of the two. However, there
- are a few exceptions -- quoted periods, triple dots sometimes used to express
- continuation, the period that goes inside the quoted expression which lies at
- the end of a sentence. Starting with its beginning, a sentence is recognized by
-
- <-->.
-
- but we can incorporate the exceptions by making a series of definitions:
-
- [non-terminal] ((or,(and,<[1]>,(nor,.,<'>,<">)),..(ITR,.))) q
- [balanced quote] ((ITR,(or,<:q:>,<"><:r:><">,<'><:r:><'>))) r
- [sentence] (<:r:>(or,.,<"><:r:>.<">,<'><:r:>.<'>)) s
-
- We still have to filter out things like captions and section numbers, but <:s:>
- is a certain approximation to a sentence recognizer.
-
- The following program ought to read the file named on the command line, and
- type it out sentence by sentence on the console.
-
- [SENTENCE.CNV]
- (()()(0 1)(
- (<0>(^Z),<0>);
- ( <0>,<0>):
- (<0> (ITR, )<1>,<0> <1>):
- (<0>(^MJ)<1>,<0> <1>):
- (,(%t,<=>)(R)):
- )) a
-
- [main program]
- ((
- ((or,(and,<[1]>,(nor,.,<'>,<">)),..(ITR,.))) q
- ((ITR,(or,<:q:>,<"><:r:><">,<'><:r:><'>))) r
- (<:r:>(or,.,<"><:r:>.<">,<'><:r:>.<'>)) s
- )(
- ((%r,<8>,<:s:>)) R
- )(8)(
- (<8>,(%Or,<8>)(a,(R)));
- ))
-
- When a file containing the program of the last panel is prepared and compiled,
- its execution reveals several oversights.
-
- 1) Not all sentences end with a period - exclamation and question
- mark, sometimes three dashes, are also terminators.
-
- 2) As written, the provision for singly or doubly quoted expressions
- does not foresee their nesting with alternate parity.
-
- 3) What programmers take for a single quote is an ASCII accent; ASCII
- doesn't have an apostrophe, so the accent is used for that too!
-
- 4) Abbreviations, especially initials in proper names, are followed
- by periods. Beware the division FILE.EXT in MS-DOS file names.
-
- 5) Tabular material, formulas, and program examples don't show periods.
- Inserts may have periods of their own - decimal points for example.
- Paragraph numbers, captions, and headers are all non-sentences.
-
-
-
-
-
- The foregoing sequence, containing an attempt at a sentence recognizer, shows
- two contradictory aspects of Convert programming. On the one hand, Convert has
- the power to give a quick description of natural characteristics of text. On
- the other hand, we see that natural language is subtly beyond any short and
- simple analysis. If we strive for perfection, it will elude us; but if we
- settle for a cursory solution of a casual problem we will fare much better.
-
- In the case of a sentence recognizer, we will do pretty well just picking out
- periods, and slightly better with periods followed by spaces or CR,LF's.
-
- To continue surveying simple copying programs, consider some frequent tasks
- which CP/M's PIP can perform, and how even more general movements could be
- achieved. Convert contains some "character arithmetic" functions which were
- placed there to allow certain kinds of copying.
-
- &u - make uppercase
- &l - make lowercase
- &a - zero parity bit (CP/M's convention for ASCII)
- &s - set parity bit (used by some editors)
-
- The functions in the & family process a character string of arbitrary length;
- the easiest way to use them is line by line until the end-of-file comes up.
-
- There are further functions in the & family; &h would be useful for generating
- hexadecimal dumps from binary progam files because it replaces each byte in
- its argument string by a two-byte printable ASCII equivalent using hexadecimal
- "digits." Individual functions of the & family could be incorporated in the
- COPY.CNV example of a previous panel just by modifying the definition of the
- skeleton W:
-
- ((%W,<8>:<9>.NEW,(&u,<0>)(^MJ))) U
- ((%W,<8>:<9>.NEW,(&l,<0>)(^MJ))) L
- ((%W,<8>:<9>.NEW,(&a,<0>)(^MJ))) A
- ((%W,<8>:<9>.NEW,(&u,<0>)(^MJ))) H
-
- Rather than having five special purpose programs, let us think of how to
- incorporate all five options into a single program. There are two evident
- choices:
-
- 1. Incorporate the option in the command line tail
- and parse it.
- 2. Solicit the option from the console.
-
- The latter is likely to be the more instructive; it also leaves open the
- possibility that a command line argument would be a sort of batch file.
-
- [PYP.CNV]
- [Harold V. McIntosh, 16 March 1984]
-
- [A Convert program exhibiting some of the characteristics of PIP.COM]
-
- [option] (()()(0)((X,(^Z)); (<0>,(%t,In file )(b,(Q))); )) a
- [input file] (()()(1)((<1>,(%t,Out file )(c,(Q))(%C,<1>)); )) b
- [output file] (()()(2)((<1>,); (<2>,(%Or,<1>)(%Ow,<2>)(d,<0>)(%C,<2>)); )) c
- [choose] (()( ((%r,<1>)) R)()( (C,(e,(R))); (U,(f,(R))); (L,(g,(R)));
- (A,(h,(%r,<1>,<[128]>))); (H,(i,(%r,<1>,<[16]>))); )) d
- [copy] (()()(0)( ((^Z),); (<0>,(%W,<2>,<0>(^MJ))(R)): )) e
- [upper] (()()(0)( ((^Z),); (<0>,(%W,<2>,(&u,<0>)(^MJ))(R)): )) f
- [lower] (()()(0)( ((^Z),); (<0>,(%W,<2>,(&l,<0>)(^MJ))(R)): )) g
- [ascii] (()()(0)( (<>,); (<0>,(%W,<2>,(&a,<0>))(%r,<1>,<[128]>)): )) h
- [dump] (()()(0)( (<>,); (<0>,(%W,<2>,(&h,<0>)(^MJ))(%r,<1>,<[16]>)): )) i
- [loop] (()()()( ((^Z),);
- (,(%t,Option? <<
- >><(>c/copy, u/upper, l/lower, a/zero parity, h/hex dump<)>)<<
- >>(a,(Q))): )) x
- [main] (()( ((&u,(%r,(&u,<9>)))) Q)(9)( (<9>,(%Or,(&u,<9>))(x)(%E)); ))
-
- [end]
- -
- This program is rather densely packed to make it fit in a single panel, but
- its structure is quite straightforward.
-
- main place batch file or TTY: in <9>, open if necessary, call x
-
- x loop: solicit option, call a, quit for ^Z
-
- a bind option to <0>, solicit input file, call b
- but return immediately for option X
-
- b bind input file name to <1>, solicit output file, call c
-
- c bind output file to <2>, open input and output, call d
-
- d call e, f, g, h, i according to option selected
-
- others repeat the appropriate action until end-of-file
- e - option C - simple copy
- f - option U - make uppercase
- g - option L - make lowercase
- h - option A - remove parity bit
- i - option H - hexadecimal dump
-
- Commentary regarding the program PYP.CNV:
-
- 1. A null file command line will give us the opportunity to define
- a "batch" file, or to interact through the keyboard if we give a
- null response. A file given on the command line defines a "batch"
- file, whose lines should contain the expected keyboard response.
-
- 2. The program is illustrative, not fool proof; little is done about
- possible error reports from BDOS unless BDOS itself takes over.
-
- 3. Files are closed with %C, this frees space in REC's pushdown list.
-
- 4. ASCII oriented processing terminates on a ^Z, but block processing
- ends when %r returns a null string (no more to be read).
-
- 6. If TTY: is designated as the output device, we can watch the results
- on the console screen.
-
- :Families of Files.
-
- Programming with a single input file and a single output file requires only the
- Convert functions %O, %r, %W, %C, and %E. They open and close files, read and
- write data to the files. Based on the analogous BDOS function, their operation
- is only slightly different. For example, opening a file for writing will create
- a previously nonexistent file, or else erase a previously existing file. When
- reading, the nonexistence of the file returns the phrase "Not Found" in the
- workspace. %E closes all open files and frees all associated storage.
-
- The read function reads one single line unless directed to read another format
- by including a pattern in its parameter list. In writing, only the contents of
- the workspace is sent to the output file. Naturally, some buffering is needed
- by these functions to make them compatible with MS-DOS.
-
- Other file handling functions are available in DOS, notably those which treat
- ambiguous file names, and allow the renaming and deleting of files. The two
- search functions, %S for "search" and %A for "search again" may be used to
- track down all the instances of an ambiguous file name at the beginning of a
- program. Then they may be read out one by one as the files they represent are
- processed. It is a good idea to save everything at once at the beginning of a
- program; this avoids the inadvertent reprocessing of a file just created.
-
- There is a fairly straightforward main program, which is shown in the HELP
- file CNVRT.HLP, which can be used to gather up all the files corresponding
- to an ambiguous file reference.
-
- The following example is slightly more complex, because it derives the name of
- an output file from the first reasonable instance of the ambiguous reference
- which it encounters. It is another variant on PIP; which has the capacity to
- join several files into a single file, as would be done by the command line:
-
- PIP UNION=A,B,C,D
-
- The variation consists in joining the files in a way that will preserve their
- individuality so that they can later be separated from one another. For binary
- files this is hard without prefacing the union with some sort of directory,
- but for ASCII files some kind of mark can be used to separate them.
-
- If the mark is ASCII text, we have to have some assurance that it will not
- occur naturally in the texts that we are going to join. For example it is
- risky to use the word end because it is a segment of render, trend, endeavour,
- and many others. Quoting it is safer, but to say that "end" was a terminator
- wouldn't work in this very file. Non-text, such as ^Z, would be safer but would
- confuse PIP or TYPE. ASCII claims that ^\ is a "file separator"; it might do.
-
- [PAK.CNV]
- [Harold V. McIntosh, 18 March 1984]
- [Make composite file from many individual files.]
-
- [transcribe file]
- (()()()( ((^Z),(%W,(P),(^\MJ)));
- (,(%W,(P),<=>(^MJ))(R)): )) a
-
- [main loop - run through files]
- (()(
- ((%r,<7><8>)) R
- )(0 8)(
- [avoid .PAK & .BAK] ((and,<-->(^@),<-->.(or,B,P)AK)<0>,<0>):
- [separate filename] (<8>(^@)<0>,<<
- [open file] >>(%Or,<7><8>)<<
- [insert its name] >>(%W,(P),[<8>](^MJ))<<
- [copy file] >>(a,(R))<<
- [close file] >>(%C,<7><8>)<<
- [go to next] >><0>):
- )) x
-
- [more ...]
-
- [form file list]
- (()()(0)( (Not Found(^@)<0>,<0>); (<[9]><0>,(%A)(^@)<0>): )) y
-
- [choose and open output file]
- (()(
- (<6>.PAK) P
- )(6)(
- [no more files] (Not Found,<=>);
- [avoid .BAK, .PAK] (<-->.(or,B,P)AK,(%A)):
- [separate filename] (<[9]><6>.,<<
- [open .PAK file] >>(%Ow,(%T,(P)))<<
- [now process list] >>(x,(y,(%S,<7><1>)(^@)))<<
- [close .PAK file] >>(%C,(%T,(P)))<<
- >>);
- )) z
-
- [main program]
- (()()(1 7)( ((and,(or,<[1]>:,)(ITR,<-->\),<7>)<1>,(z,(%S,<7><1>))); ))
-
- [end]
-
-
-
- PAK.CNV is cluttered, but still long enough to require two panels. Even so,
- it is a simple succession of nested programs:
-
- main bind disk unit, file name (which is probably an ambiguous
- reference) locate first instance of the file
-
- z search for first plausible family name, which with the
- extension .PAK, will become the output file. Set up a loop
- which will open the output file, run through the files to
- be loaded into it, and finally close it.
-
- y form the list of candidate files to be packed
-
- x the main loop, which opens each acceptable file (.BAK, and
- .PAK files are rejected), reads it and writes it into the
- .PAK file, then closes it (necessary, since there is a limit
- to how many files may be simultaneously open)
-
- a responsible for copying each individual file
-
- The packed files are separated by a line containing ^\ (1CH); it is easier
- for unpacking if this mark occupies its own line.
-
- There is, of course, a complementary program which restores the original
- programs form the packed file. It is somewhat simpler to write because the
- file names to be used are predetermined and only have to be read out of the
- text, taking advantage of the fact that they follow the separator ^\. About
- the only new technique to be found in this example is the cycle of opening,
- writing, and closing the files embedded in the master file.
-
- The complementary program is shown in the next panel.
-
- There is one detail concerning file acquisition which is common to all
- the programs we are showing: The pattern in the main program binds a
- (possibly null) pathname to variable <7>. This may contain a disk identifier
- and subdirectory names terminated in backslashes. It is needed to rebuild
- complete pathnames to the files being referenced, since the BDOS functions
- invoked by %S and %A return simple file names, not pathnames. The pattern
- <[9]> in the second rule of function y and third rule of z skips over
- attribute, time stamp and length information provided by %S and %A.
-
- Note that the PAK file name (generated by skeleton P in function z) does not
- include variable <7>, so that the packed file is created in the current
- directory of the current disk (which may well be different from whatever is
- bound to <7>).
-
- [UPAK.CNV]
- [Harold V. McIntosh, 18 March 1984]
-
- [locate file name]
- (()( (<0>.<1>) V ((%W,(V),<=>(^MJ))) W )(0 1)(
- ((^Z),);
- ([<0>.<1>],(%Ow,(%T,(V)))(b,(P))(%C,(V))(P)):
- (,Bad .PAK file);
- )) a
-
- [transcribe file]
- (()()()( ((^\),); (,(W,<=>)(P)): )) b
-
- [main program]
- (()( ((%r,<9>)) P )(9)(
- ((and,<9>,<-->.PAK),(%Or,<9>)(a,(P)));
- ((ITR,<-->\)<-->.<[1]>,Bad extension);
- (<-->.<>,<=>PAK):
- (,<=>.PAK):
- ))
-
- [end]
-
- As a final example of a program which can scan a series of files, let us
- consider one which makes selective erasures from the directory. Service
- programs with this capability are not rare; let us make this one more
- instresting by giving it the capability of scanning the file to be erased
- to facilitate the decision whether to erase it or not. To do this it employs
- the function &p which replaces each non-printable ASCII character by a dot.
- It is the function used in DDT.COM and some other programs to permit listing
- general binary files without risking the untoward action of some of the ASCII
- control characters.
-
- To check whether a file is null -- that is, a directory entry possessing zero
- sectors -- or just to refresh your memory for a file you have forgotten about,
- type an x to have the first 70 bytes of the file placed on the screen, but
- filtered by the "dot" function.
-
- A useful interactive program should be as liberal with messages, supplementary
- advice, and comments as necessary to make it helpful to the user. There is
- also an art to tastefully concealing all the additional information and
- handholding from the experienced user who does not want to endure lengthy
- explanations during every session.
-
-
-
- [BORRA.CNV]
- [G. Cisneros, 2Jan84; HVM, 21Mar84]
-
- [Get next name]
- (()( (<7><1>) F )(1 2)(
- (q:,);
- (<1>(^@)<2>,<<
- >>(%t,Erase (F)? <(>y/erase, q/quit, x/examine, other/keep<)>)<<
- >>(d,(&l,(%r,TTY:)))<2>): )) a
-
- [Delete, Quit, Show, or Keep]
- (()()()( (y,(%D,(F))); (q,q:(%W,TTY:,(F): Kept; end.));
- (x,(%Or,(F))(%t,(&p,(%r,(F),<[70]>)))(%C,(F))(&l,(%r,TTY:))):
- (,(%W,TTY:,(F): Kept)); )) d
-
- [Assemble directory entries in WS]
- (()()(0)( (Not Found(^@)<0>,<0>); (<[9]><0>,(%A)(^@)<0>): )) b
-
- [Main program: search for first]
- (()()(7)( ((and,(or,<[1]>:,)(ITR,<-->\),<7>),<<
- >>(a,(b,(%S,<=>)(^@)))); ))
- [end]
-
- :Further Examples.
-
- To round out our presentation of input-output and file handling programs,
- we show some service routines. They are presented here in a very abbreviated
- form to confine them to a single panel, but having followed the discussion
- of how to run through families of disk files, how to add more interactive
- console messages to the programs, and so on, anyone could adapt them.
-
- One of the useful utility functions which were included in Ward Christensen
- and Randy Suess' CBBS (R) programs which were available from them at one time
- was a function FIND.COM, which scanned a family of disk files to locate one
- or the other of a series of phrases which one could place on the command line.
- The evident purpose of this utility was finding lost messages when some mishap
- befell the disk which was in the system.
-
- This program was generalized to FYNDE.COM, included as number 165.12 in SIG/M
- disk #165. For the purpose of comparison, we have used Convert to reproduce the
- original FIND.COM. As a binary program it is much longer, much slower; but it
- was written and tested during an afternoon and can readily be modified in
- several directions as fast as the program can be modified with an editor and
- recompiled. To get the full generality of FYNDE.COM, Convert's ability to
- compile and execute Convert programs from within Convert programs can be used.
-
- [FIND.CNV]
- [Harold V. McIntosh, 22 March 1984]
-
- [A program which will scan a family of files looking for a keyword.
-
- The control line
-
- REC86 FIND FAMILY.*
-
- will prompt for a key phrase,
-
- > Search phrase?
-
- and then report all the lines in the search family which contain that
- word or phrase. Tabs may be included in the phrase. The exact case shift
- shown will be used, as well the exact number of spaces. Totals per file
- and a grand total will also be reported.]
-
-
-
-
- [More ...]
-
- [scan file]
- (()()(0)(
- [end] ((^Z),);
- [found] (<--><6>,(C)(T)(%W,TTY:,(K): <=>(^MJ))(R)):
- (,(,(K))(R)):
- )) a
-
- [main loop - run through files]
- (()( ((%r,<7><8>)) R ((%r,CTR:LINE)) K
- ((,(%r,CTR:CASE))) C ((,(%r,CTR:TOTL))) T )(0 8)(
- [avoid binary files] ((and,<-->(^@),<-->(or,EXE,COM,OBJ)(^@))<0>,<0>):
- [separate filename] (<8>(^@)<0>,<<
- [initialize counter] >>(%W,CTR:LINE,1,1)<<
- [initialize instance] >>(%W,CTR:CASE,0,1)<<
- [open file] >>(%Or,<7><8>)<<
- [type filename] >>(%t,-----(>) File: <7><8>(^MJ))<<
- [scan file] >>(a,(R))<<
- [close file] >>(%C,<7><8>)<<
- [report instances] >>(%t,Lines Found: (%r,CTR:CASE)(^MJ))<<
- [go to next] >><0>):
- )) x
-
- [More ...]
- [form file list]
- (()()(0)( (Not Found(^@)<0>,<0>); (<[9]><0>,(%A)(^@)<0>): )) y
-
-
- [bind search phrase, look for file]
- (()()(6)( (<6>,(x,(y,(%S,<7><8>)(^@)))); )) z
-
-
- [main program]
- (()()(7 8)(
- ((and,(or,<[1]>:,)(ITR,<-->\),<7>)<8>,<<
- >>(%Or,CTR:LINE)<<
- >>(%Or,CTR:TOTL)<<
- >>(%Or,CTR:CASE)<<
- >>(%t,Search phrase?)<<
- >>(z,(%r,TTY:))<<
- >>(%t,Total Lines Found: (%r,CTR:TOTL))<<
- >>);
- ))
-
- [end]
-
-
- One possible variant on the theme of FIND.CNV is to produce the line bearing
- the phrase sought in the form of a KWIC index. KWIC means "keyword in context,"
- and is a technique deriving from the days of punched cards. Textual material,
- for example a bibliography, was scanned for the presence of a certain phrase,
- or keyword; cards bearing the designated phrase were listed on the printer. For
- the presence of the keyword to be more obvious, the line was rotated, so that
- the keyword occupied a central position in the printed line, the same position
- for all the lines so that they could be quickly scanned to see how each one of
- them used the target word or phrase.
-
- KWIC indices can be elaborated to a considerable degree. For example, the
- keywords can be derived from the source text itself, listing all possible
- words as they occur in all possible sentences, after discarding such trivial
- occurrences as a, and, the, and other high-frequency English words.
-
- Beware of the program shown in the next panel -- it processes only a single
- file and not a family of files. However, it is a simple modification to give
- this capability, as well as to permit the use of more than one keyword, to
- rotate the line rather than just windowing it, and so on.
-
- Note the use of <[-25]>, which matches the remaining text EXCEPT the rightmost
- 25 bytes, and fails if the remaining text is less than 25 bytes long.
-
- [KWIC.CNV]
- [Harold V. McIntosh, 22 March 1984, G. Cisneros, 8.9.1990]
- [[KWIC Index]]
-
- [Bind keyword]
- (()( ( ) S )(8)( (<8>,(b,(e,(%r,<9>)))); )) a
- [KWIC line] (()()(0 1)( ((^Z),);
- (<0><8><1>,(%t,(c,<0>) <8> (d,<1>))(e,(%r,<9>))):
- (,(e,(%r,<9>))): )) b
-
- [left segment] (()()(0)( (<[-25]><0>,<0>); (<0>,(S)<0>): )) c
-
- [right segment] (()()(0)( ((and,<[25]>,<0>),<0>); )) d
-
- [find tabs] (()()(0 1 2)(
- ((and,<[8]>,<0>(^I)<1>)<2>,(f,<0>)(e,<1><2>));
- ((and,<[8]>,<0>)<2>,<0>(e,<2>));
- (<0>(^I)<1>,(f,<0>)(e,<1>)): )) e
- [expand tabs] (()()(0)( ((and,<[8]>,<0>),<0>); (<0>,<0> ): )) f
-
- [main program] (()()(9)( (<9>,(%Or,<9>)(a,(%r,TTY:Keyword? ))); ))
- [end]
-
- Another of the utilities on the disk SIG/M #165 is BINCOM.COM, which may
- be used to compare two binary files to see whether they are identical. Even
- though it contains no adjustment to pick up synchronism after encountering
- an insertion or deletion, it is still a very useful program. One use consists
- in verifying that a dissasembly has been correctly done by comparing the newly
- assembled binary program with the original binary source; as discrepancies are
- found they can be used to refine the source code.
-
- In the next panel we show BINCOM.CNV, which is the same program written with
- Convert. The source is quite concise, less than a page of code. The running
- speed is somewhat bound by the velocity of transmission to the terminal, but
- cannot help being slow in comparison to the assembly language program.
- Output is written to the standard output, so that redirection may be used
- if a list of discrepancies is to be kept for browsing with an editor.
-
- Should a modification of BINCOM be attempted, the Convert version is clearly
- advantageous; not only was the program set up with about an hour's work, any
- modification will require a similar time scale. For example, the bytes examined
- could be tested to see whether they were among the 8086 instructions which use
- an address. Knowing that two programs were closely similar except for the
- widespread occurrence of address shifts caused by insertions or deletions would
- make the comparison of two versions of a program much easier.
-
- [BINCOM.CNV]
- [Harold V. McIntosh, 22 March 1984]
- [Convert version of program SIG/M 165.04 which will compare two binary files]
- [read two] (()()(3 4)(
- (:<>,);
- (:<[1]><>,(%W,,<1> shorter(^MJ)));
- (<[1]>:<>,(%W,,<2> shorter(^MJ)));
- (<3>:<3><>,(,(%r,CTR:BYTE))(1):(2)):
- ((and,<[1]>,<3>):<4>,<<
- >>(%W,,Bytes (&Dh,(%r,CTR:BYTE)) different: <<
- >>(&h,<4><3>) (&p,<3><4>)(^MJ))<<
- >>(,(%r,CTR:MISM))<<
- >>(1):(2)):
- )) a
- [main] (()( ((%r,<1>,<[1]>)) 1 ((%r,<2>,<[1]>)) 2
- )(1 2)( (<1> <2>,<<
- >>(%Or,<1>)(%Or,<2>)(%Or,CTR:BYTE)(%Or,CTR:MISM)<<
- >>(a,(1):(2))<<
- >>(%W,,(%r,CTR:BYTE) bytes read(^MJ)<<
- >>(%r,CTR:MISM) mismatches found(^MJ)));
- (,Usage: rec86 bincom file1 file2);
- ))
- [end]
- :[CNVPRG.HLP]
- [Harold V. McIntosh, 27 March 1984]
- [Rev.: G. Cisneros, 23 January 1986]
- [Rev. for MS-DOS: G. Cisneros, 8 October 1990]
- [end]