Columbia Kermit

home *** CD-ROM | disk | FTP | other *** search

/ Columbia Kermit / kermit.zip / archives / k12.tar.Z / k12.tar / k12mit.not < prev next >

Wrap

Internet Message Format | 1992-07-12 | 28KB

Date: 7/11/92 15:00 EDT From: Charles Lasner (lasner@watsun.cc.columbia.edu) To: pdp8-lovers@ai.mit.edu Subj: Announcement of new Kermit-12 utilities Currently available at watsun.cc.columbia.edu via ftp in /pub/ftp/kermit/d/k12*.* are two new files: k12enc.pal and k12dec.pal. The new files have internal dates of 8-Jul-92. Several other documentation files with similar dates are also available. Anyone who last retrieved Kermit-12 files prior to Feb, 1992 should retrieve much of the current collection which has changed since then. These are the newly updated ENCODE and DECODE programs for moving around OS/8 binary files in a reliable manner. There is several new feature added to both: the ability to deal with "raw" image data sent as a single file, and also sent split into two files. Consider the following usage: .R ENCODE *BIGGY.EN<RXA1:/I=756 When the input specification is a device only AND the /I switch is given AND a non-zero equals parameter is given (and neither /1 nor /2 are given; see below), then the ENCODEd file is created containing the entire logical device for the length specified in the equals parameter. In the example, the length of an RX01 in OS/8 terms is used. Note that the device need not be an OS/8 "legal" structure, and there is no individual input file significence either. The user must set the length to whatever is appropriate. Too large a number could result in program abort due to handler-reported input error. The minimum length is 1; a zero-value is the same as leaving out the parameter. (This is a restriction of OS/8 itself.) Input data always starts at OS/8 logical record 0000. Note that the device must be definable within the scope of the selected OS/8 handler itself, since it is used for all input I/O using "raw" sequential calls. For example, an OS/8 bootable RX01 standard-format diskette would be acceptable for encoding here, while a WPS document disk would not be. The reason is that the handler does not access track zero (not a WPS issue) and also does all reads in 12-bit mode, while WPS documents are written in 8-bit mode, portions of which are lost when using a 12-bit mode handler. By using another (user-written) handler that supplies 8-bit transfers, the WPS document disk CAN be successfully transferred to ENCODE format intact. (Note: there are no current 8-bit mode handlers for RX02 and RX50, just RX01; this is a separate problem being addressed elsewhere.) Output DECODing is accomplished in a symmetrical manner: .R DECODE *RXA1:<BIGGY.EN/I=756 When used in this manner, the equals parameter need not be exact, but must be at least as large as the parameter used when encoding. A smaller value can be used to limit the transfer deliberately where appropriate. For example, using =7 will only transfer the first seven records of the ENCODEd file onto the target device. If the ENCODEd file contains a valid OS/8 device in complete image form, this amount will suffice to transfer only the directory to the target device. This effectively aborts the transfer, so the user can look at the directory with DIR, etc., to determine suitability of the larger transfer, etc. Warning: This usage writes over the output device starting at logical record 0000 (or higher; see below). The (possibly present) previous OS/8 structure will be destroyed. This is the intended function of this usage as the new data is meant to be moved to the beginning of an empty device. Image splitting: As pointed out by a user, if the only device available on a PDP-8 system is being used to acquire an image copy of the media, then this is likely to be impossible to carry out. This is the case in some systems whose only disk device is a pair of RX01, or RX02, or RX50 in the case of the DECmates. The reason is that the image of a device is likely to be larger (as an ENCODEd file) than the OS/8 capacity of the device itself! Since the repeat-compression of the encoding process isn't a reliable phenomena (it's data-dependent), it is likely that some disk-image files will be too large. Of course, if less than an entire device need be transferred, the equals parameter could be set to a shorter value where appropriate, but this prevents a complete transfer. (Alternatively, the data should be placed on a disk where the unused space has a repetitive pattern, such as that left after a format operation, or all zeroes, etc. This would allow the repeat-compression of the ENCODE process to reduce these areas to insignificent quantities in the ENCODEd file.) Of course, this still doesn't address the problem of sending an entire device, such as a fully-loaded diskette. To solve this problem, an additional option has been implemented to split the data into two nearly equal parts before encoding. Assuming the worst case of no repeat compression optimization, the ENCODEd image file representing half of the target disk data should fit on a bootable OS/8 disk. This disk would only contain the system head, the DECODE.SV program, and the ENCODEd data file. From this minimal system, the data can be written to another drive. Then the entire process can be repeated using a similar disk containing instead the second-half ENCODEd file. (There is no hope for a machine with only one disk!) Assuming our original RX01 example again: .R ENCODE *BIGGY1.EN<RXA1:/I/1=756 *BIGGY2.EN<RXA1:/I/2=756 This creates two image encoding files, each containing half of the device image. (In the case of an odd size, the second half is one longer than the first half.) Note that the size must be stated exactly so that the proper split occurs. .R DECODE *RXA1:<BIGGY1.EN/I/1=756 *RXA1:<BIGGY2.EN/I/2=756 This creates the disk image from the ENCODEd files. Note that the size must be stated exactly as it was when the original files were created so that both the length and position of the files (especially the second-half file) are correct. To aid in restoration of the disk, the (FILE statement within the ENCODEd file indicates that it is a block-image file, not an individual file with a file name. Additionally, the =xxxx value used to create the file is given to guide the decoding process. If the ENCODEd files are split files, then it is further indicated that the file is a first-half or second-half of an image file as necessary. As of this version (2.1), the ENCODE program also inserts a (REMARK indicating the program version and other useful information. Overall program operation for all other operations remains unchanged. As before, program documentation is contained in the source files. Except for the new image transfer features, the programs are totally compatible with their previous respective versions. (Note, the latest and previous versions are NOT compatible with the original "ancient" version which was "provisionally" released while the encoding format was still being formulated! Thus, there is one incompatible original version, and all following versions are compatible except for added features.) cjl ------------------------------ Date: 3/11/92 00:48 EST From: Charles Lasner (lasner@watsun.cc.columbia.edu) To: pdp8-lovers@ai.mit.edu Subj: Announcement of new Kermit-12 utilities Currently available at watsun.cc.columbia.edu via ftp in /pub/ftp/kermit/d/k12*.* are two new files: k12enc.pal and k12dec.pal. The new files have internal dates of 1-Mar-92. These are the newly updated ENCODE and DECODE programs for moving around OS/8 binary files. There is a new feature added to both, the ability to deal with "raw" image data. Consider the following usage: .R ENCODE *BIGGY.EN<RXA1:/I=756 When the input specification is a device only AND the /I switch is given AND a non-zero equals parameter is given, then the ENCODEd file is created containing the entire logical device for the length specified in the equals parameter. In the example, the length of an RX01 in OS/8 terms is used. Note that the device need not be an OS/8 "legal" structure, and there is no individual input file significence either. The user must set the length to whatever is appropriate. Too large a number could result in program abort due to handler-reported input error. The minimum length is 1; a zero-value is the same as leaving out the parameter. (This is a restriction of OS/8 itself.) Input data always starts at OS/8 logical record 0000. Note that the device must be definable within the scope of the selected OS/8 handler itself, since it is used for all input I/O using "raw" sequential calls. For example, an OS/8 bootable RX01 standard-format diskette would be acceptable for encoding here, while a WPS document disk would not be. The reason is that the handler does not access track zero (not a WPS issue) and also does all reads in 12-bit mode, while WPS documents are written in 8-bit mode, portions of which are lost when using a 12-bit mode handler. By using another (user-written) handler that supplies 8-bit transfers, the WPS document disk CAN be successfully transferred to ENCODE format intact. (Note: there are no current 8-bit mode handlers for RX02 and RX50, just RX01; this is a separate problem being addressed elsewhere.) Output DECODing is accomplished in a symmetrical manner: .R DECODE *RXA1:<BIGGY.EN/I=756 The equals parameter need not be exact, but must be at least as large as the parameter used when encoding. A smaller value can be used to limit the transfer deliberately where appropriate. For example, using =7 will only transfer the first seven records of the encoded file onto the target device. If the encoded file contains a valid OS/8 device in complete image form, this amount will suffice to transfer only the directory to the target device. This effectively aborts the transfer, so the user can look at the directory with DIR, etc., to determine suitability of the larger transfer, etc. Warning: This usage writes over the output device starting at logical record 0000. The (possibly present) previous OS/8 structure will be destroyed. This is the intended function of this usage as the new data is meant to be moved to the beginning of an empty device. Overall program operation for all other operations remains unchanged. As before, program documentation is contained in the source files. Except for the new image transfer feature, the programs are totally compatible with their previous respective versions. (Note, the latest and previous versions are NOT compatible with older versions!) There are no plans to upgrade the .BOO format encode and decode programs, since the .BOO format has no checksum protection. This is deemed a necessary feature in a complete device image transfer, due to the likelihood of a long file, etc. Since the maximum length of an OS/8 file is limited to 4088 records, the raw input will of necessity be limited to somewhat less than this size of records, depending on the efficiency of the run-length compression inherent in the data encoding. cjl ------------------------------ Date: 10/18/91 20:00 EDT From: Charles Lasner (lasner@watsun.cc.columbia.edu) To: pdp8-lovers@ai.mit.edu Subj: Announcement of additional KERMIT-12 utilities. While no changes have been made to the body of KERMIT-12 itself, several things have been changed/added. At the request of the KERMIT distribution service (KERMSRV) certain files have been slightly modified so they are acceptable to that bitnet, etc. facility. (Seems to be a problem with LRECL>80.) All files are now 80 or less. Except for the .DOC file, all it took was a little "cosmetic surgery" on a few lines. FTP'd copies are mostly unaffected. Most of the problems have to do with interpretation of the inter-page FF character being treated as the first character of the "record" in this non-stream-oriented system. At this time there is no actual doc file, as the file K12MIT.DOC is merely a truncation of the listing of K12MIT.PAL as passed through PAL8 and CREF. Anyone with a system big enough to support a 200K+ long source file can create this file themselves. In addition, due to certain quirks within PAL8 and CREF "beating" against unix line conventions, the file K12MIT.DOC at watsun.cc.columbia.edu was slightly different from the precise output of the assembly process, but again, only a cosmetic change. Since this file greatly exceeded the KERMSRV restriction, it has been withdrawn in favor of the source fragment equivalent to it taken directly from K12MIT.PAL. This source fragment is short enough that even an RX01-based OS/8 system can create the listing file from it thus recreating the original K12MIT.DOC locally. All this will disappear in the future when a "proper" doc file appears. In the meantime, K12MIT.DOC in whatever form it is available contains hardware hints and kinks, assembly options, and other info useful to users and anyone interested in the "innards" of the program, as well as an edit history of how K12MIT got to be where it is now starting from its "grandfather" K08MIT. It ends at the first line of the code in K12MIT.PAL, but includes all of the special purpose definitions particular to the various devices supported, such as DECmate I, DECmate II, etc. Any changes to customize KERMIT-12 are still accomplished using the separate patch file K12PCH.PAL which is unchanged. New files cover two areas: 1) direct loading without KERMIT-12, and 2) .BOO format support. 1) Many users have the hardware for running KERMIT-12, but don't already have it or another suitable program to acquire it yet, a real "catch-22" situation. Towards that end, a set of utilities has been provided to directly load KERMIT-12 without already having it. Most PDP-8 sites do have access to some other machine. Hopefully, the serial connection to be used is fairly "clean" and error-free, or at least some of the time. These programs depend on this fact. This could either be a connection to a remote multi-user system or something like a null-modem connection to a nearby IBM-PC. The programs assume only a few things: a) The connection is error free. b) The other end doesn't absolutely require anything be sent to it to send data to the PDP-8 end. (The -8 end will not send ^S/^Q or anything like that because this is unnecessary; all data goes only into PDP-8 memory directly.) c) The other end will send the data at a time controlled from its end, or after at most one character sent from the PDP-8 end of the link. The first situation is illustrated by the example of a PC connected to the -8. The -8 program is started, and it waits indefinitely after the -8 user presses any one key. (The corresponding character is sent to the PC where it is ignored.) The PC end is initiated with a command such as COPY K12FL0.IPL AUX: and the data goes to the -8. The second situation is illustrated by a remote system where a command such as TYPE K12FL0.IPL is available. The delimiting CR is not typed at this time, and will be finished later by the loading program. The initial connection up until the TYPE command is not covered by the loading program itself, so the user must supply a basic comm program, which is possible to accomplish in about 10 words or less if the rates are "favorable", or worst-case, a terminal can be used and the line switched over to the -8 at the appropriate time. In any case, CR or other appropriate character is hit on the -8 and the loading program echoes it down the line (and on the console) to initiate the data down-load. d) The other end is assumed to send the file verbatim without insertion of <del> characters (octal 177) and upper-case/lower-case is preserved. If all of these assumptions are met, then the down-load accomplishs a partial acquisition of K12MIT.SV, the primary binary file of KERMIT-12. The process must be repeated several times to acquire all portions. If a local compare utility is available that can compare absolute binary files, perhaps the process can be totally repeated to assure reliable results by comparing runs. The method used is borrowed from the field-service use of a medium-speed serial port reader on the -8 for diagnostic read-in. This reader is *almost* compatible with the device 01 reader such as the PC8E. The difference is that the *real* PC8E is fully asynchronous, whereas the portable reader just spews out the characters without any protocol. The PC8E can't drop any characters in theory, although there are reports of misadjusted readers that drop characters at certain crucial data rates. (The PC8E runs at full speed if possible, and failing this falls back to a much slower speed. All operations depend on the use of the hardware handshakes of the IOTs etc., so nothing should be lost but throughput. Misadjusted readers may drop characters when switching over to the slower mode.) The reason the field reader is acceptable is that it is used only to load diagnostics directly into memory using the RIM and BIN loaders. These minimal applications can't possibly fall behind the reader running at full speed. This is the same principle used here to down-load KERMIT-12. The loading program is a 46 word long program suitable to be toggled into ODT and saved as a small core-image program. The user starts the program and then (at the appropriate time) presses one key (usually CR if it matters) and the loader waits for remote input. As the other end sends the data, it is directly loaded into memory. There is a leader/trailer convention, just like paper-tape binary, so at end-of-load the program exits to OS/8 at 07600. At this time the user issues a SAVE command. This completes the down-load of a single field of K12MIT.SV. At the current time, there are actually two fields of K12MIT.SV, namely 00000-07577 and 10000-17577, and there are two such loaders. There is no check for proper field, so the proper loader must be used with the proper data, else the fields will get cross-loaded and will certainly fail. Once the two fields are obtained as separate .SV files (named FIELD0.SV and FIELD1.SV) they can be combined using ABSLDR.SV with the /I switch (image mode) set. The resultant can be saved as K12MIT.SV. This, if all went well, is identical in every way to the distributed K12MIT.SV (which is only distributed in encoded form; see below). Actual file differences will only exist in the extraneous portions of the file representing the header block past all useful information and the artifacts of loading which represent 07600-07777 and 17600-17777 which are not used. This is the normal case for any OS/8 system when any file is saved. Merely saving an image twice will cause this to happen. At this point, K12MIT.SV can be used as intended, namely to acquire, via KERMIT protocol, the entire release. It is recommended that the provisional copy of K12MIT.SV be abandoned as soon as the encoded copy is decoded since the encoding process provides some assurances of valid data (using checksumming, etc.). This process can be accomplished on any KL-style -8 interface including PT08, etc., or on the printer port of VT-78 and all DECmates. When used on the DECmates, there may be some minor problems associated with the down-load which may have to be done as the first use of the printer port after power-on, or some other restriction. The loader includes a suggested instruction for DECmate use if problematic (and raises the program length to 47 words). Also, due to observed bugs in the operating system (OS/278 only), there are restrictions on the use of ABSLDR.SV that cause certain command forms to fail while other seemingly equivalent forms succeed! This is documented in the latest K12MIT.BWR file in the distribution. The command form stated in the K12IPL.PAL file is the only known form that works correctly on these flawed systems. The format for down-load files is known as .IPL or Initial Program Load format. It consists of a leader containing only lower-case letters (code 141-177 only) followed by "printable" data in the range 041 (!) through 140 (`). Each of the characters represents six bits of data, to be read left to right as pairs, which load into PDP-8 12-bit memory. The implied loading address is always to start at 0000 of the implied field. The leader comment contains documentation of which field of data from K12MIT.SV it is. The trailer consists of one lower-case character followed by anything at all. This is why it is crucial that DEL (177) not appear anywhere in the body of the file. Throughout the file, all codes 040 or less are ignored. This allows for spaces in the lower-case leader for better readability, and for CR/LF throughout the entire file. CR/LF is added every 32 words (64 characters) to satisfy certain other systems' requirements. The trailer contains documentation on a suggested SAVE command for the particular data just obtained. 2) PDP-8 ENCODE format is the format of choice to obtain binary OS/8 image files because of the validation techniques employed, etc. This is the standard method of distributing K12MIT.SV as well as other "critical" files such as TECO macros and other image files. In the MS-DOS world there exists another very popular format known as .BOO encoding. It would be useful to support this format on the PDP-8 as well. .BOO format files are smaller because they use six-bit encoding instead of five-bit encoding, or at least in theory. Both ENCODE and .BOO use repeat compression techniques, but ENCODE can compress 12-bit words of any value, while .BOO only compresses zeroes and that itself is based on a byte-order view of the data. PDP-8 programs often include large regions of non-zero words such as 7402 (HLT) which would not compress when looked at as bytes. Such files would show compression ratios quite different from the norm. In any case, .BOO format is useful on the PDP-8 because it allows inter-change with .BOO files created on other systems, such as PCs. This allows the exchange of unusually formatted files, such as TECO macros between PDP-8s and PCs. (Both systems support a viable version of TECO.) The new KERMIT-12 utilities include a .BOO encoder and .BOO decoder, known as K12ENB.PAL (or ENBOO.PAL) and K12DEB.PAL (or DEBOO.PAL) respectively. They use .BOO encoded files unpacked in the standard OS/8 "3 for 2" order to preserve the original byte contents when the files originate from other systems. (Technically, .BOO format doesn't require this, but the obvious advantages dictate it. Anything encoded into .BOO format must merely have a 24-bit data structure encoded into four six-bit characters, so in theory any encoding of two adjacent PDP-8 12-bit words would be acceptable. By additionally supplying the bits in OS/8 pack/unpack order guarantees the inter-system compatibility as well.) There is an inherent weakness in the original .BOO format which must be addressed. .BOO format files always end on one of two data fields: either a repeat-zero compression field, or on a 24-bit field expressed as four characters. Should the data in a 24-bit field consist of only two or even one bytes, there are one or two extraneous null bytes encoded into the field to complete it. Presumably the need to add the extra bytes is to allow validation of the format. In any case, only the encoder knows just how many (0, 1, 2) bytes are extraneous. We can presume that if the last byte is non-zero, it is significant. If the last two are both zero, then the last or possibly both are extraneous with no way to tell. On PC systems, the general trend is to ignore these one or two extra bytes because so far there haven't been any complaints of failure. I have personally discovered that a widely used PC .BOO encoding program (written in C) erroneously adds two null bytes as a short compression field beyond the data! This is not a .BOO format issue, but rather a genuine program bug. Apparently few PC users are concerned that encoding their files prevents transparent delivery to the other end. In the OS/8 world, the situation is quite different. Each OS/8 record is 256 words or 384 bytes. If even a single byte is added, this creates an additional all-zeroes record. Besides wasting space, it is conceivable that such a file could be dangerous to use under OS/8 depending on content. (Certain files, such as .HN files are partially identified by their length. File damage, such as lengthening a file from two to three records will confuse the SET utility, etc.) Many files cannot be identified as having been artifically lengthened (and may be hard to shorten!), so this must be avoided. I have invented a fix for the problem: repeat compression fields are expressed as ~ followed by a count. 2 means two null bytes and is thus the smallest "useful" field to be found. (It takes two characters to express what would take 2-2/3 characters in encoded format. One null would only take 1-1/3 characters, not two, so this case is vestigial, but must be supported for the benefit of brain-dead encoders.) The value of 0 means a count of literally zero, thus ~0 is a "NOP" to a decoder. I have successfully tested MS-DOS programs written in BASIC and C that decode .BOO files successfully even if ~0 is appended to the end with no ill effects. (They correctly ignored the appended fields.) In my encoding scheme, ~0 at the end of a data field containing trailing zeroes means to "take back" a null byte. ~0~0 means to take back two null bytes. Thus files encoded with ENBOO.PAL either end in a repeat-compression field as before, or in a data encoding field possibly followed by ~0 or ~0~0 if necessary. The corresponding DEBOO.PAL correctly decodes such files perfectly. Should files encoded with ENBOO reach "foreign" systems, they will do what they always do, i.e., make files one or two bytes too long occasionally, with no other ill effects. Files originating from such systems will certainly be lacking any trailing correction fields and will cause DEBOO to perform as foolishly as MSBPCT. Extraneous null bytes will appear at the end of the file in OS/8 just as in MS-DOS in this case. (Note that if the file length is not a multiple of 384 bytes, additional bytes are added by DEBOO as well, but this is not a design weakness of .BOO format. It is caused by the clash of fixed record size and a variable size format.) Hopefully, files originating on OS/8 will be decoded on OS/8 as well, thus preserving file lengths. Most "foreign" files will probably be ASCII, so the ^Z convention will allow removal of trailing null bytes at either end. It is hoped that MS-DOS and other systems "upgrade" their .BOO format files to be compatible with the PDP-8 version. All KERMIT-12 files are available via the normal distribution "paths" of anonymous FTP and/or KERMSRV. The user is directed to the file /ftp/pub/kermit/d/k12mit.dsk as a "roadmap" to the entire distribution. Each .PAL file includes assembly instructions. Most use non-default option switches and non-default loading and saving instructions, so each must be carefully read. The development support files (TECO macro, .IPL generator, recent copies of PAL8, CREF, etc.) are included in the total collection. Development is not possible on RX01 systems due to inadequate disk space, but RX02's are barely adequate with a lot of disk exchanges. (Future versions may require larger disks for development.) Charles Lasner (lasner@watsun.cc.columbia.edu)