Arawak OS/2 Shareware

home *** CD-ROM | disk | FTP | other *** search

/ Arawak OS/2 Shareware / PAKLED.ISO / program / inf01.doc < prev next >

Wrap

Text File | 1992-10-21 | 9.6 KB | 270 lines

Having become extremely frustrated by VIEW.EXE's penchant for windows that come and go, without even opening large enough to see everything in them, I thought I'd try to turn .INF files into something more conventional. While I don't have code to offer, I can tell you what I learned about .INF format--it was enough to produce more-or-less readable more-or-less plaintext from .INFs. I offer this in the hope that somebody will give the community a really nice, tasteful, convenient, doesn't-use-too-much-screen-real-estate .INF browser to replace VIEW.EXE. All of this was developed by looking at .INF files without any documentation of the format except what VIEW.EXE showed for a particular feature. I don't have a lot of personal interest in refining this document with additional escape sequences, etc., but I would be happy to correspond with someone who wanted to fill in the details, or to clarify anything that may be confusing. If someone could point us to an official document describing the format that would be most helpful. -- Carl Hauser (chauser.parc@xerox.com) All numeric quantities are least-significant-byte first in the file (little-endian). **** Types **** bit1 1 bit boolean int4 4 bit unsigned integer char8 8 bit character (ASCII more-or-less) int8 8 bit unsigned integer int16 16 bit unsigned integer int32 32 bit unsigned integer **** The File Header **** Starting at file offset 0 the following structure can overlay the file to provide some starting values: { char8 unknown[8]; // unknown purpose int16 ntoc; // 16 bit number of entries in the tocarray int32 tocstart; // 32 bit file offset of the start of the tocarray int32 tocstrlen; // number of bytes in file occupied by the // table-of-contents strings int32 tocstrtablestart; // 32 bit file offset of the start of the // strings for the table-of-contents int16 nslots; // number of "slots" int32 slotsstart; // file offset of the slots array int32 dictlen; // number of bytes occupied by the "dictionary" int16 ndict; // number of entries in the dictionary int32 dictstart; // file offset of the start of the dictionary } I think there's more to the header and that it describes the index, but I didn't decode that. **** The table of contents array **** Beginning at file offset tocstart, this structure can overlay the file: { int32 tocentrystart[ntoc]; // array of file offsets of // tocentries } **** The table of contents entries **** Beginning at each file offset, tocentrystart[i]: { int8 len; // 8 bit length of the entry including this byte bit1 haschildren; // following nodes are a higher level bit1 hidden; // this entry doesn't appear in VIEW.EXE's // presentation of the toc bit1 extended; // extended entry format bit1 stuff; // ?? int4 level; // nesting level int8 ntocslots; // number of "slots" occupied by the article for // this toc entry } if the "extended" bit is not 1, this is immediately followed by { int16 tocslots[ntocslots]; // indices of the slots that make up // the article for this entry char8 title[]; // the remainder of the tocentry // until len byteshave been used } if extended is 1 there are intervening bytes that (I think) describe the kind, size and position of the window in which to display the article. I haven't decoded these bytes, though in most cases the following tells how many there are. Overlay the following on the next two bytes { int8 w1; int8 w2; } Here's a C code fragment for computing the number of bytes to skip int bytestoskip = 0; if (w1 & 0x8) { bytestoskip += 2 }; if (w1 & 0x1) { bytestoskip += 5 }; if (w1 & 0x2) { bytestoskip += 5 }; if (w2 & 0x4) { bytestoskip += 2 }; skip over bytestoskip bytes (after w2) and find the tocslots and title as in the non-extended case. **** The Slots array **** Beginning at file offset slotsstart (provided by the file header) find { int32 slots[nslots]; // file offset of the article // corresponding to this slot } **** The Dictionary **** Beginning at file offset dictstart (provided by the file header) and continuing until ndict entries have been read (and dictlen bytes have been consumed from the file) find a sequence of null-terminated strings. Build a table mapping i to the ith string. { char8* strings[ndict]; } **** The Article entries **** Beginning at file offset slots[i] the following structure can overlay the file: { int8 stuff; // ?? int32 localdictpos; // file offset of the local dictionary int8 nlocaldict; // number of entries in the local dictionary int16 ntext; // number of bytes in the text int8 text[ntext]; // encoded text of the article } **** The Local dictionary **** Beginning at file position localdictpos (for each article) there is an array: { int16 localwords[nlocaldict]; } **** The Text **** The text for an article then consists of words obtained by referencing strings[localwords[text[i]]] for i in [0..ntext), with the following exceptions. If text[i] is greater than nlocaldict it means 0xfa => end-of-paragraph 0xfc => if in-an-example then end-of-line else spacing = !spacing // see below 0xfd => if in-an-example then end-of-line else spacing = TRUE 0xfe => space 0xff => escape sequence // see below When spacing is true, each word needs a space put after it. When false, the words are abutted and spaces are supplied using 0xfe or the dictionary. Examples are entered and left with 0xff escape sequences. the variable "spacing" is initially TRUE.. **** 0xff escape sequences **** These are used to change fonts, make cross references, enter and leave examples, etc. The general format is { int8 FF; // always equals 0xff int8 esclen; // length of the sequence including esclen (but // excluding FF) int8 escCode; // which escape function } escCodes I have partially deciphered are 0x02 or 0x11 => (esclen==3) goto horizontal position. The remaining byte is an int8 describing the position. 0x00 is the left margin and starts a new line; I don't know the units for other values. 0x04 => (esclen==3) change font. The remaining byte is an int8 denoting the font: I've determined 0x00 is normal, 0x01 is italic and 0x02 is bold in VIEW's presentation. 0x05 or 0x07 => (esclen varies) beginning of cross reference. The next two bytes of the escape sequence are an int16 index of the tocentrystart array. The remaining bytes describe the size, position and characteristics of the window created when the cross-reference is followed by VIEW. I have not decoded this. 0x08 => (escLen==2) end of cross reference introduced by escape code 0x05 or 0x07 0x0B => (escLen==2) begin example. set spacing to FALSE 0x0C => (escLen==2) end example. set spacing to TRUE 0x0F => if esclen==5 an inlined cross reference: the title of the referenced article becomes part of the text. This is probably the case even if esclen is not 5, but I don't know the decoding. In the case that esclen is 5, I don't know the purpose of the byte following the escCode, but the two bytes after that are an int16 index of the tocentrystart array. 0x19 => (esclen==3) change font? I haven't checked VIEW's decoding of the next byte. I used the same decoding as for 0x04 0x1C => (escLen==2) I don't know it's function. I just ignored it. I doubt that this is an exhaustive list of the possible escape codes, but it covers most of what I found in the Control Program, REXX, and CSet/2 references. With a little more work and some playing with the info compiler to produce (chosen-plaintext, ciphertext) pairs it shouldn't be hard to pick out the whole decoding including the window positions. One other transformation I had to make was of the character box characters. Maybe these are standard but they weren't in the font I was using. These characters appear in strings in the dicitonary. They are given here in octal together with their translation 020, 021 => blank seems satisfactory 037 => solid down arrow: used to give direction to a line in the syntax diagrams 0263 => vertical bar 0264 => left connector: vertical bar with short horizontal bar extending left from the center 0277, 0300 => top right or bottom left corner; one is one, the other is the other and I can't tell which from my translation 0301 => up connector: horizontal line with vertical line extending up from the center 0302 => down connector: horizontal line with vertical line extending down from the center 0303 => right connector: vertical bar with short horizontal bar extending right from the center 0304 => horizontal bar 0305 => cross connector, i.e. looks like + only slightly larger to connect with adjacent chars 0331, 0332 => top left or bottom right corner; one is one, the other is the other and I can't tell which from my translation History: October 22, 1992: version for initial posting