OS/2 Shareware BBS: 6 File

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Shareware BBS: 6 File / 06-File.zip / sphydir3.zip / sphydir.INF (.txt) < prev next >

Wrap

OS/2 Help File | 1995-06-26 | 240KB | 3,665 lines

ΓòÉΓòÉΓòÉ 1. The SpHyDir Project ΓòÉΓòÉΓòÉ SpHyDir is an Object Oriented tool that builds documents for Web Browsers such as Netscape, Mosaic, and Web Explorer. With SpHyDir the user concentrates on the important issues of content and overall document structure. Since SpHyDir automatically generates the HTML (Hypertext Markup Language), it is possible to generate flawless Web documents without studying obscure syntax diagrams. A Web document is an ordinary text file that contains formatting instructions in the form of HTML "tags". Normally, these tags are processed by a Web Browser (such as Netscape Navigator) and are then viewed on a computer screen. If the final image "looks right" then many authors are satisfied. This is resonable when producing a single personal home page. A library of information should adopt a common document structure. Similar information should have a common format. The same type of information should be presented in the same way on all files. The reader should be able to jump to releated information, or to view a sequence of documents covering a larger topic. Traditional printed books solve this problem with chapters, page numbers, a table of contents, an index, and the editorial control of a professional publisher. SpHyDir is the Structured Professional Hypertext Directory Manager. It is Structured because it sees a Library that contains Documents that are made up of interlinked HTML Files. Each File contains Sections (Chapter, Topic, Appendix) which in turn contain figures, paragraphs, lists, and tables. Each of these document elements is presented as an Object and the overall document is structured as a tree of such Objects. It is Professional because it makes it easy for the author to control more of the advanced Web features than any other tool. Simple HTML editors and Word Processing packages insert only the most basic tag options. A professional job requires more: alternate text should be provided if the user chooses not to fetch images automatically, image sizes should be included in each reference so the browser can operate at maximum efficiency, parameters should be used to generate special effects efficiently rather than transmitting images, etc. It is Hypertext because links between documents and to other files can be made through simple Drag-and-Drop. To build a link to a remote Web resource, just display the resource in Web Explorer. SpHyDir will pick the URL reference out of the WE window and attach it to text or to an image in the document under construction. The Directories of an HTML library contain related files. An editor or word processor handles one file at a time. SpHyDir manages the structure of interrelated files and automatically generates navigational information such as Next and Previous pointers. SpHyDir runs in OS/2 Warp and follows the Workplace model of its user interface. The user drags a document into the workarea. New elements are added to the document by dragging empty paragraph, image, list, or forms objects and dropping them into the document. Hypertext links are created by dragging files from the library or URL references from a Web Browser and dropping them on existing text or images. SpHyDir only reads Web (HTML) documents. However, after editing it produces both a revised *.HTM file and a second *.IPF file that is input to the OS/2 Help compiler. IPF source can be used to generate INF documentation and HLP program help files. INF files can be viewed in OS/2 and (with a tool from IBM) in Windows. Viewing information from a local hypertext file is faster, and there are additional keyword search functions not available through the Web. The document you are now viewing (and its related files) are also available as SPHYDIR.INF. Since this represents a useful example of many SpHyDir features, the source is available for download along with the SpHyDir program. ΓòÉΓòÉΓòÉ 2. Project Status ΓòÉΓòÉΓòÉ SpHyDir is updated irregularly. Generally something is posted at the start of the week and critical bug fixes can be posted at any time. Check here for the latest information. ΓòÉΓòÉΓòÉ 2.1. June 26 ΓòÉΓòÉΓòÉ On request, the IPF generation has been brought back to its prior level of support. SpHyDir should be able to generate INF files for HTML 2.0 constructs. At this time there is no attempt to map HTML 3.0 attributes to IPF or to try to deal with Tables and other new features. The SpHyDir.INF file is now up to date. Characters that are in the Latin-1 set (and therefore in Code Page 850) should display correctly in the INF file. If someone configures the ENTITIES, CHARIN, and CHAROUT tables for another code page, sets CODEPAGE in CONFIG.SYS, runs IPFC, and does not get proper display of characters in any standard Latin IBM code page, please report back to the author. ΓòÉΓòÉΓòÉ 2.2. June 21 ΓòÉΓòÉΓòÉ [Note: a defective version of the June 21 code was posted between Midnight and 11:00 EDT on that date. Please replace it with the corrected version] ΓòÉΓòÉΓòÉ 2.2.1. Editing Internation Character Sets ΓòÉΓòÉΓòÉ A World Wide Web has to deal with international character sets. Unfortunately, there are more characters in the World than can be easily handled in any simple one-byte encoding. Three solutions are available. 1. Unicode provides a two-byte character set that can handle all the Western languages and Chinese, Japanese, and Korean. This is the ultimate solution, but it is new and there are no good tools available. 2. There are a family of one-byte character sets that provide coverage for most languages. Since no single code can include the Western "Latin" alphabet and Hebrew, Arabic, and Cyrillic, the ISO proposed a set of 8859-x (where x=1...n) character sets covering each major alphabet group. The 8859-1 (also called "Latin 1") character set covers all the languages of Western Europe (and America, Australia, etc.). Starting with HTML 2.0, the Web standards hold that an HTML document is assumed to be in 8859-1 unless stated otherwise. The HTTP and some of the HTML <HEAD> area conventions may provide for the use of other 8859-x tables for other alphabets, though the Netscape Browser, for example, only supports 8859-1 and Japanese. 3. Since a complex document may include characters (including Math and other special symbols) from different 8859-x character sets, HTML has a notation for "Entities". An Entity reference begins with the "&" escape, then contains the character name, and ends in a semicolon. For example, it is becoming widely accepted that the copyright symbol Γòò can be represented by the Entity reference "©". From the beginning, SpHyDir read and created Entity references for the three special control characters "<", ">", and "&" (< > and &). The Entities were converted to the native character for easy display and editing. When HTML was generate, these characters were converted back to Entities. Starting with the May 28th release, SpHyDir supported foreign language Entity names, but without any translation. The "&" character was simply converted to a Smiley Face dingbat character (so that the real "&" character could be treated as normal text in expressions like "PC Lube&Tune"). A user could create new Entity references by entering the Smiley Face character in the Text Edit Window (you can enter any character in the PC set by typing its decimal value on the keypad while holding down the Alt key. Since Smiley Face is 1, hold down Alt, type "1" on the numeric pad, and release Alt). This made the entry of foreign characters possible, but not natural. The name of a landmark in the Stuttgart area had to be rendered with the Entity reference to the German sharp s (ß) as Schloßkirche (castle church) The best approach would be to display foreign characters natively, just as SpHyDir translates the < > and & characters natively. After trying unsuccessfully to come up with some magic application that would solve the problem SpHyDir now caves in and handles the problem with the OS/2 standard Code Page Solution. Now Schlo╤ükirche will display in its native format in the SpHyDir Workarea and in the Text Edit window, provided that you are using Code Page 850 or convert the supplied tables to work with some other Code Page. The original PC character set is known today as Code Page 437. It is the default and is the set you get if you don't know enough to change it in CONFIG.SYS. The 437 set does not have all the characters in the "Latin 1" group. However, it does have the most important Western European characters. Although SpHyDir recommends and supplies tables for the improved Code Page 850 (along with instructions to change the CODEPAGE statement in CONFIG.SYS to make it the default), a user who insists on keeping the old 437 Code Page could change the supplied tables to support it. SpHyDir ships files named ENTITIES.850, CHARIN.850, and CHAROUT.850. They should be copied to the root directory of the HTML library. ENTITIES.850 provides a mapping between the Entity names and the code assignments for the corresponding characters in the 850 Code Page. This is just a text file and it can be used as a model to produce an ENTITIES.437 if you insist on using the old defective 437 Code Page. It can also be changed to any other national use Code Page supported by OS/2. The other two files provide translate tables between the ISO 8859-1 encoding and the Code Page locations of the corresponding characters in the IBM numbered Code Page. This is a bit more tedious to edit, but it can be done. SpHyDir doesn't do anything special about the keyboard. The assumption must be that a user has already selected a Code Page and Keyboard layout that allow the entry of characters to the ordinary editors and windows of the system. Rather than trying to duplicate (or worse to override) that support, SpHyDir simply provide a simple mechanism completely under the user's control to translate HTML entities into the standard operating system character support. If a file is encoded in the ISO 8859-1 standard, the SpHyDir supplied translate tables will convert it to Code Page 850 as it is read in. This version of SpHyDir is biased to Entity notation, so when SpHyDir generates HTML it will use Entity notation for all characters that have values with defined Entity names. It will convert back to ISO 8859-1 only the characters that do not have Entity names. This version of SpHyDir will not display as native characters anything entered with the "numeric" entity notation. Only named characters will be translated for native display. A new section of the SpHyDir document describes Code Pages in general and the ENTITIES definition file in particular. ΓòÉΓòÉΓòÉ 2.3. June 20 ΓòÉΓòÉΓòÉ Saw a request on the net for JPEG support. Added JPG and JPEG (case insensitive) to the list of extensions regarded as IMG file types. At this time, SpHyDir does not know how to find the size of JPEG files, so it does not add WIDTH and HEIGHT attributes as it does for GIF. Discovered a user was still using TARGET. It was reported that if you drag the TARGET object from the toolbar (which was by the way blank now for some reason) and try to save the file, SpHyDir crashes. Well the TARGET object isn't supposed to be there, which is why the tool is blank. However, there really still was a TARGET tool under that blank face, and yes when you insisted on using it the HTML generation failed. This, however, also exposed a problem with the ToolChest, since SpHyDir would fail if the user created a tool with an unsupported type. Anyone who ran SpHyDir in the last few weeks has a TARGET tool in the ToolChest that won't go away until the TOOLDEF.TXT file is deleted from the root directory of the HTML Library. So HTML Generation is now protected against invalid object types. ΓòÉΓòÉΓòÉ 2.4. June 15 ΓòÉΓòÉΓòÉ Corrected a bug deleting Links. Added most remaining Netscape non-standard extensions. Note: Netscape has this terrible idea to make BORDER an attribute of IMG taking a numeric value. This contrasts to the use of BORDER in TABLE which is just a Yes or No. SpHyDir II doesn't currently have the ability to distinguish valid ranges of values when the same attribute has different types of values in different tags. Reluctantly, BORDER is now defined as taking any type of value. So while it was possible previously to simply select BORDER from the Properties popup menu of a TABLE object and it would be set to the right thing, now when you select BORDER you will get a dialog box and you have to type in "Yes" and press OK. ΓòÉΓòÉΓòÉ 2.5. IMG and FIG ΓòÉΓòÉΓòÉ SpHyDir has been struggling with the problem of Image Objects. The HTML 2.0 standard allows IMG tags to appear in the middle of sentences, headers, and captions. An IMG is regarded as a large letter. This is why the official ALIGN options for IMG (TOP, MIDDLE, BOTTOM) relate to how the image is aligned vertically with the characters immediately preceeding and following it. That is not the way that most people want to treat large images. Netscape proposed two additional alignments (LEFT, RIGHT) that allow text to flow around the margin of an image, much as word processors handle inserted graphics. HTML 3.0 also includes these values for ALIGN, but for the most part it wants to use FIG. A Figure is enclosed in the <FIG>..</FIG> tags. The FIG tag contains a SRC attribute that points to a graphic file, much as IMG. FIG has a number of extensions over IMG that might prove useful if any browser supported it. However, at this point no mainstream Web browser knows about FIG and all ignore it. Of course, they do not ignore the markup inside the FIG tag, just the tag itself. The ordinary text content of the FIG tag is supposed to be displayed on non-graphic browsers as an alternative to the image: <FIG SRC=monalisa.gif> A famous painting of a woman smiling. </FIG> The IMG tag has an ALT attribute that contains alternate text for the same purpose. However, the FIG tag can contain much larger descriptions with paragraphs, lists, headings, and all other document elements. A major objective is to allow detailed description of diagrams for readers who are visually impaired and use text-to-speech browsers. Currently, however, no major browser supports the FIG tag. Following standard practice, a browser ignores tags that it does not understand, but does not ignore their contents. This makes it difficult in the near term to use the FIG contents for its intended purpose since it will be displayed by graphic browsers like Netscape, Web Explorer, and Mosaic. However, FIG appears to be a way out of the hole that SpHyDir in which SpHyDir has fallen. That hole was caused by the desire to have an Image Object. An Object is a nice big thing. You can drop GIF files on it. You can link to it easily. You can parameterize it with properties. The problem, of course, is that an object cannot occur between a couple of words, and since HTML 2.0 wants to regard the IMG tag as part of the ordinary paragraph text, this in the long run produced all sorts of problems. The FIG is a much more suitable starting point for a Document Object. It stands along, like a paragraph. It even aligns LEFT, CENTER, RIGHT, and JUSTIFY like a paragraph. It cannot appear in the middle of a sentence or in a Heading. Since FIG isn't currently supported by browsers, there has to be a migration strategy. As with "<CENTER><P ALIGN=CENTER>", the SpHyDir approach to HTML migration is to do the thing "every which way" it can be done so that every browser must precisely what is intended. Of course, there is no way to fake the advanced features that caused FIG to be invented in the first place. The browsers will just have to catch up. However, it is possible right now to generate a FIG that can replace standalone and text-wraparound images: CODE<FIG SRC=monlisa.gif ALIGN=LEFT> <IMG SRC=monalisa.gif ALIGN=LEFT ALT="A famous painting of a woman smiling"> </FIG>/CODE 1. An HTML 3.0 graphical browser will understand FIG, display the GIF file, and ignore the contents. 2. An HTML 2.0 graphical browser will not understand FIG and will ignore it. It will not ignore the contents, and so will process the IMG tag. Again the GIF file is displayed. 3. A non-graphical browser may or may not understand FIG, but in any case will ignore it because the browser doesn't display images. It will then look at the IMG tag and again decide not to display the GIF, but it will print the ALT text. Although the GIF file appears in both the FIG and IMG tags, the two are mutually exclusive. One or the other will be processed, but not both. As to the ALIGN=LEFT, this is valid for a FIG in HTML3 and is a widely supported Netscape extension of IMG in HTML 2. SpHyDir 1 did not deal with IMG tags that were embedded in the middle of a paragraph or Heading. SpHyDir II accomodated such IMG tags by creating another "dingbat" sequence in the text. The 0x08 character, which looks like a box with a hole, or roughly "[o]", is used to start and end an IMG reference. Between these two dingbats are the name of the GIF file, then optionally a blank and the alternate text. However, in the first cut SpHyDir II continued to extract IMG tags from the start of a paragraph (or when they form a paragraph by themselves). They become IMG objects. Later on, SpHyDir II tries to decide if they should be merged back into the paragraph that follows them. The intent is to change this. Images that are contained in a paragraph of their own, or that have ALIGN values LEFT or RIGHT will be extracted as before to for a separate object. However, the object will then be regenerated as a FIG+IMG construct as described above. This will allow the Image Object to begin to have Properties derived from the more powerful FIG tag, instead of limiting it to the current IMG Properties. Conversely, IMG tags that fall at the start of a paragraph containing other text, and that have ALIGN values of TOP, MIDDLE, or BOTTOM, will be treated as embedded images and will be represented by the 0x08 dingbat character sequence. ΓòÉΓòÉΓòÉ 2.6. June 14 Update ΓòÉΓòÉΓòÉ The Text Edit Window now allows drag and drop. Position the cursor or select a phrase of text. Now Drag and Drop a GIF file anywhere on the Text Edit Window. This will generate an "embedded IMG". If no text is selected, the dingbat character and file name will be placed at the cursor position. If text is selected, then the selected text will become ALT alternate text within the dingbats. Select some text. Hold down Ctrl-Shift and drop any file from the HTML library on the Text Edit Window. The previously selected text becomes a hypertext link to the file. Previously it was necessary to save the text, Link-Drop a file on the Paragraph Object in the workarea, and then select the text from the Hotword Selection window. Allowing links to be formed directly in the Text Edit window simplifies this process. Note that links created this way are saved only if the rest of the edited text is saved. Pressing the Cancel Button or closing the Text Edit window cancells the new Links as well. Select text and drop a URL Object created by Web Explorer from the Workplace on the Text Edit Window. The selected text will be converted to a Link to the remote resource represented by the URL. Do not select text. Leave the cursor at an insertion point in the paragraph (usually after a blank). Drop a URL Object created by WE on the Text Edit Window. The title of the remote resource is inserted at the point of the cursor and becomes a hotlink to the document itself. Unfortunately, it is not possible to drop Link Manager list items on the Text Edit Window. To maintain the integrity of the data being edited, the Text Edit Window locks up the underlying Workarea until the edit completes. This also blocks the Link Manager from functioning. So links have to come from the WPS environment, or save the text and use the Link Manager as it has traditionally been used. Extra blank spaces and lines after the <LI>, <TH>, and <TD> tags were removed. They were errors picked up by Netscape producing undesired results. SpHyDir now declines to insert some of ending tags that nobody else bothers to generate. Tables generated too many </TH>, </TD>, and </TR> tags. It made the HTML ugly and hard to read. The temporary dialog for new Tables has been replaced by a more polished dialog box. Enter the number of rows and columns and select by a checkbox if they are to be labelled. Select a Table Row Object. Click the Second Mouse Button. Select Create Another from the popup menu. A new row is created with the same number of label and cell objects as the previous row. [Adding a new column is harder and is left to a later date.] When generating an Ordered or Unordered list, SpHyDir II has added an extra step because List Points no longer contain text. One might have to drop a new Point on the list, then go back and drop a new Paragraph on the Point. A shortcut is to popup the Second Mouse Button window for the previous Point and choose Create Another. This not only creates another Point object, but it also creates another Paragraph under it and opens the Text Edit window directly. More generally, Create Another populates the new Point with another object of the same type as the first object contained in the old point, so the trick works for Points of Images as well. Forms bugs caused by rewrite: TYPE did not default to TEXT, NAME attribute generate twice, SUBMIT incorrectly genned as HIDDEN. ΓòÉΓòÉΓòÉ 2.7. June 12 SpHyDir II ΓòÉΓòÉΓòÉ SpHyDir II now appears stable enough to remove some of the disclaimers. There are certain to be problems with the new HTML 3.0 tags that nobody is using, but more bugs have been fixed in the old code than seem to be problems with the new. SpHyDir II is now the "standard" distribution. Where the old code is mentioned, it is called "SpHyDir 1". The documentation has been updated. A user had problems with the small characters. SpHyDir now remembers the WPS font that has been dropped on the Workpace, Properties Table, Edit Windows, and Link Manager. The Link manager window can now be widened. ΓòÉΓòÉΓòÉ 2.7.1. Entity Syntax ΓòÉΓòÉΓòÉ Newer versions of the HTML standards pointed out a number of details about Entities. An ampersand is only regarded as a possible entity if it is followed by letters or numbers. The sequence "A & P" is legal. An entity doesn't require an ending ";" except to separate it from characters that could be part of the entity name. "A & P" is also valid. SpHyDir now recognizes these forms, though on output it always generates the full "A & P" in the output HTML. ΓòÉΓòÉΓòÉ 2.7.2. Target Objects become ID Property ΓòÉΓòÉΓòÉ After an initial false start, SpHyDir 1 support for Targets (the object corresponding to the HTML <A NAME=xxx> tag) became stalled. The problem is that HTML standards and use permitted the <A> tag to include text and Headings. Unfortunately, this might mean a construct of the form: CODE<A NAME=FUZZY>the end of one topic. <H2>Now for Something Completely Different</H2> On a completely unrelated matter, </A>/CODE This is perfectly legal HTML, but any attempt to make structural sense out of it is hopeless. HTML 3.0 presented a much better idea. Labels can be assigned to headers or paragraphs with the ID attribute. This presents a Hypertext label whose location and purpose is unambiguous. Unfortunately, ID is not widely supported. SpHyDir II takes its inspiration from the "Recommended" syntax of HTML 2 and 3. "Recommended" practice holds that an anchor should go inside a Header rather than including the header. This would produce CODE<H2><A NAME="Python Introduction"> Now For Something Completely Different</A></H2>/CODE One big advantage is that this syntax is effectively interchangeable with the preferred (but currently not widely supported) HTML 3 construct: CODE<H2 ID="Python Introduction"> Now For Something Completely Different</H2>/CODE Now a reasonable strategy appears. It eliminates ambiguity, gets rid of the Target object (which was cute but a problem), and provides for the sane migration from HTML 2 to 3. First, ambiguous structure is resolved by asserting that legacy HTML should have been following the Recommended practice. The <A NAME=xxx> tag is logically associated with the very next thing that follows it no matter where the </A> is located. The Name, however, becomes an attribute of the object that contains the next thing that follows the <A>. "CODE<A NAME=X>Fred. <H2>Mary/CODE" assigns the name X to the Paragraph or other text object containing "Fred". The name has nothing to do with the following Mary section. "CODEFred.<A NAME=X><H2>Mary/CODE" - Assigns the name X to the Section associated with the Header for Mary. Although the <A> tag is "outside" and "before" the Header, nothing else comes between the tag and the start of the Header. The name applies to the first thing that follows the <A> tag, not the superficial location of the <A> tag itself. "CODEFred.<H2><A NAME=X>Mary/CODE" - HTML 2.0 Recommended practice. SpHyDir will convert the previous case to this when generating new HTML. That is, SpHyDir will move the <A> tag inside the <H2> tag. "CODEFred.<H2 ID=X>Mary/CODE" - Recommended HTML 3.0 practice. Unfortunately, many browsers don't support this yet. SpHyDir will recognize it and "backlevel it" to the previous case of Recommended HTML 2.0 practice. Later on, when the browsers catch up, this will become the syntax that SpHyDir will produce and SpHyDir will "upgrade" HTML 2 to 3. The Target button has now been restored to the Link Manager window. This time it works. Pressing the target button displays all the target lables in the current document tree. They can be dragged and dropped onto Images and text to form hyperlinks as previous Link Manager entries were used. SpHyDir does not intend to extende the reach of the target button outside the current document. Rather, XSpO programs will be developed to identify targets in other documents or databases. Although it is common practice to give short names to targets, HTML allows the ID/NAME value to be long and to have multiple words if it is quoted. Note that names are case sensitive. Using somewhat more descriptive names is helpful when a hypertext link must be selected from a long list of available labels. ΓòÉΓòÉΓòÉ 2.7.3. CENTER Again ΓòÉΓòÉΓòÉ CENTER and ALIGN=CENTER should be handled correctly in most cases. There is one area where problems can arise. When an IMG appears at the start of a paragraph, SpHyDir tries to break it out as a separate object. Objects are easier to change, since you can update properties with the table and can drop a new GIF file on the icon. Thus CODE<P><IMG SRC=xxx ALIGN=MIDDLE>This is typical.</P>/CODE is processed by SpHyDir to produce an Image Object and a Paragraph Object. The ALIGN=MIDDLE on the Image is the flag that warns SpHyDir to shuffle the image back "inside" the paragraph when HTML is generated. This is even harder to code than it is to describe. It becomes impossible, however, when you add CENTER: CODE<CENTER> <P><IMG SRC=xxx ALIGN=MIDDLE>This is typical.</P> </CENTER>/CODE The problem is that CENTER is not an attribute of IMG tags. The "MIDDLE" value means to align the image vertically so that the text that follows is at the middle of the image. It has nothing to do with CENTER which is horizontal alignment. An IMG can have ALIGN values of TOP, MIDDLE, BOTTOM, and with extensions LEFT and RIGHT. However, the way to center an IMG is to put it inside a centered paragraph as above (Netscape) or below (HTML 3.0): CODE<P ALIGN=CENTER><IMG SRC=xxx ALIGN=MIDDLE>This is typical.</P>/CODE SpHyDir is left with three bad choices. One approach is to give up entirely on Image Objects. This reduces the drag and drop functionality of the system. A second approach is to allow Paragraphs to be "opened" to expose embedded images as objects. This would be a major revision of current use. So SpHyDir will try to salvage the current approach from the onslaught of new HTML features that make it more difficult. ΓòÉΓòÉΓòÉ 2.7.4. Groupies ΓòÉΓòÉΓòÉ Suppose you want to center two lines. One HTML 3.0 approach is to apply the center attribute to each separately: CODE<P ALIGN=CENTER>Tastes Great!</P> <P ALIGN=CENTER>Less Filling!</P>/CODE This does the job, but if you change your mind it becomes necessary to uncenter each separately. The Netscape extension does the job: CODE<CENTER>Tastes Great!<P>Less Filling!</CENTER>/CODE but there are a number of technical semantic problems with the CENTER tag that make it unlikely to survive standarization. The HTML 3.0 view is to group the lines: CODE<DIV CLASS=BUDLITE ALIGN=CENTER> <P>Tastes Great!</P> <P>Less Filling</P> </DIV>/CODE Alignment on a DIV applies to everything inside it. The best use of CLASS names in this context is unclear. SpHyDir tries to migrate everything in the direction of the formal standard. To that purpose, SpHyDir introduces the Group Object which is logically associated with a DIV tag. The Insert - Structure - Group option of the Second Mouse Button popup will create an empy group. Alternately, Mark a range of objects and select Group from the popup to create a group containing all the marked objects (leaving them where they were). However, the final use of Group and DIV is not clear. The standard provides little direction, and there is no body of use on the Net to point to the right direction. Clearly the current SpHyDir casual approach to Section Objects should be formalized by creating <DIV CLASS=xxx> markups. However, if a document spans multiple files in a tree, how does one determine the CLASS names (VOLUME, CHAPTER, SECTION, SUBSECTION, APPENDIX, etc). It should also be possible to convert a Group object into a Section Object (by adding a Title) or demote a Section to a plain group. What other transformations are needed? ΓòÉΓòÉΓòÉ 2.8. June 5 SpHyDir II Beta ΓòÉΓòÉΓòÉ SpHyDir II supports most of the HTML 3.0 and Netscape extended functions. Anything missing will be added quickly. This code is Beta because a large amount of core logic had to be ripped up and reorganized. There has not been enough time to test everything. For the next few weeks, before using this code make an archive copy of the original file. Check carefully for any individual element that might have been dropped out of the document because of a bug. Please report problems back to the author. Rewriting the documentation is one of the tasks ahead. As a result, SpHyDir II Beta is available only by FTP from pclt.cis.yale.edu in the sphydir subdirectory of /pub. Source will temporarily be unavailable to Professional uses, though the key that enables Professional features on SpHyDir 1 continues to work on II. ΓòÉΓòÉΓòÉ 2.8.1. Properties Table ΓòÉΓòÉΓòÉ An HTML tag has attributes. SpHyDir II document objects have properties. There is largely a one to one correspondence between attributes and properties. HTML 3.0 adds a ton of attributes to previously fairly simple tags. Even the <P> tag can now become CODE<P ID="HomeTown" ALIGN=CENTER CLEAR=ALL>/CODE SpHyDir II needs a way to display and change all these new properties for each object. The Properties Table is modelled after similiar windows in Visual Basic and Delphi. Since PM doesn't exactly have the same kind of controls, SpHyDir settles for a Container (the same type of object as an open WPS folder or the Workarea) set to Details View. There are two columns in the table, a property description and a value. It seemed to be confusing to list all the properties that every object might have, so the table lists only those with a significant value. However, if you point to the whitespace of the table (below the last entry) and click the second mouse button, a popup menu will list all the properties known to be valid for this type of object. During the Beta period, SpHyDir may be a bit fuzzy about this selection and may include a few properties that belong to a larger class of objects of which the current object is a particular case. For example, the properties of all the forms objects are jumbled together and need to be sorted out. The value column of the table can be directly edited. Since this is a container, changing the value this way uses the same technique for renaming a file in WPS. Hold down Alt and click on the old value. A box appears around the old value and it can be edited. Clicking elsewhere in the table completes the process and saves the new value. As with WPS file renaming, this interface is not ideal. Since the value column may be narrow and awkward, an alternative strategy is to doubleclick the property. This pops up a box with a bit more room to change the old value, and some edit rules that are a bit nicer to use. If a property has a list of possible values, the list can be displayed by clicking on the property with the second mouse button. The possible values popup as a menu. SpHyDir II does not feel strongly that it really knows the absolutely correct list of possible values. First, the HTML 3.0 standard changes a lot. Secondly, the same attribute name can have different possible values in different contexts. So the user is free to type in values that are not in the list. Its just that the second mouse button popup menu cannot be used to set other values. It is a restriction for the near term that the values "Yes" and "No" may not be used for any property other than a logical switch. So don't try to title a section "No" cause that won't work. Switches correspond to attributes whose presence signals a option, such as "COMPACT" in a list. Setting a logical switch to "No" is eventually going to delete it from the properties table, because "No" is the default setting for a switches and corresponds to the attribute not being present in the tag. A few properties are changed implicitly by the dropping things on an object. For example, dropping a GIF file on the Document object changes the Background property and produces Netscape/HTML3 backgrounds. The Title of the Document or of a Section appears both as a property and as the caption of the Workarea Object. It can be edited by doubleclicking the Section object. ΓòÉΓòÉΓòÉ 2.8.2. New Features ΓòÉΓòÉΓòÉ The HEAD tags are now parsed. Along with the attributes of the BODY tag, they generate attributes of the document. Support is provided for BASE, ISINDEX, LINK (REV=HOME,TOC,INDEX, GLOSSARY, HELP, BOOKMARK), and the BODY Netscape attributes for background and color control. META attributes will be added if anyone can send me a list. CODE<BLOCKQUOTE> What, Me Worry? <CREDIT>Alfred E. Newman </BLOCKQUOTE>/CODE This construct should be properly supported. At this time, SpHyDir accepts both BLOCKQUOTE and BQ but it currently generates BLOCKQUOTE to the output file. Although BQ is recommended, it is not supported by all current browsers, while BLOCKQUOTE is universal. It has been observed that <CREDIT> is not supported by WE. The sequence: CODE<UL>Some text. <LI>One. <LI>Two. </UL>/CODE is upgraded so that the isloated text is rendered as CODE<LH>Some text.</LH>/CODE List header text is displayed as the caption of the List Object and can be changed by doubleclicking the object. Similarly, the construct CODE<Table>Some text. <TR> etc./CODE is upgraded to CODE<CAPTION>Some text.</CAPTION>/CODE In general, SpHyDir II will put ending tags in all output even when the tags can be legally omitted. In general, SpHyDir will put all text in some block, with <P>...</P> as the default. In the previous case the <P> is not appropriate because <CAPTION> (and <CREDIT>,<LH>, and a few other such things) are themselves the block container and are not allowed to contain other blocks. SpHyDir 1 tried to promote all IMG references to objects. There is a certain simplicity when this can be done, because you can drop a GIF file on an IMG object to set the file association. However, SpHyDir 1 was therefore unable to place IMG references in a Heading or in the middle of a sentence. An IMG can now appear in the previously usupported places. In this context, the IMG is treated as "honorary text". A dingbat (corresponding to the PC character for the value 8) appears before and after the embedded image. Between the dingbats, the first word is the name of the GIF file and the remaining text is treated as ALT text. There is currently no nice support for creating new embedded images. It will be added by the end of the beta period. For now, you can always type this stuff in manually. In the Text editor window, position the insert where you want the image to go. Hold down Alt, press the "8" key on the numeric pad, and release the Alt key. The dingbat appears. Type the name of the file in the ususal HTML format, say "../icons/face.gif". Type alternate text if you choose. End by repeating the Alt trick to create a second dingbat. (This can also be used to enter other unsupported tags and markup). No attempt has or will be made to do HTML 3.0 Math markup. No support is provided for the horizontal tab <TAB> tag. FIG has not been attempted this week. It should not be hard, but needs a bit of study to get it just right. It is not clear just how many of the proposed new forms of character emphasis should be supported. <DFN>, <Q>, <LANG>, <AU>, <PERSON>, <ACRYONYM>, <INS>, <DEL>, <BIG>, and <SMALL> seem to be stretching things a bit. It is not clear that all of them will actually survive the standardization process, expecially since few browsers do anything particularly meaningful with the HTML 2.0 character formatting tags that already exist. Most of the attributes in the HTML 3.0 standard are supported. A bunch of Netscape stuff was added, but a few more things are needed before the end of the Beta. To know what is supported, select an object of the appropriate type, then click the second mouse button on the whitespace of the Properties Table. If the attribute doesn't show up in the list, its not supported. Write me about it. ΓòÉΓòÉΓòÉ 2.8.3. Table ΓòÉΓòÉΓòÉ A major feature of HTML 3.0 is tables. They allow information to be layed out in columns. The rows and columns may have labels. Each cell of the table contains any type of document element (paragraphs, images, buttons, etc). Architecturally, a table is a two dimensional version of an Unordered List. Viewed as HTML, Tables involve a large number of confusing tags. They are hard to edit by hand. SpHyDir allows you to construct specialized tables, but it will automate the process of building simple N by M tables. Even if you want something special, like a heading that spans two columns, it may be easier to let SpHyDir start by generating the normal table and then change or delete the automatically generated entries that you don't need. When you use the Table Object, a dialog box pops up. During the Beta it is a bit cheezy. You can abort the dialog, leave a bare Table object, and add elements yourself (or you will if all the elements are available in some toolbar). Alternately, specify the number of rows and columns and choose whether lables are to be generated or not for each. A simple 2x3 table might look like: CL0 CL1 CL2 CL3 RL1 X11 X12 X13 RL2 X21 X22 X23 Where CL# are the column labels, RL# are the row labels, and X## are the cells. HTML is going to ravel this out row by row. The nasty part is getting the tags right. If SpHyDir II is asked to construct this table, it will produce a Table Object containing three Row Objects. The first Row Object contains the four Label Objects for the columns. The second and third Row Objects contain one Label Object (the row label) and three Cell objects. Initially, all are empty. TABLE ROW LABEL (CL0) LABEL (CL1) LABEL (CL2) LABEL (CL3) ROW LABEL (RL1) CELL (X11) CELL (X12) CELL (X13) ROW LABEL (RL1) CELL (X11) CELL (X12) CELL (X13) The table is then filled in by dropping Paragraph Objects (or Image or any other document) on each Label and Cell object to provide contents for that label or cell. The twelve objects that need to be assigned contents are all at the third level of the tree (under the three Row Objects that are in turn under the one Table Object). It is fairly easy to see what needs to be done. During the Beta period, SpHyDir II may not support all the defined table attributes. VALIGN, COLSPEC, ROWSPEC need some study. Currently SpHyDir doesn't have tools to create new Rows or cells. This is by design. It is the intent of the design that the table be expanded by selecting an exising object, clicking the second mouse button, and then choosing Create Another from the menu popup. If SpHyDir can dope out the table, it would then be able (after asking for your intention) to create all of the objects needed for another column or row. However this is not currently available. ΓòÉΓòÉΓòÉ 2.8.4. Things ΓòÉΓòÉΓòÉ After learning more about HTML details, it became clear that SpHyDir 1 had made a big mistake. List Points should not contain text. Semantically, a proper list is of the form: <UL> <LI><P>First point.</P></LI> <LI><P>Second point.<P></LI> </UL> Nobody ever actually codes a list this way, so it is easy to miss. In HTML 2.0, the <LI> and <P> tags have no attributes, so they appear to be redundant. Then along comes HTML 3.0. Now proper construction of the list is "Recommended", and any tool that plans to read in HTML had better understand this implied structure because <LI> and <P> tags now have meaningful attributes. The SpHyDir 1 view that a List Point object contained the text of the implied paragraph has now become unworkable. When SpHyDir II reads a document with lists, it creates a second level of tree indentation. The List object contains Point objects, and each Point object now contains paragraphs and stuff. You can no longer doubleclick a Point to get the Text Edit window. Since the opportunity presented itself, a Point in a definition list has as one of its properties the term from the <DT> clause. This can be changed with the Properties Table. This mess had to get cleared up before it was possible to do tables. A Table looks like a List. The Points of the Table are Rows which are themselves like a nested List. The Labels and Cells act like points. If this thing was going to be added, then the original List Points had to get cleared up. This produces a generalization about Things that contain Stuff. In addition to the obvious Things (the Document, Section objects, and the three types of Lists) there are ten other Things that contain other objects: Points, Table Cells, ADDRESS, BLOCKQUOTE, DIV, FIG, FN, NOTE, BANNER, and CENTER. It did not seem to make sense to structurally distinguish DIV and BANNER (a few months ago they were merged in an earlier version of the HTML proposal). CENTER is an obsolete construct that has not been cleanly replaced. Points, Cells, Address, and BlockQuote seem to need their own objects. The rest SpHyDir II will try to collect under the category of a GROUP object. The icon for a group is a brightly colored Folder. It is an objective that Group become an option of the second mouse button popup menu for the workarea when items are marked. Choosing Group will create a new Group item and place all the marked objects in the group. The collection can then be assigned properties by assigning the property to the Group that contains in. In particular, this is the preferred way to center a collection of things. There may be a transition during the Beta period for anyone using the previous SpHyDir 1 haphazard approach to <CENTER>. In a few weeks, SpHyDir will have dug itself out of the hole. Lacking any Group object, SpHyDir 1 assigned a Centered attribute to every object between <CENTER> and </CENTER>. The new thinking holds that if you chose to center a collection of objects, then the objects must collectively form a group with common properties. So SpHyDir will create the group for you and will use both new <DIV ALIGN=CENTER> and old <CENTER> syntax. Currently, however, CENTER may not work right. ΓòÉΓòÉΓòÉ 2.8.5. Internal Reorganization ΓòÉΓòÉΓòÉ There were two key decisions in SpHyDir II. The first was to completely reorganize key areas in order to make them ready for Object Oriented technology. The second was the choice to stick with existing Rexx and not use Object Rexx quite yet. SpHyDir uses the services of VX-Rexx to store information. The Workarea that the user sees is a VX-Rexx container in Tree-Name view. What the user doesn't see is that the records in that container contain all the text and attributes. SpHyDir 1 was designed around the HTML 2.0 standard. Since it was slowly moving toward formal adoption, it seemed that any design that handled 2.0 would be good enough to last for years. Then the Netscape folks captured a big share of the market and pushed the 3.0 features to the front. This added a whole bunch of attributes that would break the SpHyDir 1 design for storing information. One solution would be to build real Object Rexx classes and store the information there. Instead, SpHyDir II ripped out a lot of ugly, bug prone logic and created a simpler (though possibly slightly less efficient) general purpose data store within the existing VX-Rexx support. Internal control structures were generalized by more agressive use of Stem variables. SpHyDir 1 processed HTML input and generated HTML output from large SELECT/WHEN blocks. SpHyDir breaks this logic into a mass of small subroutines. The subroutine names are registered in stem tables indexed by the tag name, the attribute name, or the object type. Rexx doesn't make it easy to initialize large syntax tables. Some tables are handled by simply listing all possible words in a character string. Rexx has some very nice WORDxxx functions that make it easy to manipulate such strings. However, such wholesale changes will produce bugs when an isolated piece of the old code is missed during the update. A few routines handle most of the logic for creating new objects. However, initial debugging discovered that the New button in the Text Edit window was a special case that created a new paragraph without drag-and-drop or menus. That logic had to be updated also. The Beta period should identify any other special cases that slipped through the cracks. ΓòÉΓòÉΓòÉ 2.9. May 28 Release ΓòÉΓòÉΓòÉ IBM release Web Explorer Beta (5/25) Friday. It creates URL-file objects that can be dragged from WE to a disk directory to save interesting Web locations in WPS. You can drag these URL objects from WPS and drop them on SpHyDir objects to create Links, just as you previously dropped Link Manager URLs and XSpOs. A ToolChest Window has been added. The ToolChest is a container that is intended to provide an extension or alternative to the current Toolbar. Currently, however, the ToolChest simply duplicates a subset of the Toolbar objects (though it adds descriptive captions missing from the Toolbar). ΓòÉΓòÉΓòÉ 2.9.1. Preserve Entities in HTML ΓòÉΓòÉΓòÉ In HTML, an "entity" is a special character represented by a name preceeded by "&" and ending in ";". Because they have special significance to the syntax, "<", ">", and "&" must be represented in HTML documents as "<", ">", and "&". SpHyDir previously supported only these three entities, based on the incorrect assumption that all the other entities existed only to support ISO accented characters that would be better displayed using the international character set. However, ISO editing got put off, and a more careful examination of HTML 3.0 entities shows that they will include ISO, Greek, math, dingbats, and many other characters not found in any single code page. To allow simple editing of the "&" character, SpHyDir has to change the introducer to some funny character. Therefore, as HTML is parsed in an the leading "&" is converted to the 0x01 ("smiley face") PC character and it is converted back to "&" on output. There is no explict GUI support for entities, but as with any funny character you can enter it from the keyboard. To get a copyright symbol Γòò hold down ALT, press the 1 key on the numeric pad, then release ALT (now you have a smiley face) then type the name "copy" and a trailing ";". Funny characters can be deleted or edited just like any other character. ΓòÉΓòÉΓòÉ 2.9.2. ../ICONS/tiger.gif Image Left of Heading ΓòÉΓòÉΓòÉ Although it doesn't solve the entire problem, SpHyDir now has limited support for putting an image in a Section title. It allows one image to appear in front of the title. Worse, except for the H1 at the start of the document, you cannot create this construction with normal SpHyDir drag and drop but instead have to (ick) edit the HTML file with a plain text editor. Clearly there is room for improvement. In HTML terms, the construct looks like the following (from the PCLT home page): <H1 > <IMG SRC="exitsign.gif" ALIGN=MIDDLE WIDTH="218" HEIGHT="171"> Welcome to PC Lube and Tune </H1> The <IMG> tag has to come after the <Hn> and it must have an ALIGN value (MIDDLE generally looks best). To put an image in front of the H1 tag in the document, drag the Image Object from the toolbar and drop it on the Document Object. The Image Object will be created just before the first Section object. Now drop a GIF file on the Image Object and set its ALIGN attribute to MIDDLE (or TOP or BOTTOM). Once the Image is set up, it will be read in by SpHyDir and regenerated properly. So the worst case it to manually set it up once. ΓòÉΓòÉΓòÉ 2.10. SpHyDir II - Statement of Direction ΓòÉΓòÉΓòÉ When SpHyDir was first created, some objectives were announced in the documentation. Subsequent developments have show some of these claims to be ill advised. It seems appropriate to provide users with advance warning of a change in direction. The term "SpHyDir II" is now being introduced to reflect some new ground rules that will be required to make further progress. The biggest mistake was to promise that SpHyDir would generate HTML that would pass through a validator. Effectively, that ties it to HTML 2.0 syntax (for which there are standards) at a time when eveyone is moving rapidly to HTML 3.0 or "Netscape" extensions long before rigourous validation is possible. SpHyDir users, or at least the more vocal of them, want the extensions now. There seems to be no limit on the number of structures that HTML can include, nor the number of attributes that will be added to HTML tags. There is no room at the top of the screen in the Toolbar or entry fields for everthing that the new language features support. If the number of features grows much larger, than simple icons will not be enough to remember what's what. Furthermore, users are asking for specialized features that may require user customization. At the same time, HTML remains a poorly specified language. Important syntax changes occur from one version to the next. There are some differences between the "human" explanation of what is going on and the formal syntax descriptions. But the most important feature is that there is an enormous amount of invalid HTML on the Web that is accomodated because the mistakes don't prevent the browsers from displaying the correct image to users, and the final image is the only thing than seems to count. Worse, the standards documents explicitly mention common invalid constructions and urge browsers to accomodate them. When people are first learning C, it is a common mistake to code if (a=5) ... when the correct statement is if (a==5) ... Yet no matter how common the error is, nobody would expect a C compiler to accomodate the user and automatically "correct" the program. Yet Web tools are expected, perhaps even required, to accomodate HTML syntax violations. Yet SpHyDir cannot advance to support more complicated syntax without some rigour. Sphydir has to resolve a conflict between two goals: <UL> <LI>To support the common use of ordinary people. <LI><P>To encourage "recommended" practice</P></LI> </UL> HTML has levels of conformance. "HTML.Recommended" holds that text should be contained in a block (P, PRE, BQ) instead of standing alone. The "<LI>text" construction is thus not "Recommended" but is widely used. There is an important difference between the human explanation of what is going on here and the semantic difference. Books and articles on the "Complete Moron's Guide to HTML" will divide tags into those that create paragraphs breaks (P, H1..H6, CENTER, UL, OL, DL, LI, HR, etc.) and those that don't (B, I, A, IMG). BR creates a line break but not a paragraph break, though <BR><BR> might be hard to distinuish visably on most browsers. The non-rigourous explanation would be that <LI><P> seems redundant because <LI> itself generates the necessary break. HTML is rigourously defined in a document called the "DTD". The DTD defines a %block as a P, UL, OL, DL, PRE, BQ, FORM, etc. At the Recommended level, a UL contains LI structures, and an LI contains %blocks. At the Recommended level, ordinary text is not supposed to be in the document BODY or in a list or list element. It is only supposed to appear as the contents of a P, PRE, BQ, header, etc. The DTD identifies "<LI>text" as not Recommended, but it doesn't specify what to do about it. The DTD allows some ending tags to be omitted. The effect of a previous tag ends when a new tag is encountered that cannot be contained within the current structure. Thus "<LI> [stuff] <LI>" implies "<LI> [stuff] </LI><LI>" because one list item cannot occur within the previous list item. Having said this, there is not one shred of usable standard to transform tolerated HTML into Recommended HTML. <LI>Speak softly and carry a big stick</LI> <LI><P>Speak softly and carry a big stick</P></LI> Once the leading <P> is added, the ending </P> can be deduced because the </LI> cannot be inside the paragraph. However, no amount of DTD will every explain why the free text in the list item should have been turned into a paragraph in the first place. This is reasonable, because the reader could argue that BLOCKQUOTE is a justifiable alternative to the P tag in the case of this particular famous phrase. SpHyDir can only accomplish its original objective if its HTML parsing is heuristic. A parse that is driven simply by syntax tables will not be quite enough. This means that SpHyDir will keep, though it ought to clean up, its current logic in the Read_HTML and Parse_Block routines. At the same time, SpHyDir won't really work as an object oriented environment unless users can create their own objects. Sometimes people want to support new or experimental tags. Sometimes an author uses the same special construction in all documents. There have been requests, for example, for a document toolbar object of the form: <P><A HREF= ><IMG SRC= ></A>...<A HREF= ><IMG SRC= ></A></P> One can imagine creating this with an XSpO (an external Rexx program dropped onto the Workarea), but it would then lose its identity. HTML 3.0 provides the solution with the CLASS attribute. <P CLASS="TOOLBAR"> <A HREF= ><IMG SRC= ></A>... <A HREF= ><IMG SRC= ></A> </P> CLASS allows most block tags to be assigned a user specified category name. CLASS definitions are intended to be hierarchical and there is some implication in the standard of one class being derived from another with inheritance. SpHyDir can also make object distinctions based on the value of other attributes in the tag. For example, the current code distinguishes between <BR>, which is treated as part of paragraph text, and <BR CLEAR=ALL> which is treated as an document object like <HR>. The proposal, then, is to allow new SpHyDir objects to be defined externally. The new objects would appear in the ToolChest Container window (maybe in the menu popup if it can be changed dynamically). New objects would be recognized as the HTML is read in by a new Tag name, the presence of an attribute or a special value assigned to an attribute on an existing tag, or a CLASS attribute specification. This is not designed to allow SpHyDir to assign user defined objects to raw HTML from external sources. The only claim is that SpHyDir should be able to read back HTML that it had previously written and recognize and redisplay extended constructions. Simple user objects could be defined based on the fundamental attributes that SpHyDir currently potentially assigns to each object (an icon, caption, text content, name, variable name, variable value, etc.). Objects could also be define that contain other objects from a certain set of types. In its simplest version, this will be used to create "macros". For example, a SECTION-like object could be created named CHAPTER. It would be managed like an ordinary SECTION, but it would generate something more complicated: <DIV CLASS="CHAPTER" CLEAR="ALL"> <HR SIZE=6> <H1 ALIGN="CENTER">This is the ordinary Section title</H1> [ordinary section contents] </DIV> The second time through, SpHyDir II will recognize its own construction by the DIV tag with the CHAPTER class. It will match up the </DIV> ender to determine the scope. It then has to suck up and discard the boilerplate tags (the HR SIZE=6 in this case). Although there is a HR object in the SpHyDir vocabulary, this particular HR is part of the formal expansion of a CHAPTER object and should not generate a separate object. Except in the HTML that it generates, the CHAPTER object would then behave in every way as if it were the existing SECTION object. Now comes the trickey part. Programmers should instantly recognize that, in object oriented lingo, this example creates the CHAPTER class as a subclass of the SECTION class inheriting SECTION's methods (mostly the way to edit titles and its behavior as a container) but overriding a few methods (HTML parsing and generation). It seems likely that many new objects will have behavior exactly modelled on Paragraph, Section, or Image objects. SpHyDir might add a few new built-in objects on which new things can be constructed. However, full implementation of the concept will require that everything be rewitten in Object Rexx, and that will be disruptive enough to be put off until it is unavoidable. SpHyDir has not dependencies on other programs, through it is distributed with a few utilities (GBM, RCS) that have proven useful. However, the new release of GOSERVE (2.30) is getting too slick to ignore. Since GOSERVE uses Rexx, and SpHyDir is written in Rexx, a closer relationship should be worked out between the two programs. Currently, when SpHyDir wants to test a document, it calls Web Explorer or Netscape with its local file name. However, hyperlinks from this first document will not work if they are designed for another server and have fully qualified URL's or if the document contains a BASE statement with the production server's name. This made support for BASE a long requested but seemingly impossible objective. Any machine running SpHyDir can probably run GOSERVE in the background. GOSERVE can be told to serve documents out of the HTMLLIB directory tree. The final trick is to override the name of the production server machine with a pointer to the local loopback IP address. This can be accomplished by adding an entry in the \TCPIP\ETC\HOSTS file with values like: 127.0.0.1 sphydir 127.0.0.1 pclt.cis.yale.edu If TCP/IP is set to check the HOSTS file first, then with this entry any URL for "http://pclt.cis.yale.edu/pclt/sphydir/status.htm" will be redirected to the GOSERVE running on the local machine, which will then fetch pclt/sphydir/status.htm out of the HTMLLIB directory. BASE will then work, and all the URL's that point back to PCLT work. Of course, to FTP files to the real PCLT server I need to use a different alias for that machine (or temporarily change the HOSTS file). GOSERVE also provides an environment to test FORMS and CGI-like programming. Some days it seems like an HTML form displayed on the Web Explorer window would be a more flexible way to choose options and configure SpHyDir or document objects than popup VX-Rexx windows. In any event, future releases of SpHyDir may move toward almost requiring that GOSERVE and either Web Explorer or seamless Netscape be running. ΓòÉΓòÉΓòÉ 2.11. May 15 Release ΓòÉΓòÉΓòÉ Document Objects now have a popup menu. Click the second mouse button to display Open, Settings, Insert, Create Another, Mark, Delete, etc. Insert provides a quick way to add a new Paragraph, Image, Point, or List without going to the Toolbar. You can also Insert a Horizontal Rule for which there is no tool (though you have been able to generate it with ALT-H for quite some time). Create Another creates a second object like the current object (say another paragraph or point). Mark duplicates the old Alt-L. Delete duplicates the old Ctrl-D. Currently you can only Open the contents of the object (duplicates current DoubleClick function). Settings is under construction. ΓòÉΓòÉΓòÉ 2.12. May 8 Release ΓòÉΓòÉΓòÉ SpHyDir now refreshes the title of a subdocument as it reads the parent file in. Thus if you change the title of a subdocument, the pointer to it will be changed the next time that the parent (or the entire tree) is processed. ΓòÉΓòÉΓòÉ 2.12.1. Test (F5) ΓòÉΓòÉΓòÉ The File-Test (F5) operation now dynamically communicates to running copies of popular browsers. SpHyDir first generates a temporary copy of the document in TEMPDOC.HTM in the HTMLLIB root directory. In previous releases it then started Web Explorer to view the document. Now this is the last alternative. Before launching a new WE, SpHyDir now tries two things: 1. SpHyDir first attempts to establish a DDE link with a running copy of Netscape. For this to be useful, Netscape should be running in seamless WINOS2 mode. SpHyDir passes Netscape a request to display the Tempdoc file. 2. If Netscape is not found, then SpHyDir looks through the windows on the screen for a running version of Web Explorer. If one is found and it is already viewing an older version of TEMPDOC, then SpHyDir sends it a F5 to refresh the document. If it is running and viewing something else, SpHyDir enters the name and path of TEMPDOC in its entry area and sends an Enter key. Netscape has known problems running seamless on OS/2. Fortunately, the most serious issues involve user interaction with the menus. By controlling Netscape from SpHyDir, interaction is minimized. However, the code to control both Netscape and WE is new and may require some fine tuning. ΓòÉΓòÉΓòÉ 2.12.2. The BR Object ΓòÉΓòÉΓòÉ The design of a <BR> object finally became clear when a user reported by EMail that he was having trouble with Netscape extensions. Normally, an Image either appears by itself or is aligned with a single line of text. The Netscape ALIGN=LEFT (which SpHyDir has "supported" from the start) causes multiple lines of text to flow through the space left to the right of the image. Unfortunately, this option doesn't just flow text. it also "sucks up" any following images. The Netscapism for breaking this pattern and starting the next line under the image is to add a <BR CLEAR="ALL"> tag. I have previously noted the need for a <BR> object to separate buttons in a form. The problem ws to distinguish when reading in the HTML a <BR> acting as an object from a <BR> in the middle of a paragraph that acts instead like a character or as a CR/LF pair. Although SpHyDir has tried to avoid non-standard HTML, it seems very compact to declare that <BR CLEAR="ALL"> would be recognized as the Object and plain <BR> as the character. Browsers should ignore attributes they don't understand. The only problem occurs if you try to validate HTML that contains Netscape extensions with a validator looking for HTML 2.0. If you don't want <BR CLEAR="ALL">, then don't create the object. Incidentally, to create such a tag, position at the next object and press Alt-B. The BR Object will be positioned in front of the currently selected object. For the most part, the BR and HR objects are very similar. Neither has a specific icon at the moment. ΓòÉΓòÉΓòÉ 2.12.3. Backup using RCS ΓòÉΓòÉΓòÉ Backup of HTML files has been a serious issue. First, SpHyDir cannot be subject to terribly aggressive testing between weekly releases. If a syntax error occurs, SpHyDir can abort in the middle of writing a file. If SpHyDir encounters HTML that it doesn't recognize, information can be lost. The previous strategy of saving the old copy of the file in the BACKUP directory addressed only part of the problem. SpHyDir now introduces the heavy artillery. For other text files, the most powerful free software system is the RCS version control package from Unix. When fully exploited, it allows several people to check out and work on files in a shared library. It remembers changes to the file and who made the changes. It is possible to reconstruct older versions of the data if something goes wrong. Use of RCS is optional. If it is not used, SpHyDir continues to make a copy of the previous version of the file in the BACKUP subdirectory. For each original data file, RCS builds a control file that keeps a copy of its current contents and the information needed to recover any previous versions. The first version is "1.1" and each time the file is changed a new version number is generated. By default, RCS will archive f:\pclt\sphydir\status.htm in a control file named f:\pclt\sphydir\RCS\status.htmv. That is, it stores files in the RCS subdirectory of the path where the data file is found, and it adds the suffix character "v" to the file type. Some future version of SpHyDir may maintain enough variables for a document to allow individual decisions about what to manage under RCS. Currently, however, if you use RCS at all you have to use it as the backup for the entire library. SpHyDir is triggered to use it if there is an RCS subdirectory under the HTMLLIB root. On my machine, "f:\pclt" is the the HTMLIB for PCLT articles, so SpHyDir looks for "f:\pclt\RCS" to decide to use RCS, for all of the documents in all of the directories under f:\pclt. Unexpectedly, RCS will not create the needed subdirectory automatically, and it will not quite work correctly if the subdirectory doesn't exist. A future version of SpHyDir may fix this, once I develop more confidence about the best arrangement. For now, manually create RCS subdirectories throughout your HTMLLIB tree if you intend to use this facility. If you forget, RCS will create the control file in the same directory as the data file and you can create the RCS subdirectory and move the file to it later on. This is not a problem on HPFS volumes, but it could present an issue on FAT directories where "HTMV" might get truncated to "HTM". RCS backup is probably not a good idea for SpHyDir users with only FAT directories. Before generating new HTML, SpHyDir backs up the previous version of a file by issuing the command: ci -xv -l - m"backup" -t-"backup" xxxx.htm(l) This runs CI.EXE (Check In) of the RCS version control system. The -m and -t parameters provide dummy log messages so the program does not prompt for a description of changes or of the file. The -l parameter checks the file back out immediately (so that it remains in the library and can be rewritten). The -xv adds a "v" letter on the end of the file type (htmv or htmlv) to provide the file type of the RCS control file. SpHyDir does not check the PATH for a copy of the RCS executables. It tries to use RCS based on the existence of a directory in HTMLLIB. It is the user's responsibility to install RCS on the OS/2 system before using this facility. Unzip the RCS567PC.ZIP distribution file and copy the contents of the BIN32 subdirectory to a library in your PATH. RCS is now available on the same FTP file servers as SpHyDir itself. SpHyDir is using RCS to provide a super safe backup, not to do true version control. There is no provision to check out locked files, or to check in and unlock a final version. However, the user can build real version control outside SpHyDir by issuing RCS commands before or after running SpHyDir to provide real parameters and version numbers. RCS is a serious system with some heavy duty manuals. SpHyDir's only direction function is to call CI.EXE to generate the backup. Comparing different versions of a file, or recovering old versions from the backup, requires direct use of the other RCS commands. RTFM. If RCS appears to be too complicated, feel free to continue to use the old SpHyDir BACKUP. Use of RCS was requested in E-mail by a user several weeks ago. Initially it seemed like a really bad idea. The problem is that RCS and all the other version control systems have this view of tracking changes by line. Since SpHyDir only puts a CR/LF line break at the end of paragraphs, it appears to have really, really long lines. Change one word, and RCS regards the entire paragraph as changed. Flowing the text into 80 character lines would not make much difference, because any change in one section will flow changes onto all the subsequent lines of the paragraph. However, there are no better version control mechanisms, and after some consideration the long lines do not appear to be unworkable. Before reporting any bugs, please realize that the version being "checked in" to RCS is not the version on the screen. What is being checked in is the old version on disk from before the current edit session. So if you read a file in, make a ton of changes, press F2 to save it, and look at the Console window, do not be surprised it it reads: F:\PCLT\sphydir\RCS/STATUS.HTMv <-- F:\PCLT\sphydir\STATUS.HTM file is unchanged; reverting to previous revision 1.3 It is not saying that there are no changes in the current version, just that there were no changes in the old version that you are just about to replace. RCS is a Unix utility that has been ported to the OS/2 environment using the EMX package. EMX is a version of the GNU development tools and the GCC complier. These tools are normally found in the /unix subdirectory of the OS/2 files at ftp.cdrom.com and ftp-os2.nmsu.edu. The minimum files needed are the EXM runtime DLL library (emxrt.zip) and the RCS distribution (rcs567pc.zip). They will also be added to the SpHyDir FTP directory. This facility is currently more "experimental" than the rest of SpHyDir, so its output is not captured. The CI command writes to "standard output" and VX-Rexx captures that file and displays it in the VX-Rexx Console window. If the service seems to work well, this output will be supressed in a future release. Meanwhile, the Console window will have to be manually closed when SpHyDir ends. ΓòÉΓòÉΓòÉ 3. SpHyDir Project Objectives ΓòÉΓòÉΓòÉ Produce the highest quality HTML documents automatically. Upgrade obsolete syntax to current "Recommended" practice. Add additional markup to get the best possible results on all known viewers. Provide a transition to new standard features. Build larger documents from many small, structurally interrelated hypertext files. Easily generate links to other files, to target lables in certain files, and to remote documents (by extracting URL references while a Browser displays the remote file). Present an entirely different approach to Web document construction. Microsoft and Word Perfect have interfaces from their Word Processors. Oracle promises an interface from Oracle Book. HTML editors are being written all the time. If SpHyDir isn't completely different (and IMHO better) then it isn't worth the effort. Automatically generate navigational links, copyright notices, and other standard features at the beginning or end of every document. Support all the features of HTML 3.0 and common Netscape extensions. Provide for user extensions both to document structure and library management. Provide direct links to Netscape and Web Explorer to test document changes. SpHyDir is not WYSIWYG, but you can immediately format what you are structurally editing to see how it will look. Simplify the construction of data entry forms (entry fields, check boxes, radio buttons, push buttons) and tables. SpHyDir II supports all the tags and attributes of the current HTML 3.0 draft standard. SpHyDir should be able to process any Web document that uses these features correctly and in context. However, HTML is a formatting language and SpHyDir is a document structure tool. Incorrect syntax, or the use of a tag out of context to achieve a particular effect, can confuse the analysis. In particular, the use of <H6> to get "fine print" where no heading is actually intended will certainly produce bad results. SpHyDir II is Object Oriented. It examines an input HTML document and produces a tree of Objects that corresponds to the apparent document structure. The simplest Object is a paragraph that contains text. Other objects include the Image (for inserted graphics), order and unordered Lists, Tables, Forms, Horizontal Rules, etc. Each object has properties. There is a fairly close tie between the Properties of an Object (in SpHyDir) and the Attributes of a Tag (in HTML). When an object is selected in the tree, its properties are displayed in the Properties Table. This behavior is intentionally modelled on tools like Visual Basic and Delphi. However, since most HTML attributes have default values that can be ignored, the SpHyDir properties table only shows the items that have been assigned an explict value. The casual user can concentrate on text, graphics, and basic document structure (sections, lists of points, hypertext links). If more advanced features become needed, SpHyDir can display all the legal properties that any object is permitted. For example, any Paragraph can have an ID (jump-to label), ALIGN (LEFT|CENTER|RIGHT|JUSTIFY), CLEAR (LEFT|RIGHT|ALL), and NOWRAP. SpHyDir will list the common standard values but allows the user to type in other values (such as entering "100 pixels" as the value of the CLEAR property). With this approach, SpHyDir doesn't require the user to be familiar with HTML, but it also doesn't prevent the HTML expert from using the more obscure language options. The author can "ease into" advanced features. ΓòÉΓòÉΓòÉ 4. The SpHyDir Idea ΓòÉΓòÉΓòÉ To create a personal home page or an ad layout, one must concentrate on graphic layout. To publish a large body of useful, interrelated information on the Web, it is more important to focus on content and the organization of the entire library. This is the purpose of PC Lube & Tune and so it is the design objective of SpHyDir. The SpHyDir program icon is configured with the path to a library of Web files. SpHyDir will only edit files and build links to the subdirectories that fall under that starting point (the "HTML Library"). To edit a file, drop its icon into the SpHyDir workarea window. SpHyDir reads in the HTML and converts it to a sequence of Document Objects. These Objects correspond to paragraphs, images, sections (chapters, topics), numbered lists, bullet lists, tables, and so on. The Objects are arranged in a tree, because the document contains chapters, the chapters contain paragraphs, images, and lists, the lists contain points, and so on. Most of the objects that SpHyDir creates correspond directly with elements of the HTML language. A few have to be invented and several more have to be guessed. The future HTML standard (3.0) will include Divisions that break the document up into chapters. Current HTML (2.0) doesn't support this, and few Web documents include the HTML 3 features. So SpHyDir has to invent the "Section" object by looking for Header tags that are part of the 2.0 standard. The assumption is that a Header normally starts something. Therefore, everything after a Header (up to the next Header of the same type) must be a Section of the document. At first the SpHyDir objects may seem a bit awkward. Dividing everything up formally into paragraph objects and list objects is more precise than normal word processing. However, once SpHyDir has decomposed the original HTML into document objects, and those objects have been updated, SpHyDir is now in a position to generate a document with flawless, precise HTML syntax. There are a lot of erroneous documents in the Web. Some documents display correctly on one browser but are wrong on another browser. Few Web authors are HTML experts, and there are many misunderstandings. SpHyDir converts the HTML to something that most people instinctively understand (chapters, paragraphs). In many cases it will upgrade obsolete or "deprecated" HTML elements to current "recommended" use. ΓòÉΓòÉΓòÉ 5. SpHyDir is not for Everyone ΓòÉΓòÉΓòÉ HTML marks up documents so that they look good. SpHyDir assumes that the markup corresponds to valid document structure. Some things display nicely but are impossible to structure. For example, because <H6> produces very tiny text, it is sometimes used to get "fine print": <H2>Lease a new car for $200 a month<H2> <H6>engine not included<H6> SpHyDir requires that all H1..H6 tags be used to start sections. Also SpHyDir doesn't preserve the heading numbers, just their relative position compared to each other. In the previous example, SpHyDir would change the H6 to an H3 because that is the next number down from H2. A large number of Web documents have invalid HTML. They display as intended because the browsers don't complain about errors that do not effect formatting. For example, when someone wants to print in big letter, they frequenly use heading tags: <H1>Get Rich Quick<P>Act Now<P>Limited Time Offer</H1> <P> tags are not permitted inside a header, but most browsers tolerate this construction, using H1 to change font and /H1 to revert back to normal size. SpHyDir expects Headings to be a simple character string as the standard specifies. Paragraphs are other types of objects, and headings cannot contain objects. SpHyDir II attempts to include almost all the valid syntax in HTML 3.0 and Netscape. The Math support will be omitted for a very long time. The FIG structure will be supported when it is more widely used by browsers. Netscape extensions will not be supported when they seem to directly overlap more appropriate HTML 3.0 constructs. HTML goes through revisions. Old constructions that have been replaced are called "deprecated" in the standard. An even tighter reading of the standard is called "recommended." SpHyDir reads the HTML in, understands it, and then generates new HTML based on the structure. It can automatically upgrade old "deprecated" files to "recommended". For example, it will automatically convert <MENU> and <DIR> to <UL> and will convert <XMP> and <LISTING> to <PRE>. If you want to keep the old stuff as is, then SpHyDir is not the right choice. There are some constructions that the HTML standard permits, but maybe only because the DTD language in which the standard is written cannot express certain rules well. SpHyDir requires that a Definition List have sequences of one term (DT) and one definition (DD). The Definition can have multiple paragraphs. The sequence: <DT>canned <DD>packaged in a can <DD>fired from a job appears to be techically valid. It even has a certain obvious meaning (one term with two definitions). The HTML DTD standard says that a <DL> tag can only have <DT> or <DD> contents, but it doesn't specify how many or in what order. Some very bad HTML uses <DL><DD> <DD> </DL> to get a certain level of indentation. If you like this sort of thing, find another editor. SpHyDir "understands" tag names and attributes. The name is the part of the tag that follows "<" and the attributes follow the name as either a keyword or keyword, equals sign, and a value. If SpHyDir doesn't explicitly support the tag name, it copies the tag as ordinary text. If it understands the tag name but not the attribute, it discards the attribute. HTML 3.0 has introduced some attributes whose use is unclear. There is, for example, a LANG attribute that may assign an ISO standard abbreviation for the language and country. According to the standard, "it can be used by the parsers to select language specific choices for quotation marks, ligatures, and hyphenation rules". It is not really clear that this is useful. There is a much stronger requirement, for changing from Latin 1 to other character sets, which is not addressed by this description. SpHyDir may choose to skip features of the HTML 3.0 draft that are unclear or appear poorly thought out. If any user needs an attribute that has been omitted, please E-mail the author with a description of its use. SpHyDir builds its internal tables keyed to the tag, object, and attribute. Unfortunately, several attribute names have meanings that depend on context. The NAME attribute can be a variable name (in FORMS related objects) or it can be the label of a jump (in the <A> tag). The ALIGN attribute has one set of values for an IMG, a second set for CAPTION, and another set for Paragraphs, Headings, and Divisions. The worst thing, however, is that ALIGN is also a switch that appears with no value in tables. In some contexts SIZE means WIDTH while elsewhere it is HEIGHT. There is no way that SpHyDir can ever make sense out of this mess, but it will try to "correct" some of these ambiguities for the normal end user who is not an HTML expert. Near term, SpHyDir may offer to generate HTML attribute values that are not valid for the attribute name used in its current context. SpHyDir is not written tightly enough to trap its own syntax errors and recover. Rexx simply stops the program when it encounters a problem. Since Rexx is an interpreted language, syntax errors may only be detected during execution. When the program aborts, it can leave the output file half-written. This is the primary reason for making a backup of the previous copy of the file before generating a new copy. ΓòÉΓòÉΓòÉ 6. How to Get SpHyDir ΓòÉΓòÉΓòÉ SpHyDir is a copyrighted program which is a personal project and property of the author. It is made available on the network and may be used free of charge under a license terms distributed with the package. Essentially, you agree to leave in all HTML documents produced by SpHyDir the credit that appears at the bottom of all of these Web pages: "This document generated by SpHyDir, another fine product of PC Lube and Tune." This arrangement is called "Personal SpHyDir." If a large organization wants to generate more professional looking documents and omit the credit, other licensing arrangements can be made with the author. The following references are correct. They work with Web Explorer and Netscape and conform to current HTTP and HTML standards. If they don't work on your Browser, get a better Browser. Otherwise, you can fetch the files with FTP from pclt.cis.yale.edu. They are in the SPHYDIR subdirectory of PUB. If you have trouble with your browser, then read the trailing tutorial on Web handling of binary files to figure what went wrong. With a good browser, just select the name of any desired files and save them to disk on your machine. All are compressed with the ZIP utility from the INFOZIP project. SPHYDIR.ZIP - The basic SpHyDir package. Includes the program, some sample External Rexx "XSpO" scripts. VROBJ21C.ZIP - The VX-Rexx 2.1 runtime library at Patch Level C (VROBJ.DLL). This Dynamic Link Library must be in one of the directories listed in your LIBPATH statement in CONFIG.SYS. This file is also required for many other freeware and shareware packages, so you may already have a copy of this file. After June 23, SpHyDir will be generated with the "C" version of this module, and may complain if it is started on a system with only the "B" level of the runtime. Current information about VX-Rexx is available from the vendor Watcom. SPHYDOC.ZIP - A copy of all these HTML pages and their associated GIF files. Unlike other PCLT documents, the SpHyDir documents may be downloaded and copied. This provides a good example of lots of SpHyDir use. GBM.ZIP - A freeware package written by an IBM employee and distributed through a number of sources. This OS/2 program converts between a number of popular image formats (GIF, TIFF, XBM, BMP, etc.) and can crop or resize images. Use this package to convert BMP or Clipboard images into GIF suitable for including in a Web document. RCS is a programmer's Revision Control system ported to OS/2 from Unix. It archives updates to a source file and keeps a change history. You can display differences and recover any previous version of the file. SpHyDir doesn't require its use, but a professional HTML editor quickly learns the value of keeping a history of all document updates. For a simple and configurable Web server that can run on the same OS/2 machine, PCLT recommends the GOSERVE package from Mike Cowlishaw. There is also a supplimentary collection of routines named GOHTTP that adds better CGI and forms support from D. L. Meyer. ΓòÉΓòÉΓòÉ 6.1. Distributing Binaries through the Web ΓòÉΓòÉΓòÉ Fetching a ZIP file through the Web should be a trivial matter. Unfortunately, a number of popular Browsers (particularly NCSA Mosaic) don't do a reasonable job of handling such files. Web Servers support the HTTP (HyperText Transfer) Protocol. The first version of HTTP (0.9) simply transmitted Web files back to the reader. The current standard (1.0) preceeds each file with a statement of its data type in Internet MIME style. This allows the Browser to distinguish between HTML, plain Text, ZIP binaries, and MPEG movies. Web Browsers can also read files using the FTP protocol. With FTP, the server doesn't provide any indication of the data type, but the file name contains an extension that usually indicates the type of data (*.ZIP, *.JPG, *.GIF, etc.). In the early days of the Web, HTTP was generally used to distribute HTML files, and FTP was generally used to distribute other binary formats. No Operating Systems record the MIME file type in the disk directory. So most HTTP servers look at the file type and create a MIME data type based on the extension of the file requested. Thus if a browser fetches SPHYDIR.ZIP using FTP, it will decide that it is a ZIP file because of the *.ZIP extension, but it it fetches the same file using HTTP from the same server, the the Server will look at the *.ZIP extension, decide that it is a ZIP file, send the MIME header with that information, and the Browser will react accordingly. The problem is that a lot of Web Browsers have developed the convention that anything that comes over HTTP protocol should be either displayed on the screen or played through the speakers, while files that come over FTP can be saved to disk if they have a file extension that makes that seem right. Nothing in the standards says any such thing. Architecturally, a URL can call up ftp:, gopher:, or http: protocols to fetch a file. What you do with the file should then be determined by the type of data and not by the protocol used to fetch it. But it is hard to convince some Browsers to save a ZIP file to disk if it came over HTTP protocol. In most cases, the ZIP file is actually on disk in the Browser's CACHE directory, but it may be hard to find. When it doubt, fall back on plain FTP. ΓòÉΓòÉΓòÉ 7. Using SpHyDir ΓòÉΓòÉΓòÉ This section will explain: Object content , properties , and links Steps to install SpHyDir How to begin editing existing HTML files How to create a new HTML file How to edit text in paragraphs or headings How to delete document objects How to add new sections, paragraphs, lists, etc. How to link to other files or remote Web documents How to save the document and exit How to move paragraphs and sections around SpHyDir converts the document to a sequence of objects. Each object has three attributes: content, properties, and links. Content The content of a file is the text or program contained in that file. A word processing file may also contain formatting information. The content of a SpHyDir Paragraph Object is the text of the paragraph, along with all the HTML language features that operate at the level of words or characters and therefore cannot be turned into larger objects. To access the content of a file, doubleclick its icon on the desktop. To access the content of a SpHyDir paragraph, doubleclick its icon. This opens the SpHyDir Text Edit Window. For convenience, it is also possible to doubleclick other objects that have Headings, Titles, or Captions. This opens a dialog box that allows the heading to be changed. For example, doubleclicking the point in a Definition List allows the defined term to be edited. Although the Heading, Title, Caption, or Term can be opened as if it was contents, these features are really Properties and can also be viewed that way. Properties In Visual Basic or Delphi, each GUI object has Properties. Properties include the size, location, font, color, enabled status, and caption. SpHyDir objects have Properties that derive from HTML attributes. They may include horizontal alignment (LEFT, MIDDLE, RIGHT), size, source file (for images), label, shape, and so on. HTML 3.0 creates all sorts of attributes for each type of object. The SpHyDir workarea has a Properties Table that displays the current meaningful values of Properties for the currently selected document object. Click on a different object, and the table changes to reflect the new object. If the user doubleclicks a Property line in the Properties Table, a dialog box appears that allows the value to be changed. Selecting a property and clicking the Second Mouse Button pops up a list of the common values that the Property can take (if it is associated with a list of alternatives). The doubleclick dialog box allows a property to be set to values that are not in the popup list if the user is familiar with extended syntax. Clicking with the Second Mouse Button in the unused part of the Properties Table pops up a list of Properties that are valid for the object. At this time, SpHyDir does not allow the user to add Property names to an Object, so if SpHyDir doesn't support an HTML attribute, send a note to the author . Links A link connects an Image or section of text in a paragraph to another file in the library or to a remote network resource. Links are formed by dropping the shadow of a file, a Web Explorer URL object, or Link Manager database entry on a document object. If such an object is dropped on a paragraph, the Hotword Selection window opens. Highlight the word or phrase associated with the link and click the OK button. The SpHyDir user interface has been modelled on the native behavior of the OS/2 Workplace. A document object can be deleted by dropping it on the OS/2 Shredder. Clicking the Second Mouse Button generates a Popup menu of operations on the object. However, SpHyDir document objects are not files, so they cannot be printed by dropping them on a printer, nor can they be moved to a folder. ΓòÉΓòÉΓòÉ 7.1. How is SpHyDir Installed? ΓòÉΓòÉΓòÉ Although SpHyDir is a big Rexx program, through the magic of the Watcom VX-Rexx Development Environment is it packaged as SPHYDIR.EXE. It can be placed in any program library. The VX-Rexx runtime module VROBJ.DLL must be located somewhere in the LIBPATH, but there are so many VX-Rexx programs in use that this step may have already been performed. SpHyDir is distributed with a number of useful freeware utilities. The GBM package from IBM can be used to view and convert bitmaps files. The RCS package can be used to maintain a log of changes to the HTML files. Neither is required for SpHyDir to work, but they are helpful. Make a copy of the production library of HTML files on the OS/2 machine that will be doing the editing. The PCLT library, which appears to be "http://pclt.cis.yale.edu/pclt/" to Web Browsers on the Internet, is "D:\HTTP\PCLT" on the NT machine that acts as the server. The "D:\HTTP" is configured to the server program as the starting point for all HTTP file references. The OS/2 machine on which the files are prepared stores a copy of the files in "F:\HTTP\PCLT" and establishes "F:\HTTP" as the "HTML Library" for SpHyDir editing. This is also configured to GOSERVE as the starting point for HTML service. Files are edited and tested on the OS/2 machine, then transferred to the production server. The SpHyDir library can be specified with the SET HTMLLIB environment variable. Otherwise, it will be taken as the active directory when SpHyDir starts. SpHyDir can be configured to operate on several different file structures by creating several SpHyDir program objects, each with a different initial current directory. SpHyDir remembers parameters such as window size and location in the file SPHYDIR.INI in the root directory of the library. Therefore, if there are several SpHyDir program objects with several directories, each will have its own version of the saved parameters. ΓòÉΓòÉΓòÉ 7.2. How to load an HTML document ΓòÉΓòÉΓòÉ There are three ways to start SpHyDir on an existing HTML file. 1. If the SpHyDir Workarea is open and is either empty or the previous file can now be discarded, then drag the WPS icon of the file over and drop it in the whitespace of the workarea (not on any individual icon or caption). SpHyDir will abandon any old file and will read in the new HTML. 2. If SpHyDir is not running, but a WPS Program Object has been constructed for it, then drop any WPS Icon for an HTML file on the SpHyDir program icon. SpHyDir will start up and read in the file. 3. It is possible to associate a Program Object for SpHyDir with files of the type *.HTM or *.HTML. Then SpHyDir will be automatically launched when any such file is opened. However, this is not always the right thing. Sometimes it is useful to launch Web Explorer to view the file after it has been formatted. It is also sometimes useful to read HTML files into the System Editor and few the raw tags. So SpHyDir is not the only tool that can be used to view such files. There is also one common practice that will not work. The File pulldown menu does not have an Open option. This is a personal choice of the author and may be regarded as part of the program design. Choosing a file by name from the standard file dialog is a lot less attractive than drag and drop from the Workplace folders. ΓòÉΓòÉΓòÉ 7.3. How to create a new HTML file? ΓòÉΓòÉΓòÉ Drag the first tool (the one that looks like a book) from the upper left corner of the toolbar and drop it on the whitespace of the Workarea. SpHyDir clears the workarea to start a new document. A window pops up asking for the filename. Type the name as though it were a part of a URL. For example, type "sphydir/sample.htm" to create a new file in the "sphydir" subdirectory of the current library. ΓòÉΓòÉΓòÉ 7.4. How to Edit Text (and Stuff) ΓòÉΓòÉΓòÉ In the Workarea, the document is represented by icons. The text is displayed to the right of the paragraph icons as a "caption". OS/2 allows captions to be edited, but this doesn't provide a very nice environment. It is much simpler to "Open" the paragraph by doubleclicking on the icon or on the caption. Opening the paragraph displays the Text Edit window. The paragraph text is loaded into a Multiline Edit control. Words wrap automatically to the next line. A very large paragraph will activate the vertical scroll bars. Within the Text Edit window, the text is just ordinary ASCII data. It can be Cut or Copied to the OS/2 Clipboard, and characters from other programs can be pasted into the paragraph. All the usual rules about selecting text and using special keys like Del and End hold within the Text Edit window. Although word processors and the EPM editor can display text in multiple fonts and colors, a simple Multiline Edit control can use only one font. This is not a terribly serious problem, because HTML overpowers the ability of any WYSIWYG editor. Although it is fairly simple to do italics and bold, how can any simple editor distinguish formats labelled EM, STRONG, CODE, SAMP, KBD, VAR, CITE, DFN, PERSON, ACRONYM, ABBREV, BIG, and SMALL. These are the resonable named forms of character emphasis. HTML 3.0 threatens to add another dozen even more obscure types of character tagging. Special functions are represented in the Text Edit window by special "dingbat" characters that are not part of any standard Web character set. There are four of these dingbat functions: 1. When a file or URL is linked to a "hotword" through the Link Manager, a pair of inward pointing triangle characters bracket the link. If SpHyDir could make the triangles Read-Only it would. Editing or deleting these characters can cause trouble, because the URL for the link is stored separately. To remove a link, display the Link Manager window, select the URL for the link, and delete it. However, if the two triangles are left alone, the text in the middle can be changed to alter the hotword phrase. 2. Character emphasis, and for that matter, any unrecognized tags in the original HTML, are embedded in the text. Because the "<" and ">" characters are presented as normal text, they must be replaced with dingbat characters. Thus "<CITE>Debt of Honor</CITE> by <PERSON>Tom Clancy</PERSON>" is going to appear pretty much as seen here, except that each "<" will be replaced by an upward pointing triangle and every ">" will be replaced by one that ponts down. The simplest way to apply character emphasis is to select text with the mouse and then use the "Emphasis" menu in the Text Edit window to generate the appropriate tag. Unlike the previous case of hotwords, character emphasis has no special structural significance. These tags can be edited to change the type of emphasis or the tag can be deleted entirely. 3. HTML defines named sets of foreign, math, and special use characters. In HTML, such a character is referenced by an Entity. An Entity reference starts with "&" and then continues with the name of the character. A semicolon delimits the end of the Entity if it is immediately followed by normal characters. For example, the copyright symbol Γòò is represented as "©". SpHyDir wants to allow users to type the "&" character as needed, so when the HTML is read in the "&" is replaced by the special PC character that looks like a "smiley face" and has the numeric value 1. As with all special PC characters, it can be generated from the keyboard by holding down the Alt key and typing the number on the keypad. So to generate a copyright character, hold down Alt, press 1, release Alt. A smiley-face now appears on the screen. Type "copy;". When the paragraph is written out as HTML, the smiley-face will be turned into an "&" and the browsers will display the Entity correctly. 4. Small Images can be embedded in the middle of text. Normally this is used for icons. SpHyDir represents this with a dingbat that looks like a box with a circle in the middle, something like "[o]". To generate such a reference, hold down Alt, press "8" on the numeric pad, release Alt. Now type the file name of the icon/image. Optionally type a space and then alternate text. End by repeating the Alt "8" sequence. All other information is a property of the object and appears in the properties table at the top right corner of the workarea. Each property has a name and current value. Properties are either character strings, numbers, or choices. To edit a string or number property, hold the ALT-key and click on the old value. When a property must have a value from a list, it may be faster to click on the property with the second mouse button. The list of available values will then popup as a menu and a new value can be selected. ΓòÉΓòÉΓòÉ 7.5. How to delete objects ΓòÉΓòÉΓòÉ The WPS approach is to drag the object to the desktop Shredder. The keyboard approach is to select the object and press Ctrl-D. The Mouse approach selects the object, presses the second mouse button to popup the menu, and selects "Delete to Clipboard". It would be reasonable to expect the Del key to delete things. Unfortunately, Del is a character delete key and is used to correct mistyping in other windows. It is difficult to know exactly which window has the focus. A few bad experiences where an attempt to correct a mistyping accidentally deleted whole sections of a document suggested that Del was simply too dangerous as the Object Delete key. The Delete operation applies to everything contained within the selected object. Delete a list and all the points and paragraphs are also deleted. A user complained that the Second Mouse Button menu made it too easy to delete objects. To address this problem, a previous simple delete was converted to "Delete to Clipboard". As explained elsewhere, SpHyDir maintains a special Clipboard-like window that is able to hold the special document objects. Delete to Clipboard makes a copy of the objects in the clipboard window and deletes them from the Workarea. If a mistake was made, they can be restored by selecting a location and pressing Shift-Ins. ΓòÉΓòÉΓòÉ 7.6. How to Add to the Document ΓòÉΓòÉΓòÉ At the top left of the workarea window there are a set of icons alternately viewed as the "Toolbar" and as "Templates". They act to the document much like the OS/2 Template folder acts to the rest of the workplace. Drag the icon for a Section, Paragraph, Image, List, or other tool and drop it where you want the new element to go. This creates an empty element that needs to be filled in. Objects can also be inserted by pressing the Second Mouse Button and choosing Insert from the popup menu. The menu includes the most commonly used objects (paragraphs, points, and lists) and some less common objects for which there was no room in the toolbar. When the paragraph tool is dropped, the Text Edit Window opens and you may type text. You can also paste text from the Clipboard. At the bottom of the Text Edit Window there are buttons. One is labelled "New". Pressing the New button saves the current text in the current paragraph and creates a new empty paragraph into which the next information can be typed. When the image tool is dropped, it creates an empty Image object. Drag a GIF file over from one of the OS/2 disk folders and drop it on the object to assign a file. Add alternate text in the entry field at the top of the workarea. Pushbuttons allow you to select the alignment of the image with any trailing text. ΓòÉΓòÉΓòÉ 7.7. How to link to other files? ΓòÉΓòÉΓòÉ To create a hypertext link to another file in the HTML library, open the WPS folder to display the file object. Hold down Ctrl-Shift and drag the file as if you were going to make a shadow of it on the desktop. Drop the icon on a Paragraph or Image object. If you drop on an image, then there is no further work. Images can have only one hypertext link and the entire image is the link. If you drop on a paragraph, then the Hotword Selection window opens displaying the available text. Drag the mouse to "select" the word or phrase that will represent the link. Click the OK button. Hotwords are delimited in the text by an opening and closing triangle character. You may change the contents of the hotword area, but do not delete the funny triangle characters or SpHyDir will get confused. SpHyDir has a Link Manager in the list of Windows to handle other types of links. When the Link Manager window is visible, the top list box displays the URLs of links from the currently selected Workarea document object. The lower list box can be used to select other links. There are two buttons on the bottom. Pressing the button with a Web Explorer icon displays all the items in the current Web Explorer hotlist. Pressing the Target button displays all the target labels in the current document tree. Select an element from the list, drag and drop it to a paragraph or image in the workarea. The Web Explorer will also produce URL objects in the Workplace. Dropping a WE URL object onto a Workplace Object will also create a link to the referenced object. SpHyDir provides XSpO programs to form links. An eXternal Sphydir Object is a Rexx program that can be dropped on SpHyDir windows. XSpO's provide a way for Rexx-literate users to customize or extend SpHyDir without mucking in the source. One XSpO can be dropped on the lower Link Manager list box and duplicates the function of the Web Explorer button. It can be used as a model for programs that extract hotlists from other sources. Another XSpO is dropped on a Paragraph or Image. It searches through the windows on the destop to find Web Explorer and extracts the URL of the current document that WE is displaying. This provides a shortcut compared with creating a WE URL object and then dropping it on the SpHyDir workarea. Another XSpO shows how to generate a "Mailto" link. ΓòÉΓòÉΓòÉ 7.8. How do I save the file and quit? ΓòÉΓòÉΓòÉ When the workarea has the input focus, press F2 to generate HTML and continue editing, F3 to quit without generating, and F4 to generate HTML and then quit. The status message at the bottom of the window will indicate that HTML is being generated and then has been written. If you press F2 and nothing happens, then click once on the workarea to make sure it has the focus. If you try to quit and have modified the file in the Workarea, a message will pop up asking whether you want to Generate or Discard the file. ΓòÉΓòÉΓòÉ 7.9. How do I move paragraphs around? ΓòÉΓòÉΓòÉ You can drag an individual paragraph around, but only within the visible window. To move more data, to move a greater distance, or to move between files, there is special support to mark a range of objects, copy them to a "clipboard" and paste them somewhere else. SpHyDir wants to create the image of selecting a range of objects, moving them around, copying them to the clipboard, and pasting them back into the file. However, the native support for selection, movement, and the real OS/2 clipboard are not able to handle this problem correctly. Reluctantly, SpHyDir has been forced to reinvent some of this infrastructure. The user can select a range of objects to move within a document or to copy to another document. First select one object and press Alt-L as if you were establishing a "line mark" in the EPM or Kedit editors. Once you begin to mark a section of the document, you may extend the marked area forward or backward, but only within the current level of the document tree. The mark can be extended over but not into lists or subsections. Nor can the start or end of the mark be extended outside the section or list in which it is started. Marking creates two new objects: Mark Start and Mark End. Initially these objects are placed around the currently selected object when Alt-L is pressed. The Mark objects can then be "slid" forward or backward along the line that represents the current level of the tree. They cannot be slid into a subsection or list (to a lower level) nor can they be slid outside the section or list in which they started. The Mark can also be automatically adjusted by selecting another object at the same level of the tree and pressing Alt-L again (expanding the scope of the Mark just as additional lines are added in the EPM editor when you move to another line and press Alt-L a second time). Once a section has been marked, you can copy it to the Clipboard by pressing Ctrl-Ins (the standard OS/2 keyboard sequence for Copy). However, the OS/2 Clipboard really doesn't know how to hold SpHyDir objects, so the same effect is achieved by opening a new Window and copying all of the objects between the two marks (including all the objects contained in subsections and lists) from the workarea window to a second container that SpHyDir calls "The Clipboard". In the current release, the Clipboard window becomes visible (for debugging) though it can be minimized or can be dragged over to the side of the desktop. The objects in the Clipboard can then be moved to another part of the original document by selecting a destination object and pressing Shift-Ins (the OS/2 standard for Paste). They can be copied to another file by dragging another HTM file to the workarea (replacing the original source document) and then pasting from the Clipboard to a second document. However, Clipboard objects cannot be copied to another part of the same document. This fell out from the way the Clipboard got coded and, at the moment, it seems to be a useful feature. When the user marks objects and presses Ctrl-Ins, there were two programming choices. One choice copies all of the objects to the Clipboard. The alternative creates what is essentially a Shadow of the original record in the clipboard (what VX-Rexx calls a "shared record"). Like the Workplace shadow, the two objects are actually different views of the same data. If you were to edit the text of a paragraph after copying it to the SpHyDir Clipboard, the text of the Clipboard copy would also change. However, while a Workplace shadow cannot exist when the original is deleted, a VX-REXX shared record continues to exist until all of its related objects have been deleted. Thus the Clipboard copy of the data continues to exist after the original object in the workarea has been deleted or the entire document has been replaced. A shared record can exist in two different containers, but there can be only one copy of the record per container. By choosing to use Shadows in the Clipboard instead of full copies, SpHyDir does not support duplicating large blocks of text within the same Hypertext document. When you select another location and press Ctrl-Ins, the SpHyDir Paste tries to copy the shared record from the Clipboard back to the original document. However, since there is already a copy of the record in the workarea and no container can have two copies of the same record, the Paste operation actually moves the old record from its previous location to the new position. If you delete the document in the workarea and load a new document (even a new copy of the original document) then a new set of records are created. Now the Clipboard has the only copy of the old records and Paste copies the information into the new document. I am a bit suspicious of any feature that takes this much time to explain. On the other hand, a hypertext document should be short and it doesn't make a lot of sense to duplicate large blocks of text within such a file. There is a strong sense that the way this Mark and Clipboard logic works is probably the Right Way to handle this particular problem with this particular set of data. Only by gaining experience with this technique will it become clear if this is really the best approach. Note that the SpHyDir specialized Clipboard, Cut, and Paste apply only to the management of objects from the Workarea. Within the Text Edit window opened by double-clicking a paragraph or point, the behavior of text selection, Cut, Copy, and Paste is completely normal and operates through the normal OS/2 clipboard. Text data can be exchanged between another OS/2 program and the Text Edit window through the ordinary cut and paste mechanisms. ΓòÉΓòÉΓòÉ 8. The Toolbar and Document Objects ΓòÉΓòÉΓòÉ At the top of the Workarea there are a collection of icons that represent the Toolbar (or Template) area. These items can be dragged into the document to create new elements. ΓòÉΓòÉΓòÉ 8.1. The Document Tool and Object ΓòÉΓòÉΓòÉ The Document Tool serves two functions. If it is dropped on the "whitespace" of the workarea, away from any existing document objects (or anywhere in an empty workarea), then this tool is treated as a request to start a new document. It replaces the more traditional New option on the File pulldown menu. If there is an existing document in the Workarea, the user will be prompted whether it is alright to abandon that document. Pressing ESC returns to the previous document and cancels the request. Otherwise, the Workarea is cleared and SpHyDir prompts for the name of the file. The name is relative to the start of the document library and it should be given with the "/" notation commonly used in Web documents. SpHyDir cannot force the user to act rationally, but it is generally a good idea to organize the local library with the same structure that documents have on the production server. Thus the document http://pclt.cis.yale.edu/pclt/sphydir/status.htm would have the name "pclt/sphydir/status.htm" on the editing machine, corresponding to the part of the URL that follows the server machine name. The library on the editing machine may have some upper path components that are not visible. If HTMLLIB is set to "F:\HTTP" (or if there is no HTMLLIB variable and F:\HTTP is the current directory when SpHyDir starts) then this disk letter and upper directory will be inserted in front of the URL part to form the actual OS/2 file name. After the file name has been supplied, the user is prompted for a Title. Whatever is entered becomes the initial caption of the Document object (used to set the <TITLE> tag in HTML) and of the first Section object (used to set the <H1> string). This seems to be redundant, but the two tags are used for different purposes in HTML. The <TITLE> describes the document (say in an index of the library). The <TITLE> text may not be displayed by some browsers, or it appears as the title bar of the window. The <H1> text appears in big letters. SpHyDir doesn't go one step further and add an initial paragraph object. In some cases, the document might start with an IMG or other object instead. Drag the appropriate tool from the toolbar and drop it on the Section tool to start adding content. Most of the Properties of the Document Object correspond to fields in the <HEAD> area of the HTML. This includes LINKs to other documents, and a BASE URL value. A few Properties are taken from HTML3/Netscape extensions to the <BODY> tag to assign a Background pattern and to control the color of text on top of that background. If a GIF file is dropped on the Document Object of a file in the Workarea, that file name is set as the value of the Background property for the document. If the Document tool is dropped on a previous document object, then it becomes a request to add a Subdocument link. Subdocument pointers are used to build a larger document out of many smaller Web files. A subdocument pointer to another file means that that file is a continuation of the current file. Any word or phrase can be hotlinked to another Web page. Ordinary hotlinks do not imply a relationship. Subdocument links indicate that the other file is a child belonging to the current Web page. There are some rules to good document construction. SpHyDir chooses not to enforce them, but their proper use is highly recommended. Most importantly, a file should only be a subdocument of one other file. If two different parents try to claim the same file as a subdocument, things are going to get all messed up. A Subdocument object can go anywhere, but best results will be obtained if all the Subdocument pointers are together at the end of a file. Although it may be tempting to make everything a subdocument of the highest level index page, this produces a structure that is awkward to handle. The entire library should not be related pages. Restrict subdocument relationships to files that are really part of the same paper, article, subject, tutorial, or brochure. Use ordinary hyperlinks from the main page or library table of contents. Drag the Document Tool over and drop it in the file as you would create a paragraph or point. The Subdocument definition is then completed by dragging the WPS icon for an HTM or HTML file over and dropping it on the newly created object. If the dropped file was previously processed by SpHyDir, the Title of that document can be extracted from the Extended Attributes of the file and will appear next to the Subdocument icon. Within the current file, the Subdocument object behaves like a Paragraph whose entire contents is the TITLE of another HTML file and where that TITLE text is a hypertext link to the other file. Every time SpHyDir loads a file with subdocument pointers, it checks each identified file to see what its current Title is. Title changes appear immediately in the caption of the Subdocument object and are written back when the parent document is regenerated. The Subdocument structure is also stored as Extended Attributes of the files in the library. Each *.HTM or *.HTML file has pointers to the files that it claims as subdocuments and to its "parent" (a file that claims it as a subdocument). The order in which the Subdocument objects appear in the parent establishes a Next and Previous ordering to the sudocument HTML files, which SpHyDir maintains and can use to generate uniform Headers and Trailers. ΓòÉΓòÉΓòÉ 8.2. The Section Tool and Object ΓòÉΓòÉΓòÉ Every ordinary paper document is organized into Chapters, Sections, and Subsections. That, after all, is what an Outline is all about. HTML has provision for Headings, but it is not all that clear about what exactly a Heading introduces. HTML 3.0 addresses this by introducing a DIV tag. The concept is that large documents would be broken up into segments delimited by a <DIV CLASS=CHAPTER> or <DIV CLASS=APPENDIX> tag. However, these tags are not currently used. SpHyDir forms sections based on the appearance of an H1..H6 tag. Eventually, SpHyDir may automate the transition from HTML 2 to 3 and generate the DIV tags automatically. SpHyDir will recognize a DIV tag when it is encountered, but there is no strong body of use to know what to do with it. The author would appreciate E-mail if any important use of DIV is discovered on the Net. For it to be successful, SpHyDir has to determine where the Section ends. There is no explicit HTML marking, but it can reasonably assumed that the Section extends until there is a new Heading tag at the same or a higher logical level than the tag that started the Section. Lower level Headings are assumed to start subections of the current Section object. The Properties of a Section Object come from the attributes of the H1..H6 headings tag in the HTML 3 proposed standard. This produces a small confusion. In all other document Objects that contain other objects, an attribute of the container applies to all the objects that it contains. However, attributes on a Section Object apply only to the Heading. For example, ALIGN=CENTER means that the Heading is centered and does not apply to the paragraphs in the Section. ΓòÉΓòÉΓòÉ 8.3. The Paragraph Tool and Object ΓòÉΓòÉΓòÉ The Paragraph Object contains text. In HTML, "text" is more broadly defined to contain ordinary characters, Entities, character emphasis tags (Bold, Italics, CITE, etc), hyperlink hotwords, line breaks, and embedded images. SpHyDir is not currently prepared to break the paragraph down into any finer objects. So all this non-text "text" has to be encoded with special characters. The user can doubleclick a paragraph to display all the text in the Text Edit window. Special characters can be added to the document using the standard rule for special keyboard entry (hold down Alt, type a number in decimal notation on the numeric pad, and release the Alt key). However, hotword links are only partially contained in the text. The rest of the hotword link is a URL that can be displayed in the Link Manager Window. Before deleting text containing a hotword, use Link Manager to delete the link and remove the inward pointing triangles. Properties of the Paragraph Object are mostly derived from the attributes of the <P> tag in HTML 3. ΓòÉΓòÉΓòÉ 8.4. The Image Tool ΓòÉΓòÉΓòÉ The Image Tool represents a graphic insert. First, drag and drop the image tool to the location in the document where the image is logically positioned. Then finish the definition by dropping the WPS icon of a GIF file on top of the Image object. Currently SpHyDir requires the GIF data type (XBM and JPG may be added later). Since the IBM IPF system doesn't support GIF, generated IPF source uses the same file name and an extension of BMP. GIF files can be converted to BMP files using the GBM utilities or any of a number of other graphics programs. HTML authors are reminded that there are still a number of disadvantaged users who try to surf the net using charcter mode Unix. To accomodate such people, an Image should have alternate text that describes the content or meaning of the image. This alternate text is entered as the value of the ALT Property. If an image has alternate text, it is displayed as the caption to the right of the icon for the object. Otherwise, the file name is the caption. The Image Object was created by SpHyDir to provide an icon on which GIF files can be dropped and from which hyperlinks can be made. HTML doesn't really seem to regard Images as objects. Rather, HTML syntax seems to treat each image as an unusually large text character. The SpHyDir Object will work if the image appears all by itself or at the beginning of a paragraph. If the image has to appear in the middle of a heading or paragraph text, then it cannot be promoted to full object status. SpHyDir calls such things embedded images. An embedded image is represented by a dingbat character corresponding to the value 0x08, followed by the name of the GIF file. Then optionally there is a blank and alternate text. The sequence ends with a second 0x08 dingbat. Embedded images have no properties, and the user cannot drop anything onto them. They can be part or all of a hotlink phrase. When an image appears at the start of a paragraph, it could be rendered using the "inline image" syntax. However, SpHyDir extracts it from the paragraph an builds a separate object. The value of the ALIGN property for the Image Object determines how it will interact with the paragraph that follows it. ALIGN=NONE (a value made up by SpHyDir) displays the Image by itself. Other values of ALIGN position the following text to the right of the image, and in some cases the text flows around the image border. HTML 3.0 introduces a FIG tag to extend the functions of the current IMG tag. Unfortunately, it is not widely supported and its use is currently not clear. A later version of SpHyDir may provide automatic migration of current IMG syntax to the preferred FIG syntax after it becomes a viable alternative. ΓòÉΓòÉΓòÉ 8.5. The Ordered List Tool ΓòÉΓòÉΓòÉ An Ordered List Tool contains a sequence of numbered points. Although points are normally simple paragraphs, HTML allows a point to contain almost anything: paragraphs, images, even check boxes or radio buttons. SpHyDir 1 tried to combine the functions of Points and paragraphs. This worked well 99% of the time. It caused trouble when the list item started with anything that wasn't ordinary text. SpHyDir II views lists the way that the HTML standard views them. A list contains points. The points contain paragraphs. The paragraphs contain text. HTML 3.0 allows a list to begin with a header. This is represented by a <LH>text</LH> sequence before the first point. Following the recommendation of the standard, SpHyDir takes any free text that is found in old HTML documents outside the list points and turns it into a list heading so it is legal. The properties of an ordered list are taken from HTML 3.0. The properties allow a second list to resume numbers where a previous list left off. Netscape has some nifty ideas to control format, whether items are listed as 1 2 3, A B C, a b c, or i ii iii. ΓòÉΓòÉΓòÉ 8.6. The Unordered List Tool ΓòÉΓòÉΓòÉ An unordered lists contains unnumbered points, generally delimited by a "bullet" character. Unordered lists follow the same rules as the previous discussion of Ordered Lists. The properties of an Unordered list allow the bullet charcter to be replaced with another choice. The PLAIN attribute allows the bullet to be omitted entirely. SRC allows the bullet to be generated as a GIF image. HTML has two obsolete list formats based on the <DIR> and <MENU> tag. SpHyDir will read such lists in, but will convert them to the preferred <UL COMPACT> tag. ΓòÉΓòÉΓòÉ 8.7. The Definition (Glossary) List Tool ΓòÉΓòÉΓòÉ A Definition List defines a set of terms. Each term is followed by an indented definition. In SpHyDir, the term is a Property of each point in the List. As with other lists, the Point then contains paragraphs that are the definition of the term. With this structure, SpHyDir requires a propertly formed Definition list with alternating pairs of <DT> term <DD> definition. ΓòÉΓòÉΓòÉ 8.8. The Point Tool ΓòÉΓòÉΓòÉ The icon of a hand making a "point" represents the general list item. An Ordered, Unordered, or Definition List can contain only Points. Each point then contains paragraphs, images, and other document content objects. In an ordered or unordered list, the Properties of the Point object correspond to the attributes of the <LI> tag. In a definition list, the Point object also has a "Term" Property which is the text contained in the <DT> tag. The <DD> tag ends the term and begins the paragraphs which are contained within the Point. ΓòÉΓòÉΓòÉ 8.9. The Forms Tools ΓòÉΓòÉΓòÉ The Form Tool creates an interactive area in which the Web user can enter data to submit a request or query. A Form can include all of the previous document objects, and data fields from the bottom row of the toolbar. To process a form, the server must execute a program written by the form designer. This makes the use of Forms an advanced topic that will be described in a separate section. ΓòÉΓòÉΓòÉ 8.10. Missing Tools ΓòÉΓòÉΓòÉ Less frequently needed Objects (and tools that have no particularly good icon and would look ugly on the Toolbar) can be inserted by selecting an object in the Workarea, pressing the Second Mouse Button, and choosing Insert from the popup menu. The most frequently used Tools (Paragraph and Image for example) can also be inserted this way. However, there are a few objects that can only be inserted from the menu. The Horizontal Rule Object draws a horizontal line across the screen. Its thickness can be controlled with the SIZE property (a Netscape extension supported by Web Explorer). SpHyDir has a BR object. Normally a simple line break is honorary text and appears embedded inside paragraphs. The idea of a special break was introduced by Netscape, which needed it to clear dangling images. The <BR CLEAR=ALL> stops flowing text to the right or left of an image and starts at the first line clear of all images. The BR object is also useful in Forms to put a line break between fields, boxes, buttons, and other non-text objects. ΓòÉΓòÉΓòÉ 9. The Problem of Position ΓòÉΓòÉΓòÉ SpHyDir would be simple if the VX-Rexx and OS/2 programming interface allowed the user to drop tools and components in between two existing document elements. This would clearly indicate where the new element is to go. However, the environment requires the tools, files, and other objects to be dropped on top of existing components. This forces SpHyDir to invent some rules about positioning. Sections and Lists contain things. In the Workplace, if you drop a file on the icon of a folder, the file goes into the folder. So the normal behavior is that if you drop anything (other than a Target) on a Section or a List, then the new element is added inside that Section or List in front of anything already there. If you drop something on a Paragraph, Image, or Point then the new item goes after the thing you dropped it on. Thus to add a new Point to the end of an existing list, drop the Point tool on the last Point in the list. To add a new Point to the beginning of a list, drop the Point tool on the parent List object. These rules seem to cover all but two cases. Lists can be nested inside other lists. When this occurs, there is no way to add a new outer point after the end of a nested inner list because every time you try to drop a point on the inner list icon the new point is positioned inside the inner list instead of after it in the outer list. Similarly, there is no way to add one section after another because whenever you drop something on a section it goes inside it and not behind it. So there is an extra rule that if you hold down Ctrl when dropping a Point on a List the Point goes after the list, and if you hold down Ctrl when dropping the Section tool on an existing Section, the new Section goes behind the current section. This is not entirely satisfactory. A section can go on for many screens, and it it somewhat unexpected to have to go many screens back to the start of a section in order to drop something on the section object and add it many screens down after the section end. I am waiting for a better idea to come to mind. Originally, the idea would be that Ctrl-dropping a tool placed the tool after the thing on which you dropped it. That seemed like a good rule, but it doesn't work with Sections because you can't put a paragraph, image, or list after a Section. Sections don't end, you see, until a new Section begins (in HTML terms, a section ends when a new H1...H6 header is encountered). The only thing that you can put behind a section is another Section. Everything else that you try to put behind the section ends up inside it anyway. ΓòÉΓòÉΓòÉ 10. Managing Links ΓòÉΓòÉΓòÉ Most HTML editors expect the user to type in the document referenced (the URL) in order to create a hypertext link. Since it is easy to make a mistake, authors are urged to test their documents thoroughly. SpHyDir provides a simpler and more reliable method of constructing links. ΓòÉΓòÉΓòÉ 10.1. Forming Links from the Desktop ΓòÉΓòÉΓòÉ Links to other files in the same library can be constructed using standard Workplace Shell behavior. Hold down Cntrl and Shift and drag the icon of another file in the library to a paragraph or image object in the SpHyDir Workarea. If the link is made to a paragraph object, the Hotword Selection window opens. Use the mouse to select a word or phrase and press the OK button. Links to remote documents should be managed with the aid of Web Explorer. The SpHyDir philosophy holds that before you generate a link to a document, you should be able to display it in the Browser. There are two fairly direct ways that WE references can be use to generate SpHyDir links. The simplest option is to use the ability of the current Web Explorer program to generate URL objects. Such objects can be dropped on the desktop, but it is better if they are stored as disk files. They can then be dropped on SpHyDir to generate a link to the corresponding resource. SpHyDir also provides an XSpO Rexx program named WE_URL.CMD. First use Web Explorer to view the desired network resource. Then drop the WE_URL icon on a paragraph or image in the SpHyDir Workarea. The WE_URL program locates the Web Explorer window on the desktop, extracts the current URL from it, and passes it on to form a link. The XSpO library supplied with SpHyDir also has MAILTO.CMD, an example of how to form a link that uses the Mailto URL. ΓòÉΓòÉΓòÉ 10.2. Using the Link Manager Window ΓòÉΓòÉΓòÉ To display the Link Manager, select it from the Window pulldown menu on the Workarea window. The Link Manager presents two list box areas and a pair of buttons. The top list box shows the URLs of any links from the current document object selected in the workarea. The number of URLs listed should correspond to the number of pairs of inward pointing triangle dingbat characters in the text of the paragraph. The order of the URLs in the box also corresponds to the order of the hotword phrases in the paragraph. An Image object would have only one link. To delete a link, select the URL in the top list box and press the Ctrl-D key. The URL is removed from the list, and the triangle dingbat characters will also disappear in the text from around the previous hotword phrase. If a hotword phrase is to be deleted, it is important to remove the link first. SpHyDir has no way to connect hotwords to URLs except to pair them off in order when generating HTML. If a hotword is deleted with the editor, then the following hotword gets paired to the URL that of the deleted link, and the meanings of subsequent hotwords are similarly shifted. The larger Link Manager list box proposes new links from a database. Two buttons are presented at the bottom to populate this list. The button with the Web Explorer icon fills the box with entries from the current Web Explorer hotlist. The Target button fills it with target labels from the current document. A target is a lable assigned to a section or paragraph in the middle of the document. HTML 2.0 generates such a label with the <A NAME=xxx> tag. HTML 3.0 also supports the ID attribute on most tags, as in <P ID=xxx>. SpHyDir supports both types of HTML, but its interface is modelled on HTML 3.0. SpHyDir document objects have an ID Property. When it is set, ID appears in the Properties table for the current object. Any object can be labelled by adding the ID Property. Select the object, click with the Second Mouse button on the whitespace of the Properties table, and select ID from the list of properties. A dialog box appears in which a label value can be typed. Programmers frequently assume that the labels must be short, or that they cannot contain spaces or special characters, or that they are all uppercase. All these things are wrong. The label can be any reasonable length, it can contain blanks, and is case-sensitive. The label "Case" will not be matched by a search for "case". At this time, SpHyDir cannot guarantee that this or any other property can safely have special characters such as '<', '>', '&', or doublequote. If links are to be formed manually, then it is probably a good idea to keep the name short. Type one character wrong, even in the wrong case, and the search misses its target. However, if links are formed automatically by selecting a target from a list, then there is no chance of a mistyping. In this case, it makes sense to make the labels longer and more descriptive, so that they can be identified more easily in a larger database. When the target button at the bottom of the Link Manager is pressed, SpHyDir first seaches the current file for all objects with an ID property. If this document is part of a larger document tree, it then goes up through the Parent Extended Attribute pointers to find the root document, and proceeds down through the tree locating all the other target labels. At this point, it is not part of the SpHyDir plan to expand the scope of buttons in the Link Manager to other targets in the Library. Rather, XSpOs will be developed to populate the Link Manager list with specialized targets, such as glossary references. If anyone wishes to develop specialized XSpO routines, the names of all targets in an HTML file are stored by SpHyDir in the PCLT-SPHYDIR.TARGETS Extended Attribute. ΓòÉΓòÉΓòÉ 11. The Subdocument Tree ΓòÉΓòÉΓòÉ Most HTML editor tools operate on a single text file. However, good practice holds that hypertext documents should be divided into a large number of small files. Managing all these files and maintaining a consistent overall structure then becomes a serious problem. ΓòÉΓòÉΓòÉ 11.1. The Library ΓòÉΓòÉΓòÉ PC Lube and Tune has developed into a library structure that seems generally applicable. Because no one application can assume to own the entire server, the files fall under a common starting directory. During development, this is x:\PCLT on the author's machine. In distribution, the same structure becomes http://pclt.cis.yale.edu/pclt/ on the server. SpHyDir gets the local library name from the HTMLLIB environment variable. In this case, "SET HTMLLIB=F:\PCLT" is put in CONFIG.SYS. All of the HTML and GIF files that SpHyDir processes have to fall on this disk under this directory. SpHyDir is then programmed to moderate between the native OS/2 file naming conventions (with "\") and the more general file naming conventions used in most hypertext links (with "/"). In concept, it should be possible to move the entire structure from OS/2 to a Unix server. Although it is possible to dump all the files in one directory, the library becomes more managable if each major subject has its own directory. Any large collection of related files can be collected in the same subdirectory. ΓòÉΓòÉΓòÉ 11.2. Chapter and Verse ΓòÉΓòÉΓòÉ It is possible for a collection of random short documents to be collected together in some free-form association. No structure would be needed for such a grouping. However, most collections of hypertext files actually started as a larger paper document. The material was broken into smaller files because it is best if each file on the Web is only a few screens long. However, the original logical structure of chapters, sections, and subsections is still logically present. To accomodate this, SpHyDir supports the concept of a Subdocument. A Subdocument is a special kind of "paragraph" object in a file. Any word in an ordinary paragraph or point can be a hypertext link to another file. However, such links do not establish a relationship between the file containing the link and the file to which the link points. A Subdocument link, however, claims that the other file is logically a part of the file that references it. When one file claims another as a Subdocument, then the first file is said to be the "parent" of the claimed file. A thousand different files can have ordinary hypertext links to the same Web page, but only one file can claim to be its parent. (This is a restriction that the user should obey. SpHyDir is not currently in a position to enforce it). Just as each library generally has a "front door" or "home page", so any collection of subdocument has a starting point. The "root" document is the one member of the group that has no parent. It points to subdocuments, and they in turn can point to other subdocuments. ΓòÉΓòÉΓòÉ 11.3. Objects and Attributes ΓòÉΓòÉΓòÉ Physically, a Subdocument Object produces a paragraph whose only content is the TITLE of the Subdocument. This TITLE is a hypertext link to the Subdocument. In addtion, however, the Subdocument object has a structural effect upon the parent, the named document, and other subdocuments that are also claimed by the same parent. Subdocuments are normally a series of chapters or sections. If the text were printed out, they would be printed and read in order. The order in which the Subdocument objects appear in the parent produces a Next/Previous relationship between the subdocuments themselves. HTML 2.0 doesn't have a formal method of expressing this relationship. HTML 3.0 will have syntax for Next and Previous links. Until this becomes widely available, SpHyDir manages the relationships itself. In OS/2 a file can have Extended Attributes. The normal attributes are things like Date and Size. Extended Attributes are maintained by the application that creates the file. SpHyDir creates Extended Attributes for the HTML files to manage the larger logical document structure within the subdocument tree. One EA provides quick external access to the document TITLE without having to read through the HTML. Another lists all the Subdocuments that the current document claims. Another lists the parent, if any, of the current document. Another lists all the Header text and levels of all the Sections contained within the document. To create a Subdocument link, first drag the "Book" tool (the first one in the Toolbar) and drop it anywhere a paragraph or list point can go. The definition is completed by dragging the Workplace icon of another HTML file from the library and dropping it on the newly created object. If the dragged file was previously generated by SpHyDir, then when it is dropped on the Subdocument object, SpHyDir will extract is TITLE (from the EA) and display it as the caption of the object. This title will appear in the final page on a line by itself hypertext linked to the referenced file. When HTML is generated for the current file, the list of Subdocument objects in the order that they appear will be stored as an Extended Attribute of the current file, and an Extended Attribute will be created on each of the referenced files pointing back to the current file as the parent. Subdocument objects are not a formal construct of HTML 2.0, but there is some fully documented syntax that comes very close. When the Subdocument object is converted to HTML, it is generated in one of two forms (a paragraph or a list item): <P><A HREF="xxx.htm" REL="Subdocument"> ...title...</A></P> or <LI><A HREF="xxx.htm" REL="Subdocument"> ...title...</A> If SpHyDir processes an existing HTML document with the REL="Subdocument" attribute it will try to convert it back to a subdocument object. ΓòÉΓòÉΓòÉ 11.4. Next and Previous ΓòÉΓòÉΓòÉ HEADER and TRAILER can contain variables which are replaced with current information. Variable names are enclosed in "[" and "]" characters. [Date] is replaced by the current date. [Doctitle] is replaced by the TITLE of the document. [Up] is replaced by the file that claims this as a subdocument. [Previous] and [Next] are replaced by the files that appear before and after this file in the Subdocument list of the Parent. The [Up], [Next], and [Prevous] relationships don't always exist. For example, the document at the top of the tree has no Up. The first document listed as a Subdocument has no Previous, and the last document has no Next. To accomodate this, any line in HEADER or TRAILER that references a non-existant variable is entirely deleted. The idea is that you put on one line all the stuff that would relate to a relationship, and when it doesn't exist then the entire package is deleted. An example HEADER might include the lines: <P> [<A HREF="[Up]">Up</A>] [<A HREF="[Previous]">Previous</A>] [<A HREF="[Next]">Next</A>] </P> <P><I> [Date] </I></P> Every document gets a line containing the current date in italics. Above that line there may be 0-3 hyperlinks depending on the number of available relationships. If all three links are generated, then the line looks like: [Up] [Previous] [Next] with each word acting as a link. ΓòÉΓòÉΓòÉ 11.5. The Document Tree Window ΓòÉΓòÉΓòÉ The Window pulldown menu of the SpHyDir Workarea includes an option to display the Document Tree for whatever HTML file is currently in the Workarea. To build this window, SpHyDir checks for the Parent of the current file, and then for the parent of the parent, until it finally reaches the Root document. It then proceeds down through the Extended Attributes of the Root and all the subdocuments and sub-subdocuments. For each file, the TOC Extended Attribute lists all of the Headers in that file. The Document Tree window displays a complete cumulative Table of Contents for all of the files in the document tree structure. It is intended to eventually create a TOC file and simplify the creation of references from one part of the tree to a section in another file. Currently, the major feature of this window is the ability, from the File pulldown menu, to trigger SpHyDir to regenerate HTML for all of the files in the tree. This is a convenient way to clean things up if the HEADER or TRAILER files have been changed or when the logical order of files has been rearranged. ΓòÉΓòÉΓòÉ 12. XSpO - External SpHyDir Rexx Code ΓòÉΓòÉΓòÉ It is nice to have code that understands Web Explorer, but other people use Netscape, Mosaic, or other browsers. SpHyDir can't handle every type of hotlist file. The solution to this and other problems is an External SpHyDir Object. These are called XSpO's (pronounced "expo") but given that they are written in Rexx, it is acceptable to roll an "R" in front of the name and call it a "Rexx-spo". An XSpO is an external Rexx program that resides as a CMD file on disk. If you click on the file with the second mouse button, open its Settings, and change the icon, you can give it some meaninful icon. Then you put a shadow of the file in a desktop folder, probably along with your program object for SpHyDir. An XSpO acts something like a Tool. You drag it from its workplace folder and drop it somewhere in the SpHyDir program. The nature of the XSpO decides where it can be dropped. Unlike the Tools, an XSpO could be dropped on an entry area or list. The XSpO interface will be extended whenever an idea comes to mind. Currently, the two supported uses of an XSpO are to fill the New Links list box in the Link Manager window and to add a URL Link to an object in the work area. Sample XSpO files are supplied with SpHyDir for both purposes and will be discussed here. SpHyDir assumes that it has an XSpO whenever the user drops a CMD file on an object. When the object accepts XSpO, it generates a Rexx Call to the file as an external procedure. Since the caller is running in the VX-Rexx environment, all of the VX-Rexx functions are available to the XSpO. However, it will be difficult to make use of them without 1) a copy of the VX-Rexx manual and 2) some hints from me about the environment. An XSpO that uses VX-Rexx function calls to manipulate objects is said to be "dirty." The internal implimentation of SpHyDir may change in the future, and such files may need to be changed. An object that supports XSpO will generally provide a convention using only arguments, the return value, and the stack. Details may differ from object to object. An XSpO that does not directly call VX-Rexx functions is said to be "clean." The terms are relative, and it may be convenient to use "quick and dirty" techniques from time to time. When an XSpO is called, it is always passed as an argument the name of the object on which it was dropped. There is no good way (currently) for XSpO's to declare a type, so the XSpO itself has to make sure it has been dropped in the right place and return without doing anything if it is called by the wrong object. Objects that call XSpO's should ignore any null return. When an XSpO is dropped on the New Links listbox of the Link Manager window, it is passed no arguments other than the "New_Links" object name. The XSpO puts new list entries on the Rexx stack. Each entry begins with a URL (no blanks are allowed by SpHyDir in a URL) and then a Title (blanks are OK in the tile). Each line in the Rexx queue is one list entry. Only the title will show up in the list box, the URL is kept as user data and is presented later on when the title is dragged to create a link. If the XSpO returns the word "CLEAR" from the function call, then the list box is cleared and the new list becomes its only contents. Otherwise, the new links are added in front of the existing links. The following is the complete text of an XSpO that duplicates the existing Web Explorer Link Manager function. This file is distributed with SpHyDir and may be adapted to support other quicklist formats. /* XSpO version of Web Explorer Links */ arg object if object<>"NEW_LINKS" then return exploreini=Value("ETC",,"OS2ENVIRONMENT")"\EXPLORE.INI" strm_status = Stream( exploreini, "Command", "Open Read" ) if strm_status="READY:" then do while lines(exploreini)>0 line=linein(exploreini) if line="[quicklist]" then leave end do while lines(exploreini)>0 line=linein(exploreini) parse var line "quicklist=" title if title="" then return url=linein(exploreini) queue url title end return "CLEAR" The Workarea also supports XSpO's, but the only function currently supported is to add a Link to an object. Dropping this type of XSpO on a Workarea object is simpler than adding the link to the Link Manager list and then dragging the list item over and dropping it on the object. A supplied XSpO uses some rather "dirty" logic to find the current URL in Web Explorer (you have to enable the WE option that displays the URL in a box at the top of the window). Dropping this XSpO on a paragraph, point, or image puts a link to whatever page is currently being displayed in WE (without requiring that the page be added to the Quicklist). arg object if wordpos(object, "NEW_LINKS WORKAREA")=0 then return desktop = "?HWND1" app = VRGet( desktop, "FirstChild" ) do while app<>"" title=VRGet(app,"Caption") if substr(title,1,16 )="IBM WebExplorer " then do title=strip(substr(title,19),"B") kid= VRGet(app,"FirstChild") url=Searcher(kid) if url<>"" then queue url title if object="WORKAREA" then return "LINK" return "ADD" end app = VRGet( app, "Sibling" ) end return "" Searcher: procedure parse arg w do while w <> "" if VRGet( w, "Visible" ) = 1 then do class = VRGet( w, "ClassName" ) caption = VRGet( w, "Caption" ) if class="WC_ENTRYFIELD" then return caption subkid = VRGet( w, "FirstChild" ) url= searcher(subkid) if url<>"" then return url end w = VRGet( w, "Sibling" ) end return "" ΓòÉΓòÉΓòÉ 13. Forms Support ΓòÉΓòÉΓòÉ Modern HTML and Web Browser programs allow the user to enter data and make selections with standard GUI Boxes, Buttons, and Lists. Collectively, these features are know as "forms" support. There are two steps. First, the author must design the data entry form using HTML language elements. Secondly, a program must be written in some supported language to process the data that the user enters. HTML forms provide a subset of the standard GUI dialog features that will be familiar to users of Visual Basic or other visual programming languages. The user is presented with a set of single line and multiline text entry fields, checkboxes, radio buttons, selection lists, and push buttons. The user makes selections and enters data. Then a push button (or the Enter key) transmits data to the server. ΓòÉΓòÉΓòÉ 13.1. Forms Handling Programs ΓòÉΓòÉΓòÉ The data entered in a form has to be passed to locally written code that runs on the Web server machine. For a Unix machine, this program receives data through the "CGI" protocol. CGI specifies a particular way to pass information about the request, the remote machine, and the local server environment. Most CGI programs are written in either C or Perl. However, SpHyDir runs in OS/2 and is written in Rexx. IBM has a very nice Web server package for this environment called GOSERVE. Each arriving request is passed to a locally customized Rexx filter program running as a subthread of the server. Whatever efficiency is lost using an interpreted language like Rexx is gained back by using threads instead of creating a new process for each request. Although GOSERVE provides all the necessary forms support, it doesn't use precisely the same conventions as the CGI interface. SpHyDir will talk more generically about a "forms processing program" while other sources would probably call the same thing a "CGI program" without assuming that there could be any other kind of server. Each GUI object in the HTML form is associated with a variable name. The data and selections are transmitted as a sequence of "name = value" pairs, where name is the variable name associated with a field or button and value is the data typed or the alternative selected. This sequence of name and value pair must be processed by program that processes the request. After the request is processed, the results are sent back to the remote user. Normally, the format of this result is another HTML file. Frequently, the response will also have Forms objects. The contents of the response file will include some insertions based on the results of the previous request. Thus a comprehensive tool to simplify Form processing has to solve three problems: 1. It must provide the user with an easy way to specify the GUI objects (entry fields, buttons, check boxes, and selection lists). SpHyDir does this by providing Toolbar of GUI objects just as Visual Basic and VX-Rexx solve the same problem with similar toolbars. 2. It must provide a simple way to decode the incoming "variable=value" pairs. The Rexx language (along with some helper functions provided by GOSERVE) makes this a trivial task, but it is not a very difficult problem in any language. SpHyDir provides a Rexx "helper" routine named SpHyDir_Decode in the SPHYHLPR.VRS file that provides this service. 3. It must provide a way to insert data into the reply sent back to the user. Some existing programs generate the entire response with program statements: SAMPprintf("<TITLE>Response to Your Request</TITLE>\n");/SAMP This is tedious, difficult to read, and impossible to validate. A second approach scans an HTML file and inserts data: SAMP<TITLE> %insert TITLETEXT </TITLE>/SAMP This is slow because it requires a syntax scan during every reply. SpHyDir provides (IMHO) a better solution. The programmer uses SpHyDir to create a ordinary HTML file with text and forms objects. As SpHyDir generates the HTML, it separately tabulates the location of strings or insertion points that correspond to the various forms variable names. If the file is fetched as a *.HTM file, then everything goes out as it was designed. However, if a forms processing program wants to send the file back as a reply to a previous query, then it can call a helper routine (SpHyDir_Reply in the supplied Rexx-GOSERVE example) that extracts from the program the current value of all variables whose names correspond to the variable names assigned to the forms objects in the HTML file. These current values from the program are inserted into the file as it is sent back to the user and populate the fields, boxes, buttons, and lists that are available for the next reply. Rexx is a particularly attractive language in which to do this kind of programming because access to its variable names and symbol table is simple and flexible. The combination of SpHyDir, Web Explorer, GOSERVE, and Rexx-based Forms processing programs provides a simple but powerful Web development environment. However, local requirements will soon make it necessary to extend this development environment to real CGI programs running on Windows NT or Unix servers. ΓòÉΓòÉΓòÉ 13.2. Forms are poorly Form-matted ΓòÉΓòÉΓòÉ The ambiguities of HTML that cause problems for SpHyDir in normal text are made worse when Forms are processed. Consider a simple example: The top line is a simple entry area for typed characters. The second line presents three alternatives using the "radio button" metaphor (only one can be selected, and choosing one deselects the others). The last line is a check box that can be set or cleared by clicking it. In visual programming languages, such as Visual Basic, each radio button or check box has a "caption" defining the text that follows the box or button and describes the option. In this example, the captions are "HTTP", "Gopher", "FTP", and "BINARY". Occasionally, but less frequently, a Text Entry object would also have a caption (in this case "Identify a Server Machine:"). In any case, the Caption is an attribute of the object and is part of the object definition. However, in HTML a box or button object is just the box or button itself. Any caption text is just ordinary "paragraph" text. There is no limit on its size, contents, or structure. Just as SpHyDir had to invent a chapter and section structure by looking at Heading tags, it must also construct GUI programming objects by assuming that the captions are reasonable and obvious. All GUI objects (entry areas, buttons, boxes, and selection lists) must be inside a FORM area. However, the form can also contain ordinary text, images, ordered and unordered lists, sections, and everything else that is valid in a document. Unlike a paper form, where the instructions are usually separate so that the input can be easily processed, an HTML form can have the input widely scattered through the text. When the form is submitted, only the values of the entry fields and the selections made by the user are transmitted, not the captions and text. A user will become confused, however, if each Radio Button option is accompanied with three screens full of explanation. The relationship between the buttons would be lost. Therefore, it is probably best if each field or button has a short clean caption. Furthermore, based on a universal GUI practice, the caption of a data entry area would ususally come in front of the entry field (as the example "Identify a Server Machine:"), while the label of a check box or radio button comes right after it. In normal text, most of the SpHyDir objects start a new line. This is not true of Form Objects. If a browser can fit the next button on the same line, it will do so. The only way to be sure that there is a line break is to create a paragraph (<P>) tag. In normal text, every SpHyDir object is "paragraph sized" or larger. SpHyDir knows to create a line break when paragraphs, ordered lists, and headers are encountered. But several forms objects may have to go on the same line. One idea would be to create a higher "grouping" object to which they might all belong, but SpHyDir is based on the principle that format should follow from document structure, and it seems wrong to create artificial structure to duplicate a format feature. It has always been possible to create an empty paragraph. Simply drag the Paragraph tool to the document to create a new paragraph, then type nothing in it. When the HTML is generated, this creates a line of the form: <P> </P> in the output. The problem is that SpHyDir ignores empty paragraphs when reading in normal text, so this structural element is lost when the document is re-edited. SpHyDir relaxes this rule, and will preserve empty paragraphs when they are encountered inside a Form structure. A form designer should drop an empty paragraph object between any two buttons, fields, or boxes that are supposed to appear on different lines. ΓòÉΓòÉΓòÉ 13.3. Form Tools ΓòÉΓòÉΓòÉ The Toolbar contains template objects for all the GUI elements that HTML supports. If this document is viewed using a Web Browser, examples of the Forms objects will appear in the document. They are not connected to any processing program at this time. Attempting to submit anything from these form objects will return an error message. Just go back to the document and continue. Forms examples will not appear in the INF version of this material, because forms are not supported in IPF. ΓòÉΓòÉΓòÉ 13.3.1. The Forms Tool ΓòÉΓòÉΓòÉ Interactive form elements are valid only within a section of a document maked as a Form. The Form Tool creates such a section. Drag the Form Tool over and drop it anywhere in a document except within another Form. This creates a new level in the document tree. All other form objects, and all ordinary document objects, are valid within a Form section. Each form must be associated with the name of a program that the server will run to process the data from the form. When the form object is created or selected, an entry area becomes visible into which a program identifier can be typed. The exact format for program identifiers depends on the type of server being used. On a Unix server, this is usually the name of a program in the "cgi-bin" subdirectory, as in "/cgi-bin/program". On other systems, this may be any program name. ΓòÉΓòÉΓòÉ 13.3.2. The Single Line Text Entry Field ΓòÉΓòÉΓòÉ The Entry Field Tool creates a "single line" text entry area. This is the type of field that would be used to read simple data like a name, phone number, E-Mail address, or book title. The caption for the Entry Field Object is treated like paragraph text. When the field is created, or when the object is double-clicked, the standard Text Edit window opens. Although the user can type an arbitrary amount of text into the window, the caption should generally be short. When the object is closed, it is the caption and not the default field contents that appears next to the Entry Field Object in the SpHyDir Workarea. An Entry Field object has attributes. When the object is created or is selected by clicking with the first mouse button, a set of fields becomes visible in the upper right section of the Workarea. Yes, these are also "entry fields", but they are part of the VX-Rexx application and not the HTML Forms. Many of these attributes are common or similar across all the Forms objects. For each type of object, the appropriate set of fields becomes visible. The first (top) attribute is a variable name. When the form is submitted, the text entered into the field will be transmitted as the value of a "name=value" sequence. For example, entering "Yale University" into a field with this definition would transmit the sequence: SAMPsampentry=Yale University/SAMP to the Web Server. This value, along with any other values from other fields in the form, will be passed to the program designated by the FORM object to handle the data. In many cases, the Text Entry field will be initially empty and the user will be expected to type a value in. HTML allows an initial value to be transmitted from the server. This string will appear in the Text Entry field and will be transmitted back as its value if the user doesn't change it. A static default value can be entered in the second (long middle) field. A default value can also be generated dynamically from a previous Web Server program that requested transmission of the current page. To allow this, SpHyDir creates a "symbol table" external but connected to the HTML source for the page. This table is attached as an Extended Attribute of the file in the OS/2 or NT file system, and is stored less elegantly as a separate file in Unix. For this field, the table would contain a line of the form: ENTRY SAMPENTRY nnnn 18 Where "ENTRY" is the type of forms object, "SAMPENTRY" is the name of the variable associated with the field, "nnnn" will be replaced with the byte offset in the field of the default value (in this example, the offset of the "S" in "Sample Entry Field"), and 18 is the length of the static default value. An Entry field generates HTML text of the form: <INPUT TYPE="TEXT" NAME="SAMPENTRY" VALUE="Sample Entry Field" SIZE="30" MAXLENGTH="30"> If no static default text is provided, a VALUE="" is generated to simplify the insertion of a dynamic default text from the symbol table. SpHyDir helper routines simplify the insertion of dynamic default text from forms processing programs. The last attributes of an Entry Field include a checkbox to declare that this is a Password field (so the data typed in should be masked out) and two length fields. The first length specifies the size of the box, the second field is the maximum amount of data that can be typed into the box. If the maximum amount is larger than the size of the box, or is omitted all together, then when the user gets to the end of the box the previous characters shift left to make room. ΓòÉΓòÉΓòÉ 13.3.3. The Multiline Entry Field ΓòÉΓòÉΓòÉ A Multiline Entry (MLE) Object generates an area with scroll bars into which the user can type an arbitrary amount of text. This is ususally used for freeform feedback (to send comments, suggestions, or complaints to the author). It can also be used to annotate information. An MLE is a large object, so it has no formal caption. If you want to describe it, do so in the paragraph that preceeds or follows it. The contents of the MLE object, which can be edited by double clicking the object and opening the Text Edit window, is the static data that will appear as a default within the MLE window when it is displayed on the remote screen. An MLE field in a Web Browser will not support font changes or hypertext links. SpHyDir may eventually get around to disabling these options in the Text Edit window. Meanwhile, when editing default text for an MLE, don't use italics, bold, or any of the other format tags. An MLE is associated with a variable name. When the form is submitted, the new content of the MLE will be assigned as a value to that variable name. SpHyDir creates a entry in the Variables Extended Attribute with the type of "MLE", the name of the variable, the location of the start of the default text, and the length of the static default text. This can be used by the helper routine to insert an alternate string dynamically into the form as it is being transmitted. The content of such a string would be whatever HTML declares to be valid between the <TEXTAREA> and </TEXTAREA> tags. An MLE also has a size specified as rows and columns. They appear in the two lower numeric boxes and can be changed to fit the application needs. ΓòÉΓòÉΓòÉ 13.3.4. The Checkbox Tool ΓòÉΓòÉΓòÉ The Checkbox Tool creates a standard GUI Checkbox object. A caption follows the Checkbox to describe the option. The caption is regarded as the "contents" of the object and may be edited by double-clicking the checkbox object to open the Text Edit window. Unlike the MLE, the checkbox caption is ordinary text and may contain emphasis (bold, italics) or hypertext links. The Checkbox is associated with a variable name. When the checkbox is seleted, a "name=ON" pair is returned. A static default value can be set by clicking the "Checked" option when the checkbox object is currently selected. A Checkbox has a variable name. It can also be statically assigned an initial value by checking the "Checked" checkbox for the Checkbox object. [This is about the fourth pass through this document, and it just gets worse as it gets more precise.] There are different ways to express the value of a Checkbox variable. As a number it would be 0 or 1. In other contexts it might be "YES" and "NO" or "TRUE" and "FALSE". In HTML, the checkbox is turned on by adding the keyword "CHECKED" to the tag that defines it: SAMP<INPUT TYPE="CHECKBOX" NAME="NOMAYO" CHECKED>/SAMP However, when the user submits the form and the box is checked, the variable name is returned with the value "ON" as in: SAMPNOMAYO=ON/SAMP Clearly this is a muddy area and may be subject to further refinement. When SpHyDir generates the Variables EA for this field, the entry will have the form: CHECKBOX NOMAYO nnnn 7 Where the type is CHECKBOX, the variable name is NOMAYO, nnnn is the byte offset in the file of the blank following the variable name, and the length is either 0 or 7 since the word "CHECKED" has seven letters and is either present or omitted. ΓòÉΓòÉΓòÉ 13.3.5. The Radio Button Tool ΓòÉΓòÉΓòÉ The RadioButton Tool is used to specifiy one of a set of mutually exclusive alternatives. Only one can be selected, and selecting that option automatically turns off the other alternatives. The Web server is: The caption of the RadioButton, which can be edited by doubleclicking the object to open the Text Edit Window, is ordinary text and may have emphasis and hyperlinks. However, if the captions are large enough so that the alternatives cannot all fit on the same line, the user must provide additional HTML markup (such as the <HR> tag) to group related buttons together. When a RadioButton Object is created or selected, three fields become visible at the top of the Workarea. The first field provides the variable name for this button (and implicitly all other buttons that are part of the same grouping). The second field contains a string that will be assigned to the variable when this particular button is selected. Under these fields, a Checkbox allows this particular button to be selected as the default for the group. To be meaningful, only one button in each group can be checked as the default. In Visual Basic and VX-Rexx, radio buttons have to be collected in a Group Box to be related to each other. In HTML forms, radio buttons are related by having the same variable name. The value assigned to that variable name distinguishes one button from another. Radio Buttons pose a problem for the symbol table in the Extended Attribute. Up to this point, every HTML object produced one entry with its own variable name, and there was one insertion point for the value of that variable. However, each Radio Button has a tag location, and to override a static default with dynamic information from a program, the "CHECKED" attribute in all of the tags has to be manipulated. So for every radio button, the Variables EA gets a separate entry: SAMPRADIOBUT SERVER=UNIX nnnn 0 RADIOBUT SERVER=OS2 nnnn 0 RADIOBUT SERVER=NT nnnn 0/SAMP The "nnnn" in each line is the offset in the file of the blank that follows the name and either preceeds ">" (if the length is 0) or "CHECKED>" (if the length is 7). An acceptable strategy is to process these entries in order, checking the current value of the program's "SERVER" variable against the possible matching strings "UNIX", "OS2", and "NT". If a match is made, then "CHECKED" is inserted into the HTML file, if not and the length is 7 then the old "CHECKED" string is removed. ΓòÉΓòÉΓòÉ 13.3.6. The Spin and Listbox Objects ΓòÉΓòÉΓòÉ A Spin field displays a sequence of alternatives within a single window. CUA rules suggest that the Spin choice is appropriate when the alternatives are ordered, but the Spin object also allows a small number of alternatives to be meaningfully displayed in a small space. In HTML terms, a Spin object corresponds to a SELECT tag with no SIZE parameter. Get a dozen eggs: An interesting feature here is that Web Explorer seems to mess up the order and selection rules. It defaults to the last alternative chosen, when the standard clearly says that the first is the default, and it seems to get "bigger" and "smaller" reversed. A Listbox provides another way to display alternatives. It is probably more suitable if the number of options is large. This Object is also a SELECT list, but with the SIZE parameter specified. For both selection objects, a static list of alternatives can be entered through the Text Edit window by doubleclicking the object. Each alternative is typed on a separate line. Press Enter between alternatives. Do not use character emphasis or try to assign links to the alternatives. List alternatives can be assigned dynamically by creating an array of character strings. For example, in Rexx a set of alternatives might be specified by the sequence: account.0=3 account.1="Checking" account.2="Savings" account.3="Money Market" If the user chose the second option, this would then feed back as the string "account=Savings" which the Rexx helper routines would use to assign the string "Savings" to the variable ACCOUNT in the next program. [A note to those who are not Rexx wizards, the scalar variable ACCOUNT is completely independent of the "stem" ACCOUNT. (with the trailing period). This strategy uses the stem to hold the list of alternatives, and uses the scalar to designate which alternative was selected.] ΓòÉΓòÉΓòÉ 13.3.7. Pushbuttons ΓòÉΓòÉΓòÉ After filling in the required fields, the user triggers an action on the server by pressing a Pushbutton. If no Pushbutton object appears in the form, pressing the Enter key may also transmit data. A default Pushbutton with no options is labelled "SUBMIT". It will trigger transmission of the data, but will add nothing to the datastream itself. Multiple "SUBMIT" buttons would be indistinguishable from each other. Each Pushbutton has attributes: The left entry box is the name of a variable. The right box is both the value assigned to the variable when the button is pushed and also the label placed on the face of the button. When an explicit variable name is assigned to a Pushbutton object, an entry is also made in the Variable Extended Attribute. It identifies a type of "PUSHBUT", the variable name, the offset of the static value string, and its length. If the helper functions are used, they will check for a variable of the same name in the calling program and will substitute its current value in the Pushbutton definition. This means that the caption of the Pushbutton can be dynamically changed by the calling program. A special version of the Pushbutton control is established if the Hidden attribute is checked when the button object is selected. A Hidden field doesn't appear on the user's screen, but it is passed back as part of the data stream to the next program. This can be used to pass a handle, transaction ID, or other state information from one screen to the next. ΓòÉΓòÉΓòÉ 14. Bugs and Restrictions ΓòÉΓòÉΓòÉ VX-Rexx 2.1B has a bug when moving a tree of records in a container. Suppose, for example, you decide to move one section in front of another. You can click on the sections to collapse the tree so that just the two icons are showing. You can then drag the second icon in front of the first. However, when you re-expand the tree, you will see that elements two or three levels down in the tree have been incorrectly reorganized. For now, the safe way to move large sections of the document is to mark them with Alt-L and move them through the SpHyDir special "Clipboard" window. Web Explorer creates unusual objects that cannot be directly dropped on the SpHyDir windows. To process a document, drag the document from the WE window to a folder in the HTML library on the current machine. SpHyDir can only process files that are in the library. Drop a URL object on the desktop or in a folder first, then use it to build a link. ΓòÉΓòÉΓòÉ 15. Supported and Unsupported HTML ΓòÉΓòÉΓòÉ SpHyDir II was restructured to simplify extensions. Most of the HTML 3.0 and Netscape tags and attributes are now supported, or will be shortly. SpHyDir generates LINK tags for Subdocument relationships (Next, Previous, Up). It preserves, as properties of the Document Object, LINKs mentioned in the current HTML 3.0 draft (Home, TOC, Index, Glossary, Copyright, Help, and Bookmark). It plans to support Header and Trailer links for document specific boilerplate files. Other LINK tags are not preserved. The SpHyDir objects have a place for every valid construction, but they may not support constructions that are invalid, even when frequenlty used. If there is a reasonable strategy, current incorrect markup may be "upgraded" to valid status. For example, lists may not contain any data outside the list items: <OL>Text here is illegal, but there is a lot of it in practice. <LI>This is an implied paragraph <LI><P>This is an explicit paragraph</P></LI> Text here is nominally illegal. </OL> SpHyDir will take one text string outside the points and "upgrade" it to the HTML 3.0 List Header <LH> contents. In other places, loose text may be upgraded to a <CAPTION> or <CREDIT>. However, when there is no place to put it, the text may get lost. SpHyDir needs where possible to convert HTML constructs to the properties of an Object. A particular problem is created by hypertext labels generated by <A NAME=xxx>word</A>. Since SpHyDir cannot manage properties for individual words, it assigns the NAME to the ID property of the Paragraph, Section, or other object in which the labelled word appears. It is the intention of SpHyDir to migrate this to the preferred <P ID=xxx> syntax of HTML 3.0 as soon as that syntax is universally supported. For now, SpHyDir rewrites the HTML by applying the <A NAME=xxx> tag to the entire text content of the Paragraph or Header in which it previously appeared. SpHyDir does not support two <A NAME=xxx> labels within the same paragraph or header. In many visual programming languages, buttons and boxes have a caption. This is not an HTML concept. SpHyDir follows the more common practice to simplify use. In HTML, a check box is just the box: <INPUT TYPE="CHECKBOX" NAME="BIN"> BINARY Syntatically, the last word "BINARY" is outsize the tag. It is just text. If SpHyDir didn't make any stuctural assumptions, it would just appear as ordinary paragraph text. However, SpHyDir depends on creating "objects" that are bigger than just a "[]" or "O". So the Entry field, Checkbox, and Radiobutton forms object include text that functions as the "caption" of the object. ΓòÉΓòÉΓòÉ 16. Character Sets ΓòÉΓòÉΓòÉ Character set issues are ususally overlooked in the US. However, a World Wide Web has to confront the problem of displaying information in languages other than English. This is a fairly difficult problem that must be approached carefully. The most complete solution would be Unicode, a two-byte character set that includes every modern language in the world. This may prove important in the future, but its use today is premature. A more modest solution is to use the ISO "8859" family of one-byte character sets. In particular, the ISO 8859-1 "Latin 1" character set supports all the Western European languages from Iceland, to the Nordic countries, to Italy. There is little perspective in Connecticut about how people overseas actually configure their personal computers. The screen is a more powerful device and can support many different character sets. The keyboard is more constrained. Through the years there have been many different approaches to the keyboard entry of foreign language character sets. If SpHyDir is going to provide an easy to use editing environment, the data entry is an important part of the problem. Without any user input, SpHyDir now caves in to the OS/2 System design. It embraces the IBM architecture of Code Pages. The assumption is that IBM sells hardware and software overseas and if it insists on pushing an architecture like Code Pages, then that must be how people are actually using the system. A few terms need to be defined: character set A character set is a collection of characters that completely address a particular need. For example, the upper and lower case alphabet is a character set that can be used to express all the common names of people in the US (since names like "Sally2" and "Bi$$" don't occur). The minimal useful computer character set are the 94 characters in the ASCII set (although for many purposes you can get along without ~ ` { } or ^. Extensions to this character set exist to support particular foreign languages or special purposes (math, APL). font A font is a set of instructions for drawing each character in a character set on a screen or printer. The system normally uses a small set of bitmap fonts to display characters of normal size. Algorithmic fonts such as Microsoft's TrueType or Adobe's ATM fonts can be displayed in any size. code A standard that assigns number values to every character in a character set, allowing those characters to be stored in a computer memory, on disk, or to be transmitted on a communications line. ASCII and EBCDIC are examples of codes. A code always has some control characters to represent the end of a line, a backspace, a tab, and other functions. In ASCII, the control character values are from 0 to 31 and in EBCDIC they are from 0 to 63. code page A Code Page is (essentially) a character code in which all the control values have been removed and replaced with addtional printable characters. Code Page is mostly an IBM term, though it has rubbed off on Microsoft. It allows a display or printer to have some additional special use characters that can be displayed in contexts where the normal functions of control characters are not needed. When IBM designed the PC in 1980 there were no general international standards for character sets and code pages beyond the standard ASCII set. The PC created a Code Page by filling in the remaining 256-94=162 code locations with a haphazard collection of box drawing, international, and dingbat (club, face, "small house") characters. Years later this was designated in IBM terms as Code Page 437. Later on during the 1980's, the Internation Standards Organization (ISO) finally developed a set of one-byte character sets that extended the ASCII standard to other character sets. 8859-1 covers Western Europe 8859-2 covers "Latin" Eastern Europe 8859-5 Cyrillic 8859-6 Arabic 8859-7 Greek 8859-8 Hebrew 8859-9 like 8859-1 but drop Iceland and add Turkey The HTML 2.0 standard makes 8859-1 the default encoding for HTML documents. However, the HTTP and MIME standards allow a document to be encoded in any of the ISO 8859 family of code sets. It would be a mistake for SpHyDir to drop its USA-centered perspective only to adopt a slightly broader 8859-1 Western European perspective. The "Latin 1" character set on which the 8859-1 code is based includes some characters which were not part of the IBM PC Code Page 437. Most of the vendors (Microsoft with Windows and NT, Adobe with PostScript and ATM, DEC) simply adopted 8859-1 as their standard code. IBM decided that it was too important to leave the basic box-drawing characters in their current location. Instead, they created Code Page 850, which includes all the Latin 1 characters but does not assign them to their 8859-1 code values. The OS/2 Presentation Manager has a dummy Code Page 1004 that reflects the ISO 8859-1 character values. However, this is not recognized as a "real" Code Page number by most of the commands and OS/2 services that deal with such things. Before beating up on IBM, it should be noted that the ISO 8859-1 standard may not be quite as useful as it first appears. While it is fairly simple to display 256 different characters (or more) on a computer screen or printer, it is very difficult to squeeze all those characters on the keyboard. Any one-byte code page will have too many characters for easy keyboard input, but not enough characters to handle the total information system requirement. Long before modern computers and laser printers made a complete 8-bit code set possible, foreign countries had adopted variations on the old 7-bit "ASCII" character set. The idea was to give up a character you don't need for one that is more important in your country. The characters ` ~ ! @ # $ % ^ { } [ ] \ | < > could be replaced with ╨ó or ╤ê. These substitutions created other Code Pages in which the foreign use characters have been placed in the familiar ASCII location, and the ASCII charcters that they displaced have been put somewhere else. There is also the problem that in any large publishing project, the character sets quickly expand beyond any 256 character subset. Beyond French and Spanish, there are Hebrew, Arabic, Cyrillic, Greek, and then the problem of mathematical symbols, special punctuation, and the stupid box drawing characters that caused all the trouble in the first place. Some of this you can handle with GIF files, but the rest pose a problem. HTML and current World Wide Web practice address this issue with Entities. The characters that are not part of the standard ASCII set are referenced by name. An Entity reference to a character begins with "&", then contains the character name, and ends with ";". The special character used in HTML syntax are converted to Entities, with < > and & referenced as < > and & respectively. The ╨ó symbol is denoted Æ (short for "A-E ligature"). Going back to the earlier analysis, the Entity name allows HTML to refer to a character in a character set without becoming dependent on any particular code mapping. While a code mapping would limit you to 256 characters, the range of possible names is unlimited. Entities also allow you to accomodate Code Pages that either reflect historical accident (the original PC Code Page 437) or National Use subsets. The CODPAGE statement in the OS/2 CONFIG.SYS dataset specifies first the default Code Page number, and then an alternate value. IBM normally makes 437 the default to support obsolete DOS utilities. In modern use, particularly when someone edits HTML files, it makes more sense to at least make 850 the default: CODEPAGE=850,437 For more information, look up CODEPAGE in the Command Reference file in the Information folder. SpHyDir does not change the current Code Page. The whole idea behind the current SpHyDir strategy is that whatever Code Page the user has currently selected must be familiar. The user must already know how to deal with it and how to comfortably enter data in the local language. So SpHyDir converts HTML use to the Code Page environment rather than trying to change OS/2 to some other character set. The number of the current code page is used as a file extension. SpHyDir searches the root directory of the HTML library (determined from the HTMLLIB environment variable or the current directory when SpHyDir starts up). It looks for three files: ENTITES.xxx, CHARIN.xxx, and CHAROUT.xxx where xxx is the Code Page number. SpHyDir is distributed with *.850 versions of these three files for the recommended Code Page 850. The ENTITIES file is an ordindary text file with entries to map the Entity names to values in the current code page. For example, the ENTITIES.850 begins with the lines: CODE b5 Aacute Á Capital A, acute accent b7 Agrave À Capital A, grave accent b6 Acirc Â Capital A, circumflex accent /CODE Only the first two items are significant. On the first line, "b5" is the hex representation of the value assigned to the character in the 850 code page and "Aacute" is the name of the entity (with the leading "&" and trailing ";" stripped off). The rest of the line is commentary. SpHyDir has a builtin knowlege of the < > and & Entity names. These are also the only Entities that can be mapped to a code value below 80 hex. All the other Entity names that SpHyDir will process specially come from the ENTITIES.xxx file. However, if SpHyDir encounters an Entity name that is not defined in the file, it simply converts the "&" to a Smiley Face dingbat character and retains it in its Entity form in the Workarea and Text Edit windows. Later on the Similey Face is turned back to "&" when the HTML is generated. The CHARIN.xxx and CHAROUT.xxx files provide translate tables to handle files encoded in the ISO 8859-1 character set. The CHARIN table translates characters from the HTML file with code values from A0 to FF hex to the corresponding codes in the current Code Page. The CHAROUT file provides a table to translate Code Page characters with a hex value of 80 to FF to the external ISO 8859-1 set. SpHyDir provides CHARIN.850 and CHAROUT.850. Since the 850 Code Page contains the Entire Latin 1 character set, this appears to be a fairly reasonable arrangement. The user is free to create a CHARIN.437 to support the older PC character set, but since it does not contain all the characters in the Latin 1 alphabet some characters may be lost on input. Also, the CHAROUT table cannot meaningfully translate PC dingbat characters ( like the box drawing character) that are not part of the 8859-1 set. Assuming that the user adopts this suggestion to make 850 the default Code Page: If a CHARIN.850 file has been copied to the root directory of the HTML library, then immediately after reading in an HTML file somewhere in that library, SpHyDir uses the table in that file to translate any ISO 8859-1 extended code values to their corresponding Code Page values. Note that this simply shuffles one set of code above hex 80 to another set of codes also above hex 80. Since all the HTML markup and entity names use standard ASCII characters below hex 80, the initial translation will not effect any of the subsequent syntax analysis. If there is no CHARIN table, then any code value in the HTML file will be read into SpHyDir untranslated. It will display with whatever character the current Code Page assigns to that code value. However, without a CHAROUT table it will also be written back to HTML with its original code value. SpHyDir will not provide any help displaying or editing such characters, but it will not damage them if the user leaves them undisturbed. When processing text, SpHyDir identifies an Entity from the leading "&". It will handle the <, >, and & Entities automatically. Without an ENTITIES.850 file in the root directory of the HTML library, those are the only Entities that it knows about. With an ENTITIES file, it will look up any other Entity names that the file defines and replace the Entity reference with the code value in the current Code Page for the corresponding character. The character will then display normally in the Workplace document tree and in the Text Edit Window. Any Entity name that is not matched against the file remains an Entity. Since SpHyDir wants the "&" character to edit normally, and since a lot of dingbat characters are available, the "&" introducer is replaced by the Smiley Face character whose code value (in all Code Pages) is 01. When it goes to generate HTML, SpHyDir converts the Smiley Face dingbat back to an "&". Thus anything that displays as a SmileyFace Entity in the Text Edit window will become a regular HTML Entity in the final file. It will not be translated to anything else. Any text in a Paragraph or Header that contains extended code values (above hex 80) will be checked against the table build from the ENTITIES.xxx table. If a match is found, the character is replaced with an Entity reference to the character name. Any extended code values that do not match Entity names will be translated by the CHAROUT.xxx table should it exist. These characters will remain as single byte codes. However, if they are proper Latin 1 characters then they should be assigned their 8859-1 values and should display properly with a browser. If there is no CHAROUT table, any character in SpHyDir memory will be written to the HTML file without translation. If this happens to be a valid 8859-1 character, then it will display on most browsers. Although 850 is the recommended International Code Page, many users may prefer other OS/2 Code Pages tailored to a particular country. It is trivial to generate another ENTITIES.xxx table file. Generating the CHARIN and CHAROUT tables are a bit more difficult, but the existing tables were generated with a C program and, given a bit more time, it may be possible for SpHyDir to provide these tables for other defined numbers: 852 Latin 2 (Czechoslovakia, Hungary, Poland) 857 Turkish 860 Portuguese 861 Iceland 863 Canada (French-speaking) 865 Nordic This will remain an exercise unless some real user out on the Web reports that they use one of these Code Pages and would like them to be supported. It is not clear if more is needed to support the right-to-left characters. For that matter, it is not clear if there is any Web Support for: 862 Hebrew-speaking 864 Arabic-speaking Again, input from users would be helpful. I have not studied the scope of the National Use Code Pages. They may not include all of the Latin 1 characters. Lacking official Entity names, and any usable Web standards, and any support from Browsers such as Netscape, it seems premature for PCLT to try to solve this problem all by itself. This is, however, an area where Entity notation has a substantial advantage over CHARIN/CHAROUT single character translation. A user with an Icelandic keyboard can still generate the occasional Turkish character as a named Entity even if that character cannot be natively displayed in the Text Edit Window. This is the reasoning behind the SpHyDir bias to generate output as Entity notation instead of as single byte 8859-1 encoding. If this isn't exactly what you want, please E-mail Howard.Gilbert@yale.edu with additional suggestions.