19 Jun 1995 - Preliminary Information

The Subdocument Tree

Most HTML editor tools operate on a single text file. However, good practice holds that hypertext documents should be divided into a large number of small files. Managing all these files and maintaining a consistent overall structure then becomes a serious problem.

The Library

PC Lube and Tune has developed into a library structure that seems generally applicable. Because no one application can assume to own the entire server, the files fall under a common starting directory. During development, this is x:\PCLT on the author's machine. In distribution, the same structure becomes http://pclt.cis.yale.edu/pclt/ on the server.

SpHyDir gets the local library name from the HTMLLIB environment variable. In this case, "SET HTMLLIB=F:\PCLT" is put in CONFIG.SYS. All of the HTML and GIF files that SpHyDir processes have to fall on this disk under this directory. SpHyDir is then programmed to moderate between the native OS/2 file naming conventions (with "\") and the more general file naming conventions used in most hypertext links (with "/"). In concept, it should be possible to move the entire structure from OS/2 to a Unix server.

Although it is possible to dump all the files in one directory, the library becomes more managable if each major subject has its own directory. Any large collection of related files can be collected in the same subdirectory.

Chapter and Verse

It is possible for a collection of random short documents to be collected together in some free-form association. No structure would be needed for such a grouping. However, most collections of hypertext files actually started as a larger paper document. The material was broken into smaller files because it is best if each file on the Web is only a few screens long. However, the original logical structure of chapters, sections, and subsections is still logically present.

To accomodate this, SpHyDir supports the concept of a Subdocument. A Subdocument is a special kind of "paragraph" object in a file. Any word in an ordinary paragraph or point can be a hypertext link to another file. However, such links do not establish a relationship between the file containing the link and the file to which the link points.

A Subdocument link, however, claims that the other file is logically a part of the file that references it. When one file claims another as a Subdocument, then the first file is said to be the "parent" of the claimed file. A thousand different files can have ordinary hypertext links to the same Web page, but only one file can claim to be its parent. (This is a restriction that the user should obey. SpHyDir is not currently in a position to enforce it).

Just as each library generally has a "front door" or "home page", so any collection of subdocument has a starting point. The "root" document is the one member of the group that has no parent. It points to subdocuments, and they in turn can point to other subdocuments.

Objects and Attributes

Physically, a Subdocument Object produces a paragraph whose only content is the TITLE of the Subdocument. This TITLE is a hypertext link to the Subdocument. In addtion, however, the Subdocument object has a structural effect upon the parent, the named document, and other subdocuments that are also claimed by the same parent.

Subdocuments are normally a series of chapters or sections. If the text were printed out, they would be printed and read in order. The order in which the Subdocument objects appear in the parent produces a Next/Previous relationship between the subdocuments themselves. HTML 2.0 doesn't have a formal method of expressing this relationship. HTML 3.0 will have syntax for Next and Previous links. Until this becomes widely available, SpHyDir manages the relationships itself.

In OS/2 a file can have Extended Attributes. The normal attributes are things like Date and Size. Extended Attributes are maintained by the application that creates the file. SpHyDir creates Extended Attributes for the HTML files to manage the larger logical document structure within the subdocument tree.

One EA provides quick external access to the document TITLE without having to read through the HTML. Another lists all the Subdocuments that the current document claims. Another lists the parent, if any, of the current document. Another lists all the Header text and levels of all the Sections contained within the document.

To create a Subdocument link, first drag the "Book" tool (the first one in the Toolbar) and drop it anywhere a paragraph or list point can go. The definition is completed by dragging the Workplace icon of another HTML file from the library and dropping it on the newly created object. If the dragged file was previously generated by SpHyDir, then when it is dropped on the Subdocument object, SpHyDir will extract is TITLE (from the EA) and display it as the caption of the object. This title will appear in the final page on a line by itself hypertext linked to the referenced file.

When HTML is generated for the current file, the list of Subdocument objects in the order that they appear will be stored as an Extended Attribute of the current file, and an Extended Attribute will be created on each of the referenced files pointing back to the current file as the parent.

Subdocument objects are not a formal construct of HTML 2.0, but there is some fully documented syntax that comes very close. When the Subdocument object is converted to HTML, it is generated in one of two forms (a paragraph or a list item):
<P><A HREF="xxx.htm" REL="Subdocument"> ...title...</A></P>
or
<LI><A HREF="xxx.htm" REL="Subdocument"> ...title...</A>
If SpHyDir processes an existing HTML document with the REL="Subdocument" attribute it will try to convert it back to a subdocument object.

Next and Previous

HEADER and TRAILER can contain variables which are replaced with current information. Variable names are enclosed in "[" and "]" characters.
[Date] is replaced by the current date.
[Doctitle] is replaced by the TITLE of the document.
[Up] is replaced by the file that claims this as a subdocument.
[Previous] and [Next] are replaced by the files that appear before and after this file in the Subdocument list of the Parent.

The [Up], [Next], and [Prevous] relationships don't always exist. For example, the document at the top of the tree has no Up. The first document listed as a Subdocument has no Previous, and the last document has no Next. To accomodate this, any line in HEADER or TRAILER that references a non-existant variable is entirely deleted. The idea is that you put on one line all the stuff that would relate to a relationship, and when it doesn't exist then the entire package is deleted.

An example HEADER might include the lines:
<P>
[<A HREF="[Up]">Up</A>]
[<A HREF="[Previous]">Previous</A>]
[<A HREF="[Next]">Next</A>]
</P>
<P><I> [Date] </I></P>

Every document gets a line containing the current date in italics. Above that line there may be 0-3 hyperlinks depending on the number of available relationships. If all three links are generated, then the line looks like:
[Up] [Previous] [Next]
with each word acting as a link.

The Document Tree Window

The Window pulldown menu of the SpHyDir Workarea includes an option to display the Document Tree for whatever HTML file is currently in the Workarea.

The Document Tree Window To build this window, SpHyDir checks for the Parent of the current file, and then for the parent of the parent, until it finally reaches the Root document. It then proceeds down through the Extended Attributes of the Root and all the subdocuments and sub-subdocuments. For each file, the TOC Extended Attribute lists all of the Headers in that file.

The Document Tree window displays a complete cumulative Table of Contents for all of the files in the document tree structure. It is intended to eventually create a TOC file and simplify the creation of references from one part of the tree to a section in another file.

Currently, the major feature of this window is the ability, from the File pulldown menu, to trigger SpHyDir to regenerate HTML for all of the files in the tree. This is a convenient way to clean things up if the HEADER or TRAILER files have been changed or when the logical order of files has been rearranged.

Continue Back PCLT

Copyright 1995 PCLT -- SpHyDir Web Document Manager -- H. Gilbert
May be distributed with SpHyDir program

This document generated by SpHyDir, another fine product of PC Lube and Tune.