A simple HTML document is illustrated in Figure 4-1.
<TITLE>The World-Wide Web</TITLE> <H1>About The World-Wide Web</H1> <P>The World-Wide Web is a <EM>distributed multimedia hypertext</EM> system.</P>Figure 4-1 A Simple HTML Document.
Structural elements in the document are identified by start and end tags. For example the <TITLE> and </TITLE> tag is used to specify the title of the document, which is often displayed by a client. The <H1> and </H1> tag is used to define the first level heading. Clients will normally display headers differently from the body text: for example, a graphical client could display the header using a larger or different font, whereas a text-based client could display a header as centred text or in all capitals.
Figure 4-1 also illustrates the <EM> container. Text held in the container (which is defined by the <EM> start tag and the </EM> end tag) will be emphasised in some way. A graphical browser could render the emphased text by displaying it in italics, whereas a browser with audio capabilities for the visually impaired could render the emphasis by a change in the tone of the voice output.
Figure 4-1 also shows the paragraph container. It is important to understand that the <P> tag is part of a paragraph container and is no longer a paragraph separator (as many people mistakenly believe). If the </P> is not used the existence of the next <P> tag will imply a </P>. In future versions of HTML it will be possible to specify paragraph attributes: for example <P ALIGN=Centred>.
Although browsers will display the HTML document shown in Figure 4-1, for reasons of performance and upwards compatibility it is strongly recommended that HTML documents contain additional elements including the <HTML>, <HEAD> and <BODY> tags, as shown in Figure 4-2.
<HTML> <HEAD> <TITLE>The World-Wide Web</TITLE> </HEAD> <BODY> <H1>About The World-Wide Web</H1> <P>Information about the World-Wide Web is available <A HREF="http://info.cern.ch/hypertext/WWW/TheProject.html"> at CERN</A>.</P> </BODY> </HTML>Figure 4-2 A Simple HTML Document.
The <HTML> container is used to define the extent of the HTML document. Within the HTML document there are two other containers: <HEAD> and <BODY>. The <HEAD> container provides information about the document itself. This can include the title of the document (as illustrated) copyright information, keywords and expiry dates (for use by caching software). It is important to make use of the tag since, for example, an automatic indexing program which wishes to index the title of HTML documents can parse only the information contained in the container. If the container is not present the entire document may have to be parsed, which will place unnecessary extra load on the server.
Figure 4-2 also illustrates the use of the anchor <A> container. This tag is used to provide hypertext links. In the example the text at CERN which is contained between the <A> and </A> tags will be highlighted in some way by the browser. Selecting this highlighted phrase will cause the client to send a request for http://info.cern.ch/hypertext/WWW/TheProject.html This request will use the http protocol and will be sent to the server running on the system at info.cern.ch
Word processing tools have the advantage that they provide a consistent environment for existing users of word processors. However they do have their disadvantages. Because they are normally implemented as macros, they can be very slow, especially when used with large or complicated documents. There is also a danger that HTML markup which is embedded as hidden text could cause conflicts with other word processing tools if, for example, the word processed document was used by other users.
Figure 4-8 The ANT_HTML Macro.
Figure 4-9 Editing A Document From Arena.
Figure 4-10 A Document Converted Using LaTeX2html.
LaTeX2html is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/translators/latex2html Further information is available at the URL http://cbl.leeds.ac.uk/nikos/doc/www94/www94.html
RTFtohtml is available as a command line tool for a number of Unix platforms. In addition an Apple Macintosh implementation is available. A beta version of an MSDOS implementation was announced in November 1994.
An extension of the RTFtohtml program is known as RTFtoweb. This provides a number of additional features, including creation of hypertext links at user defined section breaks. Figure 4-11 illustrates a document on Exploring The World-Wide Web Using Mosaic For Windows which is available at the URL http://www.leeds.ac.uk/ucs/docs/tut50/tut50.html
Figure 4-11 Document Converted Using RTFtoweb.
In Figure 4-11 it should be noted that the document is automatically split into a number of files. A hypertext table of contents is automatically generated. Chevrons (>> and <<) are also generated automatically which can be used to move to the next or previous section.
Further information about RTFtohtml is available at the URL ftp://ftp.cray.com/src/WWWstuff/RTF/rtftohtml_overview.html The software is available at the URL ftp://ftp.cray.com/src/WWWstuff/RTF/latest/ In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/translators/rtftohtml
RTFtoweb is available at the URL ftp://ftp.rrzn.uni-hannover.de/pub/unix-local/misc/rtftoweb/html/rtftoweb.html
A number of HTML validation tools are available which can validate HTML documents. A number of popular tools are described below.
HoTMetaL is available for the X and Microsoft Windows platforms. Two versions of the software are available: a public domain version and a licensed version. HoTMetaL Pro, the licensed version, can be used to import and validate an existing document. The public domain version will give an error and refuse to load a document which contains invalid HTML.
HoTMetaL is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/Mosaic/html/hotmetal
Figure 4-13 HTML Validation Service.
A variation on this service is available at the URL http://www.cc.gatech.edu/grads/j/Kipp.Jones/HaLidation/validation-form.html
These services make of the sgmls validation program.
The software can be installed on your local Unix system. It is available at the URL ftp://ftp.hal.com/pub/CGI/check-html.tar.Z
Figure 4-14 Installing The Check_HTML Script.
Support Who wrote the software - an experienced software developer or a student as part of a computer project? Will the software continue to be developed and supported?
Quality Does the software produce valid HTML?
Functionality What facilities does the software provide?
Other Issues If the software is based on a word processing package, what happens if the word processed document needs to be used by another word processor?
A review of Microsoft Windows HTML authoring tools is available at the URL http://werple.apana.org.au/~gabriel/html-editors/index.html
A list of HTML tools is available at the URL http://info.cern.ch/hypertext/WWW/Tools/Filters.html
Dan Connolly's HTML Design Notebook is available at the URL http://www.hal.com/%7Econnolly/html-design.html
The HTML specification is available at the URL http://www.hal.com/%7Econnolly/html-spec.html