Home

Help

Search


HTML tools - Tips and traps

The HTML columns in March's issue of PC User magazine provoked a strong reaction from PC User reader and web author Robert Eldridge. Robert savaged a section of HTML source created by Rose Vines WinWord 97 and urged us to make sure web page source should be written "to degrade gracefully'' in all web browsers.

We thought Robert made some good points in his email so we invited him to expand on it in this column. We asked him to share his thoughts on web authoring and, in particular, the pitfalls of using GUI net tools to create HTML source.

Here are our questions and Robert's replies.:
Question - Helen & John
Can you give us your line by line critique of the HTML source in the "Not just a pretty face" box from page 89 of the March issue of PC User magazine? In particular, would you explain what the correct form should be.

Answer - Robert
I will deal with this in two parts, the head section and the body section. Firstly the head:

<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=
windows-1252">
<META NAME="Generator" CONTENT="Microsoft Word 97">
<TITLE>welcome</TITLE>
<META NAME="Version" CONTENT="8.0.3326">
<META NAME="Date" CONTENT="9/26/96">
<META NAME="Template" CONTENT="C:\PROGRAM FILES\MICROSOFT
OFFICE\OFFICE\HTML.DOT">
</HEAD>

I really wonder about all this <meta> content. Is it required, no. Does it provide any useful information for the page user, no. I did initially think the date may be informative but then realised, no. Would CONTENT="10/2/96" equal the 10th of February or the 2nd of October? and in any event is it the date the page was written or the date of the software?

What is missing and is required is the document type declaration.

I am currently researching this question in relation to what <!doctype> should one use on pages that do not validate to a standard. The absence of a <!doctype> now implies one at html level 2.0. Whilst the presence or otherwise of a <!doctype> has no apparent significant effect the standard does require one and it could be important in future versions of browsers and other user agents.

Now to the body of the document:

<BODY TEXT="#000000" LINK="#0000ff" VLINK="#800080" BGCOLOR=
"#000000">

Whilst this follows the recommendation that if you set one colour you should set them all, it sets colours that contradict the very purpose of that recommendation by setting the background to "#000000" (black) and the text to "#000000" (black).

As the only text on the page has its colour set to dark blue by an attribute to the <font> tag then why not set the set the <body> text to "#000080" (dark blue). That way browsers that don't support <font color=> at least might get to see the text in dark blue.

<FONT SIZE=2><P>&nbsp;</P>
</FONT><P>&nbsp;</P>
<P>&nbsp;</P>

Three paragraphs each containing a non-breaking page space (&nbsp;). They are used in this situation to put some "white space" at the start of the document but why set the font size on the first 'empty' paragraph? The difference between <p><font size=2>&nbsp;</font></p> and <p>&nbsp;</p> is VERY small in terms of positioning the subsequent content.

This line also introduces some invalid code. The block level element <p> can contain a text level element like <font>, NOT the other way around.

Users of these types of constructions should also be aware that some older browsers do not support the named entity reference for a non-breaking space of &nbsp; It is therefore slightly "safer" to use the numeric character reference alternative of &#160; Seeing "&nbsp;" literally printed does look ugly on those browsers.

<P ALIGN="CENTER"><A HREF="main.html"><IMG SRC="IMAGES03.JPG"
BORDER=0 WIDTH=77 HEIGHT=161></A></P>

Adding the ALT= attribute (and some appropriate content) to the <IMG> element would provide some meaningful content for those browsing with 'images off' or using a text only browser. This is becoming even more important as a relatively high proportion of web users now "surf" with images off. I've seen estimates of 20 to 30%.

<B><FONT SIZE=4 COLOR="#000080"><P ALIGN="CENTER">Click to
enter</P>

Again, some more invalid code. The <P> block level element closes the text level <B> and <FONT> elements before their ending </B> and </FONT> tags.

Some also hold that "Click to..." is bad style. A person browsing with a keyboard does not "Click" and a blind person browsing using a speaking browser does not "Click". An alternative term such as "Select" comes to mind.

</FONT><FONT COLOR="#000080"><P
ALIGN="CENTER">&nbsp;</P></B></FONT></BODY>
</HTML>

This closing section of the page code is the part I was especially thinking of when I used the phrase "dogs breakfast".

Leaving aside the coding sequence errors why set up a paragraph containing a space, add code to colour it, add extra code to centre it when you can't see a space anyway, and all at the end of the page with no subsequent content?

In summary the page could have been written in a valid manner, producing the same result to a wider range of browsers and with 47% less bytes as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD><TITLE>welcome</TITLE></HEAD>
<BODY TEXT="#000080" LINK="#0000ff" VLINK="#800080"
BGCOLOR="#000000">
<P>&#160;
<P>&#160;
<P>&#160;
<P ALIGN=CENTER><A HREF="main.html">
<IMG SRC="IMAGES03.JPG" BORDER=0 WIDTH=77 HEIGHT=161
ALT="Select to enter"></A>
<P ALIGN=CENTER><BIG><B>Select to enter</B></BIG>
</BODY>
</HTML>

   
Five "howlers" of Web design:
1 Pages that are just too large (in total byte size of text plus images/sounds).

2 Pages that require any or excessive horizontal scrolling at relatively common browser use conditions and resolutions.

3 Pages that require the user to download anything to use/view the page.

4 Pages that make excessive use of 'fancy features' - we have all see the pages with:

Too many words blinking;
Backgrounds that make the text almost impossible to read;

Pages that take forever to load because they decided, without asking you, to offer some background music;

Weird colour schemes that have you reaching for your sunglasses;

Applets that take forever to download then only put up some banner or the like with no real meaningful content - "Look I can do this, wow" type of thing;

Fancy displays telling you the time in East Mongolia;

Excessive use of animated GIF's (especially large byte size ones).

5 Frame pages that have just got it all wrong like:

Don't have a noframes content alternative;

Don't offer any advantage over the non-framed alternative;

Require excessive scrolling in the frames or even worse have turned off scrolling preventing you from accessing all the contents including getting to a link;

Offer links to external pages but don't use target="_top" locking you into their frameset.

Question - Helen & John
What advice would you give for avoiding these errors in the future?

Answer - Robert
1 Validate the pages.

2 If you use software that 'saves in html format' have a good look at the results in a text editor and be liberal with the use of the 'sub editors red pencil' to fix the code as this type of software often creates masses of redundant code. Then validate.

3 With html authoring software check to see if it's "broken" ie will allow you to write invalid code. Use those programs with care. They often by default include an inappropriate <!doctype> and also can generate redundant code. Edit then validate.

4 For elements and attributes that don't validate think about how the construction might degrade. Some old browsers are handy in this respect.

5 When hand crafting pages in an editor try to markup honestly and avoid the use of "kludges" to get styling results. For example don't use <blockquote> merely to get an indent (in some browsers only). Be careful in the use of styling elements and attributes and be mindful as to how they degrade in non-supporting browsers.


Question - Helen & John
What's your perspective on the pitfalls of advice such as "even though the source may not be kosher 3.2, the code works, so why bother?"

Answer - Robert
The key phrase here is "the code works". The resultant question is not "why bother" but, in my opinion, why?

If it works because the browser is "recovering" a result from invalid code then there is no guarantee that another browser, or another version of the same browser, will "recover" the page content in a similar or satisfactory manner. Syntactically incorrect code should not be relied upon and can't provide a proper source for a reliable outcome.

If it works but is not "kosher 3.2" because it uses elements and/or attributes that are proprietary then the author has to be aware of the consequences.

Whilst the use a <blink> element for example generally is of little effect on browsers that don't support <blink> other constructions can have serious effects. For example if an author used white text (more likely if set in the <BODY>) to contrast a black background in a <table> that was on a white page, a browser not supporting the bgcolor= attribute in the relevant <table>,<tr>,<th> or <td> tag might display white text on a white background. Very informative for the viewer using that browser!


Question - Helen & John
How should a validator be configured so that users are not intimidated by getting hundreds of "errors" that don't seem like errors?

Answer - Robert
Other than setting the DTD that you are validating against you do not configure a validator. Validators provide a method for validating code against a DTD standard. A validator that allows you to "configure" your own standard is not, by definition, a validator but a checker.

You can get hundreds of errors for a number of reasons. The common ones are:

You haven't included an appropriate <!doctype> and the validator therefore defaults to validating against a html 2.0 DTD (it does not like tables etc)

Your authoring software inserted <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> or the like and you have used a lot of non html 2.0 elements and/or attributes but the validator uses the doctype to set the validation standard.

You have used a lot of proprietary tags and/or attributes, javascript 'attributes' such as OnMouseOver for example and other such 'enhancements'.

You have made a lot of mistakes. These often are of the type where the tags overlap, for example <b><i>text</b></i> or incorrect nesting, for example <strong><p>text</p></strong>

   
Tips/tools for novices:
1 Proceed slowly, keep it simple whilst you learn and always focus on well organised content.

2 Have a basic understanding of html. There is a lot of good on line information available. Try http://www.htmlhelp.com/ for a start.

3 Don't use large (in terms of bytes) images or sounds.

4 Be considerate of other users with different setups to your own and learn how to also cater for their needs.

5 Validate your pages.The <NOFRAMES>...</NOFRAMES> tags allow you to display text on those browsers that do not create support frames. As well as text you can include a link to a non-frames version of your page. These tags must appear inside the <FRAMESET>...</FRAMESET> tags and are ignored by frames-aware browsers. We've used this option in the top.htm document.

Question - Helen & John
Given your experience in HTML newsgroups - what do you regard as the five main howlers of HTML design and what tips would you give to novices generally?

Answer - Robert
The most important thing to remember is that HTML stands for Hyper Text Markup Language. I take it that by "HTML design" you meant WEB design.

Html provides the method of 'marking' the content. This is the page title. This is a primary heading. This is a paragraph. This is a list. I emphasise this term. This is a quote. I cite this reference. This is a division in the contents. Anchor this content to another page. Elements (tags) are used to code the marking to the content.

The "user agent", typically a browser such as Netscape Navigator, Opera, Internet Explorer, Lynx and the like (but it could be a blind person's speaking browser) uses the tags to "markup" the content. How the browser chooses to present the marked up content is basically up to the browser. This means that variations on how the content is presented must be expected.

The author can additionally use other tags and attributes to suggest a style to the marked up content. Make this text bold. Draw a horizontal rule. Colour this text red. These tags must also be considered as suggestions only. Relying on this text being blue and that text red 'falls apart' when the viewer is using a monochrome display for example.

It also follows that the author has no control over what and how the user chooses to use their browser. "Designing" a layout that looks "good" in your browser with a full screen window at 640x480 may mean that it looks "ordinary" at 600x800 and is almost impossible to follow on a Mac or Web TV display with 500 to 600 (or whatever) pixel maximum width resolution - let alone the user with some small mono screen on their combined mobile phone cum web browser.

This is why HTML is described as a "portable" language. It isn't, and should not be, tied to a particular operating system, browser or user configuration. This is why it is used on the "World Wide Web". As soon as you impose particular settings etc., that must be relied upon the "World Wide" part is lost.

   

Top of page

|What's New | Net Guides | Web Workshop | Net Sites | About PC User |
| Games | Education | General & Business | Online Tools | Utilities |
| Patches & Support files | PC User Interactive |

All text © 1997 Australian Consolidated Press - PC User Magazine