[Next] [Previous] [Up] [Top]

2. HTML Specification

2.4 HTML and MIME

The World Wide Web initiative (WWW) links information throughout the world. To do this, WWW uses the Internet Hypertext Transfer Protocol (HTTP), which allows transfer representations to be negotiated between client and server. Results are returned in a MIME body part.

HTML is one of the representations used by WWW, and is proposed as a MIME content type. The definition of the HTML Content-Type is text/html, and has three optional parameters:

Level
The level parameter specifies the feature set used in the document. The level is an integer number, implying that any features of same or lower level may be present in the document. Levels are defined by this specification.

Version
To help avoid future compatibility problems, the version parameter may be used to give the version number of the specification to which the document conforms. The version number appears at the front of this document and within the public identifier for the SGML DTD.

Character sets
The charset parameter is reserved for future use. See Section 2.16 for a discussion of character sets and encodings in HTML.

The actual character set used in the representation of an HTML document may be ISO 8859/1, or its 7-bit subset which is ISO 646. There is no obligation for an HTML document to contain any characters above decimal 127. It is possible that a transport medium such as electronic mail imposes constraints on the number of bits in a representation of a document, though the HTTP access protocol used by WWW always allows 8 bit transfer.

When an HTML document is encoded using 7-bit characters, then the mechanisms of numeric character references (see Section 2.16.2) and character entity references (see Section 2.16.3) may be used to encode characters in the upper half of the ISO 8859/1 Latin-1 set. In this way, documents may be prepared which are suitable for mailing through 7-bit limited systems.

NOTE: ISO 646 is, for all intents and purposes, equivalent to the ANSI standard for ASCII (American Standard Code for Information Interchange). The only notable differences between the two standards are the names assigned to the control characters that occupy positions 00 through 31 and position 127 (decimal) in that encoding. For encoding HTML documents, only three control characters in ISO 646 or ASCII are relevant (see Section 2.16.2). These are Carriage Return (CR) at position 13, Line Feed (LF) at position 10, and Horizontal Tab (HT) at position 11.


HTML 2.0 Specification (Internet Draft) - 29 NOV 94
[Next] [Previous] [Up] [Top]

Generated with CERN WebMaker