MIME Encapsulation of Aggregate HTML Documents WG (mhtml) Thursday, August 14 at 0900-1130 ================================ Chair: Einar Stefferud AGENDA: The MHTML WG will review Implementation Experience, and will review and revise the Informational Draft which will document implementation and deployment issues in an RFC for the internet community. The WG will also consider possible protocol actions that may be required to deal with errors uncovered by implementaton efforts. MHTML draft agenda for the IETF meeting in Munich 1997, Version 3 ----------------------------------------------------------------- This issue list is also available in neater format at URL: http://www.dsv.su.se/~jpalme/ietf/mhtml-issue-list-0797v3.html More information is also available at URL: http://www.dsv.su.se/~jpalme/ietf/jp-ietf-home.html ***** Documents to be discussed during the meeting 2112 PS E. Levinson, "The MIME Multipart/Related Content-type", 03/12/1997. (Pages=9) (Format=.txt) (Obsoletes RFC1872) 2111 PS E. Levinson, "Content-ID and Message-ID Uniform Resource Locators", 03/12/1997. (Pages=5) (Format=.txt) 2110 PS J. Palme, A. Hopman, "MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)",03/12/1997. (Pages=19) (Format=.txt) draft-ietf-mhtml-rev-01.txt J. Palme, A. Hopman, "MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)" ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-spec-07.txt The same document is available in Microsoft Word format at URL: ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-spec-07.doc And difference from RFC 2110 in Microsoft Word 6 format at URL: ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-spec-07dif.doc Note that the official IETF file name of the new draft is "draft-ietf-mhtml-rev-01.txt" and not "draft-ietf-mhtml-spec-07.txt". draft-ietf-mhtml-info-06.txt J. PaLme, Sending HTML in E-mail, an informational supplement to RFC 2110: MIME E-mail Encapsulation of Aggregate HTML Documents (MHTML), version 06. It can be retrieved by anonymous FTP from URL: ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-info-06.txt. ***** Issue 1: Exact matches in section 8.2 5.1 Do we by exact matches mean case sensitive matches and no resolution like "file%20name" to "file name". Note: This should not be any problem if standards are adhered to, since spaces are not legal in URLs. However, it is accepted practice for Web browsers to accept lots of kinds of illegal URLs, and the two most widely used products both accept spaces in URLs in hyperlinks in HTML documents. How should such a URL be handled in the Content-Location statement. Should the space be converted to %20 (then the words about exact matching in mhtml-spec chapter 8.2.2 most be changed) or should it be put in illegal format in the Content-Location header, too? The MHTML proposed standard (RFC 2110) at present says that URL-s in e- mail headers are to be encoded using the encoding method of RFC 2017, and RFC 2017 refers to RFC 1738 which specifies that illegal characters in URL are to be encoded using the % method, for example a space is encoded as %20. Ed Levinson has proposed that the encoding method of RFC 2047 should be used instead in the special case where RFC 1738 encoding would make it impossible to make the exact match required by RFC 2110. The advantage with this is that when the RFC 2047 encoding is reversed, we get back the same string, and can do the exact match. If RFC 2017/RFC 1738 encoding is used, reversal may reverse too much, so that the exact match will not work. 5.2 Does this apply only to relative Content-Locations without any Content-Base? Should we say something about exactness of matchings when URL-s are resolved using a Content-Base? If so, what? 5.3 What about the case where the URL is relative and unresolvable in the header, but absolute in the HTML text. The present spec does not say what should be done in that case. ***** Here is an example which explains some of the choices: Assume you have a HTML document which contains the following element: and the owner of this HTML document requests that it is sent by e-mail. How should the e-mail look like in this case? (a) Content-Type: Text/HTML Content-Type: Image/GIF Content-Location: "file name.gif" (b) Content-Type: Text/HTML Content-Type: Image/GIF Content-Location: "file%20name.gif" (c) Content-Type: Text/HTML Content-Type: Image/GIF Content-Location: "file%20name.gif" (a) is not in agreement with RFC 2017, which RFC 2110 refers to, so if we choose (a), RFC 2110 or RFC 2017 must be changed. (b) means you have to edit the HTML text before sending it, which is not so nice, since you are then opening a big can of worms: Which corrections of faulty HTML should you correct before sending it via e- mail? (c) requires change in the text about "exact match" in RFC 2110. ***** Issue 2: Precedence of Content-Base and Content-Location in section 5 If there is both a Content-Base and a Content-Location header, which of them should take precedence in resolving URL-s in the HTML content? ***** Issue 3: Use of Content-Base and Content-Location for information Should the Content-Base and Content-Location be allowed in cases where they do not influence functionality, as a way of informing the reader that a body part was taken from a certain web location? ***** Issue 4: Allow Content-Base, Content-Location outside Multipart/related? Any reason to remove this passage in RFC 2110 section 4.1: These two headers may occur both inside and outside of a multipart/related part. JP comment: The statement is true. The specific usage of Content-Base and Content-Location described in RFC 2110 SHOULD only occur inside Multipart/related, but these two headers can also occur as information to the reader that the body part is also available at a certain URL. And since Text/html can occur outside of Multipart/related (Multipart/related is only needed when the Text/html contains links to other body parts in the same message), Content-Base and Content-Location can also occur outside of Multipart/related, and in my opinion this text should not be removed. Possibly we could change the paragraph to the following. These two headers may occur both inside and outside of a multipart/related part, but their usage for handling HTML links between body parts in a message SHOULD only occur inside Multipart/related. ***** Issue 5: Allow same Content-Location on two body parts in section 7 Should we allow the same Content-Location on two body parts, if they resolve to different URLs (last paragraph of section 7 in mhtml-spec). Suggestion: Yes. ***** Issue 6: Content-Base in one part, not in another in section 8.2 Suppose there are two body parts in a multipart/related. One of them has a Content-Base statement, the other does not have. Example: Part 1: Content-Type: Text/html Content-Base: http://foo.net Part 2: Content-Type: Image/gif Content-Location: picture.gif In this case, should relative-to-absolute conversion take place on "picture.gif" in Part 1, so that it will not match the relative URL in Part 2? ***** Issue 7: Robustness Principle in general Should the standard include the new chapter 13. Robustness Principle as suggested in draft-ietf-mhtml-spec-07 or should this chapter be put into the informational draft draft-ietf-mhtml-info or not be published at all. Note: The present work in the IETF DRUMS working group, where this kind of information, under the title "4. Obsolete Syntax" is included in the standard-to-be draft-ietf-drums-msg-fmt. ***** Issue 8: Robustness Principles, one by one Every single subchapter in chapter 13. Robustness Principle is controversial and we should decide for or against having it (this applies whether this chapter goes into the standard or the informational document). ***** Issue 8.1: Content of the type parameter (section 13.1) Should liberal implementations accept input where the type parameter is wrong or omitted? ***** Issue 8.2: Quoting of the type parameter (section 13.2) Should liberal implementations accept input where the type parameter is not quoted? ***** Issue 8.3: Quoting of the start parameter (section 13.3) Should liberal implementations accept input where the start parameter is not quoted with angle brackets? ***** Issue 8.4: Content-Base/Location in multipart headings (section 13.4) Should liberal implementations accept and try to use, if necessary, Content- Base and Content-Location headers in multipart headings. ***** Issue 9: Allow Content-Base, Content-Location to be valid for object parts? Any reason to change this passage in RFC 2110 section 4.1: These two headers are valid only for exactly the content heading or message heading where they occurs and its text. They are thus not valid for the parts inside multipart headings, and are thus meaningless in multipart headings. ***** Issue 10: Examples in chapter 9 Can some of the implementors, who have executable code which can check examples, provide better examples? By better examples I mean examples with both are correct and which clarify the controversial points. ***** Issue 11: Revised proposed standard or draft standard Are we aiming at revising RFC 2100 into a revised proposed standard or into a draft standard? ***** Issue 12: Publishing of the info document Is it time now to publish draft-ietf-mhtml-info-06.txt as an informational RFC? ***** Issue 13: Charter and status of the working group Is there any need for a discussion about the charter of the working group, and about whether the working group should be designated as "active" or "inactive"? ***** Issue 14: Value of start parameter to multipart/related The present MHTML standard (RFC 2110 and RFC 2112 say that if the root body part of a multipart/related is of type multipart/alternative, then the type parameter of multipart/related should be "multipart/alternative". It has been suggested, that this be changed, so that the type parameter tells what is the main part of the multipart/alternative. One solution might be to change the syntax of the type parameter so that it can for example have the value "multipart/alternative;text/html" to indicate that the root is a multipart/alternative whose primary alternative is of type text/html. ------------------------------------------------------------------------ Jacob Palme (Stockholm University and KTH) for more info see URL: http://www.dsv.su.se/~jpalme