home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!news.acns.nwu.edu!uicvm.uic.edu!u35395
- Organization: University of Illinois at Chicago
- Date: Thu, 21 Jan 1993 15:20:53 CST
- From: C. M. Sperberg-McQueen <U35395@uicvm.uic.edu>
- Message-ID: <93021.152053U35395@uicvm.uic.edu>
- Newsgroups: comp.text.sgml
- Subject: sgml support and diff return code of 0
- References: <9301141548.AA19133@mingus.techno.com>
- <1993Jan15.192630.4767@informix.com> <93018.171821U35395@uicvm.uic.edu>
- <1993Jan19.023707.18716@informix.com>
- Lines: 47
-
- Robert Hartman writes (quoting me):
- > >the standard says very clearly that an end-tag inferred by the parser
- > >behaves the same way an end-tag explicitly given in the input stream.
- > >If we require SGML processors to preserve that distinction from import
- > >to export, we really are requiring something more than SGML support from
- > >them.
- >
- > If you can write a parser that can do this correctly in all cases, more
- > power to you.
-
- I'm not sure what you mean by 'this' -- if you mean 'infer the
- existence of an end-tag in the SGML document', then any conforming
- SGML parser can do it. The standard defines quite clearly when a tag
- is omissible or not and how a parser can recognize the omission of a
- tag. So it's not magic. It's not even one of the areas in which
- different parsers differ substantially in their interpretations of the
- standard.
-
- > From what little I know about parsers, it's would seem to me to be much
- > easier to get one to pass the diff test than it would be to verify that
- > another makes all the right guesses.
-
- Easy, perhaps: all you have to do is save the input file. My argument
- was not that it's hard to preserve the input, but that it's logically
- inconsistent to insist that an implementation of a standard preserve
- distinctions which the standard defines as not having any legitimate
- significance.
-
- It's as though one insisted that a C compiler had to produce object code
- from which the source code could be regenerated, and the regenerated
- source code had to have all the comments, the white space, the macro
- definitions, and the newlines of the original source, as well as having
- trigrams in exactly the places where the original source had trigrams.
- Otherwise we are going to tell all our friends and colleagues that the
- C compiler is not fully conformant.
-
- Since the C standard specifies that a parser cannot require a user to
- use two spaces where one would suffice, or interpret two spaces
- differently from one (outside of literal strings), some might argue
- that a *conforming* C compiler *cannot possibly* preserve the information
- that one, or two, or seven, spaces were used here, or here, or here.
- Even if it's legal for a compiler to retain that information somehow,
- it would surely be rather strange to say that *only* a compiler which
- did that should count as fully conformant, which is what your
- original proposal boils down to for SGML parsers.
-
- -C. M. Sperberg-McQueen
-