home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!pipex!unipalm!uknet!mcsun!sunic!ugle.unit.no!nuug!ifi.uio.no!SGML
- From: SGML@ifi.uio.no (Erik Naggum)
- Newsgroups: comp.text.sgml
- Subject: Re: Overenthusiasm
- Message-ID: <23302M@erik.naggum.no>
- Date: 12 Aug 92 23:43:58 GMT
- References: <5591@ebt-inc.UUCP>
- Organization: Department of Informatics, University of Oslo, Norway
- Lines: 151
-
- Steve DeRose <sjd@uhura.uucp> writes:
- |
- | Erik wrote:
- |
- | >...Well, I certainly won't speak for Darrell, but there is no issue of
- | >a portability problem for SGML documents.
-
- The next sentence, which totally changes the impact of the above quoted
- statement went like this:
-
- || There [is] _perhaps_ an issue of proper specifications for
- || semantic[s] in the SGML application so that application programs can
- || be written from them, and thus, portability or (perhaps more
- || properly) implementability _of_ SGML applications is the key.
-
- In other words, there's no portability problem for SGML _documents_, but
- there might be an implementability issue for SGML _applications_.
-
- The reason is, of course, that SGML doesn't deal with documents, it
- deals with document _types_, as defined in and by SGML applications.
- Any application program will necessarily have to be written with respect
- to the document types, or at least half the point with using SGML is all
- but lost.
-
- | Erik, if I mail you a document to do *any* processing on (other than
- | confirming its validity, which is merely solipsistic), it is still
- | hard. Please list for me any two SGML application programs meant to
- | do comparable processing on SGML documents, which can take an SGML
- | document and produce comparable results without a lot of painful
- | manual set-up first. They must work for any DTD.
-
- This is a straw man you erect for the sole purpose of cutting it down as
- if it were my argument, which it isn't. Also, you must have ignored
- everything I have said about SGML applications in order to say this.
-
- To repeat, an SGML document is an SGML document relative to an SGML
- application, part of which is specified in the prolog of the SGML
- document, and part of which is specified in the rest of the DTD (=
- Document Type Definition, not just "Declaration"). There can be no such
- thing as what you request, because you forget that processing semantics
- is not specified in the SGML document itself. Thus "comparable
- processing on SGML documents" begs the question of what "processing"
- means, and how to establish "comparable". If I don't have the rest of
- the DTD, "they must work for any DTD" is also an irrational requirement.
- Implementing the semantics of an SGML application takes work, and
- whether you regard it as "a lot of painful manual set-up" or as
- implementing the semantics of an SGML application in order to conform to
- it, probably makes a hell of a difference in the quality of the work,
- too.
-
- | Most of the benefits of SGML, ironically, derive not from SGML per
- | se but from a set of conventions that SGML users encourage.
-
- As seen from the viewpoint that you'd like to suffer as little "painful
- manual set-up" as possible, yes. As seen from the viewpoint of
- capturing essential characteristics which super-classes of document
- types have in common, it becomes not just "conventions", but the
- informal version of what architectural forms does formally. It's
- obvious that you can't have only one "architecture" (or set of
- "conventions"), if you wish to do real information processing, but
- that's what it seems that you imply. If so, you have limited SGML to a
- set of semantics which makes SGML no more than a markup language for the
- presentation of documents in various forms.
-
- I think, and I know that I saw this when I first started to study SGML,
- that SGML is much more abstract than this. SGML is, in fact, so
- abstract that many people have enormous problems relating to more than
- one concrete side of its application. (See the description of the blind
- men and the elephant in Goldfarb's SGML Handbook for an example of this
- problem.) James Mason pointed this out in his article, too, where he
- notes that people have been with SGML since the beginning, and gradually
- become aware of its applicability far from their original problem
- domain.
-
- | On another point, to suggest that SGML is not for representing
- | semantics, and yet "is intended to be a information representation
- | vehicle", is oxymoronic.
-
- Syntax and semantics are often found in layers, where the syntax of one
- layer is the semantics of a lower layer. This is what you get from data
- abstraction, and languages such as ASN.1. I have been forced to move up
- and down in such layers more than I appreciated while it was going on
- when I worked on communication protocols, but I have also seen that such
- layering where conscious omission of vast amounts of detail is the rule
- of the day, can be tremendously difficult to cope with. The reference
- model for open systems interconnection (OSI), for instance, provides you
- with fully seven double-function interfaces of "service data" and
- "protocol data" which really are the same thing seen from above and
- below, but it's crucial to keep the viewpoint in focus, and moving
- around in the OSI model requires effort even to trained experts.
-
- I doubt that they would be very pleased to hear that their work is
- "oxymoronic" because some listener doesn't realize that they're looking
- at the same thing from at least 13 different viewpoints at the same
- time. SGML is relatively simple by comparison, with only between three
- and five different viewpoints on the same data "active" at the same time
- (depending on how seriously you take character sets and the abstractness
- of the abstract syntax). (HyTime adds five more, and I got completely
- lost at first, until I could view SGML as only one viewpoint on a HyTime
- document, and could rid myself of the sequential nature of the SGML
- document instance.)
-
- "There has to be _some_ semantics, but it could be _any_ semantics,"
- should be thought of as a rule of thumb with SGML documents. SGML
- defines the syntax and representation of element types and, ultimately,
- document types, and provides representational semantics, which at a
- sufficiently high level of abstraction can be regarded as syntax for the
- higher-level language. Carsten Bormann has related that he thinks the
- semantics of an SGML document parsing instance is the element structure
- information set (ESIS). That's certainly one very important viewpoint,
- because it's what an SGML parser "does" with respect to the application
- program. Still, the ESIS is mere syntax to the application program,
- which has to import its own semantics on the data.
-
- I'm trying to limit SGML to what it can "do", but I'm also trying not to
- limit what it can be used _for_. I see this as essential in viewing
- SGML and related standards as _enabling_ (to use James Mason's article
- as a reference). If you need more than SGML "does", you need to define
- an SGML application.
-
- A legitimate concern has been raised that validation should occur as
- close to the data as possible. Yes, I agree. That's why SGML provides
- NOTATION, for instance. The non-trivial (but still toy-size) SGML
- applications I have played with required three layers in the application
- program, which could be invoked independently so as to validate more and
- more of the structure before using it. Not only was it relatively
- simple to implement, it produced very pleasing results.
-
- The key here is layering, and understanding that SGML spans a number of
- layers, and realizing that you should talk to SGML at the highest layer,
- and not try to grope about in the lower representational layers which
- SGML can handle just fine all by itself.
-
-
-
- Again, sorry about the length of this. I really don't have the time to
- do proper editing and removal of redundancy. Normally, I don't post the
- first couple drafts, but I can't afford that luxury now. I think this
- needs saying because it's been true for about three weeks, now, and some
- people tend to react as if I had published this stuff in a refereed
- scientific journal. I find it more important to get this out, than wait
- until the thread is all but forgotten. If you think otherwise, please
- let me know. (And thanks to those you have. :-)
-
- Best regards,
- </Erik>
-
- PS: Steve, could you fix your Reply-To or From line so they point back
- to a place you can be reached. My archive receipt generator has sent me
- many a complaint that "uhura" doesn't exist in the known e-mail universe.
-
- --
- Erik Naggum | ISO 8879 SGML | +47 295 0313
- | ISO 10744 HyTime |
- <erik@naggum.no> | ISO 10646 UCS | Memento, terrigena.
- <enag@ifi.uio.no> | ISO 9899 C | Memento, vita brevis.
-