home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!spool.mu.edu!enterpoop.mit.edu!eru.mt.luth.se!lunic!sunic!aun.uninett.no!nuug!ifi.uio.no!enag
- From: SGML@ifi.uio.no (Erik Naggum)
- Newsgroups: comp.text.sgml
- Subject: Re: Precedence of SGML Operators?
- Message-ID: <19930107.004@erik.naggum.no>
- Date: 7 Jan 93 19:31:51 GMT
- References: <1802@igd.fhg.de> <19921228.013@erik.naggum.no> <93005.142005U35395@uicvm.uic.edu>
- Lines: 90
-
- I wrote:
- :
- | ... which in a different language could have been explained by some
- | arcane precedence rules at work; however, SGML has no notion of
- | operators or of operator precedence rules as these terms are used in
- | programming languages.
-
- I may have been unclear, so let me try again.
-
- The term "operator" as used in programming languages has no counterpart in
- SGML. Programming languages specify actions, operations, to be performed
- as a result of executing the code. SGML is a descriptive language, and has
- no notion of operations: Operations cannot be expressed in SGML alone.
- Therefore, there are no _operators_. The concept "operator" comes from a
- different world, and is conceptually incommensurate with SGML's concepts of
- "occurrence indicator" and "connector". Thus, there is no place for the
- concept "operator" in the SGML world at all.
-
- In some programming languages, there are rules to disambiguate the semantic
- order among multiple operators in a given expression, usually known as
- precedence rules, modified by parentheses. (These rules may be expressed
- in the grammar, but this is indistinguishable from binding directions and
- precedence rules to the language user.) In SGML, parentheses are not used
- to form expressions, but _groups_. In any given group, only one connector
- may be used (the semantics of using more than one is not defined by the
- language), so there is no issue of precedence among connectors, or of
- binding direction. Moreover, by virtue of the rule that only one connector
- can be used in a group, the connector effectively refers to all tokens in
- the group, even though it syntactically occurs between individual tokens.
- (This means that the connectors can not even be understood as "binary".)
- Similarly, an occurrence indicator refers to the immediately preceding
- _token_ (a group is a token in this respect), and would have been a "unary
- post-fix operator" if the language had had operators.
-
- We have therefore established that a connector refers to all tokens in a
- group using it, instead of the tokens between which it occurs, and that the
- occurrence indicator is unary and post-fix with respect to the token to
- which it refers. This completely removes the concept "precedence" from the
- SGML world. Similarly, we previously removed the concept "operator" from
- the SGML world.
-
- This is why I said that there could've been some notion of precedence rules
- at work "in a different language".
-
- Numerous people try to think if SGML as "code" with "operators" and all
- sorts of familiar programming-language and/or regular expression concepts.
- There are important similarities, which enable us to think of SGML things
- in terms of familiar concepts and apply well known algorithms and
- techniques to SGML grammars, such as ambiguous content models and regular
- expressions, but the differences are also important, and can only be
- ignored at the peril of those who do. In fact, the differences are what
- makes SGML so elusive and hard to grasp for many people who would like it
- to be what they think it looks like. That's how we can get questions which
- confuse the element structure (a hierarchy, with all the attendant tree
- terminology and access modes), with its linearization and fragmentation
- into the entity structure, and how people can spend lots of time figuring
- out that "ambiguity" in SGML isn't the "ambiguity" from language theory
- (where the equivalent term is "deterministic").
-
- SGML isn't a programming language, and never the twain shall meet.
-
- C. M. Sperberg-McQueen <U35395@uicvm.uic.edu> writes:
- :
- | ... EN's remarks might lead one to believe that other SGML parsers
- | don't have bugs; that would be a wrong conclusion. Certainly most of
- | the ones I've used have a problem here or there.
-
- I regret if this was a possible interpretation. What I actually thought
- and wanted to express was that with all the marketing and hype that the
- Sema Group puts out about how their parser and tools support more SGML than
- others, they should be even more diligent than others in making sure that
- the marketing matches the product, and they have put their head on the
- block voluntarily. There is also a major difference between getting the
- grammar wrong, and having a problem here and there. This is the difference
- between a bug and a design flaw. (I'm painfully aware that products in the
- computer industry _never_ match their marketing, but that doesn't mean we
- should accept this state of affairs.)
-
- Thanks to Ed Blachman for noting that a new version of Mark-It has been
- published with this violation of the standard rectified.
-
- I hope this clarifies what I had in mind.
-
- Best regards,
- </Erik>
- --
- Erik Naggum ISO 8879 SGML +47 295 0313
- ISO 10744 HyTime
- <erik@naggum.no> ISO 9899 C Memento, terrigena
- <SGML@ifi.uio.no> ISO 10646 UCS Memento, vita brevis
-