home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cis.ohio-state.edu!ucbvax!RALVM13.VNET.IBM.COM!DRMACRO
- From: DRMACRO@RALVM13.VNET.IBM.COM ("Dr. "Eliot Kimber" Macro")
- Newsgroups: comp.text.sgml
- Subject: Data Representation in SGML
- Message-ID: <9207211343.AA07166@ucbvax.Berkeley.EDU>
- Date: 21 Jul 92 13:10:37 GMT
- Sender: daemon@ucbvax.BERKELEY.EDU
- Lines: 107
-
- Darryll Raymond writes:
-
- >In article <9207201503.AA25149@ucbvax.Berkeley.EDU>, DRMACRO@RALVM13.VNET.IBM.C
- >O
- >M ("Dr. "Eliot Kimber" Macro") writes:
- >
- >>I do know that I have yet to be presented with a data form or
- >>set of relationships that I cannot express in an SGML tag set
- >>coupled with complementary processing applied to that tag set.
- >
- > Two comments. First of all, computationally speaking, this isn't
- >saying much. For if you allow yourself to use complementary processing,
- cthen it doesn't matter *how* you express the original relationships, or
- >which ones you do specify. For you can smuggle all the complexity, as
- >well as any extra relationships you want, into your complementary processing.
- >If you start out with arbitrary computing power, it's not surprising that
- >you end up being able to do anything that's computable.
-
- I think we may be talking at cross purposes and that perhaps I have
- misunderstood both what you were getting at and what your purpose
- is.
-
- If your contention is that the representations and relationships
- definable *solely* in the semantics provided by the SGML
- language syntax as defined in ISO 8879 is not up to the
- task of data modeling in all the various ways that it might
- be done (the assumption being that some language could be
- invented that would allow that in some rigorous way), then I must
- agree. SGML was not designed to be an all-encompassing data
- modeling language all by itself. Quite the opposite.
-
- SGML was designed to do two things:
-
- 1. Separate as completely as possible information about document
- structure and content (what things are and how they relate to
- other things in the document structurally) from "formatting"
- or presentation specifications needed to derive a particular
- output rendering from that data. Doing this enables re-use
- of information in powerful ways.
-
- 2. Define a rigorous syntax for the markup the enables point 1
- such that any conforming SGML document will be parsed in exactly the
- same way by any conforming SGML system.
-
- The SGML standard is explicit in indicating that it is up to
- the processing application to define what the semantics
- of the data are. This makes sense because different uses of
- the same data may use different semantics, thus SGML should
- not impose any single semantic expression on that data.
- What's missing is a standard way of expressing those semantics.
-
- The SGML standard goes out of its way to avoid defining the
- language for defining any semantics about relationships (even
- IDREFs only tell you that two things are connected--they don't
- tell you why). This is because the SGML standard forms the
- basis for a general data processing infrastructure upon which
- more complex applications can be built. As Erik said, and I
- agree 100 percent, the value of SGML is that it makes building
- applications on top of an SGML system easy (or even possible,
- where they were impractical before).
-
- Note that the draft DSSSL standard (ISO 10179) *does* define a
- standard language for describing the relationship semantics between
- arbitrary data elements in SGML documents.
-
- I would suggest that instead of rejecting SGML because it
- doesn't define semantics, that you propose an SGML language
- and related set of processors that do everything you want
- them to. This is what HyTime does. HyTime defines the
- data elements it recognizes *AND* the associated processing
- that must be performed. Thus, interchange is achieved
- at the processing level because the HyTime standard
- defines what processing must be done and how semantic
- relationships are defined and decoded.
-
- By the same token, a publishing system based on DSSSL
- and SGML enables interchange at the processing level
- because the DSSSL specification is a specification
- of the semantic relationships defined in a standardized
- language. Any conforming DSSSL processor will reach
- the same semantic conclusions about a given document.
-
- What I have learned from HyTime is that any way of expressing data,
- any set of semantic relationships, can be defined in terms of
- "architectural forms", which codify the types of things your system
- deals with, coupled with definitions of the required processing
- associated with each architectural form, thus codifying the semantic
- relationships between different forms. My forms may use SGML
- functions like IDREF to enable connections between elements, but my
- architecture definition defines what two elements being connected means
- semantically and in terms of processing that either must be performed,
- can be performed, or cannot be performed. I can, in other words,
- standardize, using SGML as the language for expressing my
- architectural forms, the representation and processing of any data
- form or relationship I want to. The processing part of my standard
- can be expressed textually, or it could be expressed in some general
- procedure definition language. In fact, I could use DSSSL
- specifications to specify my "semantic architectural forms". SGML is
- not a requirement for doing this, but SGML provides a powerful and
- convenient language for doing it (and for me, writing content models
- is second nature at this point).
-
- Eliot Kimber Internet: drmacro@ralvm13.vnet.ibm.com
- Dept E14/B500 IBMMAIL: USIB2DK9@IBMMAIL
- Network Programs Information Development Phone: 1-919-543-7091
- IBM Corporation
- Research Triangle Park, NC 27709
-