home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!ebt-inc!uhura!sjd
- From: sjd%ebt-inc@uunet.uu.net (Steve DeRose)
- Newsgroups: comp.text.sgml
- Subject: Straw man proposal for overlapping markup
- Message-ID: <5431@ebt-inc.UUCP>
- Date: 24 Jul 92 15:07:56 GMT
- Sender: news@ebt-inc.UUCP
- Organization: EBT
- Lines: 76
- Originator: sjd@uhura
-
- Here's a straw-man proposal for an upgrade to SGML for handling overlapping
- markup streams. I won't claim it's very pretty, but it has advantages:
-
- * It requires *no* changes to the syntax of SGML documents, but
- only an additional keyword permitted in ATTLIST declarations in DTDs.
-
- * A parser that doesn't know about it could ignore the keyword and
- produce acceptable output (such as the identical ESIS).
-
- * Unbounded numbers and types of overlaps can be encoded inline.
-
- * The element relationships are pretty human-readable and findable.
-
- * A parser that does know about the construct can at least validate
- that all tags for overlapping elements balance.
-
- The mechanism is somewhat similar to Eliot's recent proposal; it is
- also among several approaches discussed over the last
- four years by the relevant Text Encoding Initiative committees, and for
- slightly longer by the CHUG Hypermedia Working Group.
-
- Here it is, building on my previous example:
-
- <!ELEMENT speech.start - - EMPTY>
- <!ATTLIST speech.start connect STARTID #REQUIRED>
- <!ELEMENT speech.end - - EMPTY>
- <!ATTLIST speech.end connect ENDID #REQUIRED>
-
- <speechstart speaker='Sir John' connect=chunk5>
- <poetry-line>Greetings, O brave defender of our right,</poetry-line>
- <poetry-line>'Till now, I thought you knew not how to Write</poetry-line>
- <poetry-line>but heavy Morals.
- <speechend connect=chunk5>
-
- <speechstart speaker='Melissa' connect=chunk7>
- These my Pens imploy</poetry-line>
- <poetry-line>for all my business was to pall Your Joy:</poetry-line>
- ...
- <speechend connect=chunk7>
-
- Almost exactly as with ID and IDREF attributes, there is a validity constraint:
-
- * The value of a STARTID attribute must be unique among all STARTID
- attribute values in a document instance.
-
- * The value of a ENDID attribute must be unique among all ENDID
- attribute values in a document instance.
-
- * Every STARTID attribute value in a document instance must also occur
- as exactly one ENDID attribute value *later* in the same document.
-
- This is slightly easier to check than ID/IDREF matching, and guarantees a
- certain level of integrity. The semantics of the overlap are in the
- application's domain (say, whether it should be highlighted, editable,
- whether there are cross-constraints between overlaps, etc).
-
- A slight improvement to this construct would have you declare both attributes
- for the *same* GI, and have the constraint that the elements having
- matching STARTID/ENDID must be of the same GI. I actually like this larger
- extension much better, because it strengthens validation and allows using
- the GI to actually reflects the types of overlapping elements (rather than
- permitting only a naming convention such as "speech.start" and "speech.end").
-
- I also see it as a plus that current parsers can handle documents using
- either construct without even being upgraded, simply by letting the user
- change the DTD to read "ID" for "STARTID" and "IDREF" for "ENDID" (although
- this weakens validation and expressiveness, documents can be created and
- meaningfully parsed right now).
-
- Anybody see problems with this idea? It does not attempt to address
- Darrell's case of truly discontiguous elements, only the case of contiguous
- but overlapping elements. With a little tuning would this make sense for
- some of the applications we've been discussing?
-
-
- Steve DeRose
-