home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!ralvm13.VNET.IBM.COM
- From: drmacro@ralvm13.VNET.IBM.COM
- Message-ID: <19930111.081057.248@almaden.ibm.com>
- Date: Mon, 11 Jan 93 10:46:04 EST
- Newsgroups: comp.text.sgml
- Subject: Re: FrameBuilder
- Disclaimer: This posting represents the poster's views, not those of IBM
- News-Software: UReply 3.1
- References: <19930111.002@erik.naggum.no>
- Lines: 78
-
- Erik brings up some interesting points and I think he's generally
- correct. The abstract model I have in my mind for SGML systems
- is one where applications are separated from the data storage
- component (database) by a true SGML parser and entity manager
- that provides a completely-conforming SGML parsing "view" of
- the data and hides the details of how the data is stored. On
- top of this layer is what we might call a "logical access
- interface" that only presents the resolved element tree view,
- augmented by knowledge of data entities, something like this:
-
-
- Application Layer
- : A : A
- : : V :
- : : Logical Access Interface
- : : : A
- V : V :
- SGML Parser/Entity Manager
- : A
- V :
- Data storage and access
-
-
- In this model, applications can either interact directly with the SGML
- parser and entity manager or interact with the logical access
- interface. The first choice represents traditional "translator"
- applications where the application itself performs all document data
- handling (e.g., builds its own data structures for holding information
- about the document as it's processed). The other option is a "query
- interface", where the knowledge of the document structure and content
- is held and managed by the logical access interface as a service for
- the application. This effectively provides a more abstract view of
- the document data for the application. This is the sort of view that
- data retrieval systems normally want. How the creation of this view is
- implemented in practice is not important and could be via parsing of
- the document instance on demand if the parser is fast enough.
-
- This is a very abstract model and the methods of implementing
- it are many and varied. I'm not sure there's a meaningful
- difference between a system where SGML documents are parsed
- into a database (without loss of structural data or knowledge
- of data entities), from which all subsequent access is done,
- and a system where all data access is done against the raw
- SGML data itself. As long as the SGMLness of the data is
- preserved such that it can be restored at will, it conforms.
- Note that this is different from using "SGML-like" structures
- or parsing SGML into a form in which some SGML aspects are
- lost, even if the structure is preserved.
-
- For example, the definition of SFQL presented at SGML '92 had no
- mechanism for preserving knowledge of NDATA entities within
- element content. Therefore, it fails the test defined above
- because there is loss of vital SGML information. By the same token,
- if the validity of the data in an "SGML database" cannot be
- determined in exactly the same was as if the data was parsed
- normally, the database system fails.
-
- I can see a tremendous utility in the library system Erik
- outlined, where a set of SGML access services are provided
- that make SGML applications independent of any given SGML
- parser or entity manager. I suspect that the first generally-
- available (which may mean free) or well-defined set of functions
- will become a defacto standard. It would be interesting to
- see such a library defined as part of OSF/1, for example, or
- as part of IBM's SAA architecture for application interfaces.
- The DSSSL standard may provide such a definition,
- I don't know, or it may provide sufficient definition of
- what such a definition would have to provide in terms of
- information about SGML documents to enable creation of a
- complete design in the abstract. I would want to see such
- an interface defined outside the scope of any implementations,
- even if it is provided as part of a vendor product.
-
- Eliot Kimber Internet: drmacro@ralvm13.vnet.ibm.com
- Dept E14/B500 IBMMAIL: USIB2DK9@IBMMAIL
- Network Programs Information Development Phone: 1-919-543-7091
- IBM Corporation
- Research Triangle Park, NC 27709
-