extensibility

Unparsed Entities and Their Uses


When working with a DTD, unparsed entities allow XML documents to reference non-XML data sources. The data sources may be graphics for display with the text, or they may be any other form of non-XML data that an application might need to supplement the information contained in the XML documents.

Unparsed entities look much like external general entities, but there are some key differences. Unparsed entities are marked with a notation identifier, and can only be referenced through attribute values (of the type entity or entities). When an application finds an element that contains an unparsed entity name as the value of an attribute of the appropriate type, it can look up the entity's system or public identifier as well as the notation identifying its type, and optionally load or present the information contained in the unparsed entity.

While unparsed entities can be declared in the main schema for a document, many documents declare them as part of their internal subset rather than relying on a schema meant for multiple documents to understand their particular needs. As the number of documents grows, storing all unparsed entities in a centralized schema may produce enormous lists of declarations whose contents are only used by a few documents.

Note: Applications aren't required to process unparsed entities, and most 'generic' XML applications, like browsers, won't have any idea what to do with an unparsed entity. They tend to be most useful in limited situations involving custom applications. Another tool for referencing both XML and non-XML information from XML documents, XLink, is under development at the W3C.

Copyright 2000 Extensibility, Inc.

Suite 250, 200 Franklin Street, Chapel Hill, North Carolina 27516