extensibility

Moving Information Between Schemas: Transformation Tools


The clean hierarchical structures used by XML to represent data make it much easier to create new representations of the same document. Given an XML parser and some basic tools for transforming XML documents according to a set of rules, converting the information stored in XML documents among XML schemas can be relatively painless. In addition to custom-built application code, two generic tools are available to XML developers for transforming XML content among schema vocabularies: architectural forms and Extensible Style Language Transformation (XSLT).

Architectural Forms have their roots in HyTime, a standard for creating hypermedia with SGML, XML's predecessor. Architectural forms use information built into the schema to convert elements and attributes from one schema to another, providing a flexible and powerful way to transform many different schemas into core architectures. While no official standard for using architectural forms with XML exists, several tools are available for working with architectural forms. James Clark's SP, an SGML parser, includes an architectural forms engine, while David Megginson's XAF provides architectural forms support for applications using the Simple API for XML. The best guide to architectural forms and their use in XML schemas is David Megginson's Structuring XML Documents (Prentice-Hall, 1998.)

The transformation component of Extensible Style Language (XSLT) provides another set of tools for manipulating XML information. XSL takes an XML document and a style sheet that uses XML syntax as its inputs and produces an XML document as its output. XSL style sheets combine templates and declarative markup to describe the final output. XSL and XSLT are under development at the World Wide Web Consortium (W3C).

The primary difference between these two approaches from a schema developer's perspective is that architectural forms expects transformation information to appear inside of a schema (though you can, of course, build a separate schema that is only used for transformations), while XSLT stores all of its transformation information in the style sheet. Coordinating your schema development with expected final products is still important, as you might otherwise permit structures to appear in your document that your style sheet ignores because it wasn't prepared for them. When possible, build schemas in close coordination with the plans for transforming your XML documents, and document your schemas (and style sheets) so that developers who come later can figure out what you were doing.

Copyright 2000 Extensibility, Inc.

Suite 250, 200 Franklin Street, Chapel Hill, North Carolina 27516