Chapter 5. Transforming Documents

Table of Contents

Transformation Configuration Options
Creating a Scenario
Example Transformation Configurations
PDF Output
PS Output
TXT Output
HTML Output
HTML Help Output
JavaHelp Output
XHTML Output

XML is designed to store, carry, and exchange data, not to display data. When we want to view the data we must either have an XML compliant user agent or transform it to a format that can be read by other user agents. This process is known as transformation.

Within the current version of oXygen you can transform your XML documents to the following formats without having to exit from the application. For transformation to formats not listed simply install the tool chain required to perform the transformation and process the xml files created with oXygen in accordance with the processor instructions.

PDF

Adobe Portable Document Format (PDF) is a compact binary file format that can be viewed and printed by anyone, anywhere across a broad range of hardware and software using the free PDF Viewer from Adobe.

PS

PostScript is the leading printing technology from Adobe for high-quality, best-in-class printing solutions ranging from desktop devices to the most advanced digital presses, platemakers, and large format image setters in the world. Postscript files can be viewed using viewers such as GhostScript, but are more commonly created as a prepress format.

TXT

Text files are Plain ASCII Text and can be opened in any text editor or word processor.

XML

XML stands for EXtensible Markup Language and is a W3C standard. markup language, much like HTML, which was designed to describe data. XML tags are not predefined in XML. You must define your own tags. XML uses a Document Type Definition (DTD) or an XML Schema to describe the data. XML with a DTD or XML Schema is designed to be self-descriptive. XML is not a replacement for HTML. XML and HTML were designed with different goals:

  • XML was designed to describe data and to focus on what data is.

  • HTML was designed to display data and to focus on how data looks.

  • HTML is about displaying information, XML is about describing information.

XHTML

XHTML stands for EXtensible HyperText Markup Language, a W3C standard. XHTML is aimed to replace HTML. While almost identical to HTML 4.01, XHTML is a stricter and cleaner version of HTML. XHTML is HTML defined as an XML application.

All formatting during a transformation is provided under the control of an Extensible Stylesheet (XSLT). Specifying the appropriate XSLT enables transformation to the above formats and preparation of output files for specific user agent viewing applications, including:

HTML

HTML stands for Hyper Text Markup Language and is a W3C Standard for the World Wide Web. HTML is a text file containing small markup tags. The markup tags tell the Web browser how to display the page. An HTML file must have an htm or html file extension. An HTML file can be created using a simple text editor.

HTML Help

Microsoft HTML Help is the standard help system for the Windows platform. Authors can use HTML Help to create online help for a software application or to create content for a multimedia title or Web site. Developers can use the HTML Help API to program a host application or hook up context-sensitive help to an application.

JavaHelp

JavaHelp software is a full-featured, platform-independent, extensible help system from Sun Microsystems that enables developers and authors to incorporate online help in applets, components, applications, operating systems, and devices. JavaHelp is a free product and the binaries for JavaHelp are re distributable.

Many other target formats are possible, these are the most popular. The basic condition for transformation to any format is that your document is valid against a given DTD and that the XSLT (XSL), used for transformation is compatible with the DTD.

An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class are transformed into an XML document that uses the formatting vocabulary.

XSL consists of three parts:

XSL Transformations

XSLT is a language for transforming XML documents.

XML Path Language

XPath is an expression language used by XSLT to access or refer to parts of an XML document. (XPath is also used by the XML Linking specification).

XSL Formatting Objects

XSL-FO is an XML vocabulary for specifying formatting semantics.

The oXygen installation package is distributed with the Apache FOP (Formatting Objects Processor). FOP is a print and output independent formatter driven by XSL Formatting Objects. FOP is implemented as a Java application that reads a formatting object tree and renders the resulting pages to a specified output. Other FO processors can be configured in the Preferences->External FO Processors option for use in document transformation.

Transformation Configuration Options

The Transformation Configuration dialog is used to specify how the transformation process is executed and what result will be achieved. In addition, the dialog provides an option to initiate the transformation without having to exit the dialog and provides settings to the Apply Transformation function.

Open the Configure Transformation dialog by selecting XML->Configure Transformation (Ctrl+Shift+C).

Figure 5.1. The Configure Transformation Dialog

The Configure Transformation Dialog

Complete the dialog as follows:

Scenario's

Due to the considerable amount of flexibility and choice within the transformation process, oXygen enables the saving of configurations as “Scenario's”. This enables saving of configurations for repeat use and saves time. Scenarios are saved on a per document basis.

XSLT Tab

Use the XSLT tab to specify an input XSL file to be used for the transformation. You can also add XSLT parameters and append header and footer URL's for inclusion in the transformation.

FOP Tab

Use the FOP tab to enable/disable use of FOP during a transformation. FOP input may be provided from the XSLT output or the edited document source. oXygen is supplied with the Apache FOP, but supports definition and use of any third party processor. Default output method is set to use PDF, but PS and TXT are also configured. You may add and define any method supported by your FOP.

Output Tab

Use the Output Tab to specify the output path where target output files will be saved. When performing an XHTML transformation the relative path for image locations must be provided in order to ensure that image paths will be correctly resolved in order to be displayed in the output files. When using FOP this is not required as images will be embedded within the output PDF or PS. This option will therefore be disabled during FOP transformations.

Creating a Scenario

Use the following procedure to create a scenario.

  1. Select XML->Configure Transformation (Ctrl+Shift+C) to open the Configure Transformation dialog.

  2. Click the Duplicate Scenario to create a copy of the current “Scenario”.

  3. Double-click in the “Name” filed to select the exiting text.

  4. Type a new name.

  5. Click OK or Transform Now to save the “Scenario”.

Note

Multiple scenarios can be created for a document. Scenarios are saved on a per document basis. Creating a scenario for one document will not make that Scenario available on other documents.