Working with Documents

Creating New Documents

oXygen supports a number of Document Types, use the following procedure to create documents.

Procedure 4.1. Creating new documents

  1. Select File-> New (Ctrl+N). The New dialog is displayed which contains the supported Document Types: XML, XSL, XML Schema, Relax NG Schema, Document Type Definition, Text File, Java File, C File, C++ File, Batch File, Shell File, Properties File, SQL File, PHP File and PERL File.

  2. Select a document type, then click OK. If XML was selected the Create an XML Document dialog is displayed otherwise a new document is opened in the Editor Panel.

The Create an XML Document dialog enables definition of XML Document Prologs using either XML Schema or DTD system identifiers. As not all XML documents are required to have a Prolog, you may choose to skip this step by clicking OK. If the prolog is required, use the following tables to complete the fields as required.

Figure 4.7. The Create an XML Document Dialog - XML Schema Tab

The Create an XML Document Dialog - XML Schema Tab

Complete the dialog as follows:

Use a DTD or XML Schema

When checked enables selection between DTD or XML Schema.

URL

Specifies the location of an XML Schema Document (XSD).

Document Root

Populated from the elements defined in the specified XSD, enables selection of the element to be used as document root.

Namespace

Specifies the document namespace.

Figure 4.8. The Create an XML Document Dialog - DTD Tab

The Create an XML Document Dialog - DTD Tab

Complete the dialog as follows:

Use a DTD or XML Schema

When checked enables selection between DTD or XML Schema.

System ID

Specifies the location of a Document Type Definition (DTD).

Document Root

Populated from the elements defined in the specified DTD, enables selection of the element to be used as document root.

Public ID

Specifies the PUBLIC identifier declared in the Prolog.

Opening, Saving and Closing Documents

As with most editing applications, oXygen lets you open existing documents, save your changes and close them as required.

Opening Documents

Documents can be opened using:

  • File-> Open (Ctrl+O) : Displays the Open dialog used to discover, select and open one or more files.

  • File->Reopen: Displays a list of recently opened document files. Select a file to open.

  • Open: Opens the selected file from the Project View.

In addition oXygen supports direct opening of files from the command prompt. Use the following command syntax: sh ./oxygen.sh FileToOpen.xml

When the Tree View is started the oXygen will display the current document from the Main Window.

Saving Documents

There are three save methods:

  • File-> Save (Ctrl+S) : Saves the current document. If the document does not have a file, displays the “Save As” dialog.

  • File->Save All: Saves all open documents. If any document does not have a file, displays the “Save As” dialog.

  • File->Save As: Displays the Save As dialog, used to name and save an open document to a file; or save an exiting file with a new name.

Closing Documents

To close documents use the following methods:

  • File->Close (Ctrl+W) : Closes only the selected tab. All other tab instances remain.

  • File->Close All: Closes all open documents. If a document is modified or has no file, a prompt to save, not to save, or cancel the save operation is displayed.

  • Close: Displayed when a tab is right-clicked. Closes the selected tab when selected.

  • Close other files: Displayed when a tab is right-clicked. Closes the other files except the selected tab.

  • Close all Tabs: Closes all open tabs within the panel.

Creating Documents based on Templates

Templates are documents containing a predefined structure. They provide starting points on which to rapidly build new documents that repeat the same basic characteristics. oXygen installs a rich set of templates for a number of XML applications. You may also create your own templates and share them with other users.

The Templates dialog, enables you to select templates that have already been created in previous sessions or by other users. Open the Templates dialog by selecting File->New from Templates.

Figure 4.9. The Templates Dialog

The Templates Dialog

Open a template using the following options:

Standard

Populates the Templates list to show templates supplied with the oXygen installation package.

From File

Populates the Templates list to show personal templates previous saved.

From URL

Enables definition of a URL location containing Templates.

Templates List

Displays the available templates for Standard, From File and From URL options.

Procedure 4.2. Creating Documents based on Standard Templates

  1. Select File->New from Templates. The Templates dialog is displayed.

  2. Select the Standard option from the Load Templates Group. The Templates list displays standard oXygen templates.

  3. Scroll the Templates list and select the required Template Type.

  4. Click OK. A new document is opened that already contains structure and content provided in the template starting point.

Procedure 4.3. Creating Documents based on Personal Template Files

  1. Select File->New from Templates. The Templates dialog is displayed.

  2. Select the From File option from the Load Templates Group. The Templates list displays person templates.

  3. Scroll the Templates list and select the required Template Type.

  4. Click OK. A new document is opened that already contains structure and content provided in the template starting point.

Procedure 4.4. Creating Documents based on URL Template Files

  1. Select File->New from Templates. The Templates dialog is displayed.

  2. Select the From URL option from the Load Templates Group. The From URL field is enabled.

  3. Enter the URL location of the templates, then click Load. The list of templates is retrieved from the URL and displayed in the Templates list.

  4. Scroll the Templates list and select the required Template Type.

  5. Click OK. A new document is opened that already contains structure and content provided in the template starting point.

Creating New Templates

oXygen enables user defined templates to be created. Templates are created by adding an existing document to the Template library.

Procedure 4.5. Creating New Templates

  1. Open the document that will be used to create the Template.

  2. Modify the structure and content as required.

  3. Select File -> Add to Templates. The Add to Templates dialog is displayed.

  4. Enter the name by which the template will be known. Click OK the document is added to the list of Personal Templates.

  5. Test the template using the From File option.

Sharing Templates

oXygen stores Personal Templates in an XML file called .com.oxygenxml.templates.xml, located in the Home folder of the oXygen user. By copying this file to a Web Server folder and making it accessible via HTTP, other oXygen users can use the From URL option to access the templates.

Procedure 4.6. Sharing Templates

  1. Create one or more Personal Templates.

  2. Copy .com.oxygenxml\templates.xml to an accessible folder on your Web Server.

  3. Test the template using the From URL option.

Editing Documents

While editing a document is a simple procedure, there are some points of which you should be aware and which should make your editing more productive.

Working with Unicode

Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. Unicode is an internationally recognized standard, adopted by industry leaders. The Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.

It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends. Incorporating Unicode into client-server or multi-tiered applications and websites offers significant cost savings over the use of legacy character sets.

As a modern XML Editor, oXygen provides support for the Unicode standard. Enabling your XML application to be targeted across multiple platforms, languages and countries without re-engineering. Internally, the oXygen Editor uses 16bit characters covering the Unicode Character set.

On loading documents of the type XML, XSL, XSD and DTD, oXygen reads the document prolog to determine the specified encoding type. This is then used to instruct the Java Encoder to load support for and save using the code chart specified. In the event that the encoding type cannot be determined, oXygen will prompt and display the “Available Java Encodings” dialog. The “Available Java Encodings” dialog provides a list of all encodings supported by the Java platform.

Figure 4.10. Available Java Encodings Dialog

Available Java Encodings Dialog

While in most cases you will use UTF-8, simply changing the encoding name will cause the file to be saved using the new encoding. The appendix Unicode Character Encoding provides a Matrix that matches common names with Java Names. It also explains what you should type in the XML prolog to cause the document to be saved as the required encoding.

To edit document written in Japanese or Chinese, you will need to change the font to one that supports the specific characters (a Unicode font). For the Windows platform, use of “Arial Unicode MS” or “MS Gothic” is recommended. Do not expect Wordpad or Notepad to handle these encodings. Use Explorer or Word to eventually examine XML documents.

Note

The naming convention used under Java does not always correspond with the common names used by the Unicode standard. For instance, while in XML you will use encoding=“UTF-8”, in Java the same encoding has the name “UTF8”.

Streamline with Code-Insight

oXygen's Intelligent Code-Insight feature is an editing assistant that enables rapid, in line identification and insertion of structured language elements, attributes and in some cases their parameter options.

The Code-Insight assistant is automatically displayed whenever the < character is entered into a document or by selecting CTRL+Space on a partial element or attribute name. Moving the focus to highlight an element and pressing the Enter key, inserts both the start and end parts of the highlighted element in to the document. If the feature Add Required Elements of Code-Insight is enabled all the elements that the new element must contain, as specified in the DTD or XML Schema, are inserted automatically in the document. After inserting, the cursor is positioned directly before the > character of the start tag, if the element has attributes, in order to enable rapid insertion of any attributed supported by the element, or after the > char of the start tag if the element has no attributes. Pressing the space bar, directly after element insertion will again display the assistant. In this instance the attributes supported by that element will be displayed. If an attribute supports a fix set of parameters, the assistant will display the list of valid parameter. If the parameter setting is user defined and therefore variable, the assistant will be closed to enable manual insertion.

Figure 4.11. Code-Insight Assistant

Code-Insight Assistant

The content of the Code-Insight assistant is dependent on the element structure given in a given DTD or XML Schema file.

The number and type of elements displayed by the assistant is sensitive to the current position of the cursor in the structured document. The child elements displayed within a given element are defined by the structure of the specified DTD or XML Schema. All elements that are not child elements of the current element are filtered out. This behavior is continued to element attributes and their parameters.

The DTD or XML Schema used to populate the Code-Insight assistant is specified in the following methods, in order of precedence:

  • From the file specified in the external subset of the document prolog. In this case oXygen reads the prolog and resolves the location of the DTD or XML Schema.

  • From the file specified in the oXygen Code-Insight dialog. oXygen will read the Code-Insight settings when the prolog fails to provide or resolve the location of a DTD or XML Schema.

Creating DTDs

When working with documents that do not specify a DTD, or for which the DTD is not known or does not exist, oXygen is able to learn and translate it to a DTD, which in turn can be saved to a file in order to provide a DTD. In addition to being useful for quick creation of a DTD that will be capable of providing an initialization source for the Code-Insight assistant. This feature can also be used to produce DTDs for documents containing personal or custom element types.

Procedure 4.7. To create a DTD:

  1. Open the structured document from which a DTD will be created.

  2. Select XML->Learn Structure (Ctrl+Shift+L). oXygen will learn the document structure, when finished displaying words “Learn Complete” in the Message Pane of the Editor Status bar.

  3. Select XML->Save Structure (Ctrl+Shift+S) to save the DTD currently stored in memory to file.

Note

The resulting DTD is only valid for documents containing the elements and structures defined by the document used as the input for creating the DTD. If new element types or structures are defined in a document, they must be added to the DTD in order for successful validation.

Working with XML Catalogs

When Internet access is not available one or more XML catalogs can be added to the list in the dialog below and the local copies of the DTD and/or XML Schema files will be used. When you add or delete an XML catalog to the list of XML catalogs in the Options -> Preferences -> XML Catalog pane you should restart the application so that the changes take effect.

The Prefer option is used to specify whether oXygen will try to resolve first the PUBLIC or SYSTEM reference using the specified XML catalogs. If a PUBLIC reference is not mapped in any of the catalogs then a SYSTEM reference is looked up. The verbosity level specifies the types of output messages displayed and can have one of the values: debug, warn, info, error and fatal.

Formatting and Indenting Documents (Pretty Print)

In structured markup languages, the whitespace between elements that is created by use of the Space bar, Tab or multiple line breaks insertion from use of the Enter, is not recognized by the parsing tools. Often this means that when structured markup documents are opened, they are arranged as one long, unbroken line, what seems to be a single paragraph.

While this is perfectly acceptable practice, it makes editing difficult and increases the likelihood of errors being introduced. It also makes the identification of exact error positions difficult. Formatting and Indenting, also called “Pretty Print”, enables such documents to be neatly arranged, in a manner that is consistent and promotes easier reading on screen and in print output.

Pretty print is in no way associated with the layout or formatting that will be used in the transformed document. This layout and formatting is supplied by the XSL style sheet specified at time of transformation.

Procedure 4.8. To format and indent a document:

  1. Open or focus on the document that is to be formatted and indented.

  2. Selecting XML->Format and Indent (Ctrl+Shift+P). While in progress the Message Pane of the Editor Status bar will indicate “Pretty print in progress”. On completion, this will read “Pretty print successful” and the document will be arranged.

Note

Pretty Print automatically converts elements that contain no content from closed syntax to unclosed syntax.

Pretty Print requires that the structured document is “Well Formed”. If the document is not “Well Formed” an error message is displayed. The message will usually indicate that a problem has been found in the form and will hint to the problem type. It will not highlight the general position of the error, to do this run “Well Formed” function by selecting XML->Check XML Form (Ctrl+Shift+W).

Using XPath Expressions

XPath is a language for addressing specific parts of an XML document. XPath, like the Document Object Model (DOM), models an XML document as a tree of nodes. An XPath expression is a mechanism for navigating through and selecting nodes from the XML document. An XPath expression is in a way analogous to a Structured Query Language (SQL) query used to select records from a database.

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node.

XPath defines a library of standard functions for working with strings, numbers and Boolean expressions.

Examples:

child: : * Select all children of the root node.

.//name Select all elements having the name “name”, descendants of the current node.

/catalog/cd[price>10.80]Selects all the cd elements that have a price element with a value larger than 10.80

To use XPath effectively requires, at least an understanding of the XPath Core Function Library. Once you have this knowledge the oXygen XPath expression field part of the Editor toolbar can be used to aid you in XML document development.

To find out more about XPath, the following URL is recommended: http://www.w3.org/TR/xpath

In oXygen the results of an XPath query are returned in the Message Panel. Clicking a record in the result list highlights the nodes within the editing panel.

Results are returned in a format that itself is a valid XPath expression:

- [FileName.xml] /node[value]/node[value]/node[value] -

Example 4.1. XPath Utilization with DocBook DTD

Our example is taken from a DocBook book based on the DocBook XML DTD. The book contains a number of chapters. DocBook defines that chapters as have a <chapter> start tag and matching </chapter> end tag to close the element. To return all the chapter nodes of the book enter //chapter into the XPath expression field, then Enter. This will return all the chapter nodes of the DocBook book, in the Message Panel. If your book has six chapters, their will be six records in the result list. Each record when clicked will locate and highlight the chapter and all sibling nodes contained between the start and end tags of the chapter.

If we used XPath to query for all example nodes contained in the section 2 node of a DocBook XML document we would use the following XPath expression //chapter/sect1/sect2/example. If an example node is found in any section 2 node, a result will be returned to the message panel. For each occurrence of the element node a record will be created in the result list.

In our example an XPath query on the file oxygen.xml determined that:

- [oxygen.xml] /chapter[1]/sect1[3]/sect2[7]/example[1]

Which means:

In the file oxygen.xml, first chapter, third section level 1, seventh section level 2, the example node found is the first in the section.

Tip

If your project is comprised of a main file with ENTITY references to other files, you can use XPath to return all the name elements of a certain type by querying the main file. The result list will query all referenced files.

Using Check Spelling

The Check Spelling option enables you to perform the check spelling on the current document:

Figure 4.12. Check Spelling Dialog

Check Spelling Dialog

Complete the dialog as follows:

Unrecognized Word

Contains the word that cannot be found in any of the existing dictionaries. The word is also highlighted in the XML document.

Replace with

The character string with which suggest to replace the unrecognized word.

Guess

Displays a list of words suggested to replace the unknown word. Double clicking a word in this list automatically inserts it in the document and continues the spell checking process.

Dictionary

Displays a list with the available dictionaries.

Replace

Replaces the currently highlighted word in the XML document, with the selected word in the Guess list box.

Replace All

Replaces all occurrences of the currently highlighted word in the XML document, with the selected word in the Suggestions list box.

Ignore

Allows you to continue checking the document while ignoring the first occurrence of the unknown word. The same word will be flagged again if it appears in the document.

Ignore all

Ignores all instances of the unknown word in the whole document.

Learn

Includes the unrecognized word in the list of valid words so that the spell checker will no longer consider it for correction.

Options

Sets the configuration options of the Spell Checker.

OK

Closes the Spell Checker dialog.

Figure 4.13. Options Dialog

Options Dialog

The Options dialog contains the global check spelling options:

Case sensitive

When checked, operations ignore capitalization errors.

Ignore mixed case words

When checked, operations do not check words containing case mixing (e.g. "SpellChecker").

Ignore word with digits

When checked, the Spell Checker do not check words containing digits (e.g. "b2b").

Ignore Duplicates

When checked, the Spell Checker do not signal two successive identical words as an error.

Ignore URL

When checked, ignores words looking like URL or file names (e.g. "www.xxx.com" or "c:\boot.ini") .

Check punctuation

When checked, punctuation checking is enabled: misplaced white space and wrong sequences, like a dot following a comma, are detected.

Enable auto replace

Enables the "Replace Always" feature.

Allow compounds words

When checked, all words formed by concatenating two legal words with an hyphen are accepted. If the language allows it, two words concatenated without hyphen are also accepted.

Allow general prefixes

When checked, a word formed by concatenating a registered prefix and a legal word is accepted. For example if "mini-" is a registered prefix, accepts "mini-computer".

Allow file extensions

When checked, accepts any word ending with registered file extensions (e.g. "myfile.txt", "index.html" etc.).

Suggestion

This option indicates the type of spell checker accuracy, which may be: "Favour speed over quality", "Normal" and "Favour quality over speed".

Using Search and Replace

The Search and Replace option enables you to perform the following operations on the current document:

  • find occurrences of a word or string of characters including white spaces and highlight the position in the editor.

  • replace occurrences of target defined in the “Find” field with a word or string of characters, including white spaces, defined in the “Replace” field.

  • find all occurrences of a word or string of characters including white spaces and return a result list to the Message Panel.

  • replace all occurrences of a word or string of characters including white spaces.

Figure 4.14. Find/Replace Dialog

Find/Replace Dialog

Complete the dialog as follows:

Text to find

The target character string to search for.

Replace with

The character string with which to replace the target.

Find

Execute a find operation for the next occurrence of the target and stop.

Replace

Execute a replace operation for the target and stop.

Find all

Executes a find operation and returns all results to the Message Panel.

Case sensitive

When checked, operations are case sensitive.

Whole words only

When checked only whole occurrences of a word will be included in the operation.

Find in tags

When checked, operation will include content of the start and end tags of the XML elements.

Regular Expression

When checked allows using any regular expression in PERL syntax.

Search from file start

Commences the operation from start of file, position 0:0.

Using Search and Replace in Files Dialog

The Search and Replace in Files option enables you to perform the same operations as the Search/Replace option on any number of files located in a given path.

Figure 4.15. Search/Replace in Files

Search/Replace in Files

Complete the dialog as follows:

Text to Find

The target character string to search for.

Case Sensitive

When checked, operations are case sensitive.

Whole words only

When checked only whole occurrences of a word will be included in the operation.

Find in tags

When checked, operation will include content of the start and end tags of the XML elements.

Regular Expression

When checked allows using any regular expression in PERL syntax.

Replace with

The character string with which to replace the target.

Make Backups with extension

In the replace process oXygen make backup files of the modified files. The default extension is *bak, but you can change extension as you prefer.

Specified Path

Choose the search path

Path of current file

Use the path of the current file

Project Files (File Filter)

Search the files from the current project using the specified file filter.

Find All

Executes a find operation and returns the result list to the Message Pane

Replace All

Replaces all occurrences of the target containing in the specified files.

Use this option with caution.

Global search and replace across all project files does not open the files containing the targets, nor does it prompt on a per occurrence basis, to confirm that a replace operation must be performed. As the operation simply matches the string defined in the find field, this may result in replacement of matching strings that were not originally intended to be replaced.

Debugging Your Documents

The W3C XML specification states that a program should not continue to process an XML document if it finds a validation error. The reason is that XML software should be easy to write, and that all XML documents should be compatible. With HTML it was possible to create documents with lots of errors (like when you forget an end tag). One of the main reasons that HTML browsers are so big and incompatible, is that they have their own ways to figure out what a document should look like when they encounter an HTML error. With XML this should not be possible.

However, when creating an XML document, errors are very easily introduced. When working with large projects or many files, the probability that errors will occur is even greater. Determining that your project is error free can be time consuming and even frustrating. For this reason oXygen provides functions that enable easy error identification and rapid error location.

Checking XML Form

XML with correct syntax is “Well Formed XML” and has correct XML syntax.

A “Well Formed XML” document is a document that conforms to the XML syntax rules.

  • All XML elements must have a closing tag.

  • XML tags are case sensitive.

  • All XML elements must be properly nested.

  • All XML documents must have a root element.

  • Attribute values must always be quoted.

  • With XML, white space is preserved.

Using the Check XML Form function checks your project for any deviation from these rules. If any error is found the result is returned to the Message Panel. Each error is one record in the Result List and is accompanied by an error message. Clicking the record will open the document containing the error and highlight the approximate location.

Example 4.2. Check XML Form Error Message

In our example we will use the case were an end tag is missing from a DocBook listitem element. In this case running Check XML Form will return the following error.

F The element type “listitem” must be terminated by the matching
                                end-tag “</listitem>”.

To resolve the error, click in the result list record which will locate and highlight the errors approximate position. Review the “listitems”, identify which is missing an end tag and insert </listitem>.

Validating Documents

A “Valid” XML document is a “Well Formed” XML document, which also conforms to the rules of a Document Type Definition (DTD) or XML Schema, which defines the legal elements of an XML document.

The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements.

The oXygen Validate XML function ensures that your document is compliant with the rules defined by an associated DTD or XML Schema.

Example 4.3. Validate XML Error Message

In our example we will use the case were a DocBook listitem element does not match the rules of the docbookx.dtd. In this case running Validate XML will return the following error.

E The content of element type “listitem” must match
“(calloutlist|glosslist|itemizedlist|orderedlist|segmentedlist|
simplelist|variablelist| caution|important|note|tip|warning|
literallayout|programlisting|programlistingco|screen|
screenco|screenshot|synopsis|cmdsynopsis| funcsynopsis|classsynopsis|fieldsynopsis|
constructorsynopsis| destructorsynopsis|methodsynopsis|formalpara|para|simpara|
address|blockquote|graphic|graphicco|mediaobject| mediaobjectco|informalequation|
informalexample| informalfigure|informaltable|equation|example|
figure|table|msgset|procedure|sidebar|qandaset|anchor|
bridgehead|remark|highlights|abstract|authorblurb|epigraph| indexterm|beginpage)+”.

As you can see, this error message is a little more difficult to understand, so understanding of the syntax or processing rules for the DocBook XML DTD's “listitem” element is required. However, the error message does give us a clue as to the source of the problem, but indicating that “The content of element type ‘listitem’ must match”.

Luckily most standards based DTD's and XML Schema's are supplied with reference documentation. This enables us to lookup the element and read about it. In this case we would want to learn about the child elements of “listitem” and their nesting rules. Once we have correctly inserted the required child element and nested it in accordance with the XML rules, the document will become valid on the next validation test.