Microsoft XML Parser in Java
Release Notes for Version 1.6

November 6, 1997

Breaking Changes
XML Language Features
Object Model Changes
Bug Fixes

Please check the Release 1.8 Release Notes for changes in the latest version of the XML Parser.

Breaking Changes

The XML language recently changed to become case-sensitive. This is clearly a breaking change from the 1.0 version of the parser and is enabled by default, but for backwards compatibility the object model provides a switch to set the parser back to case-insensitive, as follows:

    Document d = new Document();
    d.setCaseInsensitive(true);
    d.load("http://www.foo.com/example.xml");

The other potentially breaking change is the introduction of ignorable white-space nodes, as defined in the W3C DOM Specification. This results in making Element.getChild(index) unreliable, since there may or may not be white-space nodes that affect this index. The following two examples, although semantically identical, result in two different object models:

Example 1 Example 2

<ROOT> <FOO/> </ROOT>

<ROOT><FOO/></ROOT>

DOCUMENT +---ELEMENT ROOT |---WHITESPACE 0xd 0xa 0x9 |---ELEMENT FOO +---WHITESPACE 0xd 0xa

DOCUMENT +---ELEMENT ROOT +---ELEMENT FOO

Example 1	Example 2
<ROOT> <FOO/> </ROOT>	<ROOT><FOO/></ROOT>
DOCUMENT +---ELEMENT ROOT \|---WHITESPACE 0xd 0xa 0x9 \|---ELEMENT FOO +---WHITESPACE 0xd 0xa	DOCUMENT +---ELEMENT ROOT +---ELEMENT FOO

This means that if you have a pointer to the ROOT element, getChild(0) will not always return the FOO element. A more reliable way to get the FOO element is as follows:

    Element root = document.getRoot();
    Element foo = root.getChildren().getChild(0);

This works because the default ElementCollection returned from getChildren() automatically filters out the white-space nodes.

XML Language Features

Feature Details

Conditional sections In the DTD (INCLUDE and IGNORE keywords).

Namespaces See separate XML Namespaces document.

XML encoding Support for the encoding attribute on the <?XML ...?> tag was added. The actual encodings that are supported depends on the Java Virtual Machine that you have installed on your computer. Under Internet Explorer 3.02, you have support for ISO-10646-UCS-2 and ASCII only. Under the final release of Internet Explorer 4.0, you have support for UTF-8, ISO-10646-UCS-2, Shift_JIS, Big5, and ISO-8859-1. It also supports little endian and big endian storage formats and maintains the same when the document is saved.

XML-SPACE Implemented according to the XML specification. The default for the parser is to normalize white space.

RMD The RMD attribute on the <?XML?> tag is now supported with the possible values NONE, INTERNAL, and ALL. Either way, well-formedness is still checked in the internal subset.

Floating ampersand The parser can now parse the text "this & that" since the ampersand is not followed by a valid name character. This makes it possible to parse existing CDF files.

Feature	Details
Conditional sections	In the DTD (INCLUDE and IGNORE keywords).
Namespaces	See separate XML Namespaces document.
XML encoding	Support for the encoding attribute on the <?XML ...?> tag was added. The actual encodings that are supported depends on the Java Virtual Machine that you have installed on your computer. Under Internet Explorer 3.02, you have support for ISO-10646-UCS-2 and ASCII only. Under the final release of Internet Explorer 4.0, you have support for UTF-8, ISO-10646-UCS-2, Shift_JIS, Big5, and ISO-8859-1. It also supports little endian and big endian storage formats and maintains the same when the document is saved.
XML-SPACE	Implemented according to the XML specification. The default for the parser is to normalize white space.
RMD	The RMD attribute on the <?XML?> tag is now supported with the possible values NONE, INTERNAL, and ALL. Either way, well-formedness is still checked in the internal subset.
Floating ampersand	The parser can now parse the text "this & that" since the ampersand is not followed by a valid name character. This makes it possible to parse existing CDF files.

Object Model Changes

Change Details

Introduced ignorable white space A new white-space node has been added that remembers all the white space between elements. This makes it possible to save the XML in exactly the same format as it was read.

Synchronized with C++ XML object model Several changes were made to the object model in order to sync up with the XML object model provided by the C++ XML Parser. This makes JavaScript pages work the same regardless of whether the back-end parser is C++ or Java.

Improved document save options New feature for selecting document save format: DEFAULT, COMPACT, or PRETTY. DEFAULT saves in original format, COMPACT has no white space, and PRETTY has new lines and tabbed indenting.

Pushed Name class up to API level The Name class is a useful class that automatically tokenizes commonly used names. This can save a lot of memory, and as a result it can also speed up parsing. For example, the msnbc.cdf file creates only 58 unique Names, and shares a whopping 3522 Name objects. All Name objects are created using a static method as follows: Name foo = Name.create("FOO"); These names are stored in a hash table so that multiple instances of the same name will share the actual Name object. Obviously this is useful for XML tags and XML entities, and so the APIs in the object model now take and return Name objects instead of strings whereever applicable so that clients can also receive these benefits.

Added method on ElementFactory A new method was added to ElementFactory to notify the factory when an Element was completely parsed. This is useful for clients who provide their own factory to know when an element that they have created is complete.

Created new DTD abstraction DTD handling code was extracted from the Document class and placed in a new class called DTD. This way the parser no longer has direct knowledge about the Document class.

Added ElementEnumeration This can be used to iterate over the immediate children of a given node in the tree that have a matching tag name.

Added ElementCollection This provides a collection interface similar to that already used in the Internet Explorer 4.0 C++-based XML object model.

Change	Details
Introduced ignorable white space	A new white-space node has been added that remembers all the white space between elements. This makes it possible to save the XML in exactly the same format as it was read.
Synchronized with C++ XML object model	Several changes were made to the object model in order to sync up with the XML object model provided by the C++ XML Parser. This makes JavaScript pages work the same regardless of whether the back-end parser is C++ or Java.
Improved document save options	New feature for selecting document save format: DEFAULT, COMPACT, or PRETTY. DEFAULT saves in original format, COMPACT has no white space, and PRETTY has new lines and tabbed indenting.
Pushed Name class up to API level	The Name class is a useful class that automatically tokenizes commonly used names. This can save a lot of memory, and as a result it can also speed up parsing. For example, the msnbc.cdf file creates only 58 unique Names, and shares a whopping 3522 Name objects. All Name objects are created using a static method as follows: Name foo = Name.create("FOO"); These names are stored in a hash table so that multiple instances of the same name will share the actual Name object. Obviously this is useful for XML tags and XML entities, and so the APIs in the object model now take and return Name objects instead of strings whereever applicable so that clients can also receive these benefits.
Added method on ElementFactory	A new method was added to ElementFactory to notify the factory when an Element was completely parsed. This is useful for clients who provide their own factory to know when an element that they have created is complete.
Created new DTD abstraction	DTD handling code was extracted from the Document class and placed in a new class called DTD. This way the parser no longer has direct knowledge about the Document class.
Added ElementEnumeration	This can be used to iterate over the immediate children of a given node in the tree that have a matching tag name.
Added ElementCollection	This provides a collection interface similar to that already used in the Internet Explorer 4.0 C++-based XML object model.

Bug Fixes

Fixed bugs in Details

Root-level tags The handling of root-level comments and processing instructions (Misc elements) was broken. The XML specification allows for any number of Misc elements before and after the <!DOCTYPE> and root-level Element tags. This information is now also preserved in the object model for saving out in the same order.

DTD validation These fixes include bugs reported by people who used the Alpha 1.0 release of MSXML and new bugs found internally. We also made improvements in error reporting and in making sure that the saved DTD looks the same as the original DTD.

Document save Some things were missing in Document save that caused the resulting output to be invalid in some circumstances.

Entity handling External parameter entities now fetch the external file and parse it (used by namespaces). Entities are now also stored as nodes in the tree, which means that they can now be saved properly.

Fixed bugs in	Details
Root-level tags	The handling of root-level comments and processing instructions (Misc elements) was broken. The XML specification allows for any number of Misc elements before and after the <!DOCTYPE> and root-level Element tags. This information is now also preserved in the object model for saving out in the same order.
DTD validation	These fixes include bugs reported by people who used the Alpha 1.0 release of MSXML and new bugs found internally. We also made improvements in error reporting and in making sure that the saved DTD looks the same as the original DTD.
Document save	Some things were missing in Document save that caused the resulting output to be invalid in some circumstances.
Entity handling	External parameter entities now fetch the external file and parse it (used by namespaces). Entities are now also stored as nodes in the tree, which means that they can now be saved properly.

Back to the XML Parser in Java home page

Microsoft XML Parser in Java Release Notes for Version 1.6

Breaking Changes

XML Language Features

Object Model Changes

Bug Fixes

Microsoft XML Parser in Java
Release Notes for Version 1.6