Microsoft HomeproductssearchsupportshopWrite Us   Microsoft Home
Magazine
 |  Community
 |  Workshop
 |  Tools & Samples
 |  Training
 |  Site Info

Workshop  |  XML (Extensible Markup Language)

Microsoft XML Parser in Java
Release Notes for Version 1.6


November 6, 1997

* Breaking Changes
* XML Language Features
* Object Model Changes
* Bug Fixes

Please check the Release 1.8 Release Notes for changes in the latest version of the XML Parser.


Breaking Changes

The XML language recently changed to become case-sensitive. This is clearly a breaking change from the 1.0 version of the parser and is enabled by default, but for backwards compatibility the object model provides a switch to set the parser back to case-insensitive, as follows:

    Document d = new Document();
    d.setCaseInsensitive(true);
    d.load("http://www.foo.com/example.xml");

The other potentially breaking change is the introduction of ignorable white-space nodes, as defined in the W3C DOM Specification. This results in making Element.getChild(index) unreliable, since there may or may not be white-space nodes that affect this index. The following two examples, although semantically identical, result in two different object models:

Example 1 Example 2
<ROOT>
<FOO/>
</ROOT>
<ROOT><FOO/></ROOT>
DOCUMENT
+---ELEMENT ROOT
    |---WHITESPACE 0xd 0xa 0x9
    |---ELEMENT FOO
    +---WHITESPACE 0xd 0xa
DOCUMENT
+---ELEMENT ROOT
    +---ELEMENT FOO

This means that if you have a pointer to the ROOT element, getChild(0) will not always return the FOO element. A more reliable way to get the FOO element is as follows:

    Element root = document.getRoot();
    Element foo = root.getChildren().getChild(0);

This works because the default ElementCollection returned from getChildren() automatically filters out the white-space nodes.

TopBack to top

XML Language Features

Feature Details
Conditional sections In the DTD (INCLUDE and IGNORE keywords).
Namespaces See separate XML Namespaces document.
XML encoding Support for the encoding attribute on the <?XML ...?> tag was added. The actual encodings that are supported depends on the Java Virtual Machine that you have installed on your computer. Under Internet Explorer 3.02, you have support for ISO-10646-UCS-2 and ASCII only. Under the final release of Internet Explorer 4.0, you have support for UTF-8, ISO-10646-UCS-2, Shift_JIS, Big5, and ISO-8859-1. It also supports little endian and big endian storage formats and maintains the same when the document is saved.
XML-SPACE Implemented according to the XML specification. The default for the parser is to normalize white space.
RMD The RMD attribute on the <?XML?> tag is now supported with the possible values NONE, INTERNAL, and ALL. Either way, well-formedness is still checked in the internal subset.
Floating ampersand The parser can now parse the text "this & that" since the ampersand is not followed by a valid name character. This makes it possible to parse existing CDF files.

TopBack to top

Object Model Changes

Change Details
Introduced ignorable white space A new white-space node has been added that remembers all the white space between elements. This makes it possible to save the XML in exactly the same format as it was read.
Synchronized with C++ XML object model Several changes were made to the object model in order to sync up with the XML object model provided by the C++ XML Parser. This makes JavaScript pages work the same regardless of whether the back-end parser is C++ or Java.
Improved document save options New feature for selecting document save format: DEFAULT, COMPACT, or PRETTY. DEFAULT saves in original format, COMPACT has no white space, and PRETTY has new lines and tabbed indenting.
Pushed Name class up to API level The Name class is a useful class that automatically tokenizes commonly used names. This can save a lot of memory, and as a result it can also speed up parsing. For example, the msnbc.cdf Non-SBN link file creates only 58 unique Names, and shares a whopping 3522 Name objects. All Name objects are created using a static method as follows: Name foo = Name.create("FOO"); These names are stored in a hash table so that multiple instances of the same name will share the actual Name object. Obviously this is useful for XML tags and XML entities, and so the APIs in the object model now take and return Name objects instead of strings whereever applicable so that clients can also receive these benefits.
Added method on ElementFactory A new method was added to ElementFactory to notify the factory when an Element was completely parsed. This is useful for clients who provide their own factory to know when an element that they have created is complete.
Created new DTD abstraction DTD handling code was extracted from the Document class and placed in a new class called DTD. This way the parser no longer has direct knowledge about the Document class.
Added ElementEnumeration This can be used to iterate over the immediate children of a given node in the tree that have a matching tag name.
Added ElementCollection This provides a collection interface similar to that already used in the Internet Explorer 4.0 C++-based XML object model.

TopBack to top

Bug Fixes

Fixed bugs in Details
Root-level tags The handling of root-level comments and processing instructions (Misc elements) was broken. The XML specification allows for any number of Misc elements before and after the <!DOCTYPE> and root-level Element tags. This information is now also preserved in the object model for saving out in the same order.
DTD validation These fixes include bugs reported by people who used the Alpha 1.0 release of MSXML and new bugs found internally. We also made improvements in error reporting and in making sure that the saved DTD looks the same as the original DTD.
Document save Some things were missing in Document save that caused the resulting output to be invalid in some circumstances.
Entity handling External parameter entities now fetch the external file and parse it (used by namespaces). Entities are now also stored as nodes in the tree, which means that they can now be saved properly.

TopBack to top

HomeBack to the XML Parser in Java home page


Did you find this article useful? Gripes? Compliments? Suggestions for other articles? Write us!

Back to topBack to top

© 1998 Microsoft Corporation. All rights reserved. Terms of use.

 

Magazine Home
Ask Jane
DHTML Dude
Extreme XML
For Starters
More or Hess
Servin' It Up
Site Lights
Web Men Talking
Member Community Home
Benefits: Freebies & Discounts
Benefits: Promote Your Site
Benefits: Connect with Your Peers
Benefits at a Glance
Online Special-Interest Groups
Your Membership
SBN Stores
Join Now
Workshop Home
Essentials
Content & Component Delivery
Component Development
Data Access & Databases
Design
DHTML, HTML & CSS
Extensible Markup Language (XML)
Languages & Development Tools
Messaging & Collaboration
Networking, Protocols & Data Formats
Reusing Browser Technology
Security & Cryptography
Server Technologies
Streaming & Interactive Media
Web Content Management
Workshop Index
Tools & Samples Home
Tools
Samples, Headers, Libs
Images
Sounds
Style Sheets
Web Fonts
Training Home
SBN Live Seminars
SBN Live Chats
Courses
Peer Support
CD-ROM Training
Books & Training Kits
Certification
SBN Home
New to SBN?
What's New on SBN
Site Map
Site Search
Glossary
Write Us
About This Site