4.1 Getting A DOM Tree

The easiest way to get a DOM tree is to have it built for you. PyXML offers two alternative implementations of the DOM, xml.dom.minidom and 4DOM. xml.dom.minidom is included in Python 2. It is a minimalistic implementation, which means it does not provide all interfaces and operations required by the DOM standard. 4DOM (XXX reference) is a complete implementation of DOM Level 2 (which is currently work in progress), so we will use that in the examples.

One of the modules in the xml.dom package is xml.dom.ext.reader.Sax2, which provides the functions FromXmlStream, FromXml, FromXmlFile, and FromXmlUrl which will construct a DOM tree from their input (a file-like object, a string, a file name, and a URL, respectively). They all return a DOM Document object.

import sys
from xml.dom.ext.reader.Sax import FromXmlStream
from xml.dom.ext import PrettyPrint

# parse the document
doc = FromXmlStream(sys.stdin)