The easiest way to get a DOM tree is to have it built for you. PyXML
offers two alternative implementations of the DOM,
xml.dom.minidom and 4DOM
. xml.dom.minidom is
included in Python 2. It is a minimalistic implementation, which means
it does not provide all interfaces and operations required by the DOM
standard. 4DOM
(XXX reference) is a complete implementation of
DOM Level 2 (which is currently work in progress), so we will use that
in the examples.
One of the modules in the xml.dom package is xml.dom.ext.reader.Sax2, which provides the functions FromXmlStream, FromXml, FromXmlFile, and FromXmlUrl which will construct a DOM tree from their input (a file-like object, a string, a file name, and a URL, respectively). They all return a DOM Document object.
import sys from xml.dom.ext.reader.Sax import FromXmlStream from xml.dom.ext import PrettyPrint # parse the document doc = FromXmlStream(sys.stdin)