Using the Low-Level Parser Interface

The low-level parser API gives you complete flexiblity to do whatever you wish with the data in an XML document. To use the low-level parser API, you define a set of callback functions that the parser invokes as it encounters specific structures in the XML document. The code in this section shows how to use the low-level parser to print the data in an XML document. A sample implementation for each callback function is shown, and then the code to create and run the parser.

The code in Listing 1-6 implements the first--and by far the longest--callback function, CFXMLParserCreateXMLStructureCallBack . This example implementation prints the contents of each new XML structure's additional information data as it is encountered.

Listing 1-6 Implementing the CFXMLParserCreateXMLStructureCallBack function
void *createStructure(CFXMLParserRef parser, CFXMLNodeRef node, void *info) { CFStringRef myTypeStr; CFStringRef myDataStr; CFXMLDocumentInfo *docInfoPtr; // Use the dataTypeID to determine what to print. switch (CFXMLGetNodeType(node)) { case kCFXMLNodeTypeDocument: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeDocument\n"); docInfoPtr = CFXMLNodeGetInfoPtr(node); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Document URL: %@\n"), CFURLGetString(docInfoPtr->sourceURL)); break; case kCFXMLNodeTypeElement: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeElement\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Element: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeProcessingInstruction: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeProcessingInstruction\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("PI: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeComment: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeComment\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Comment: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeText: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeText\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Text:%@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeCDATASection: myTypeStr = CFSTR("Data Type ID: k CFXMLDataTypeCDATASection\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("CDATA: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeEntityReference: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeEntityReference\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Entity reference: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeDocumentType: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeDocumentType\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("DTD: %@\n"), CFXMLNodeGetString(node)); break; case kCFXMLNodeTypeWhitespace: myTypeStr = CFSTR("Data Type ID: kCFXMLNodeTypeWhitespace\n"); myDataStr = CFStringCreateWithFormat(NULL, NULL, CFSTR("Whitespace: %@\n"), CFXMLNodeGetString(node)); break; default: myTypeStr = CFSTR("Data Type ID: UNKNOWN\n"); myDataStr = CFSTR("Unknown type.\n"); } // Print the contents. printf("---Create Structure Called--- \n"); CFShow(myTypeStr); CFShow(myDataStr); // Release the strings. CFRelease(myTypeStr); // Return the data string for use by the addChild and // endStructure callbacks. return myDataStr; }

Notice that the CFXMLParserCreateXMLStructureCallBack function returns the data string created using the dataString field of the newly encountered structure. This return value can actually be anything, but is kept by the parser and passed back to you by both the CFXMLParserAddChildCallBack and CFXMLParserEndXMLStructureCallBack functions described below. Note that if your CFXMLParserCreateXMLStructureCallBack function returns NULL , CFXMLParserAddChildCallBack and CFXMLParserEndXMLStructureCallBack will not be called. The only exception is CFNodeTypeDocument ; CFXMLParserEndXMLStructureCallBack will be called for it even if you return NULL from CFXMLParserCreateXMLStructureCallBack .

The parser invokes the CFXMLParserAddChildCallBack when it encounters a child of the most recently parsed structure. In this example, the CFXMLParserAddChildCallBack callback shown in Listing 1-7 simply prints out both of the strings to make clear the parent-child relationships of the XML structures being parsed.

Listing 1-7 Implementing the CFXMLParserAddChildCallBack function
void addChild(CFXMLParserRef parser, void *parent, void *child, void *info) { printf("---Add Child Called--- \n"); printf("Parent being added to: "); CFShow((CFStringRef)parent); printf("Child being added: "); CFShow((CFStringRef)child); }

The parser calls the CFXMLParserEndXMLStructureCallBack function, implemented in Listing 1-8, when it moves beyond a given structure. The xmlType parameter is a pointer to whatever data the CFXMLParserCreateXMLStructureCallBack function returned when the structure's open tag was first encountered. In this example implementation, the callback prints out a string indicating which structure has ended.

Listing 1-8 Implementing the endStructure callback
void endStructure(CFXMLParserRef parser, void *xmlType, void *info) { // Leave evidence that we were called. printf("---End Structure Called for \n"); CFShow((CFStringRef)xmlType) // Now that the structure and all of its children have been parsed, // we can release the string. CFRelease(xmlType); }

The parser calls the CFXMLParserResolveExternalEntityCallBack function when it encounters an external entity reference. The example XML data in this chapter contains no entity references so this callback is not invoked. Listing 1-9 shows a minimal implementation.

Listing 1-9 Implementing the CFXMLParserResolveExternalEntityCallBack function
CFDataRef resolveEntity(CFXMLParserRef parser, CFStringRef publicID, CFURLRef systemID, void *info) { printf("---resolveEntity Called---\n"); return NULL; }

The parser calls the CFXMLParserHandleErrorCallBack callback when it encounters an error condition. As shown in Listing 1-10, you can use the XML Services API to get both the error string and error location information from the parser. If you return false from this callback, the parser aborts. If you return true and the error is nonfatal, the parser continues processing.

Listing 1-10 Implementing the handleError CFXMLParserHandleErrorCallBack function
Boolean handleError(CFXMLParserRef parser, SInt32 error, void *info) { char buf[512], *s; // Get the error description string from the Parser. CFStringRef description = CFXMLParserCopyErrorDescription(parser); s = (char *)CFStringGetCStringPtr(description, CFStringGetSystemEncoding()); // If the string pointer is unavailable, do some extra work. if (!s) { CFStringGetCString(description, buf, 512, CFStringGetSystemEncoding()); } CFRelease(description); // Report the exact location of the error. fprintf(stderr, "Parse error (%d) %s on line %d, character %d\n", (int)error, s, (int)CFXMLParserGetLineNumber(parser), (int)CFXMLParserGetLocation(parser)); return FALSE; }

Listing 1-11 demonstrates how to create and invoke the parser.

Listing 1-11 Creating and invoking the XML parser
// First, set up the parser callbacks. CFXMLParserCallBacks callbacks = {0, createStructure, addChild, endStructure, resolveEntity, handleError}; // Create the parser with the option to skip whitespace. parser = CFXMLParserCreate(kCFAllocatorDefault, xmlData, urlOut, kCFXMLParserSkipWhitespace, kCFXMLNodeCurrentVersion, &callbacks); // Invoke the parser. if (!CFXMLParserParse(parser)) { printf("parse failed\n"); }

As you can see, once the callbacks have been implemented, the code to create and call the parser is quite simple. Listing 1-12 shows the output generated by the code in Listing 1-11.

Listing 1-12 Parser output
---Create Structure Called--- Data Type ID: kCFXMLNodeTypeDocument, Document: file://localhost/myPlist.xml ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeProcessingInstruction, PI: xml ---Add Child Called--- Parent being added to: Document: file://localhost/myPlist.xml Child being added: PI: xml ---End Structure Called for PI: xml ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeDocumentType, DTD ---Add Child Called--- Parent being added to: Document: file://localhost/myPlist.xml Child being added: DTD ---End Structure Called for DTD ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: plist ---Add Child Called--- Parent being added to: Document: file://localhost/myPlist.xml Child being added: Element: plist ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: dict ---Add Child Called--- Parent being added to: Element: plist Child being added: Element: dict ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: key ---Add Child Called--- Parent being added to: Element: dict Child being added: Element: key ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeText, Text: Jane Doe ---Add Child Called--- Parent being added to: Element: key Child being added: Text: Jane Doe ---End Structure Called for Text: Jane Doe ---End Structure Called for Element: key ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: integer ---Add Child Called--- Parent being added to: Element: dict Child being added: Element: integer ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeText, Text: 1999 ---Add Child Called--- Parent being added to: Element: integer Child being added: Text: 1999 ---End Structure Called for Text: 1999 ---End Structure Called for Element: integer ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: key ---Add Child Called--- Parent being added to: Element: dict Child being added: Element: key ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeText, Text: John Doe ---Add Child Called--- Parent being added to: Element: key Child being added: Text: John Doe ---End Structure Called Text: John Doe ---End Structure Called for Element: key ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeElement, Element: integer ---Add Child Called--- Parent being added to: Element: dict Child being added: Element: integer ---Create Structure Called--- Data Type ID: kCFXMLNodeTypeText, Text: 2000 ---Add Child Called--- Parent being added to: Element: integer Child being added: Text: 2000 ---End Structure Called for Text: 2000 ---End Structure Called for Element: integer ---End Structure Called for Element: dict ---End Structure Called for Element: plist

© 2000 Apple Computer, Inc. (Last Updated 14 July 2000)