MIME++ ToolBuzz Tutorial
1. How to Avoid Reading This Document
Depending on what you need to do with MIME++ ToolBuzz, you may not need to read this tutorial at all. For very simple tasks, you may be able to use code from one of the example programs, or use code snippets from the How-To document. It's probably worth your time to take a quick look at the example programs and the How-To document before reading this tutorial to see if you can find suitable code for your task. A brief look at the reference manual pages might also help you to get a quick familiarity with the library classes.
If you need to do more complex tasks — for example, if you need to make significant modifications to one of the example programs — then reading this tutorial is probably a good idea.
2. Prerequisites
To make the best use of MIME++ ToolBuzz, you should already be familiar with C++ and object-oriented programming. The more familiar you are with these topics, the easier it will be for you to learn and use MIME++ ToolBuzz. However, if you are quite new to C++ and object-oriented programming, then you should not be discouraged. Working with a well-designed C++ library like MIME++ ToolBuzz can be an excellent way to improve your skills.
Depending on what you are trying to do, you may also need to be somewhat familiar with the RFCs that define the MIME standard. MIME++ ToolBuzz is very powerful, and it allows you to do very many things. As some would say, it gives you enough rope to hang yourself. Therefore, if you need to do really advanced tasks, then you should probably understand the RFCs enough to be sure that you are creating valid MIME documents. To ease your burden somewhat, however, the example programs all follow the standards as closely as possible. So, if your own code follows the patterns found in the examples, you may get by without knowing the technical details of the RFCs.
3. Getting Started
Tutorials for most programming languages often start with a simple example that prints the message “Hello, World!”. We will follow that lead in this MIME++ ToolBuzz tutorial, by showing how to create a simple MIME message that contains the text “Hello, World!”.
3.1 Creating a Message
The following code snippet shows how easy it is to create a simple “Hello, World!” email message using MIME++ ToolBuzz:
// Create a new DwMessage object
DwMessage msg;
// Set the From, To, Subject, and Date header fields
msg.Headers().From().FromString("john@example.com");
msg.Headers().To().FromString("mary@example.com");
msg.Headers().Subject().FromString("Test message");
msg.Headers().Date().FromCalendarTime(time(0));
// Set the message body
msg.Body().FromString("Hello, World!\n");
// Finally, assemble the message into a string
msg.Assemble();
// Print it to see how it looks
cout << msg.AsString();
Even without the comments in this code example, you should be able to understand what’s going on in this code snippet. The code is pretty much self-documenting.
A key concept behind MIME++ ToolBuzz is illustrated by this example: the document object model. The document object model concept is explained later in this tutorial. For now, we note that the DwMessage object contains a DwHeaders object, which is accessed by the DwMessage::Headers member function. The DwMessage object also contains a DwBody object, which is accessed by the DwMessage::Body member function. A DwHeaders object contains objects, too. In the example above, these other objects are accessed via the member functions DwHeaders::From, DwHeaders::To, DwHeaders::Subject, DwHeaders::Date. This structure is what characterizes a document object model — namely, that objects contain other objects, similar to the way one “fragment” of a document contains other “fragments”.
Note: As we will see later, the objects that are accessed via certain member functions in DwHeaders are actually contained indirectly by the DwHeaders object. A DwHeaders object contains a collection of DwField objects, which themselves contain DwFieldBody objects. Member functions such as DwHeaders::From are shortcut functions (sometimes called convenience functions) that get you more directly to the object you really want. In the case of DwHeaders::From, you directly access a DwMailboxList object, which is a subclass of DwFieldBody. However, the DwMailboxList object is actually contained in a DwField object, which the DwHeaders object contains.
3.2 Parsing a Message
The opposite of creating a message is parsing a message. This code snippet shows how MIME++ ToolBuzz can used to parse a message:
// Set the string value (not shown)
DwString msgStr = ...
// Create a new DwMessage object
DwMessage msg;
// Set the message content into the DwMessage object
msg.FromString(msgStr);
// Parse the message
msg.Parse();
// Extract the information we want
cout << "From: " << msg.Headers().From().AsString() << endl;
cout << "To: " << msg.Headers().To().AsString() << endl;
cout << "Date: " << msg.Headers().Date().AsString() << endl;
cout << "Subject: " << msg.Headers().Subject().AsString() << endl;
cout << "----------------------------------------" << endl;
cout << msg.Body().AsString();
As in the previous code snippet for creating a message, this code snippet also illustrates MIME++ ToolBuzz’s document object model.
Notice that in this example, the member function AsString is frequently used. Most objects in MIME++ ToolBuzz have a string representation, which can be set using FromString and can be retrieved using AsString. As we will see shortly, many classes in MIME++ ToolBuzz are subclasses of DwMessageComponent, and therefore, they inherit the member functions DwMessageComponent::FromString and DwMessageComponent::AsString.
4. Fundamental Concepts
4.1. A Document Object Model for MIME
A document object model is the result of applying object-oriented design to a document format. If you are an experienced web developer, you may be familiar with the document object model for HTML and XML that has been standardized in a W3C recommendation. Or, if you have experience with Microsoft’s Visual Basic for Applications, then you might be familiar with Microsoft’s document object model for Word or Excel. If you have had experience with these other document object models, then you will probably find the document object model that MIME++ ToolBuzz implements to be fairly easy to understand.
MIME++ ToolBuzz is based on a document object model for MIME. What this means, is that the library uses C++ classes to represent the various fragments of a MIME document. (It makes sense to talk about a MIME document, because MIME is a document format, and not all MIME documents are email messages.) We have already seen some of these classes in the examples presented earlier: DwMessage, DwHeaders, DwBody, and so on. These classes correspond to the fragments of a MIME document: the entire message, the headers, the body, and so on. For almost any identifiable fragment of a MIME document, you will also find a corresponding class in the MIME++ ToolBuzz class library.
Note: We use the term “fragment” to denote any identifiable substring of a complete MIME document. Why use the term “fragment” instead of some other term? Well, the term “part” is used in many discussions of MIME to denote specifically a body part. We also considered using the term “component”, but that also has other connotations in the world of programming, specifically in referring to COM objects or CORBA objects. So, we'll stick with the term “fragment”, since it doesn't really have any other connotations that we are aware of.
4.2. The Tree Structure of a Message
When MIME++ ToolBuzz represents a MIME document, it creates objects in an arrangement that reflects the arrangement of the fragments in the document. The fragments of a MIME document are nested, in the sense that some fragments are contained inside of other fragments. For instance, at the highest level, a message contains a collection of header fields and a message body. At the next level, the collection of header fields contains individual header fields, and the body contains zero or more individual body parts (it contains one or more body parts if it is a multipart message). The way to represent this relationship in the object model is by a tree structure, where the containment relationship translates to a parent-child relationship. Therefore, in MIME++ ToolBuzz, a DwMessage object is a node in a tree structure, and it contains two child nodes: a DwHeaders object and a DwBody object. If the objects represent a complete MIME document, then the DwMessage object is the root node of the tree — that is, all other nodes are descendants of the DwMessage node object. At the next level, the DwHeaders object contains zero or more child objects, which are instances of DwField, and the DwBody object may contain zero or more DwBodyPart objects (it contains one or more if the document is a multipart document).
Let’s take a look at what this tree structure looks like. We’ll use the example program named doctree. (The source code for this example program can be found in the directory examples/doctree.) The program takes the name of a text file as a command line parameter. The program reads the contents of this file, which is assumed to be a single, complete MIME document, and displays the tree structure of the MIME++ ToolBuzz objects on the standard output. For this example, we will use a file that contains the following multipart message in MIME format:
From: "Garret A. Hobart" <hobart@vp.gov>, "Levi P. Morton" <morton@who.org> Sender: "Levi P. Morton" <morton@who.org> To: Lewis Cass <cass@somewhere.net>, Charles Pinckney <pinckney@xyz.edu> Subject: Test message, =?iso-8859-1?q?nat=FCrlich?= Date: Wed, 17 May 2000 19:47:24 -0400 Message-ID: <NDBBIAKOPKHFGPLCODIGOEKFCHAA.morton@who.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="bOuNdArY.oUtEr" This is a multi-part message in MIME format. --bOuNdArY.oUtEr Content-Type: multipart/alternative; boundary="bOuNdArY.iNnEr" --bOuNdArY.iNnEr Memo text --bOuNdArY.iNnEr Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <html><body> Memo text </body></html> --bOuNdArY.iNnEr-- --bOuNdArY.oUtEr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="1.txt" First attachment --bOuNdArY.oUtEr Content-Disposition: attachment; filename="1.txt"; x-foo=1 Second attachment --bOuNdArY.oUtEr--
The output from running the doctree program is the following:
+-DwMessage (multipart/mixed) +-DwHeaders | +-DwField (From) | | +-DwMailboxList | | +-DwMailbox | | +-DwMailbox | +-DwField (Sender) | | +-DwMailbox | +-DwField (To) | | +-DwAddressList | | +-DwMailbox | | +-DwMailbox | +-DwField (Subject) | | +-DwText | | +-DwEncodedWord | | +-DwEncodedWord | +-DwField (Date) | | +-DwDateTime | +-DwField (Message-ID) | | +-DwMsgId | +-DwField (MIME-Version) | | +-DwText | | +-DwEncodedWord | +-DwField (Content-Type) | +-DwMediaType | +-DwParameter +-DwBody +-DwBodyPart (multipart/alternative) | +-DwHeaders | | +-DwField (Content-Type) | | +-DwMediaType | | +-DwParameter | +-DwBody | +-DwBodyPart (text/plain) | | +-DwHeaders | | +-DwBody | +-DwBodyPart (text/html) | +-DwHeaders | | +-DwField (Content-Type) | | | +-DwMediaType | | | +-DwParameter | | +-DwField (Content-Transfer-Encoding) | | +-DwMechanism | +-DwBody +-DwBodyPart (text/plain) | +-DwHeaders | | +-DwField (Content-Type) | | | +-DwMediaType | | | +-DwParameter | | +-DwField (Content-Transfer-Encoding) | | | +-DwMechanism | | +-DwField (Content-Disposition) | | +-DwDispositionType | | +-DwParameter | +-DwBody +-DwBodyPart (text/plain) +-DwHeaders | +-DwField (Content-Disposition) | +-DwDispositionType | +-DwParameter | +-DwParameter +-DwBody
The program’s output shows the tree structure of the objects that represent various fragments of the message. The class name of each object is printed. Some additional information is printed in parentheses in order to provide better context for interpreting the output. This sample message is a good one, since it demonstrates many of the features of MIME++ ToolBuzz and the document object model.
If you have a Windows PC available, you can also explore the document object model in MIME++ ToolBuzz interactively using the MsgViewer utility, which can be found in the bin subdirectory. This utility contains a tree control on the left, which shows the tree structure of the MIME++ ToolBuzz objects, and a text control on the right, which shows the document fragment that corresponds to the currently selected node in the tree control.
When learning MIME++ ToolBuzz’s document object model, I highly recommend that you use either the MsgViewer utility or the tree example to experiment with different MIME documents, especially multipart documents.
Note: There are some limitations to using these utilities. First, the text edit control used by the MsgViewer utility (a standard control provided by the Windows operating system) truncates content that is too large. I am not really sure what the maximum size is. Second, if you are using the evaluation version of MIME++ ToolBuzz, you are limited to a maximum message size by the MIME++ ToolBuzz library evaluation version.
4.3. A String Representation and a Broken-Down Representation
As you might have guessed, the common aspects of the MIME++ ToolBuzz classes that can be used as tree nodes have been factored into a common base class. This base class is DwMessageComponent. It’s a good idea to become familiar with the DwMessageComponent class, since it is the base class for many classes in MIME++ ToolBuzz.
One of the key concepts to understand in MIME++ ToolBuzz is that every node object — that is, every subclass of DwMessageComponent — has two different representations. First, there is the string representation. The string representation is just the string that contains the fragment of the MIME document corresponding to the node object. You can get or set the string representation using the member functions DwMessageComponent::AsString and DwMessageComponent::FromString, as we did in the example code earlier. Second, every node object also has a broken-down representation, which is the parsed form of the fragment. The best example of the string representation vs. the broken-down representation can be seen in the DwDateTime class, which represents a date-time value in an email message. The broken-down representation is just the collection of numbers that represent the year, month, day, hour, minute, second, and time zone. The string representation is the date-time as a string in the standard RFC 2822 format, such as “Fri, 13 Jul 2001 11:08:15 -0400”.
One of the things that MIME++ ToolBuzz does very well is to convert from the broken-down representation to the string representation, and to convert from the string representation to the broken-down representation. When you convert from the broken-down representation to the string representation, then we say that you are assembling the object. Conversely, when you convert from the string representation to the broken-down representation, then we say that you are parsing the object. The base class DwMessageComponent makes assembling and parsing integral to the library’s design by offering the pure virtual functions DwMessageComponent::Assemble, which does the assembling, and DwMessageComponent::Parse, which does the parsing. These member functions are inherited by, in fact implemented by, each subclass of DwMessageComponent.
To see more plainly how assembling and parsing works, let’s look at a simple example. For this example, we’ll use DwMailbox which represents an email address. (The term “mailbox” comes from the RFCs, particularly, RFC 2822.) In the following code snippet, we set the broken-down representation of the mailbox, then assemble it to create the string representation:
// Create a DwMailbox object
DwMailbox mbox;
// Set the elements of the broken-down representation
mbox.SetFullName("John Doe");
mbox.SetLocalPart("jdoe");
mbox.SetDomain("example.org");
// Assemble the object to create the string representation
mbox.Assemble();
// See how it looks
cout << mbox.AsString() << endl;
Now we can see how the opposite conversion is done. In the following code snippet, we set the string representation, then parse it to obtain the broken-down representation:
// Create a DwMailbox object
DwMailbox mbox;
// Set the string representation
mbox.FromString("John Doe <jdoe@example.org>");
// Parse the object to create the broken-down representation
mbox.Parse();
// See how it looks
cout << mbox.FullName() << endl;
cout << mbox.LocalPart() << endl;
cout << mbox.Domain() << endl;
These simple code snippets show how assembling and parsing is done on a DwMailbox object. A similar procedure, of course, applies to other subclasses of DwMessageComponent. (As a simple exercise, try it with the DwDateTime class.)
4.4. Recursive Parsing and Assembling
We have seen how parsing and assembling is done, and we have seen how the node objects are arranged in a document tree. Because of the tree structure, it makes sense that the parsing and assembling should be done recursively. In other words, when you execute the parse method on a node object, it makes sense that the parse method should operate on all the child nodes of that object. Let’s see how this happens in MIME++ ToolBuzz.
Suppose you call Parse on a DwMessage object. This action will create the broken-down representation of the DwMessage object, which consists of a DwHeaders object and a DwBody object. After these objects are created, their string representations are set, and then MIME++ ToolBuzz automatically calls the Parse member function of these newly created objects. Let's consider what happens when Parse is called on the DwHeaders object. In this case, the broken-down representation of the DwHeaders object is created, which consists of a list of DwField objects. After these objects are created, their string representations are set, and then MIME++ ToolBuzz calls their Parse member functions. This kind of recursive parsing continues until the leaf nodes (nodes without any child nodes) are created.
A similar thing happens when you call Assemble on a DwMessage object. The order is a little different, however. Before the DwMessage object’s string representation is created, it first calls the Assemble member function of its contained DwHeaders and DwBody objects. Then the string representations of these objects are combined to create the string representation of the DwMessage object. Of course, when MIME++ ToolBuzz calls the Assemble member function of the DwHeaders object, it first calls the Assemble member function of each of the DwField objects that the DwHeaders object contains. And of course, the Assemble member functions of each of those objects is also called, and so on.
The result of this technique of calling the parse or assemble methods recursively is that the entire document subtree rooted at the current node becomes completely consistent, in the sense that the string representations and the broken-down representations are in agreement. For instance, once you call Assemble for the root DwMessage object, you can be sure that for all the nodes in the tree structure, the string representations and the broken-down representation are consistent. Similarly, once you call Parse for the root DwMessage object, you can also be sure that the string representations and the broken-down representations are consistent for every node in the tree.
4.5. The Is-Modified Flag
Finally, we consider the possibility of optimizing the recursion for the assemble method. If all the nodes that are direct or indirect descendants of a particular node object are completely consistent, then calling the assemble method on any one of these node objects is a complete waste. Fortunately, MIME++ ToolBuzz is smart to handle this situation correctly. The implementation uses an is-modified flag to keep track of which node objects are consistent, and which ones require that their assemble method be called in order to make the string representation and the broken-down representation agree. Therefore, whenever you change the broken-down representation of a node object, MIME++ ToolBuzz sets the is-modified flag for that object, which indicates that the assemble method must be called to make the broken-down representation and the string representation agree.
Of course, more is required than just setting the is-modified flag for a single node object. In most cases, the changes are made to the leaf node objects, but the assemble method is called on the root node. What is necessary is for the is-modified flag changes to propagate to the root node. So, if you change a DwMailbox object, which is a leaf node object, you will find that the is-modified flag is also set on the DwMessage root node. What happens is that when a node object sets its is-modified flag, it also notifies its parent to set its is-modified flag. This guarantees that the recursive operation of the assemble method works correctly.
5. Going Further
Having finished reading this tutorial, you should have a good understanding of the most important concepts behind MIME++ ToolBuzz. If you have not already done so, you may want to look at some of the example programs, which are in the examples subdirectory. You will probably find some useful code that you can use to get started on your own project.
To learn more about the individual classes, you can read the reference manual pages, which are in the doc/ref subdirectory. Load the HTML page doc/ref/mimepp.htm into your browser to get started.