Read Me About OTMP

1.0a1

OTMP is a sample code library that lets you to call Open Transport synchronously from preemptively scheduled tasks (MP tasks). The library enables high performance network programming from a common source base on both Mac OS 9 and Mac OS X. This package contains the library itself, a number of supports modules, and a demo application.

On traditional Mac OS, the OTMP library requires Mac OS 9.0 or above. Carbon applications should work with CarbonLib 1.0.4 and later, assuming the underlying Mac OS version is sufficient. On Mac OS X the OTMP library requires Mac OS X Public Beta or later.

Justification

Fast networking requires preemptive scheduling. When network data arrives the system must buffer it until the application executes and reads the data. If the network is very fast, the memory required for this buffering can be huge. Therefore it's vitally important that the application run as quickly as possible after the arrival of network data. Cooperatively scheduled threads do not fit this bill.

For example, let's say your application uses Thread Manager threads for its networking. While inter-thread switch times are very low, one of those threads must eventually call WaitNextEvent in order to respond to user events. At that point the Process Manager yields time to other applications. If there are other applications running it's possible for your application to remain unscheduled for a long time; a typically value might be 6 ticks, or 0.1 seconds. If your application is receiving data from a gigabit network (1000 Gb/s, or approximate 100 MB/s), the system must buffer 10 MB of data (100 MB/s times 0.1 s) to prevent flow control on the wire. This is prohibitively large buffer, which implies that a cooperatively scheduled application will never yield excellent network performance.

The traditional way to run your network application preemptively is to use notifiers. A notifier runs at deferred task time, so there's very little latency between the arrival of network data (an interrupt from the network hardware) and the execution of your application's code.

The big drawback to notifiers is that they run in an interrupt context: your application must do its processing and leave the notifier. Notifier-based applications must use a state machine execution model, which makes them hard to program and unwieldy to maintain. Unfortunately, given Mac OS's traditional lack of preemptive threads, notifiers are the only solution.

In Mac OS 8.6 Apple introduced a preemptive thread model to Mac OS in the form of multiprocessing (MP) tasks. While the name implies that these threads require a multiprocessor system, this is not the case. MP tasks work just fine, and are preemptively scheduled, on single-processor systems. Unfortunately, MP tasks can only call a very limited set of system services. In Mac OS 8.6 this set was too small to write useful network applications. However, in Mac OS 9.0 this set expanded such that MP task-based network applications became possible.

The two key changes in Mac OS 9.0 are:

The first change is critical because network applications typically need to access the file system as well as the network. This is especially true for network applications where speed is a priority; such applications typically move a lot of data, and the file system is either the source or the sink for that data.

The second change is critical because it offers a low-latency communications mechanism between MP tasks and Open Transport. OTMP uses this mechanism to implement its simple and efficient interface to the network from MP tasks. The implementation is described later in this document.

In summary, high-speed networking requires a preemptively scheduled execution environment. You can use notifiers to implement high-speed networking, but the programming model is state machine-based and therefore unnecessarily difficult. OTMP allows you to efficiently access the network using a synchronous threaded execution model based on MP tasks.

Packing List

The sample contains the following items:

Using the Sample

The OTMP library comes with a test application, OTMPSimpleServerHTTP, which you can run to demonstrate the library functioning on your computer. OTMPSimpleServerHTTP is an evolution of the DTS sample code OTSimpleServerHTTP. The non-MP sample is a very simple HTTP server that is threaded using Thread Manager threads and Open Transport's "sync idle" programming model. The MP sample is threaded using preemptively scheduled threads (MP tasks), along with synchronous network access using OTMP.

To try out this sample, take the following steps.

  1. Set up a machine running Mac OS 9 or later with a statically allocated IP address.
  2. Make sure that no existing web server software is running. The easiest way to check for a running web server is to connect to the machine with a web browser and see if it responds.
  3. Duplicate the "Sample HTTP Source" folder (inside the OTMPSimpleServerHTTP folder) and rename the copy to the text of the machine's IP address (for example, "1.2.3.4").
  4. Run the application appropriate to your platform.
    • Mac OS X -- Run the OTMPSimpleServer-Carbon. The sample binds to a privileged port (port 80, the HTTP port), so you must be running with superuser privileges.
    • Mac OS 9 -- Run the OTMPSimpleServer-PPC application. If you have CarbonLib prior to version 1.2 installed, you will need to disable it before running this sample. See the Caveats section for an explanation of why this is necessary.
  5. Connect to the machine with your web browser. You should see the sample main web page (index.html) from the HTTP source folder.

Building the Sample

The sample was built using the CodeWarrior Pro 2 environment upgraded to Universal Headers 3.3.2. You should be able to build it in CodeWarrior Pro 6 without any problems. To build a sample just open its project, select the appropriate target, and choose Make from the Project menu.

Adding OTMP Support to Your Code

The difficulty of adding OTMP support to your code depends on how your existing code is structured.

The remainder of this section describes the various steps in converting a sync/blocking/threaded to use OTMP.

Being InContext

The first step in adopting OTMP is to switch your code to use the "InContext" version of the various OT routines. The rationale for this change is described in DTS Technote 1173 Understanding Open Transport Asset Tracking. If your code is Carbonized, you have already made this change. If your code is not Carbonized, you can make the change without Carbonizing your entire application using the technique shown in the DTS sample code OTClassicContext.

OT to OTMP

The next step is to switch your sync/blocking/threaded OT calls to use OTMP. It's easier to do this before converting to preemptive threads because you can debug your initial OTMP changes with a standard debugger (good MP task-aware debuggers are hard to find). You can safely call OTMP from system task time (including Thread Manager threads) as long as you install a yielder callback. To install a yielder callback, call the routine InstallOTMPMainThreadYielder (declared in "OTMP.h"). Make sure you read the header comments for this routine before calling it.

The next step is to change all of your Open Transport calls to be OTMP calls. OTMP exports two versions of each of the APIs it supports.

You should pick the API that best meets your requirements and then start converting your code. The regular expression "OT[a-z0-9]+\(" does a good job of finding all of the calls to Open Transport in a source file. Search each of your source files for these calls and then, for each call, decide on the appropriate action.

Once you have completed this step you should make sure your application still runs correctly before making the switch to MP tasks.

Important: For regular C programmers, EndpointRef is basically defined as a void *. This makes it easy to accidentally call a real OT routine with an OTMPEndpointRef [2553322]. If your program crashes with a PowerPC access exception in the first few lines of an OT routine, it's likely that you've made this mistake.

Important: The current version of OTMP does not provide any InetSvcRef routines. This is deliberate. Most users of InetSvcRef are calling OTInetStringToAddress to translate a DNS string to an InetAddress, which they then pass to OT. If this is the case, you should consider using an AF_DNS address instead. See DTS sample code OTSimpleDownloadHTTP for an example. Using AF_DNS addresses will help make your code more compatible with IPv6. If you use other InetSvcRef routines let me know and I will add those routines to OTMP.

Thread Manager to MP

The next step is to convert your Thread Manager threads to MP tasks. One quick way to flush out any dependencies on Thread Manager is to remove ThreadsLib from your project; the linker errors will then tell you exactly what Thread Manager routines you use and where you use them. You must then work out an MP task equivalent for the Thread Manager routines you are using. This can range from the very easy (MPCreateTask instead of NewThread) to the non-trivial (ThreadCurrentStackSpace). For more assistance in this area, see Adding Multitasking Capabilities to Applications Using Multiprocessing Services.

However, converting your Thread Manager calls to MP calls is only the tip of the iceberg: you also have to find and fix any MP-unsafe code within your threads.

The OTMPSimpleServerHTTP sample code illustrates a number of the techniques required to turn Thread Manager code into MP task code.

Debugging

Once you have converted your code to use MP tasks, you then have the arduous task of debugging it. Unless you have access to an MP-aware debugger, the only good debugging technique is printf-style debugging. To this end, OTMP is bundled with a library, MoreMPLog, that supports a number of useful logging techniques. The easiest to grasp technique is, literally, printf-style debugging. MoreMPLog provides a routine, MPLogPrintfSlow, that you can use to print text to standard out from an MP task. This routine uses MPRemoteCall to do its job, so it isn't fast. Adding too many calls to MPLogPrintfSlow to your code will significantly slow it down (which may have an effect on the problem you're trying to debug).

A faster alternative to MPLogPrintfSlow is MPLogPrintf. This routine logs its text to a memory buffer. Before calling this routine you must call InitMPLog to create the log buffer. You can call this routine from any execution environment, including MP tasks, system task code (the main thread and other cooperative threads), and all forms of interrupts.

The MoreMPLog module also provides a number of convenience routines to log events and parameters to those events. These routines are named MPLog, MPLog1, and so on. These routines have a number of useful features.

There are two ways of looking at the log. The simplest is to call the routine MPLogWriteToFile, which writes the contents of the log to a text file. This routine takes a non-synchronized snapshot of the log, so if threads are still logging you will see inconsistent results. It's often best to call this routine in the process of quitting your application, after you have shut down your MP tasks.

The second way of looking at the log is using MacsBug. The two key MoreMPLog state variables, the logs position and size, are CFM exported by the module. If you link with the appropriate options, these values will be visible to MacsBug. You can then dump out the log from within MacsBug.

As part of its startup processing, MoreMPLog defines a convenience macro, "MPLog", for finding the end of the log. When you execute this macro from within MacsBug it will search the log for the most recent entry and display the last 256 bytes of the log.

To read the log you need to understand a how it is structured. Each log entry is surrounded by french quotes (option-\ and option-shift-\, « and »). The most recent entry is immediately followed by a bullet character (option-8, •). Note that the log can wrap around at the end of the buffer: the end of the buffer includes a text mark "___End Buffer___" so that you can easily spot it.

How it Works

This section describes how the OTMP library works, in gruesome detail. If your goal is to ship your next cool Mac product, you can probably skip this section. If you want to learn about the gory details inside this library, read on MacDuff!

The key difficulty in implementing OTMP was to provide blocking network I/O on Mac OS 9. On Mac OS X the Carbon Open Transport compatibility library provides convenient and efficient blocking networking, so on that platform OTMP just calls the system and lets it do the heavy lifting. However, things are not that simple on Mac OS 9.

On Mac OS 9 it's not possible to do networking directly from an MP task. Open Transport is not MP-safe. Therefore, OTMP must get the main task to do all the actual networking. There are three key problems to consider.

Each of these topics is discussed in a later section. But first, let's look at the core data structures used by OTMP.

Data Structures

OTMP uses three core data structures, as shown in Figure 1.

Figure 1. OTMP Data Structures

For every MP task that is prepared to call OTMP (every task that has called OTMPPrepareThisTask) there is an OTMPTaskData structure. This structure is referenced by a per-task variable (see MPGetTaskStorageValue) and is allocated when the task calls OTMPPrepareThisTask and deallocated when the task calls OTMPUnprepareThisTask. The only interesting field of this structure is waitEvents, an MP event group that is allocated along with the structure. This event group is used to block MP tasks when they are waiting for the main task to complete an operation.

Note: A per-task data structure is used to avoid memory allocations on the data path. It is not possible to allocate an event group per provider because there can be more than one outstanding blocking operations on the same provider (for example, an OTMPSnd and an OTMPRcv). On the other hand, allocating an event group for each operation would slow down every operation, including operations on the data path that should be quick. Using an event group per task is a good middle position. It's not possible for a task to do two blocking calls simultaneously, so an event group per task is sufficient. It is also efficient, because it uses a single lightweight call (MPGetTaskStorageValue) instead of a call that requires allocation (MPCreateEvent).

The next core data structure is the OTMPProvider. It wraps the underlying Open Transport provider (a reference to which is in the otEP field) and adds some additional fields. The most important extra field is waitRecords, a linked list of all operations blocked on this provider.

Note: This has to be a list rather than a single pointer because there can be more than one outstanding blocking operation on the provider.

Note: Although the "EP" in otEP and mpEP implies that OTMP deals exclusively with endpoints, that is not true. Just like OT, OTMP works on providers at the bottom level. While OTMP currently only supports one class of providers, the endpoint class, this may not always be true.

The final core data structure is the OTMPWaitRecord. This structure holds all the information needed for

The link field is used to chain together all of the wait records for a given provider. The concurrency control on this list is a tricky topic and is discussed below.

The waitingFor field holds an OTEventCode that describes the event that the operation is waiting for. The example in figure one shows an OTMPBind operation, and hence the value is T_BINDCOMPLETE. When the provider's notifier (which is run by the main task) is called with this value, the main task will unblock the corresponding MP task.

It's important to recognize that this value is not matched exactly. For example, if an MP task does an OTMPRcv and no data is present, a wait record is queued with a waitingFor value of T_DATA. However, if the endpoint disconnects before more data arrives, the endpoint's notifier will be called with a T_DISCONNECT event, which will cause the wait record to be dequeued and the MP task to continue with a kOTLookErr. For more details, see the comments associated with the routine WaitRecordSearchProc in "OTMP.c".

The noteProc field of the wait record contains a pointer to a routine that is called when the waitingFor event is delivered to the notifier. It allows the routine to do event-specific notification processing. OTMPBind does not need this, so it sets this field to nil.

The waitEvents field contains the ID of the event group associated with the MP task that made the request. The code running in the main task set an event in this event group when it wants to unblock the MP task (see Main to MP Communication). This field is kInvalidID when the request is made by cooperatively scheduled code (the main thread or a Thread Manager thread). In that case, the cooperative code polls the waitComplete Boolean and unblocks when the notifier sets it to true.

The waitResult field holds the result of the operation. Although this field is initialized to ioInProgress (1) it is not possible to poll this field because some OT routines (for example, OTMPIoctl and OTMPSnd) return positive values to indicate success. When the MP task unblocks it extracts the operation result from this field and returns it as the function result for the OTMP routine.

The final field of the wait record is labeled blue. This is a Blue action that is used schedule an operation that runs on the main task as soon as possible. It is basically a wrapper around a deferred task: it contains a pointer to a routine that is called at deferred task time. See MP To Main Communication for more details.

The wait record is actually a sub-field of other structures that holds additional parameters. The first wrapper structure is a StdParam. This holds fields required for standard OT operations, that is an operation on an provider that might run asynchronously. Some OTMP operations are not standard operations (for example, OTMPOpenEndpointQInContext is non-standard because there is no endpoint present when the operation starts) but most operations are standard (including OTMPBind, which is used in this example).

Aside from the embedded wait record, the StdParam structure includes two additional fields. The mpEP field is a reference back to the OTMPProvider. This is required as part of the StdAction routine, which is discussed later. The second extra field is stdAction, a pointer to a callback routine used by StdAction.

The outmost level of wrapping around the wait record is a parameter block that is unique to each OTMP routine. This parameter block embeds the wait record (inside a StdParam, if the routine implements a standard operation) and the other parameters necessary for the routine. In this example we can see that BindParam includes a StdParam (which holds the mpEP parameter to OTMPBind) and two other fields to hold the reqAddr and retAddr parameters to OTMPBind.

One crucial aspect of these routine-specific parameter blocks is that they exist on the stack (that is, the parameter block is a local variable of the OTMP routine). The OTMP routine fills out the fields of the parameter block, starts the operation by scheduling a Blue action, and then blocks until the main task has completed the operation and unblocks it. By allocating the parameter block on the stack we minimize the overhead of memory allocations on the data path.

MP to Main Communication

Communication from MP to the main task is done via Blue actions, implemented in the module MoreBlueActions. A Blue action is much like a deferred task. The client supplies a MoreBlueAction structure to a routine MoreBlueActionInstall, which adds the structure to a linked list of structures (gMoreBlueActionList). The module then installs a deferred task (using DTInstall). When the deferred task callback (MoreBlueActionsRun) executes, it dequeues all of the Blue actions on the list and calls their associated action callback. MoreBlueActions uses a single deferred task to run all queued Blue actions, and uses an atomic lock, gMoreBlueActionsRunInUse, to track the usage of that deferred task.

OTMP uses Blue actions to schedule operations to run on the main task. The blue field of the wait record is the Blue action used for that record. For a standard operation, the blue field contains a pointer to the routine StdAction. When this routine is called at deferred task time it does three things.

  1. On entry StdAction enters the provider's notifier. On exit it leaves the notifier. This locks out other notifications while the routine is running. This is a required part of the global synchronization model.
  2. StdAction calls the callback routine referenced by the stdAction field of the StdParam structure. This routine is different from a standard Blue action in that it returns a Boolean result indicating whether the operation is complete.
  3. If the callback routine returns true, StdAction completes the wait record, which unblocks the MP task. If the callback returns false, StdAction adds the wait record to the list of wait records associated with the provider. An event in the future (the delivery of an event to the provider's notifier) will complete the operation and unblock the MP task.

See the StdAction routine in "OTMP.c" for more details.

Main to MP Communication

Communication from the main task to the MP task in done via an event group. The allocation of this event group is discussed above. After the MP task has initialized an operation parameter block, it schedules a Blue action to start the operation. It then blocks itself on its event group. When the operation completes, the code in the main task (typically running at deferred task time) is responsible for unblocking the MP task. It does this in three steps.

  1. It sets the waitResult field of the wait record to the result of the operation. After being unblocked the OTMP routine will return this as its function result.
  2. It sets the waitComplete field of the wait record to true. This will unblock a cooperatively scheduled caller of OTMP. See System Task Execution for more details on how cooperatively scheduled clients of OTMP work.
  3. If the waitEvents field of the wait record is not kInvalidID, it sets a bit it the event group using MPSetEvent. This will unblock an MP task caller of OTMP.

See the CompleteWaitRecord routine in "OTMP.c" for the exact code.

CompleteWaitRecord is called in two broad classes of scenario.

Global Synchronization Model

Whenever you write multithreaded code it is vital to synchronize accesses to data shared between the threads. OTMP is no exception to this rule. However, OTMP's job is particularly tricky because it lives on the boundary between two threading models, MP tasks and Open Transport (deferred tasks and notifiers).

MP tasks can use MP critical sections (mutexes) to synchronized access to their shared data. However, MP critical sections can not be used to synchronize with deferred tasks -- they are blocking operations, and it is not safe for the main task to block at deferred task time. On the other hand the main task can use deferred tasks and notifiers to synchronize access to its shared data, but this will not stop MP task accessing it. MP tasks can not call OTEnterNotifier!

OTMP's solution to this is to access all shared data from an OT notifier in the main task. The MP task is responsible for initializing an operation parameter block and then queuing a Blue action. The Blue action does all of the work. It's running as a deferred task so it can use standard OT mechanisms to synchronize access to data.

The key data structure that needs protection is the waitRecords list in the OTMPProvider. This list is a standard OT list and thus OTMP must ensure than only one thread of execution is touching it at any given time. It does this in the following way.

Note: The second point is the key reason why the OTMPX routines must call the system-supplied OT compatibility library on Mac OS X rather than just calling the corresponding OTMP routine. Mac OS X provides no concurrency guarantees between deferred tasks and OT notifiers.

A Walk Through OTMPXBind

To illuminate the topics covered in the previous sections, let's look at the various steps taken by OTMPXBind when it's called by an MP task.

  1. OTMPXBind tests whether it's running on Mac OS X. The technique used is described in DTS Q&A OV 03 Detecting Classic and Carbon Environments. If it's running on Mac OS X, it calls OTBind and returns. Mac OS X's Open Transport compatibility library is safe to call from MP tasks. If it's running on Mac OS 9, OTMPXBind calls OTMPBind. All of the remaining steps describe the process on Mac OS 9.
  2. OTMPBind initializes a stack-based BindParam structure to the state shown in Figure 1. It then calls an OTMP internal routine, QueueBlueAndWait.
  3. QueueBlueAndWait starts by calling MoreBlueActionInstall to schedule the Blue action associated with the operation. At this point the execution paths fork. The MP task continues to run QueueBlueAndWait, while the main task takes an interrupt and schedules a deferred task which eventually runs the Blue action (in this case StdAction). I will discuss the MP task first although you must remember that these operations can happen in different orders.
  4. The MP task continues the execution of QueueBlueAndWait. The next step is for the MP task to block waiting on the waitEvents field of the event record. The task remains blocked until it is unblocked by the Blue action.
  5. At some point after calling MoreBlueActionInstall (step 3) the main task runs the Blue action, in this case StdAction. This routine calls OTEnterNotifier to prevent OT calling the endpoint's notifier. It then calls the standard action (the stdAction field of the StdParam structure), which in this case calls BindStdAction.
  6. BindStdAction is quite simple. It calls OTBind with the parameters stored in the BindParam parameter block. It is safe to call OT at this time because we're running as a deferred task on the main task. BindStdAction then returns, with a return result of either true (the OTBind call returned an error) or false (the OTBind call succeeded).
  7. At this point StdAction's behavior varies depending on the result from the standard action (BindStdAction). If the result is true, StdAction will complete the wait record and unblock the MP task by calling CompleteWaitRecord. However, for the sake of this discussion let's assume that the result is false. In that case StdAction adds the wait record to the endpoint's waitRecords list.
  8. Before returning, StdAction calls OTLeaveNotifier to unblock notifications on this endpoint (they were blocked in step 5). It's likely that the T_BINDCOMPLETE event has already occurred (the bind operation is very fast), in which case this call will actually deliver that event to the endpoint's notifier (BlueNotifier).
  9. BlueNotifier searches the endpoint's waitRecords list for a wait record that matches the event that is being delivered. In this case it finds the wait record that was added to the list in step 7. It removes that wait record from the list and looks at the noteProc field. If noteProc is not nil, BlueNotifier calls it. However, in the case of OTMPBind the noteProc field is nil, which indicates that this request does not require notification processing: BlueNotifier can complete the wait record by calling CompleteWaitRecord. It uses the notifier's result parameter as the operation result for the wait record.
  10. CompleteWaitRecord stores the operation result into the waitResult field of the wait record and then sets waitComplete to true. It then sets an event in the event group designated by the waitEvents field. This unblocks the MP task that was blocked in step 4. It then returns to BlueNotifier, which in turns returns to OT, which returns from OTLeaveNotifier to StdAction, which returns to the Blue actions module, which returns to the Deferred Task Manager. The critical thing is that none of these routines touch the wait record after the MP task has been unblocked. This is necessary because the wait record exists on the MP task's stack, and may well be destroyed as soon as the MP task unblocks.
  11. After being unblocked (in the step above) the MP task is now running the remainder of QueueBlueAndWait. This extracts the operation result from the wait record and returns it as its function result to OTMPBind, which returns the result to OTMPXBind, which returns the result to the original caller.

Neat huh?

While this is an accurate description of the implementation of OTMPXBind, it is an idealized view of OTMP as a whole. There are a number of OT routines that do not have simple completion events, or have two completion events, or where OT breaks reentrancy invariants. Some of the trickier cases include OTMPXOpenEndpointQInContext, OTMPXCloseProvider, OTMPXLook, OTMPXConnect, OTMPXSnd, OTMPXRcv, OTMPXListen and OTMPXAccept. These corner cases are extensively documented in comments in "OTMP.c".

System Task Execution

As mentioned earlier in this document it is possible to call OTMP from cooperative scheduled code. This has a number of consequences. First, when a cooperative thread calls OTMP there's no need for an event group on which to block the thread while waiting for the network. This is good because the technique used to store the event group's ID (per-MP task storage) does not work well for multiple cooperative threads within the main task. Instead of blocking on an event group the cooperative thread simply spins on the waitComplete field of the wait record. While it's spinning it calls the thread yielder (installed using InstallOTMPMainThreadYielder) which allows the client application to yield time to other threads and processes.

Reusing OTMP Techniques

It is possible to reuse the techniques described above to make any Mac OS 9 interrupt-safe system service available from MP tasks. For example, say you want to call the USB API synchronously from an MP task. You can do this by

Apple has used techniques similar to this to make the File Manager and Device Manager callable from MP tasks. You can extend this technique for any Mac OS API that is callable at interrupt time (for example, USB, FireWire, SCSI Manager, ATA Manager, Sound Manager).

Caveats

This sample has a number of caveats of which you should be aware.

Credits and Version History

If you find any problems with this sample, mail <DTS@apple.com> and we will try to fix them up.

Share and Enjoy.

Worldwide Developer Technical Support
7 Nov 2000