MPTSK: The Task Organization Layer

It is nice to be able to switch contexts whenever we choose, but we may desire more. Suppose, for instance, we get to a point in our program where we must wait for, say, keyboard input. Which thread do we switch to? Or suppose we want to send a message to another thread, say, telling the thread it's printout has finished? How do we pass the thread the message? What if the thread is busy?

The MPTSK cluster answers these questions by providing a general-purpose, integrated scheduler and message passing system based on mailboxes (from which the package gets its name). Essentially, MPTSK organizes threads into a powerful (multi)task force!

The MPTSK cluster introduces three new datatypes: tasks (MPTSK), messages (MPMSG), and mailboxes (MPBOX).

Tasks essentially enhanced threads. They are automatically scheduled and need not be explicitly switched between. They can also send and receive messages...

Messages can be any recordtype for maximum flexibility. The structure need only begin with an MPMSG field (for internal message management purposes). Indeed, even tasks and mailboxes are treated as messages...

Mailboxes are the thing which ties tasks and messages together, and makes Mailbox Multitasking work. A hybrid of the queue data structure, a mailbox either contains a list of messages or a list of tasks (waiting for messages). (See figure)

Figure: MPTSK: Mailbox ``Message Bank'' Illustration
\begin{figure}
\begin{verbatim}
Messages Mailbox Tasks
-------- ------...
...a task receives it. A task is queued until a message reaches
it.}
\end{figure}

Tasks may either send or receive messages on a mailbox. If a task sends a message and there are no tasks waiting for it, then the tasks adds the message to the end of waiting messages; if there are tasks waiting, then the task gives the message to the first task in line. On the other hand, if a task attempts to receive a message from a mailbox, and there are no messages waiting, then the task adds itself to the end of waiting tasks; if there are messages waiting, then the task takes the message and proceeds.

Think of a mailbox like a bank account and messages like money. Like a bank account, a mailbox has either a credit or deficit of messages. And like a bank account, messages can be deposited or withdrawn. This is a mailbox.

Of course, when people (tasks) come to the bank, they must wait in line. Likewise a mailbox services its tasks and messages in a first-come, first-serve order. This is a mailbox.

Though a simple concept, the MPBOX's mailbox turns out to be a remarkably versatile structure. It is used not only for inter-task communication, but also for scheduling, resource-management, even message-passing super-duper semaphores. For instance, suppose we view three messages waiting on the printers mailbox as representing the computer's three printers. When a task wants a printer, it simply receives from the printers mailbox. The message it gets tells it which printer it got. And if there are no printers available, the task is queued waiting for the printer in logical first-come, first-serve order; when a printer is available, the task will automatically be restarted. And since this FIFO ordering is holds for messages as well, the least-recently-used printer will always be chosen, insuring the three printers are evenly used.

If a processor is also thought of as a resource (which it is), the mailbox abstraction makes a good scheduler. We imagine a schedule mailbox, with tasks waiting to ``receive'' a processor to run them.6 Again, due to the FIFO ordering, tasks are automatically processed in a round-robin fashion. For more advanced, prioritized scheduling, we assign a priority to a task by assigning a mailbox on which to schedule it whenever it can be run. We list these schedule mailboxes in order of priority. Whenever we need to run a task, we search this (short) list sequentially until the first (highest priority) task is found (see figure).

Figure: MPTSK: A Mailbox Scheduler Example
\begin{figure}
\begin{verbatim}
Priority
Priority Level Schedule Tasks...
...imply searches down the schedule
list until it finds a task.}
\par
\end{figure}

A key advantage to mailbox multitasking is speed. Many multitasking systems have a single kernel task which schedules all processes and manages all resources. Therefore, to switch between tasks, and even to send a message, requires at least two context switches (into and out of the kernel). Mailbox requires no or at most one context switch to send a message, and only one to switch between tasks. This is because it does these kernel functions simply and within the current task. It achieves efficiency at the cost of some inter-task protection (which is acceptable since all the memory is shared anyway).

Also, usually only one processor is allowed to execute inside the kernel at once. Thus the kernel scheduler itself becomes the bottleneck as more processors are added, and for this reason it is unusual to find a shared-memory machine with more than half-a-dozen processors. Likewise, a mailbox can only be opened by one processor. But Mailbox Multitasking distributes message passing and scheduling between multiple mailboxes; and a processor can only open one mailbox at a time, and only momentarily. Though its scheduling algorithm is simple, it is fast and potentially independent of the number of processors. For this reason, Mailbox should be able to take great advantage of multiprocessing machines.

Figure: MPTSK: Message, Task, and Mailbox Datatype Internals
\begin{figure}
\begin{verbatim}
Message Header (for message management)
...
...re (to control multiprocessor access to the mailbox)\end{verbatim}
\end{figure}

An apparent disadvantage of Mailbox Multitasking is that it does not come with functions to allow one task to force another to stop or suspend, a feature common in other multitaskers. But this is because Mailbox has been designed to also work in multiprocessor environments, where the scheduler cannot automatically assume the other tasks are suspended while it executes a task. Though I do not have a multiprocessor machine at my disposal, I designed Mailbox to take advantage of multiprocessing environments should they become available, and to better understand issues involved in real concurrent environments.

Actually, Mailbox is easily extended to support sophisticated task control operations due to another feature, its uniformity. Everything, even scheduling, works by sending and receiving messages. More sophisticated operations follow logically and do not need to be tailored. For example, to yield the current task so other tasks can run (useful when polling some condition), the task merely attempts to ``receive'' a processor ``resource'' from its schedule mailbox. It is also possible for a task to specify which tasks (which schedule list) it wants to yield to. Thus, when optimizing performance, a task can implement its own scheduler and ``break'' the strict priority scheduling. Or, for modular programming, a task can duplicate it: a task can multitask itself!

Even the send and receive ``primitives'' are special instances of a general mptsk$\_xfer$() function, with flexibility akin to the mpthd$\_switch$(). Because all task switching and message passing is through a single function interface, any sort of message monitoring, performance monitoring, or task control is possible, simply by patching the desired code into this function. As we will see more of in the next subsection, instead of being big and complicated with every possible feature, Mailbox is simple, so the programmer can quickly tailor her own.

In short, the MPTSK implements prioritized multiprocessor scheduling and simple ``mailbox'' message-passing, based on threads. It introduces the message, task, and mailbox datatypes and a simple set of operations, with which many sophisticated scheduling, communication, and monitoring environments can be easily built.