Initial Environment Setup

Initial Environment Setup

PVM supports three different models of programming, and the initial environment setup varies depending on the model in question. The initial environment setup consists of determining the total number of PVM tasks to be used in the PVM job (including those started by hand at a shell prompt and those started via a pvm_spawn()), and using that as the initial static number for MPI. If the program being ported relies on dynamic addition and deletion of hosts you must change the program to use a static number of hosts and tasks.

It is a common practice in PVM programs to start a task by hand, and then determine the machine configuration inside this task via the pvm_config() call, so as to dynamically spawn tasks on the machines in the current configuration. You must replace this practice with a static determination of the hosts and tasks that form an MPI parallel program.

The rest of this section discusses the three programming models supported by PVM and how to perform initial environment setup for each case.

Pure SPMD Program

In the pure SPMD program model, n instances of the same program are started as the n tasks of the parallel job, using the spawn command of the PVM console (or by hand at each of the n hosts simultaneously). No tasks are dynamically spawned in the tasks; that is, the tasks do not use pvm_spawn(). This scenario is essentially the same as the current MPI one where no tasks are dynamically spawned.

For this scenario, the initial parallel environment setup consists of specifying the hosts to run the n tasks on. You can accomplish this setup using the mechanism provided on top of the MPI library. For example, the setup can use a hostfile for mpirun or the procgroup file for the MPICH implementation.

General SPMD Model

In this model, n instances of the same program are executed as n tasks of the parallel job. However, one or more tasks are started by hand at the beginning, and these dynamically spawn the remaining tasks in turn.

Here, the change involves figuring out how many PVM tasks are spawned in total (including those started by hand and those dynamically spawned), and on what machines these tasks are run. These two pieces of information can be directly translated into information (number of MPI tasks and the hosts on which these are to be run) that the hostfile/procgroup file of the MPI setup requires.

You must remove all instances of the pvm_spawn() call from the program. Most of the options of this call can be dealt by a translation into the MPI initial setup. The option PvmTaskDebug has no counterpart in MPI, so the corresponding MPI task cannot be started in debug mode. The option PvmTaskTrace and its subsequent use with a tool such as XPVM can be translated to whatever profiling interface and tools are available in the given MPI implementation.

Similarly, you should also eliminate all calls to pvm_addhosts(), pvm_delhosts(), and pvm_config(). Finally, if the program has a pvm_halt() call, remove it also.

MPMD Model

In an MPMD programming model, one or more distinct tasks (having different executables) are started by hand, and these tasks dynamically spawn other (possibly distinct) tasks. The initial setup change required for this model is similar to the one required for the general SPMD model discussed in the previous section; that discussion applies here too. The main difference here is that the task executables are different programs, and this information is encapsulated in the hostfile/procgroup file in the MPI paradigm.

The initial MPI environment setup thus consists of figuring out the number of instances of each distinct executable that constitute the parallel job, and using the total as the static initial number for the MPI environment. Again, you must remove all the pvm_spawn(), pvm_config(), pvm_addhosts(), pvm_delhosts(), and pvm_halt() calls in each PVM executable.

Common Environment Setup Changes

For all the three models, you must remove from the program being ported all calls that query the library for virtual machine or tasks information, such as pvm_mstat(), pvm_pstat() and pvm_tasks(). Handle any semantic dependency to these calls in the program, other than initial environment setup, in the resulting MPI program.

Since tasks cannot enroll in and leave from an MPI run time environment more than once, you must change all PVM tasks to reflect this requirement. Typically, a PVM task enrolls via the pvm_mytid() call; in the absence of this call, the first PVM call enrolls the calling task. Additionally, a task can call pvm_mytid() several times in a program with or without interleaved pvm_exit() calls. If it is not interleaved with pvm_exit() calls, the calling task simply gets its task ID back from the PVM library on the second and subsequent pvm_mytid() calls. You can easily eliminate these subsequent pvm_mytid() calls from the program by saving the value of the task ID and passing it around.

Replace the first pvm_mytid() call in a PVM program with the MPI_Init() routine, which must precede all other MPI routines and must be called exactly once. Since an MPI implementation can add its own command-line arguments to be processed by MPI_Init(), you must place all the user's command-line processing (anything that accesses argc and argv) after MPI_Init(). This requirement is in contrast to PVM programs, since PVM does not add its own arguments to those of the tasks being started.

To find out the number of tasks in the parallel job and its own task ID, an MPI task must call the functions MPI_Comm_size() and MPI_Comm_rank(). Thus the initial portion of a typical MPI program looks like the following:

/* Initialize the MPI environment. */
MPI_Init(&argc, &argv);
/* Get task id and the total number of tasks. */
/* The rank is essentially the task id. */
MPI_Comm_rank(MPI_COMM_WORLD, &taskId);
MPI_Comm_size(MPI_COMM_WORLD, &numTasks);