LAM 6.2 Beta Installation Guide

Unpacking the distribution

The LAM distribution is packaged as a compressed tape archive, lam62b.tar.Z or lam62b.tar.gz

Uncompress the archive and extract the sources.

% gzip -d -c lam62b.tar.gz | tar xf -
or
% uncompress -c lam62b.tar.Z | tar xf -

Applying patches

The tape archive lam62b-patch.tar if available contains patches for serious bugs. Extract the patches from the archive in the top level LAM source directory lam62b.

% tar xvf lam62b-patch.tar

Read the preamble to each patch. Apply the relevant patches to the the specified files using your favourite editor. Alternatively, apply all the patches with the UNIX patch(1) utility, run from the source directory. If you have trouble applying patches please read this.

% cat lam62b-patch[0-9][0-9] | patch -p0

Configuration

LAM 6.2 uses a GNU configure script to perform site and architecture specific configuration.

Change directory to the top level LAM directory lam62b and run the configure script.

% ./configure {options}

or

% sh ./configure {options}

By default the configure script sets the LAM install directory LAMHOME to /tmp/lam. This can be overridden with the --prefix option (see below).

The configure script will create configuration files config.mk, share/h/lam_config.h, share/h/rpi.tcp.h and share/h/rpi.shm.h. Inspect these files and sanity check the configured values.

The configure script recognizes the following options.

--prefix=PREFIX

Sets the installation location LAMHOME for the LAM binaries, libs, etc. to directory PREFIX. PREFIX must be specified as an absolute directory name.

--with-cc=CC

Use C compiler CC.

--with-cflags=CFLAGS

Use C compiler flags CFLAGS.

--with-fc=FC

Use Fortran compiler FC. Specify FC=no to disable Fortran support if you do not have a Fortran compiler or do not require such support.

--with-fflags=FFLAGS

Use Fortran compiler flags FFLAGS.

--with-rpi=RPI

Build with RPI transport layer RPI [RPI=tcp]. RPI must one of tcp, sysv or usysv. If this option is not specified the RPI transport layer defaults to tcp. Please refer to the release notes for descriptions of the RPI transport layers.

--with-tcp-short=BYTES

Use BYTES as the maximum size of a short message when communicating over TCP. Default is 64 KB.

--with-shm-short=BYTES

Use BYTES as the maximum size of a short message when communicating via shared memory. Default is 8 KB.

--with-shm-poolsize=BYTES

Use BYTES as the size of the shared memory pool.

--with-shm-maxalloc=BYTES

Use BYTES as the size of the maximum allocation from the shared memory pool.

--without-shortcircuit

Disable the send/receive short circuiting optimization. This optimization has not been tested as thoroughly as we would like, hence this option to disable it.

--with-select-yield

Force the use of select() to yield the processor.

--with-pthread-lock

Use a process shared pthread mutex to lock access to the shared memory pool rather than the default SYSV semaphore. This option is only valid with the usysv RPI and on systems which support process shared pthread mutexes.

--with-shared

Build shared libraries. This is currently only supported on LINUX with the gcc compiler and on Solaris with the Sun compiler.

--with-signal=SIGNAL

Use SIGNAL as the signal used internally by LAM. The default value is SIGUSR2. To set the signal to SIGUSR1 for example, specify --with-signal=SIGUSR1.

--with-rsh=RSH

Use RSH as the remote shell command. For example if you want to use the secure shell ssh then specify --with-rsh=ssh.

Example:

 % ./configure --with-rpi=usysv --with-cc=/bin/cc --with-cflags=-O4 -with-fc=no
Compile for the usysv RPI using the C compiler /bin/cc with options -O4 and disable Fortran support.

Building LAM

Once the configuration step has been done build LAM by doing
% make install
in the top level LAM directory. This will build the LAM binaries and libraries and install them together with header files, system files and man pages in LAMHOME/bin, LAMHOME/h, LAMHOME/lib, LAMHOME/boot and LAMHOME/man. Preexisting files in these directories may be overwritten.

Name shifted MPI library

By default the name shifted PMPI_* entry points are not built into the LAM MPI library. To build a libpmpi containing these entry points run make with the the target 'profile' after building the executables and libraries as described above.
% make profile

Boot schema

A boot schema is a description of a multicomputer on which LAM will be run. You can create boot schema files (see bhost(5) for syntax) for typical configurations of the local multicomputer(s). Place these files under boot/ in the installation directory. They will be found by LAM tools such as lamboot(1), recon(1) and wipe(1).

Using LAM

If the LAM installation directory is moved after it is built, users must set the LAMHOME environment variable to the new location. On each UNIX machine, users must add the LAM executable directory to their shell's search path. LAM executables are found under bin/ in the installation directory. These steps must be taken on each and every machine that might be part of a multicomputer running LAM. Set the variables in the shell's start-up file, not the .login file.

The recon(1) tool checks if LAM can be started on the given boot schema. There are several prerequisites that enable LAM to be started on a remote machine.

Refer users to the lam(7) manual page to get started using LAM tools and libraries.

Troubleshooting

Problems building

The configuration and build steps have been tested under Linux, Solaris, HP-UX, AIX and Irix. It should work on any reasonable Unix variant.

Please send the config.mk, config.log and share/h/lam_config.h files generated from the configure step, and logs of the output from the configure and make steps, to lam@mpi.nd.edu.

To capture the output of the configure and make steps you can use the script command or the following technique if using a csh style shell

% ./configure {options} |& tee config.LOG
% make install          |& tee make.LOG
or if using a Bourne style shell
% ./configure {options} 2>&1 | tee config.LOG
% make install 2>&1          | tee make.LOG

Problems running

Please refer to the LAM 6.1 FAQ for help with some common problems encountered when trying to run a LAM application.

Some problems not mentioned in the FAQ are described below.

Insufficient shared resources
If you see the following message
MPI_Init: LAM error: creating shared memory
then most likely you are either trying to create a shared memory segment larger than the system supports or some system limit on the total number or size of shared segments has been reached. Depending on the situation you can either reconfigure your system to allow for larger/more shared segments, wait for resources to be released by other tasks, or try running with a smaller shared memory pool.

If you see the following error when using the sysv transport

MPI_Init: LAM error: creating semaphore
then most likely you have exceeded the maximum number of SYSV semaphores that the system supports. Recall that the sysv transport allocates np (np - 1) semaphore sets where np is the number of processes in the task using shared memory on a node. Either wait for resources to be released by other tasks or reconfigure your system to allow for more semaphores.

Note: the ipcs(1) command is useful for seeing what shared resources are in use.

Clearing space

After LAM has been built, all of the objects can be removed by running the make(1) utility with the "clean" target in the source directory.
% make clean
If further space is required, the source directory can be taken off-line. Only the installation directory need be maintained on-line.