An introduction to PVFS2

Since the mid-1990s we have been in the business of parallel I/O. Our first parallel file system, the Parallel Virtual File System (PVFS), has been the most successful parallel file system on Linux clusters to date. This code base has been used both in production mode at large scientific computing centers and as a launching point for many research endeavors.

However, the PVFS (or PVFS1) code base is a terrible mess! For the last few years we have been pushing it well beyond the environment for which it was originally designed. The core of PVFS1 is no longer appropriate for the environment in which we now see parallel file systems being deployed.

While we have been keeping PVFS1 relevant, we have also been interacting with application groups, other parallel I/O researchers, and implementors of system software such as message passing libraries. As a result have learned a great deal about how applications use these file systems and how we might better leverage the underlying hardware.

Eventually we reached a point where it was obvious to us that a new design was in order. The PVFS2 design embodies the principles that we believe are key to a successful, robust, high-performance parallel file system. It is being implemented primarily by a distributed team at Argonne National Laboratory and Clemson University. Early collaborations have already begun with Ohio Supercomputer Center and Ohio State University, and we look forward to additional participation by interested and motivated parties.

In this section we discuss the motivation behind and the key characteristics of our parallel file system, PVFS2.

Subsections