Consistency from the client point of view

We've discussed in a number of venues the opportunities that are made available when true POSIX semantics are given up. Truthfully very few file systems actually support POSIX; ext3 file systems don't enforce atomic writes across block boundaries without special flags, and NFS file systems don't even come close. Never the less, many people claim POSIX semantics, and many groups ask for them without knowing the costs associated.

PVFS2 does not provide POSIX semantics.

PVFS2 does provide guarantees of atomicity of writes to nonoverlapping regions, even noncontiguous nonoverlapping regions. This is to say that if your parallel application doesn't write to the same bytes, then you will get what you expect on subsequent reads.

This is enough to provide all the non-atomic mode semantics for MPI-IO. The atomic mode of MPI-IO will need support at a higher level. This will probably be done with enhancements to ROMIO rather than forcing more complicated infrastructure into the file system. There are good reasons to do this at the MPI-IO layer rather than in the file system, but that is outside the context of this document.

Caching of the directory hierarchy is permitted in PVFS2 for a configurable duration. This allows for some optimizations at the cost of windows of time during which the file system view might look different from one node than from another. The cache time value may be set to zero to avoid this behavior; however, we believe that users will not find this necessary.