Implementation

To create a PVFS2 ``file'', there are actually three things that have to happen. A metafile must be created to hold attributes. A collection of datafiles must be created to hold the data. A dirent in the parent directory must be created to add the file into the name space.

There are a bunch of options for implementing this correctly:

In the first four schemes, the clients are totally responsible for creating all the components of a PVFS2 file.

In the dirent first scheme, the dirent is created as the first step. Following this the other objects are allocated and filled in. The advantage of this approach is that clients that lose the race to create the file will do so on the first step (as opposed to the dirent last case, described below). This means that the minimum amount of redundant work occurs. However, the dirent can't even have a valid handle in it if it is created first, meaning that the dirent will have to be modified a second time by the creator to fill in the right value (once the metafile is allocated). This leaves a window of time during which the dirent exists but refers to a file that has no attributes and cannot be read.

In the dirent second scheme, clients first allocate a metafile with parameters indicating that it isn't complete, then allocate the dirent. This means that losing clients will all allocate a metafile (and then have to free it). However, it also provides a valid set of attributes that could be seen during the window of time that the file is being created. Datafiles would be allocated last, meaning that the client would have to modify the distribution information in the metafile after it has been added into the namespace; however, a valid handle would already exist in the name space, resulting in a cleaner client-side mechanism for updating the distribution information once it is filled in. Clients attempting to read/write a file with cached distribution information that isn't filled in will necessarily need to update their cache and potentially wait for this to finish.

In the dirent last scheme clients first allocate datafiles, then the metafile (filling in the distribution information), then finally fill in the dirent. This scheme has the benefit of at all times providing a consistent, complete name space. It has the drawback of a lot of work on the client side for the losers to ``undo'' all the allocation that they performed before failing to obtain the dirent.

The hash to metafile scheme relies on the use of the vesta-like hashing scheme for directly finding metafiles. This scheme is listed here just to keep it in mind; we don't expect to use the hashing scheme at this time. In this scheme the metafile is created first. Since all clients will hash to the same server, and a path name is associated with the metafile (in this scheme), the server would allow only one metafile to be created. After this the winner could allocate datafiles and finally create the dirent. It's not a bad scheme, but we're not doing the hashing right now because of costs in other operations.

In the last scheme server communication is used to coordinate creation of all the objects that make up a file. The server holding the directory is told to create the PVFS file. It creates the metafile and datafiles before adding the dirent. The server scheduler can ensure that only one create completes.

We will implement the dirent second scheme.