Internal Distribution Representation

PVFS2 distributions are internally represented with the struct PINT_dist. This structure contains a pointer to the distribution name, methods, parameters and various sizes. The internal distributions are used on both the clients and the metadata server, as well as being stored physically with the file metadata.

When a user creates a file, the system distribution supplied, or the default distribution is exchanged for a corresponding PINT_dist structure. It is this structure that will be used for any further operations performed on the file and stored in the metadata for the file.

The client and server both use the distribution methods to fulfill the request from the client to the server to locate a specific byte range in a specific file. All this processing is performed within the PINT request for the file and byte range. The main difference in the client and server processing is the way segments are built is different as they represent the distribution of data from the various servers, not the distribution of data on the server (What in the world does this sentence mean?!?)

Distribution parameters are defined in the exported header for the distribution (e.g. for the simple stripe distribution, the header file is pvfs2-dist-simple-stripe.h). The distribution methods are usually defined in a corresponding implementation file in the io/description subsystem (e.g. the simple stripe implementation is in io/description/dist-simple-stripe.c).

The methods defined for each distribution allow it to completely specify how the file data is mapped to the PVFS2 disk abstraction, the data file object. The one possible exception to this is that distributions cannot currently assert their preference in how data file objects are mapped to data servers. This is planned in the near future, however their is no current consensus on how to improve upon the current round robin mapping approach (see PINT_bucket_get_next_io).