July 2000 Programmer's Challenge
RAID 5+
Mail solutions to:
progchallenge@mactech.com
Due Date: 11:59pm ET, Saturday, 1 July 2000
Those of you in the Information Systems business have certainly heard of Redundant Array of Independent (or Inexpensive) Disks, or RAID technology. As the name suggests, RAID stores data across multiple disks to provide improved performance and some level of protection against disk failure. Several RAID levels have been implemented, and one of the most common, RAID Level 5, employs disk striping (spreading blocks of data across multiple disks) and parity to optimize disk access for applications that perform random small read/write operations, and to protect against the failure of a single disk.
Our Challenge application, however, is a little more demanding. We’re operating a mission critical application, one that cannot afford to lose data even with a double hardware failure. Your Challenge is to implement a RAID-5+ system that has the performance advantages of RAID 5, but can continue functioning when two disk drives fail.
The prototype for the code you should write is:
/*
* ReadProc and WriteProc are callbacks that allow you to write to multiple drives
* simultaneously, simulating the effect of striping.
*/
typedef void (* WriteProc) ( /* conduct physical writes to multiple drives */
long startByte[], /* start write from startbyte[n] physical byte on disk n */
long numBytes[], /* write numbytes[n] bytes to disk n */
char *writeBuffer[], /* write to buffer[n] to disk n */
Boolean writeErr[]
/* returns writeErr[n]==true if disk n has a write error or parameters were bad */
);
typedef void (* ReadProc) ( /* conduct physical reads from multiple drives */
long startByte[], /* start read from startbyte[n] physical byte on disk n */
/* bytes startByte..startByte+numBytes-1 must be within 0..diskSize-1 */
long numBytes[], /* read numbytes[n] bytes from disk n */
char *readBuffer[], /* read into buffer[n] from disk n */
Boolean readErr[]
/* returns readErr[n]==true if disk n has a read error or parameters were bad */
);
/*
* InitRaid provides the problem parameters
*/
void InitRaid(
long numDisks,
/* problem size, you will have numDisks of real data, plus 2 disks for parity */
long diskSize, /* number of bytes in each disk */
long failureRate, /* expect 1 failure in each failureRate read/write attempts */
WriteProc physicalWrite,
/* procedure that allows you to write to numDisks+2 disks of size diskSize */
ReadProc physicalRead
/* procedure that allows you to read from numDisks+2 disks of size diskSize */
);
void RepairDisk(
long whichDisk /* index of disk that has been repaired, no more than 2 at one time */
);
/*
* RaidWrite and RaidRead ask you to write to the numDisks*diskSize bytes of
* storage. You use WriteProc and Read proc to actually write to the (numDisks+2)
* physical devices, using redundant writes to compensate for the loss of up to two
* disks. RaidWrite and RaidRead bytes are numbered 0..numDisks*diskSize-1.
* RaidWrite and RaidRead return true unless there is a problem performing the
* write/read.
*/
Boolean RaidWrite(
long startByte, /* write starting at this byte */
long numBytes, /* number of bytes to write */
char *buffer /* write bytes from this buffer */
);
Boolean RaidRead(
long startByte, /* read starting at this byte */
long numBytes, /* number of bytes to read */
char *buffer /* read bytes into this buffer */
);
void TermRaid(void);
This Challenge starts with a call to your InitRaid routine. InitRaid is provided with the number of disks (numDisks) available to your Raid 5+ implementation. Actually, numDisks represents the amount of data you need to store and protect; you actually have numDisks+2 disks available, with the additional 2 disks used to provide protection against disk failures. InitRaid is also provided with the size of the identically sized disks in bytes (numBytes), and the approximate failure rate of the disks (1 failure in failureRate read/write attempts). Finally, InitRaid is provided with two callback routines that provide you with read and write access to the physical disks in our simulated disk array, which we’ll discuss below.
The Challenge evaluation consists of a large number of RaidWrite and RaidRead operations. Each RaidWrite call provides a logical address to write to (startByte) in the range 0..numDisks*diskSize-1, a number of bytes to write (numBytes), and a buffer from which to write. Your code must write the data to the simulated disks provided using the WriteProc callback, providing for error correction by writing redundant information to other disks. If WriteProc returns a writeErr[n] value of true, the write to disk n failed, and you may need to compensate. RaidRead provides a logical address to read from, a number of bytes to read, and a buffer to read into. If ReadProc returns a readErr[n] value of true, the corresponding read operation failed, and you will need to use the redundant disks to reconstruct the lost information.
At the end of the evaluation, your TermRaid routine will be called. You should free any memory you have allocated.
From time to time, a failed disk may be repaired. You’ll be notified of such a repair by a call to RepairDisk. When a disk is repaired, you ought to reconstruct any necessary error correction information, because another disk may fail at any time. Up to two disks may be in "failed" mode at any given time.
WriteProc allows you to perform a write operation on all of the disks in the array simultaneously, writing numBytes[n] from writeBuffer[n] starting at physical location startByte[n] on disk n. Similarly, ReadProc allows you to perform a simultaneous read operation on all of the disks, reading numBytes[n] into readBuffer[n] starting from physical location startByte[n] on disk n. Unfortunately, the disk controllers in our simulated system are not very sophisticated: while disk I/O occurs in parallel, our disks are not able to chain sequential I/O operations. An I/O operation on the array requires read/write time proportional to the largest numBytes[n] value passed to any disk.
Our disks will have, say, 5msec average seek time, and 30MB/sec transfer rates. Your program’s score will be the sum of the seek and transfer times for each read/write operation, plus the execution time required by your program. The winner will be the solution that correctly performs all read/write operations with the lowest program score.
There are no specific memory restrictions on this Challenge, but you may not allocate memory to store all of the data written to the simulated disk array, or to persistently store any error correction information. You must use the simulated disks and the WriteProc and ReadProc callbacks for that purpose.
This will be a native PowerPC Challenge, using the CodeWarrior Pro 5 environment. Solutions may be coded in C, C++, or Pascal.
Test code for this Challenge will be available shortly.
You can get a head start on the Challenge by reading the
Programmer's Challenge mailing list. It will be posted to the list on
or before the 12th of the preceding month. To join, send an email to
listserv@listmail.xplain.com
with the subject "subscribe challenge-A". You can also join the
discussion list by sending a message with the subject "subscribe
challenge-D".
|