NetNews Usenet Archive 1992 #30

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #30 / NN_1992_30.iso / spool / comp / arch / storage / 858 < prev next >

Wrap

Text File | 1992-12-15 | 9.2 KB | 213 lines

Newsgroups: comp.arch.storage Path: sparky!uunet!world!RAID7 From: RAID7@world.std.com (John OBrien) Subject: RAID 7 vs. Earlier Raid 4 Message-ID: <BzBHH0.3v1@world.std.com> Organization: The World Public Access UNIX, Brookline, MA Date: Tue, 15 Dec 1992 20:12:35 GMT Lines: 203 When I returned to the office from being out of town, there were copies of several INTERNET messages on my desk with a note from my boss, "several misconceptions out there". Indeed. My belief is that addressing those misconceptions from a technical perspective would be a service to those interested INTERNET parties. Here goes... Randy Rorden of Sanyo Icon writes: >I then read an article by John O'Brien of Storage Computer that >was published in the Spring 1992 Computer Technology Review. >In it, he mentions the above-listed three features that >supposedly make RAID 7 different from other RAID levels. I do >not agree that these "architectural" features constitute a >different RAID level. Mr. Rorden then cites [Patterson88][Katz89] and continues with his own commentary, >That's because RAID levels define different ways of organizing >data on disk drives and ways of providing redundancy so that >lost data can be recovered when a drive fails, not how those >drives may be connected, controlled, cached, or buffered. Tom Wicklund of Intellistor writes: >Before asking the status of RAID 7, wait for the term to be >defined. RAID 7 is not part of the original RAID taxonomy that >Berkeley defined. Their RAID 7 is a modified Berkeley RAID 4, >with the RAID 7 definition a marketing ploy to say why their >implementation of RAID 4 architecture is superior. What was misconstrued above, about the Berkeley RAID Levels and taxonomy, and about RAID 7, can best be explained by considering the following two questions: (1) What are the criteria for Berkeley RAID levels?, and (2) How far does this taxonomy extend? (1) WHAT ARE THE CRITERIA FOR BERKELEY RAID LEVELS? Whenever there is disagreement about what was said, meant, or intended in a document, the most direct route to clarification is to simply go to that source document. "A Case for Redundant Arrays of Inexpensive Disks" was first published as a Computer Science Division report at Berkeley in December 1987 by Patterson, Gibson, and Katz. (This is the same report that Mr. Rorden cites as [Patterson 88] as it was republished by IEEE). There are three passages in this original paper of interest; one from the abstract, a second from the body, and a third from the conclusion. ABSTRACT: "Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disk (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAIDs to an IBM 3380 and a Fujitsu Super Eagle." BODY "To simplify the explanation of our final proposal and to avoid confusion with previous work, we give the taxonomy of five different organizations of disk arrays, beginning with mirrored disks and progressing through a variety of alternatives with differing performance and reliability. We refer to each organization as a RAID level." CONCLUSION "This paper makes two separable points: the advantages of building I/O systems from personal computer disks and the advantages of five different disk array organizations, independent of disks used in that array. The later point starts with the traditional mirrored disks to achieve acceptable reliability with each succeeding level improving o the effective performance per disk for supercomputer applications (characterized by a small number of requests per second for a massive amounts of information each time), o the transaction-processing performance (characterized by a large number of read-modify-writes to a small amount of information each time), or o the usable storage capacity" Even from a cursory read of this material it is clear that the major criterion for establishing a RAID level is, PERFORMANCE. In the abstract alone, the authors mention PERFORMANCE no less than 5 times. PERFORMANCE is also central to the taxonomy definition shown above, which appears in the body of the report. And finally, the authors tell us in the conclusion, PERFORMANCE -- together with her twin sister usable capacity -- are the very yardsticks by which "each succeeding level" is measured. Performance isn't the only criterion; by its stature and frequency of reference, though, it appears to be the most important. Does this mean that implementing a RAID with chips which work 20 nS faster implies a different RAID level. I don't believe any reasonable technologist would adopt such a view. In the RAID definition from the body of the report the authors qualify that "different organizations of disk arrays" also help define RAID levels. For example: when the idea of distributing parity from a single disk, RAID 4, to locations across multiple disks is added, the resulting "organization" is defined as a new level, specifically RAID 5. Similarly, when the idea of constructing a RAID device where all disk heads could truly move asynchronously -- for writes as well as reads --; where disk transfers -- queued as well as non-queued -- would be managed asynchronously; where an embedded OS would simultaneously manage individual drive cache and central cache in a manner that behaves like a virtual solid state disk; the resulting tremendous increase in performance and different disk organization clearly merit a RAID level different from existing levels. Enter: RAID 7. ( In March 1990 Storage Computer issued a paper which defined RAID levels 6 and 7. This was an attempt to talk seriously about RAID technology and architecture. In June of 1991, Mr. Randy Katz -- one of the original Berkeley authors -- published an article in Computer Technology Review describing an architecture which he denoted as RAID 6.) RAID 7 is the first RAID level which outperforms the four performance metrics versus the single spindle (Large Reads, Large Writes, Small Reads, and Small Writes). Expressed in other words: RAID 4 write parallelism is one per group since every write must also join the parity disk queue; RAID 5 write parallelism is G/2 (where G is the group size) since every write has an associated parity disk - that is a single write requires two disks; RAID 7 write parallelism is N-1 (where N is the number of drives in the arrays and N is always >= G). Thus RAID 7 enjoys a strong performance vis a vis RAIDs 4 and 5. (Personally, I don't think there is anything magical about the name RAID 7 -- it is just an attempt to convey some relative understanding of performance and capacity as opposed to the other RAID numbers. I recall being drawn into a rather animated discussion with one of our customers who had just tested the RAID 7 against a solid state disk and he noted that it performed at about 70% of the speed of the SSD. He argued strenuously that we should bypass describing it as a RAID 7, in favor of the term, "virtual solid state disk" which he felt was a more descriptive of behavior".) (2) HOW FAR DOES THIS TAXONOMY EXTEND? Readers of the Berkeley paper might be reasonably grouped into three categories: (1) those that criticize the paper, (2) those that criticize the universe (everything but the paper), and (3) finally, the group to which I would claim membership, those who perceive the paper as a good example of fine academic work, by some innovative and talented technologists. The Berkeley authors did much to popularize disk arrays. That they may have made some simplifying assumptions in the paper which might not have been first choice for product implementors (transfer unit size, large write size) is not a cause for concern; let's recognize the paper for what it is -- a very valuable contribution of research -- and not criticize it for what it isn't -- an implementor's bible for all time to come. Conversely, those who criticize the universe believe that since Berkeley defined levels 1 thru 5 in this paper, no scope of work can stretch those limits. I find it hard to appreciate the value of this kind of rigidity. This is a little like defining five levels of minicomputers in the late sixties, and not allowing any other definitions. What happens when new architectures (Ala. Tandem, Stratus) evolve? Are they not considered minis? Do we ignore them because they don't fit our original 1969 definition? Like all real contributions to the art, the Berkeley taxonomy needs augmenting from time to time -- if only to reflect industry/academic directions. Those requiring more information on the RAID 7 extension to the Berkeley levels are welcome to a copy of a Storage Computer publication, "RAID AID: A Taxonomic Extension to the Berkeley Disk Array Schema". Just request by EMAIL RAID7@world.std.com /Phone (603-880-3005) /FAX (603-889-7273). John O'Brien Storage Computer Corporation