linuxmafia.com 2016

home *** CD-ROM | disk | FTP | other *** search

/ linuxmafia.com 2016 / linuxmafia.com.tar / linuxmafia.com / pub / hardware / ata-scsi-harddrives < prev next >

Wrap

Internet Message Format | 1996-04-27 | 11KB

From myrddin.imat.com!rick Sun Apr 28 01:15:46 1996 Path: myrddin.imat.com!rick From: rick@hugin.imat.com (Rick Moen) Newsgroups: sfpcug.general Subject: Hard drives Date: 28 Apr 1996 08:15:21 GMT Organization: Imagine That Lines: 196 Message-ID: <4lv9ep$1ri@myrddin.imat.com> NNTP-Posting-Host: hugin.imat.com X-Newsreader: TIN [version 1.2 PL2] I have nothing against whacko technical claims. Indeed, some of my best friends make them -- and SFpcUG's general meetings are often an outstanding place to encounter them. However, it's nice to be able to figure out what's whacko, and what's not. Some of the more salient ones we've been hearing at our meetings have concerned hard drive technology, and I've been distressed to hear them accepted by listeners without much apparent thought. So, I thought I'd help out with some information on considerations that must be borne in mind, when evaluating claims about hard drives to determine if they make sense or not. Your hard drive is an entire subsystem; its performance depends on the performance of _each_ of its components working in cooperation. First, data must be physically read from the platters by the heads. Thus, one constraint on disk speed is physical disk access. Then, the data pass through the electronics on the hard drive. So, the drive's data transfer rating is also a factor. At that point, the data have passed out to the bus leading to the drives' host adapter (IDE or SCSI), which has a rating for the bus transfer rate and data width. Transfer rate is rated in megabytes per second, relative to an 1-byte-wide bus. (If you use the same transfer or clock rate on a 2-byte-wide bus, you get double the effective transfer rate.) Thus, this transfer rate is another relevant factor. After passing through into the IDE or SCSI host adapter, the data pass over a bus (local, PCI, ISA, or whatever) to either the system RAM (Direct Memory Addressing = DMA mode) or the CPU (Programmed I/O = PIO mode). The speed of this process is potentially a factor, as is the CPU burden imposed by PIO access. Last, the software driver controlling this process in the operating system is fast or slow to some degree. So, driver speed is another factor. OK, get the point of all this? If one aspect of this process is already the bottleneck for the drive subsystem as a whole, then doubling the speed of a _different_ element that's already faster than it _needs_ to be _doesn't improve performance_: It was _already_ waiting for the slower element; you've merely increased its unusable extra capacity. Examples: We've been hearing whacko claims about both IDE and SCSI. (1) We're told that EIDE (more properly, ATA-2) equipment is faster, because its new PIO-4 and DMA-2 modes both run at an impressive 16 megabytes per second transfer rate along the IDE bus. (2) We're told that you can't get good performance on SCSI, without using a faster-than-customary transfer rate along the SCSI bus and a wider-than-customary data path. These claims' common thread is that they both focus on the data bus between the controller and the drive, as if that were the _only_ element in the drive subsystem whose performance mattered. In other words, the assumption is that the drive- controller link (whose top speed is measured by the "transfer rate" spec) is the bottleneck. So, you as an intelligent reader need to ask yourself: Is this a valid assumption? For example, is _physical disk access_ the bottleneck, instead? A disk's physical access speed is a little difficult to gauge, because the manufacturers don't really give you enough data to go by. They tell you the drive's rotation speed (e.g., 5200 RPM) and average seek time (e.g., 10 milliseconds). Those are roughly the specs of Western Digital's new (and quite good) Caviar series 2.0/2.3GB ATA-2 drives. I don't have the exact design details handy, but let's assume that they have 16 heads (8 disk platters), and 63 512-byte sectors per track. As the platters rotate once past the heads, they can pick up 63 x 512 x 16 = 516,096 bytes of data (half a meg) without moving the heads (which is a much slower operation). This rotation takes 1/5200 second, so under ideal conditions (contiguous data, readable without moving the heads), data can be read at 516,096 x 5200 = about 2.5 GB/second. That's awfully fast. Unfortunately, in the real world, heads -=must be moved=-, to seek tracks elsewhere, other than where they happen to be. This is where the "10 millisecond" figure comes in: It's the amount of time the manufacturer figures it will take for an average track- to-track journey, to find data located on cylinders further in or further out. Upon arrival, _then_ the heads can resume reading or writing data. Drives happen to spend a great deal of time seeking tracks, because data just isn't to be found consecutively ordered, a lot of the time, especially when control structures such as File Allocation Tables are kept on the outer tracks, and actual data are kept any old where (even after defragmenting). Memory buffers on the drives can help, but seldom exceed 128KB, or so (on ATA drives). Because this is a complex situation, there are no neat answers for how fast physical access will be for given hardware. You have to put a drive on a fast interface, and simply measure the resulting real-world data retrieval rate that emerges from the forest of intangibles. However, in the experience of many of us, there have simply been -=no=- IDE drives of any vintage that can physically read real-world data even as fast as 2 megabytes per second -- less than 1/4 the speed of even the oldest (and slowest) IDE controllers built to the original IDE (ATA) spec -- which ran at 8.3 megabytes per second. So, we have a situation where the 16MB/sec theoretical capacity of ATA-2 interfaces is easy to understand and widely promoted, but this capability is rendered meaningless by a limit in _another_ area that's more difficult to quantify and to comprehend, and is therefore often ignored. (It should hardly be surprising to alert readers, however, to discover that a _physical_ process limits the speed of a related purely _electronic_ one.) When a drive is rated at 10 milliseconds average access time, the assumption is that there'll be no order to successive locations that must be visited -- that "random" seeks are involved. Suppose you could arrange the next few dozen seeks, so as to minimise needed head travel, like a paper boy tossing newspapers in order of address, rather than visiting families alphabetically by last name (which would obviously take much longer)? This is what SCSI drives do -- intelligent seeking, aka "command queuing". In effect, this lowers average seek time dramatically. ATA drives do not do this, to their disadvantage. Suppose you have more than one drive? It'd be nice for them to work simultaneously, right? SCSI drives do this, because the SCSI bus was designed to handle it, and a feature called "SCSI disconnect" allows the host adapter to issue one drive a read/write order, and then trust it to carry out the task on its own ("disconnect" it), and start working with a second drive while the first's task is already underway. ATA drives do not support disconnect -- again, to their disadvantage. So, the claim about 16MB/sec transfers along the ATA-2 bus is looking pretty shaky: Not only can you not find ATA-2 drives capable of supplying more than a small fraction of that, at the physical access level, but also multiple drives on the same chain can't even double-up their contributions very well, because of the disconnect problem. What about the claims about SCSI needing to be double-wide and double-fast, before good throughput is possible? Again, one needs to look at physical access issues: With SCSI disconnect, at least the drives' contributions are pretty much fully additive. However, again, the inherent slowness of the inevitable seek cycles holds down real-world disk-access rates. Rotation speeds are about the same (usually 5400 to 7200 RPM, lately), and memory buffers are a little bigger (often 512KB), but not enough to make a huge difference. Average seek claims are also better (8-10 ms, instead of 10-12 ms), but not greatly so. I would estimate that you'd be really lucky, as a result, to get real-world physical read rates as high as 3 - 5MB/second. Why, then, are there new SCSI controllers and drives that support two-byte-wide cables, instead of the usual one-byte-wide ones, and some that do 20MB/second transfers along the SCSI bus, instead of the standard 10MB/sec? (Standard values cited are those of SCSI-2.) The answer is that some situations -- typically on servers -- call for "striping" data across multiple drives. A design called "RAID-5" calls for using four hard drives on a single SCSI chain, with data spread across them in a manner designed to let the machine continue with all data intact, even if any one of the four drives dies. ("Mirroring" involves a similar effect, with pairs of drives holding redundant data.) Because RAID-5 involves reading from all four drives at once, all four of them will be hitting the SCSI bus simultaneously with data, each as quickly as it can. This is one of the rare cases where disk access might saturate the 10MB/second bandwidth of a standard SCSI-2 drive/controller combination. In such cases, a wider data path and/or faster data transfer _is_ beneficial, as the SCSI bus _is_ the bottleneck. However, -=you=- don't have a RAID array on your desktop machine, and neither do I. (At least, if you do, I want you as a friend, so I can get your throwaways. ;-> ) So, where did these claims come from, about how great 16MB/sec ATA-2 transfer rates are, and how inadequate 10MB/sec SCSI-2 ones are? I suspect that someone was just reading spec sheets, and assuming that bigger means better, without bothering to consider whether he was comparing apples and oranges. Think of it this way: It's taken me 22 paragraphs to explain the subject properly, so you might see for yourself -- without taking my word for it -- where the problem with that line of reasoning lies. It's a great deal easier to just _parrot specifications_, and assume they tell the tale accurately. I'm not surprised that this happens. However, now you'll be forewarned, the next time you hear them repeated as gospel truth -- and perhaps will have greater insight into the relevant issues. -- Cheers, A post is just a post Rick Moen My admin will deny. rick@hugin.imat.com The usual disclaimers apply As news spools by.