home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
linuxmafia.com 2016
/
linuxmafia.com.tar
/
linuxmafia.com
/
pub
/
hardware
/
ata-scsi-harddrives
< prev
next >
Wrap
Internet Message Format
|
1996-04-27
|
11KB
From myrddin.imat.com!rick Sun Apr 28 01:15:46 1996
Path: myrddin.imat.com!rick
From: rick@hugin.imat.com (Rick Moen)
Newsgroups: sfpcug.general
Subject: Hard drives
Date: 28 Apr 1996 08:15:21 GMT
Organization: Imagine That
Lines: 196
Message-ID: <4lv9ep$1ri@myrddin.imat.com>
NNTP-Posting-Host: hugin.imat.com
X-Newsreader: TIN [version 1.2 PL2]
I have nothing against whacko technical claims. Indeed, some of my
best friends make them -- and SFpcUG's general meetings are often
an outstanding place to encounter them.
However, it's nice to be able to figure out what's whacko, and what's
not. Some of the more salient ones we've been hearing at our meetings
have concerned hard drive technology, and I've been distressed to hear
them accepted by listeners without much apparent thought.
So, I thought I'd help out with some information on considerations
that must be borne in mind, when evaluating claims about hard drives
to determine if they make sense or not.
Your hard drive is an entire subsystem; its performance depends on the
performance of _each_ of its components working in cooperation. First,
data must be physically read from the platters by the heads. Thus,
one constraint on disk speed is physical disk access. Then, the data
pass through the electronics on the hard drive. So, the drive's
data transfer rating is also a factor.
At that point, the data have passed out to the bus leading to the
drives' host adapter (IDE or SCSI), which has a rating for the bus
transfer rate and data width. Transfer rate is rated in megabytes
per second, relative to an 1-byte-wide bus. (If you use the same
transfer or clock rate on a 2-byte-wide bus, you get double the
effective transfer rate.) Thus, this transfer rate is another
relevant factor.
After passing through into the IDE or SCSI host adapter, the data
pass over a bus (local, PCI, ISA, or whatever) to either the system
RAM (Direct Memory Addressing = DMA mode) or the CPU (Programmed I/O
= PIO mode). The speed of this process is potentially a factor,
as is the CPU burden imposed by PIO access. Last, the software
driver controlling this process in the operating system is fast or
slow to some degree. So, driver speed is another factor.
OK, get the point of all this? If one aspect of this process is
already the bottleneck for the drive subsystem as a whole, then
doubling the speed of a _different_ element that's already faster
than it _needs_ to be _doesn't improve performance_: It was
_already_ waiting for the slower element; you've merely increased
its unusable extra capacity.
Examples:
We've been hearing whacko claims about both IDE and SCSI. (1)
We're told that EIDE (more properly, ATA-2) equipment is faster,
because its new PIO-4 and DMA-2 modes both run at an impressive
16 megabytes per second transfer rate along the IDE bus. (2)
We're told that you can't get good performance on SCSI, without
using a faster-than-customary transfer rate along the SCSI bus
and a wider-than-customary data path.
These claims' common thread is that they both focus on the
data bus between the controller and the drive, as if that were
the _only_ element in the drive subsystem whose performance
mattered. In other words, the assumption is that the drive-
controller link (whose top speed is measured by the "transfer
rate" spec) is the bottleneck. So, you as an intelligent
reader need to ask yourself: Is this a valid assumption? For
example, is _physical disk access_ the bottleneck, instead?
A disk's physical access speed is a little difficult to gauge,
because the manufacturers don't really give you enough data to
go by. They tell you the drive's rotation speed (e.g., 5200 RPM)
and average seek time (e.g., 10 milliseconds). Those are roughly
the specs of Western Digital's new (and quite good) Caviar series
2.0/2.3GB ATA-2 drives. I don't have the exact design details
handy, but let's assume that they have 16 heads (8 disk platters),
and 63 512-byte sectors per track. As the platters rotate once
past the heads, they can pick up 63 x 512 x 16 = 516,096 bytes of
data (half a meg) without moving the heads (which is a much slower
operation). This rotation takes 1/5200 second, so under ideal
conditions (contiguous data, readable without moving the heads),
data can be read at 516,096 x 5200 = about 2.5 GB/second. That's
awfully fast.
Unfortunately, in the real world, heads -=must be moved=-, to seek
tracks elsewhere, other than where they happen to be. This
is where the "10 millisecond" figure comes in: It's the amount
of time the manufacturer figures it will take for an average track-
to-track journey, to find data located on cylinders further in or
further out. Upon arrival, _then_ the heads can resume reading or
writing data.
Drives happen to spend a great deal of time seeking tracks,
because data just isn't to be found consecutively ordered, a lot of
the time, especially when control structures such as File Allocation
Tables are kept on the outer tracks, and actual data are kept any old
where (even after defragmenting). Memory buffers on the drives can
help, but seldom exceed 128KB, or so (on ATA drives).
Because this is a complex situation, there are no neat answers for
how fast physical access will be for given hardware. You have to
put a drive on a fast interface, and simply measure the resulting
real-world data retrieval rate that emerges from the forest of
intangibles. However, in the experience of many of us, there have
simply been -=no=- IDE drives of any vintage that can physically read
real-world data even as fast as 2 megabytes per second -- less
than 1/4 the speed of even the oldest (and slowest) IDE controllers
built to the original IDE (ATA) spec -- which ran at 8.3 megabytes
per second.
So, we have a situation where the 16MB/sec theoretical capacity of
ATA-2 interfaces is easy to understand and widely promoted, but
this capability is rendered meaningless by a limit in _another_ area
that's more difficult to quantify and to comprehend, and is therefore
often ignored. (It should hardly be surprising to alert readers,
however, to discover that a _physical_ process limits the speed of
a related purely _electronic_ one.)
When a drive is rated at 10 milliseconds average access time, the
assumption is that there'll be no order to successive locations that must
be visited -- that "random" seeks are involved. Suppose you could
arrange the next few dozen seeks, so as to minimise needed head travel,
like a paper boy tossing newspapers in order of address, rather than
visiting families alphabetically by last name (which would obviously
take much longer)? This is what SCSI drives do -- intelligent seeking,
aka "command queuing". In effect, this lowers average seek time
dramatically. ATA drives do not do this, to their disadvantage.
Suppose you have more than one drive? It'd be nice for them to
work simultaneously, right? SCSI drives do this, because the SCSI
bus was designed to handle it, and a feature called "SCSI disconnect"
allows the host adapter to issue one drive a read/write order, and
then trust it to carry out the task on its own ("disconnect" it),
and start working with a second drive while the first's task is
already underway. ATA drives do not support disconnect -- again,
to their disadvantage.
So, the claim about 16MB/sec transfers along the ATA-2 bus is
looking pretty shaky: Not only can you not find ATA-2 drives
capable of supplying more than a small fraction of that, at the
physical access level, but also multiple drives on the same
chain can't even double-up their contributions very well, because
of the disconnect problem. What about the claims about SCSI
needing to be double-wide and double-fast, before good throughput
is possible?
Again, one needs to look at physical access issues: With SCSI
disconnect, at least the drives' contributions are pretty much
fully additive. However, again, the inherent slowness of the
inevitable seek cycles holds down real-world disk-access rates.
Rotation speeds are about the same (usually 5400 to 7200 RPM,
lately), and memory buffers are a little bigger (often 512KB),
but not enough to make a huge difference. Average seek claims
are also better (8-10 ms, instead of 10-12 ms), but not greatly so.
I would estimate that you'd be really lucky, as a result, to get
real-world physical read rates as high as 3 - 5MB/second.
Why, then, are there new SCSI controllers and drives that support
two-byte-wide cables, instead of the usual one-byte-wide ones, and
some that do 20MB/second transfers along the SCSI bus, instead
of the standard 10MB/sec? (Standard values cited are those of SCSI-2.)
The answer is that some situations -- typically on servers --
call for "striping" data across multiple drives. A design called
"RAID-5" calls for using four hard drives on a single SCSI chain,
with data spread across them in a manner designed to let the
machine continue with all data intact, even if any one of the
four drives dies. ("Mirroring" involves a similar effect, with
pairs of drives holding redundant data.)
Because RAID-5 involves reading from all four drives at once, all
four of them will be hitting the SCSI bus simultaneously with
data, each as quickly as it can. This is one of the rare cases
where disk access might saturate the 10MB/second bandwidth of a
standard SCSI-2 drive/controller combination. In such cases,
a wider data path and/or faster data transfer _is_ beneficial,
as the SCSI bus _is_ the bottleneck. However, -=you=- don't have
a RAID array on your desktop machine, and neither do I. (At
least, if you do, I want you as a friend, so I can get your
throwaways. ;-> )
So, where did these claims come from, about how great 16MB/sec ATA-2
transfer rates are, and how inadequate 10MB/sec SCSI-2 ones are?
I suspect that someone was just reading spec sheets, and assuming
that bigger means better, without bothering to consider whether
he was comparing apples and oranges.
Think of it this way: It's taken me 22 paragraphs to explain the
subject properly, so you might see for yourself -- without taking
my word for it -- where the problem with that line of reasoning
lies. It's a great deal easier to just _parrot specifications_,
and assume they tell the tale accurately. I'm not surprised that
this happens. However, now you'll be forewarned, the next time
you hear them repeated as gospel truth -- and perhaps will have
greater insight into the relevant issues.
--
Cheers, A post is just a post
Rick Moen My admin will deny.
rick@hugin.imat.com The usual disclaimers apply
As news spools by.