home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.benchmarks:1691 comp.arch.storage:772
- Newsgroups: comp.benchmarks,comp.arch.storage
- Path: sparky!uunet!europa.asd.contel.com!darwin.sura.net!spool.mu.edu!sgiblab!sgigate!sgi!igor!jbass
- From: jbass@igor.tamri.com (John Bass)
- Subject: Re: Disk performance issues, was IDE vs SCSI-2 using iozone
- Message-ID: <1992Nov15.072855.3112@igor.tamri.com>
- Organization: TOSHIBA America MRI, South San Francisco, CA
- References: <36995@cbmvax.commodore.com> <1992Nov12.193308.20297@igor.tamri.com> <37043@cbmvax.commodore.com>
- Date: Sun, 15 Nov 92 07:28:55 GMT
- Lines: 150
-
- > First, I've only seen spurious drive errors once in my entire time
- >working with HD's
-
- Most people don't see it ever ... they don't know what to look for. They
- just change components in the system until it works.
-
- I've seen this problem atleast a hundred times ... the most recient was a few months
- ago when a local PC Repair shop was baffled by repeated "disk errors" during backup
- and no other time. Put a scope on the 12 volt line an it had nearly a volt
- of ripple when the tape drive was running ... in spec otherwise. The case before
- that was due to 3 disk drives ... average current on 12v was more than fine,
- they used sequenced startup to prevent excessive surge on power up. But when
- the brushless DC spindle motors all commutated at the same time (happended
- about every 5-6 minutes ... they drank the small output caps on the
- switcher dry. Added big cap to 12v and everything was happy.
-
- Another problem I see that few people properly diagnose is cold startup
- thermal gradient failures. Nearly all disks drives are spec'ed for several
- deg C/hr temp gradient. From a chilly room temp to operating temp in 10 min
- is an effective 100 deg C/hr gradient ... many HDA's warp enough during this
- period to seriously off track ... and cause lost of data. Some off-track
- enough to partially overwrite ID's and Data in the next track, or atleast
- provide enough signal interference to make normal operation difficult.
- Many of the infield error rate problems I have seen have really been this
- problem ... solved by having the customer leave the machine on 24hrs/day.
-
- I have a diagnostic for this which fails more 8-16 head drives than you
- might expect! Some fail in systems due to chassis thermal gradient stresses
- or cooling air flow differences, and work fine on the bench.
- The most receint case of this was also earlier this year ... for a
- hospital pharmacy ... to recover the clients data took nearly a dozen
- thermal cycles before every sector could be read without error.
-
- Bring the drive to the high end of it's operating temp for 20 mins, format
- and initialize every sector of the drive. Bring the drive to the low
- end of it's operating range, and re-write every EVEN sector. Turn the
- drive off and cool to 60 deg F. in system (office night conditions). Bring
- ambient temp quickly to 80, start system and randomly write every ODD
- track and verify both neighboring tracks. Then read scan (observing soft
- error rates) the drive at the high and low ends of the operating range.
- Thermal sink to 65 and do random EVEN track read scan's during several in
- system warm up cycles.
-
- An interesting side effect of thermal gradient off-tracking is PLL
- dropout in the data separator which generates a long error burst
- outside the normal detect and correct range for ECC ... and
- miss-corrects occur. When doing data recovery on such systems,
- you can't always trust the "correct" data from the drive.
-
-
- > You need to separate multi-tasking from multi-user. Single-user
- >machines (and this includes most desktop Unix boxes) don't have the activity
- >levels for the example you gave to have any relevance. It's rare that more
- >than one or two files are being accessed in any given second or even minute.
-
- I would have agreed with you up until last year ... X windows and MS windows
- have drasticly changed the volume of disk I/O on PC's. At the same time,
- disk I/O has become a major wait factor in the startup and initialization
- of most GUI applications ... often contributing 80% of the minute or two
- big applications take to startup. In addition, applications seldom deal with
- simple ascii files .... they are now full of font and layout info, if
- not bitmaps. The evolution was most noticable on my mac ... when I
- first got my 128k mac I didn't understand why all the app's were
- so big ... today I don't think there are may app's that will even
- load on it ... many require more than the 1mb of a Mac Plus.
-
- Applications that delivered acceptable performance on a 60KB/sec filesystem
- 10 years ago, now need 500KB/Sec or more. MSDOS and UNIX V7 filesystems
- don't deliver this type of performance ... single block filesystems
- are a thing of the past.
-
-
-
- >recognition of the costs and complexity involved should be there also.
- >Filesystems are complex beasts, and that complexity can lead to high
- >maintenance costs. Things that increase the complexity of an already
- >complex item for performance gains that are only rarely needed for the
- >application are suspect. Also, adding something to an already complex
- >object can be more costly than adding it to a simple object, because the
- >complexities interact.
-
- My point EXACTLY ... SCSI Firmware & Drivers have become too complex,
- in many cases, more complex than the application & filesystem using it!
- Enter IDE, simple firmware and simple drivers. And with some minor
- issues resolved (like better host adapters), better performance.
-
- >>If this single user is going to run all the requests thru the cache
- >>anyway ... why not help it up front ... and queue a significant amount
- >>of the I/O on open or first reference. There are a few files were this
- >>is not true ... such as append only log files ... but there are clues
- >>that can be taken.
- >
- > Why should the single-user have to run everything through a cache?
- >I think direct-DMA works very nicely for a single-user environment, especially
- >if you don't own stock in RAM vendors (as the current maintainers of Unix,
- >OS/2, and WindowsNT seem to). Reserve the buffers for things that are likely
- >to be requested again or soon - most sequential reads of files are not re-used.
-
- Generally user applications don't ask for a whole file, properly aligned
- and of an even sector length ... the filesystem can. Disk caches don't need
- to use an excessively large amount of DRAM to make significant performance
- increases.
-
- Actually, from my experience, most files opened and read sequentially,
- will be opened and read sequentially again ... and soon. C compilers
- open the sources once, then typically open 6-20 include files which
- will be used again, if not for another compile, by the same users
- next try of the same program. Ditto for the many default, startup
- and font files in a X/motif environment. Make files, shell scripts
- and other such tools are another good example. Many small programs
- are also frequently re-used, especially those used by make & shell
- scripts. In development environments, common libraries are also
- often read.
-
- Caching all writes is also highly productive ... editor writes out
- C source, to be read by C compiler which writes object module read
- by linker which writes the executable to read by system loader when
- the user tests. The cycle repeats, deleting ALL files created by the
- cycle .... if you hold the I/O long enough ... it really isn't I/O
- after all.
-
- Doing selective caching has very unpredictable negative side-effects,
- unless selected by the application/user ... and even then .....
-
- >>capable of servicing 486/RISC desktop machines ... and certainly in
- >>conflict with the needs of fileservers and multiuser applications engines
- >>found in 100,000's point of sale systems managing inventory and register
- >>scanners in each sales island -- Sears, Kmart, every supermarket, autoparts
- >>stores and a growing number of restrants and fast food stores.
- >
- > Those are interesting areas, but those are not desktop markets,
- >and those are not single-user markets. Those are large, multi-user (from the
- >point of the server) transaction and database server systems.
-
- Certainly NOT LARGE - But built from the same desktop PC machines we
- discuss ... few vendors build UNIX or NOVEL servers ... customers take
- these cheap single-user desktop boxes and put UNIX/NOVEL (and soon NT)
- on them to solve their needs. Most of the Point of sale machines
- are 2-4 meg 386's with 40-80mb of disk ... same or smaller than most
- MS Windows machines ... they just have an additional 8-16 serial ports!
-
- The day of the single-user, single-process desktop machine are
- nearly over. Multifinder on the Mac, Windows for DOS, Peer-Peer
- workgroup networks with file/resource sharing have just about
- obsoleted that old OS/Hardware model. This new model is even
- more demanding on HW than most character based UNIX systems
- require. Except for the simplest home computer applications
- (Johnny's word processor and games), desktop systems of the `90's
- are not just cute little boxes ... they need some serious punch.
-
-