home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!ukma!bogus.sura.net!howland.reston.ans.net!spool.mu.edu!nigel.msen.com!msage
- From: msage@msen.com (Micro Sage Software)
- Newsgroups: comp.os.rsts
- Subject: Re: Exabyte Backup Failures
- Date: 28 Jan 1993 00:17:23 GMT
- Organization: Msen, Inc. -- Ann Arbor, Michigan
- Lines: 102
- Distribution: inet
- Message-ID: <1k78n2INNhvb@nigel.msen.com>
- References: <1k4rs4INN5tv@nigel.msen.com>
- NNTP-Posting-Host: garnet.msen.com
- X-Newsreader: TIN [version 1.1 PL8]
-
- Dan and Karen Sugalski (dan@msen.com) wrote:
- : Stephen Carter (stevedc@syma.sussex.ac.uk) wrote:
-
- : : We are running RSTS 9.7. We have had exabytes since I don't know how
- : : long, and have been VERY happy with the overall functionality.
-
- : : In July 1992 we swapped two 512Mb Fuji's for Two 2Gb Seagate ST42400
- : : Scsi discs hanging off a CMD Scsi controller. The users' data did
- : : expand (naturally!) but currently stands at (laboriously calculated)
- : : 2,146,696,704 bytes.
-
- : : When we backup now, the run fails (after 13 hours) with the following
-
- : : |
- : : |
- : : ?Error reading Backup set
- : : ?Data error on device
- : : ?Error reading Backup set
- : : ?Data error on device
- : : ?Unexpected error 14 in RSTRMS
- : : |
-
- : : On the face of it, with these data volumes it may appear to be a badly
- : : handled end-of-reel situation, so before folk flame me, I HAVE seen a real
- : : end-of-reel situation on the exabyte, and that was handled properly
- : : (Please mount volume 2 of Backup set etc etc)
-
- : : What is it?
-
- RSTRMS is just an overlay in backup, nothing that you explictly called
- into being. It's part of the verification (actual backup set read
- routine). This problem is most eveident on 11/44 systems. Much less a
- problem on 11/84's
-
- : Well, this sounds suspiciously like a problem that I've encountered on a
- : lot of customer's machines--namely that RSTS' TMSCP handler has pretty
- : poor error recovery. While I've never encountered an error in the middle
- : of a backup, the most anyone's ever dumped to the thing are two RA81s
- : (~750 meg of data, not counting free space left on the drives).
-
- Actually, the largest customer using ome of these we have has 2 nearly
- full RA81's ~= .9gb. They only encounter problems during restores.
-
- : What seems to happen is that somewhere between the tape drive and the
- : TMSCP handler, a packet gets lost, and the two get out of sync, never
- : again to talk. (We have to power the drive and computer off--a reboot
- : doesn't seem to be enough)
-
- Right. The reason VMS will work with the turkey drives is that the TMSCP
- handler is very pushy about things. It will keep hammering at a drive
- until something gives where as RSTS tries to be polite about it. Since these
- drives barely respond to SCSI at all, RSTS often loses.
-
- : The easiest way we've found to trigger it is to do a restore when there's
- : a load on the machine (100% guaranteed to cause a failure). The restore
- : works OK and the data gets restored, but the tape drive will not talk to
- : the computer any more. Symptoms include: Data Error on Device errors,
- : Magtape Select errors, Device Hung errors, and, my favorite, complete
- : machine lockups. (The latter can be cured by either inserting a tape or
- : ejecting the tape that's already in. Things pick up where they left off,
- : no harm done)
-
- : You might try upgrading to 10.1 (Skip 10.0 unless you're ready to apply
- : several dozen patches, including one that's not documented anywhere, but
- : is lurking on the upgrade tape). The TMSCP handling is reportedly better.
- : I don't have any experience with the drives on it, though, as I don't
- : know anyone with one of the tape drives that is using it.
-
- 2nd the 10.1 release (with emphasis on skipping 10.0). Check with colorado
- as all our customers received their updates at about the same time (6
- months ago). 10.0 will work okay if you get all the patches. Contact me
- if you would like them, I've got them all typed in as ONLPAT command file
- already.
-
- : Alternatively, you may want to try putting a pause of some sort between
- : the BACKUP commands (maybe the big dump of data is causing a probelm and
- : giving the drive a chance to flush buffers and such will help). If you're
- : doing the verify after each backup, then *don't*! Do both backups firtst,
- : then verify the tape.
-
- We verify tapes around here with a home-grown tape verification program.
- It's written in Macro-11 and takes advantage of streaming tape drives
- (gets them going pretty fast). It is used every night at about 5 sites on
- Exabytes, 3 of them using CMD controllers.
-
- Also, don't make your buffers too big. If they are too large, then RSTS
- will have to keep stopping and refilling them which will cause it to stop
- streaming. That puts a lot of wear on an already rickety tape drive (get
- the hint I have little love for ExaByte?). Make the Block Size MAXIMUM and
- either leave the buffer size alone.
-
- What Version of RSTS are you running? What type of system and how much
- memory? The whole thing can be system dependent.
-
- If you'd like the verify program, contact me and I can E-Mail it to you.
- It's only about 20 blocks of source.
-
- Gerry Duprey EMAIL: gerry@msage.com
- Micro Sage Software Systems VOICE: (313) 663-0444
- 130 South First Street
- Ann Arbor, MI 48104 USA
-
-