home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!enterpoop.mit.edu!spool.mu.edu!nigel.msen.com!dan
- From: dan@msen.com (Dan and Karen Sugalski)
- Newsgroups: comp.os.rsts
- Subject: Re: Exabyte Backup Failures
- Date: 27 Jan 1993 02:25:56 GMT
- Organization: Msen, Inc. -- Ann Arbor, Michigan
- Lines: 76
- Distribution: inet
- Message-ID: <1k4rs4INN5tv@nigel.msen.com>
- References: <1993Jan25.123329.11700@syma.sussex.ac.uk>
- NNTP-Posting-Host: garnet.msen.com
- X-Newsreader: TIN [version 1.1 PL8]
-
- Stephen Carter (stevedc@syma.sussex.ac.uk) wrote:
-
- : We are running RSTS 9.7. We have had exabytes since I don't know how
- : long, and have been VERY happy with the overall functionality.
-
- : In July 1992 we swapped two 512Mb Fuji's for Two 2Gb Seagate ST42400
- : Scsi discs hanging off a CMD Scsi controller. The users' data did
- : expand (naturally!) but currently stands at (laboriously calculated)
- : 2,146,696,704 bytes.
-
- : When we backup now, the run fails (after 13 hours) with the following
-
- : |
- : |
- : ?Error reading Backup set
- : ?Data error on device
- : ?Error reading Backup set
- : ?Data error on device
- : ?Unexpected error 14 in RSTRMS
- : |
-
- : On the face of it, with these data volumes it may appear to be a badly
- : handled end-of-reel situation, so before folk flame me, I HAVE seen a real
- : end-of-reel situation on the exabyte, and that was handled properly
- : (Please mount volume 2 of Backup set etc etc)
-
- : What is it?
-
- Well, this sounds suspiciously like a problem that I've encountered on a
- lot of customer's machines--namely that RSTS' TMSCP handler has pretty
- poor error recovery. While I've never encountered an error in the middle
- of a backup, the most anyone's ever dumped to the thing are two RA81s
- (~750 meg of data, not counting free space left on the drives).
-
- What seems to happen is that somewhere between the tape drive and the
- TMSCP handler, a packet gets lost, and the two get out of sync, never
- again to talk. (We have to power the drive and computer off--a reboot
- doesn't seem to be enough)
-
- The easiest way we've found to trigger it is to do a restore when there's
- a load on the machine (100% guaranteed to cause a failure). The restore
- works OK and the data gets restored, but the tape drive will not talk to
- the computer any more. Symptoms include: Data Error on Device errors,
- Magtape Select errors, Device Hung errors, and, my favorite, complete
- machine lockups. (The latter can be cured by either inserting a tape or
- ejecting the tape that's already in. Things pick up where they left off,
- no harm done)
-
- You might try upgrading to 10.1 (Skip 10.0 unless you're ready to apply
- several dozen patches, including one that's not documented anywhere, but
- is lurking on the upgrade tape). The TMSCP handling is reportedly better.
- I don't have any experience with the drives on it, though, as I don't
- know anyone with one of the tape drives that is using it.
-
- Alternatively, you may want to try putting a pause of some sort between
- the BACKUP commands (maybe the big dump of data is causing a probelm and
- giving the drive a chance to flush buffers and such will help). If you're
- doing the verify after each backup, then *don't*! Do both backups firtst,
- then verify the tape.
-
- Finally, check to see if there are any batch jobs that might be running in
- a different queue. (The bigger backup may be overlapping something else.)
- Also consider doing a SET SYSTEM/NOLOGINS in the com file before the
- backup starts. (You may want to shut down the other queues, as well as any
- other jobs that may be running)
-
- And, even more finally, back up the non-system disk *first*. It's been by
- experience that, once TMSCP loses it, the backup/restore job is immortal.
- (Looks to have pending async IO--prio/runburst go to 128/127 but doesn't
- die) $SHUTUP won't be able to bring the system down, so you'll have to
- power off the machine. Dismount the big drive before you do--one fewer
- drive to clean...
-
-
- .Sigs? We don' need no steenkin' .sigs!
- Dan
-