home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.databases.sybase
- Path: sparky!uunet!caen!spencer
- From: spencer@med.umich.edu (Spencer W. Thomas)
- Subject: Dump integrity??
- Message-ID: <SPENCER.92Dec18122733@guraldi.med.umich.edu>
- Date: Fri, 18 Dec 92 12:27:33 EST
- Organization: University of Michigan
- Distribution: comp
- Nntp-Posting-Host: guraldi.itn.med.umich.edu
- Lines: 65
-
- We ran into a situation this week that really scares me. We were
- upgrading our server from 4.2 to 4.8, and made dumps of all our
- databases. During the upgrade process, it became necessary (see my
- other message about this) to restore one of the databases from the
- backup. We got an error message about an invalid logical page pointer
- (with a very large number, definitely invalid), and the load aborted.
- At this point, it was not possible to do ANYTHING with the database,
- including trying to drop it, except to LOAD it. If the backup we were
- loading had been the only one we had, we would have been in very sad
- shape, with no option but to totally rebuild the database partition
- from scratch.
-
- Once we got things rebuilt, I decided to lay down another dump, so I
- used 'mt fsf' to skip past the dumps that were already on the tape.
- It failed(!) with "I/O error". Hmm....
-
- I then tried using 'dd' to read the dump file. It also failed with an
- I/O error. Very suspicious... So, we trekked over to the machine
- room to replace the tape. The console log showed these errors:
-
- Dec 16 12:15:14 mendel vmunix: st1: forwardspace filemark failed
- Dec 16 12:15:14 mendel vmunix: st1 error: sense key(0x4): hardware error
- Dec 16 12:16:43 mendel vmunix: st1: forwardspace filemark failed
- Dec 16 12:16:43 mendel vmunix: st1 error: sense key(0x4): hardware error
- Dec 16 12:16:43 mendel vmunix: st1: file positioning error
- Dec 16 12:24:57 mendel vmunix: st1: read failed
- Dec 16 12:24:57 mendel vmunix: st1 error: sense key(0x4): hardware error
-
- I see some problems here:
-
- 1. There appeaars to have been no indication of the tape failure
- during the dump. This is a hardware problem, and cannot be blaimed
- on Sybase.
- 2. There appears to be no way to verify (via the server) that a dump
- was successful, short of trying to reload it. (No "verify after
- write" mode for the dump, for example.)
- 3. The read error was NOT REPORTED by the load database command,
- instead we got a bogus message about a bad logical page pointer.
- ****** This is totally inexcusable ******* We thought we had an
- integrity problem with state of the database prior to the dump, and
- wasted a fair amount of time trying to discover why. If the error
- had been "I/O error during load" or some such, we could have
- responded in a more appropriate way.
- 4. The database was left in a state that was essentially impossible to
- recover from. It appears that once a load fails, the only thing
- you can do to that database is to load it successfully (you can't
- even drop it, unless you can convince the server that it is
- suspect, so that you can dbcc dbrepair (dropdb) it.)
- 5. For the most part, the manuals were totally useless in helping us
- recover. (We never got lost enough to try calling tech support, so
- I don't know whether they would have been any help, either).
-
- I am now trying to figure out how to do database dumps in such a way
- that I am confident that I will always be able to restore the data, or
- at least so I will know when a dump is not restorable so I can take
- action to make a good one! My current idea is this: After the dump
- runs each night, use Unix commands to back up the tape and read the
- dump file(s). This will at least let me discover media errors.
-
- Just sign me,
- Unhappy but wiser (and mad as h***!)
- --
- =Spencer W. Thomas | Info Tech and Networking, B1911 CFOB, 0704
- "Genome Informatician" | Univ of Michigan, Ann Arbor, MI 48109
- Spencer.W.Thomas@med.umich.edu | 313-747-2778, FAX 313-764-4133
-