home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.databases
- Path: sparky!uunet!munnari.oz.au!metro!bnd2.bnd.oz.au!johnw
- From: johnw@bnd2.bnd.oz.au (John Warburton)
- Subject: Re: Hot Standby DBMS's
- Message-ID: <1992Aug19.001017.10999@bnd2.bnd.oz.au>
- Organization: B&D Doors, Sydney, NSW, Australia
- References: <9208172103.AA04378@hplwk.hpl.hp.com>
- Date: Wed, 19 Aug 1992 00:10:17 GMT
- Lines: 51
-
- From article <9208172103.AA04378@hplwk.hpl.hp.com>, by albert@HPLWK.HPL.HP.COM (Joseph Albert):
- >
- > Jonathan Edwards writes:
- >
- >>In the transaction-processing world, there is the concept of a 'hot standby'
- >>system, which is a geographically separated system containing a copy of the
- >>database, and capable of coming online very quickly. The replicated data must
- >>be close to current, and guaranteeing complete synchronization is required
- >>by some applications. A further feature is the ability to 'catchup'
- >>incrementally to missed changes after an outage, without a complete database
- >>copy. Our database (homebrew non-relational) does this.
- >>Are there any other databases that can do this?
- >
- > A more reliable fault-tolerance would be obtained from redundancy at a
- > level which is lowered than the level of abstraction of the database.
- > Disks can be made redundant by having 2 or 3 physical disks, with
- > drivers that make them look like a single device. if one disk fails,
- .
- .
- .
- .
- lots more on physical fault tolerance deleted...
-
- That all sounds good for cpu, disk crashes etc, but what about when you plug
- humans into the loop?
-
- 4 out of our last 5 crashes have been human error & not hardware problems.
- Mainly filling up the database disks & no more writes possible, or someone
- accidentally writing direct onto the databse volume file .. sigh
-
- I think that these sort of problems and the resultant reload of database
- checkpoint & replay of transactions could have been avoided with a machine that
- has the checkpoint written to it, at the same time as to tape, and transaction
- logs sent to the machine every 15 mins, so as to save time in reload of the
- checkpoint, - all we would do would replay the transactions -saving us about 2
- 1/2 expensive hours.
-
- The reason this looks good, is that you do not have to replace your existing
- hardware for expensive fault tolerant machines. All you need is a (secondhand?)
- machine like your production one, with cheaper disks - just as long as it limps
- along until you fix the problem on the production machine, that's fine.
-
- comments?
-
- > Joseph Albert
- > albert@hplabs.hp.com
- --
- John Warburton Internet: johnw@bnd2.bnd.oz.au
- Systems Administrator Phone: +61 2 771 5566
- B & D Australia Fax: +61 2 771 6385
- Living on ice-cream and chocolate kisses...
-