NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / database / 6184 < prev next >

Wrap

Text File | 1992-08-18 | 2.6 KB | 62 lines

Newsgroups: comp.databases Path: sparky!uunet!munnari.oz.au!metro!bnd2.bnd.oz.au!johnw From: johnw@bnd2.bnd.oz.au (John Warburton) Subject: Re: Hot Standby DBMS's Message-ID: <1992Aug19.001017.10999@bnd2.bnd.oz.au> Organization: B&D Doors, Sydney, NSW, Australia References: <9208172103.AA04378@hplwk.hpl.hp.com> Date: Wed, 19 Aug 1992 00:10:17 GMT Lines: 51 From article <9208172103.AA04378@hplwk.hpl.hp.com>, by albert@HPLWK.HPL.HP.COM (Joseph Albert): > > Jonathan Edwards writes: > >>In the transaction-processing world, there is the concept of a 'hot standby' >>system, which is a geographically separated system containing a copy of the >>database, and capable of coming online very quickly. The replicated data must >>be close to current, and guaranteeing complete synchronization is required >>by some applications. A further feature is the ability to 'catchup' >>incrementally to missed changes after an outage, without a complete database >>copy. Our database (homebrew non-relational) does this. >>Are there any other databases that can do this? > > A more reliable fault-tolerance would be obtained from redundancy at a > level which is lowered than the level of abstraction of the database. > Disks can be made redundant by having 2 or 3 physical disks, with > drivers that make them look like a single device. if one disk fails, . . . . lots more on physical fault tolerance deleted... That all sounds good for cpu, disk crashes etc, but what about when you plug humans into the loop? 4 out of our last 5 crashes have been human error & not hardware problems. Mainly filling up the database disks & no more writes possible, or someone accidentally writing direct onto the databse volume file .. sigh I think that these sort of problems and the resultant reload of database checkpoint & replay of transactions could have been avoided with a machine that has the checkpoint written to it, at the same time as to tape, and transaction logs sent to the machine every 15 mins, so as to save time in reload of the checkpoint, - all we would do would replay the transactions -saving us about 2 1/2 expensive hours. The reason this looks good, is that you do not have to replace your existing hardware for expensive fault tolerant machines. All you need is a (secondhand?) machine like your production one, with cheaper disks - just as long as it limps along until you fix the problem on the production machine, that's fine. comments? > Joseph Albert > albert@hplabs.hp.com -- John Warburton Internet: johnw@bnd2.bnd.oz.au Systems Administrator Phone: +61 2 771 5566 B & D Australia Fax: +61 2 771 6385 Living on ice-cream and chocolate kisses...