home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!pmafire!news.dell.com!swrinde!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!pacbell.com!decwrl!infopiz!lupine!motcsd!udc!mcdphx!ennews!anasaz!qip!briand
- From: briand@anasaz (Brian Douglass)
- Newsgroups: comp.databases
- Subject: Re: Hot Standby DBMS's
- Message-ID: <1992Aug25.063147.2764@anasaz>
- Date: 25 Aug 92 06:31:47 GMT
- References: <BtDAt7.L5J@world.std.com> <BtI4Bv.D8r@cup.hp.com> <BtI9KD.MAv@world.std.com>
- Organization: Anasazi, Inc. Phoenix, Az
- Lines: 61
-
- In article <BtI9KD.MAv@world.std.com> edwards@world.std.com (Jonathan Edwards) writes:
- >Since there has been quite a lot of traffic generated by my original
- >post, I thought I should contribute how I have implemented the
- >Remote Hot Standby feature in our proprietary non-relational database.
- >There is intense market pressure for us to deliver relational-based
- >solutions, thus my post to see if any such could do the job, so far
- >fruitlessly.
- >
- >My database has a global journal stream which batches together the transaction
- >data system-wide. A commit is typically a single disk write, and on a
- >reasonably loaded system it averages below one write.
- >There is never any need to flush data to its home location on disk -
- >it can remain cached forever (but thats another story...).
- >A normal (all disks intact) recovery takes a few minutes max.
- >
- >Remote Hot Standby operates by simply keeping a copy of the journal file
- >on a remote system. We typically use an Ethernet bridge over T1 lines.
- >We usually operate synchronously - a transaction is not complete till
- >both the local and remote journal writes are complete.
-
- This is a different implimentation of what I've been talking about.
- Duplicating the transaction stream (in Mr. Edwards case, the journal
- stream) so that it is processed identically by both the local and remote
- machines.
-
- >Obviously, latency is the critical performance factor here. Though because of
- >journal batching, this hurts response time more than throughput.
- >As a result, the communication protocol is absolutely minimal - I talk
- >directly to the Ethernet driver and run my own protocol. I have learnt that
- >most Ethernet interface boards have pretty pathetic performance.
- >
- >As journal data is received and written at the remote system, it is
- >asynchronously being recovered there, just like a normal journal (roll-forward)
- >recovery. Thus the remote system can recover and take over in a few minutes
- >from the production system.
-
- I think a difference lies in that while I suggest transactions
- are processed as they are received on the remote system, Mr. Edwards system
- seems to roll forward only upon primary failure, or I suppose nightly. If
- I am wrong in this, please clarify.
-
- If the remote does do a "catch-up" each evening, how would you handle a
- 24x7 operation withy your product?
-
- [good stuff deleted]
-
- >By the way, these systems handle Billions of dollars a day, which is what
- >justifies so much custom engineering for reliability. Failing to get the
- >day's work done on time can cost 100's of thousands of dollars in interest
- >penalties alone. Losing data is unthinkable.
- >
- >Hope this was of interest - Jonathan
-
- What type/size systems is this operation implemented on? If you can say.
- Large Unix OLTP operations are still a rarity compared traditional mainframe
- solutions, and I love to hear what others are doing.
-
- --
- "... For I have sworn upon the alter of god, eternal hostility against
- every form of tyranny over the mind of man." Thomas Jefferson
- Brian Douglass briand@anasazi.com 602-870-3330 X657
-