home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sys.isis
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!saimiri.primate.wisc.edu!aplcen.apl.jhu.edu!ddsdx2.jhuapl.edu!jmp
- From: jmp@ddsdx2.jhuapl.edu (Jim Pierce)
- Subject: RE: Challenges?
- Message-ID: <1993Jan22.141440.14520@aplcen.apl.jhu.edu>
- Summary: Client groups are a helpful model
- Keywords: client groups
- Sender: news@aplcen.apl.jhu.edu (USENET News System)
- Organization: Johns Hopkins University
- Date: Fri, 22 Jan 93 14:14:40 GMT
- Lines: 75
-
-
- This is posted in response to Ken Birman's request for challenges.
-
- Let's start with the Isis client-server model. (If I screw this up,
- Ken will let me know.) The idea is that the server is a collection of
- processes which perform some service and have been replicated for
- fault tolerance and/or load balancing purposes. How many of these
- processes there are is normally of no concern to the client and so it
- should be. A process can become a client of this group and request and
- receive services and never have to be aware of membership changes in
- the server group unless they all fail. If the client fails the server
- members will all cleanup any state they may have been keeping with
- respect to that client.
-
- Now to the problem we have been discussing with Ken.
-
- Suppose our server was coordinating the assignment of some resource.
- In one of our servers, the resource is a number (from a limited set of
- numbers) which the server is guaranteeing is system wide unique. The
- client requests a number and returns it when it is finished with the
- number. If the client fails, the server can recover the numbers
- assigned to that client. But wait a minute. Suppose that client is
- itself a replicated application (server) for fault tolerance and load
- leveling purposes. The failure of the client process that requested
- the number from the server does not mean that the number is available
- for reuse. This is because the other members of the client's group
- want to keep on going and are managing the use of the number within
- their aplication. Also, we want to be sure that the surviving members
- of the client's group know about any outstanding requests made by the
- failed member but not yet satisfied by the server.
-
- Before going on, I should mention that we have been thinking almost
- entirely in terms of asynchronous xbcasts. This makes the solution
- somewhat simpler conceptually from the client side in that if client
- members are shadowing each other's transactions with the server, the
- communication would be asynchronous for the shadows anyway.
-
- Back to the main thread. Suppose we could tell Isis that all the
- members of the client's group were a single logical client of the
- server. I view this as pg_client with a group address instead of a
- process address. When a member of the client group sends a message to
- the server group we would want it to be atomically sent to the other
- members of the client group as well. Now everybody knows everything.
- The server members would not care which member of the client group
- sent the message. When the server member(s) responsable for responding
- to the client's message send their response, it would go atomically to
- all the members of the client group as well as the server group. All
- members of both groups are always in the same logical state with
- respect to the services provided by the server group.
-
- Suppose a client member fails. A group view change would occur in the
- client group. Survivors would reallocate their work or whatever makes
- sense for their group. They would be assured that they had seen all
- server requests made by the failed member before it failed and that
- any outstanding responses would be forthcoming without the request
- being repeated. Also no responses made by the server would have been
- lost with the failed member so numbers do not get lost. As far as the
- server members, they don't get a group view change if there are
- surviving members of the client group. When the last member of a
- client group fails, then the servers could see a group view change
- showing GV_CLDEPARTED. On this event, all state related to the logical
- client could be cleaned up. In our example, all numbers assigned to
- that logical client could be recovered and reused.
-
- We also want it to be fast and cheap.
-
-
-
-
- Jim Pierce phone: (301) 953-6326
- The Johns Hopkins University fax: (301) 953-6141 (unclass)
- Applied Physics Laboratory email: pierce@capsrv.jhuapl.edu
- Johns Hopkins Rd.
- Laurel, MD 20723
-
-