About Fault Tolerance, Load Balancing, Replication, and Clustering

For sites with multiple servers, good performance is a result of proper fault tolerance and load balancing. The following discussion of fault tolerance, load balancing, replication, and clustering is an overview only. For more detailed information, consult the documentation to your clustering application.

The following diagram shows a typical clustering configuration. Each server node has a local disk which may contain information specific to the node. Both nodes share resources on a hard disk, although only one node at a time can access resources on the disk. The nodes are connected by a local, private interconnection. This configuration facilitates all of the replication and clustering features available in IIS. The shared resources make load balancing possible and the private interconnection facilitates failover and failback.

Fault tolerance

Fault tolerance is the ability of a site to continue to provide service even if server nodes fail. It means that if one server node stops working, another server node can immediately pick up the request load with minimal disruption to users. The process where the workload of one server node is automatically transferred over to another node is called failover. The process where the load is restored to the failed server node after it is back online is called failback. Failover and failback are processes that clustering applications perform automatically so that the user is typically unaware that either failover or failback has taken place.

Load balancing

Load balancing means that two servers can support larger amounts of activity by distributing the request load. Rather than having one server node that is overloaded and another that is not used much, the load can be shared equally. Load balancing can be achieved by assigning Web sites to a specific server node manually. In this way, any Web site can be accessed from either node, although a Web site may not be accessed from both nodes simultaneously.

Note Typically, clusters are configured for both load balancing and fault tolerance.

Replication

Replication is the copying of content and / or configuration settings from one server to another so that both servers can offer the same resources to users. For clusters that share a data storage device such as a disk drive, replication of content is not necessary. Replication of configuration settings is, however, necessary for all clusters regardless of whether they share content. Many clustering applications support replication of both content and configuration settings. For more details see your clustering application documentation. IIS comes with its own utility to replicate configuration settings from one machine to any number of other machines. For details on using this utility see Replication and Clustering in IIS.

Clustering

Clustering allows two servers to appear as one to users. The computers are connected not only physically by cables, but also programmatically, through clustering software. This connection allows them to utilize features such as failover and load balancing that are not possible with stand-alone server nodes. Users may not even be aware that problems have occurred. Clustered servers can also share disk drives containing important information such as a database.