Ensuring the Availability of Applications and Services

Process for Planning Your Network Load Balancing Clusters

This section discusses guidelines you need to consider when you plan the Network Load Balancing clusters in your organization. When you plan your Network Load Balancing clusters, consider using the planning process represented in the flowchart in Figure 18.3.

Enlarge figure

Figure 18.3 Process for Planning Your Network Load Balancing Clusters

Determining Which Applications to Use with Network Load Balancing

Many applications work with Network Load Balancing. This section offers guidelines for determining which applications might be suitable.

In general, Network Load Balancing can scale any application or service that uses TCP/IP as its network protocol and is associated with a specific TCP or UDP port.

Network Load Balancing uses "port rules" that describe which traffic to load balance and which traffic to ignore. By default, Network Load Balancing configures all ports for load balancing. However, you can modify the configuration that determines how incoming network traffic is load balanced on a per-port basis. To modify the default behavior, you create port rules that cover specific port ranges.

Some examples of services and their associated ports are:

HTTP over TCP/IP: Web servers, such as Microsoft Internet Information Services (IIS): Port 80.
HTTPS over TCP/IP: HTTP over Secure Sockets Layer (SSL) for encrypting Web traffic: Port 443.
FTP over TCP/IP: FTP: Port 21, port 20, and ports 1024-65535.
TFTP over TCP/IP: Trivial File Transfer Protocol (TFTP) servers, which are used by applications such as Bootstrap Protocol (BOOTP): port 69.
SMTP over TCP/IP: Simple Mail Transport Protocol (SMTP), which is used by applications such as Microsoft Exchange: Port 25.
Microsoft Terminal Services: Port 3389.

To be successfully load balanced, an application or service must be designed to allow multiple instances (multiple copies of a program) to run simultaneously, one on each cluster host. For example, an application must not make updates to a file that will in turn be synchronized with updates made by other instances unless it explicitly provides a means to do so. To avoid this problem, set up a back-end database server to handle synchronized updates to shared-state information.

In addition, Network Load Balancing is commonly used with:

VPN servers
A VPN server services an extension of a private network that encompasses links across shared or public networks such as the Internet.
Streaming media servers
Software (such as Microsoft Media Technologies) that provides multimedia support, allowing you to deliver content using Advanced Streaming Format over an intranet or the Internet.

Network Load Balancing is a good choice for VPN servers and streaming media servers after you have determined that your organization would benefit from load balancing PPTP or streaming traffic.

Note

Before load balancing an application in a Network Load Balancing cluster, review the application license or check with the application vendor. Each application vendor sets its own licensing policies for applications running on clusters.

When using Network Load Balancing with VPN servers to load-balance PPTP clients, it is important to configure the TCP/IP properties correctly to ensure compatibility with clients running earlier versions of Windows (for example, Windows 98 and Windows NT 4.0). To do this, assign only a single virtual IP address to the network adapter used by Network Load Balancing and do not assign a dedicated IP address on this subnet. This restriction does not apply for Windows 2000 clients.

For more information about configuring Network Load Balancing for VPN and other applications, see "Windows Clustering" in the Microsoft® Windows® 2000 Server Resource Kit Distributed Systems Guide.

Using Network Load Balancing to Deploy Terminal Server Clusters

Windows 2000 Terminal Services, when configured in Application Server mode, provides centralized application deployment and execution for remote users. You can use Network Load Balancing to distribute a large client base among a group of terminal servers. This is most appropriate in situations where the Terminal server application is largely stateless, such as providing a data entry application to a sales floor or warehouse.

If roaming users need to reconnect to existing Terminal server sessions, you cannot use Network Load Balancing to provide a complete roaming experience. Since Network Load Balancing routes users to cluster hosts based on the IP address, a user connecting from multiple locations, or one who uses DHCP and disconnects between sessions, is not always routed back to the same computer, and thus cannot reattach to a disconnected session. Even in such situations, Network Load Balancing can provide reconnections for dropped sessions, or persistent routing if the user's IP address is fixed. If maintaining disconnected sessions is not a requirement, then you can use Network Load Balancing effectively for any sort of terminal server application.

When you use Network Load Balancing, it is recommended that you configure all terminal servers to end disconnected sessions after a moderate time-out, such as 30 minutes. This configuration allows dropped sessions to be resumed, but does not allow disconnected sessions to persist for long periods. Persistent sessions can be a problem when the user is routed to other computers, because the computers will not be reconnected. This configuration could consume resources by leaving sessions open on multiple computers for each user or, in the worst case, users can be locked out because their resources are in use elsewhere.

When deploying a Terminal Services cluster using Network Load Balancing, each server needs to be able to serve all users. To facilitate this, you must store per-user information, system information, and common data in an accessible place, such as a back-end file server. Figure 18.4 represents an implementation of Network Load Balancing and Terminal Services.

Enlarge figure

Figure 18.4 Network Load Balancing Provides Balancing Among Terminal Servers

Note that separate servers are depicted for line-of-business database applications and for per-user data storage. Each of these servers needs to be implemented as a high-availability server using clustering or other appropriate technologies. Additionally, this implementation improves scalability by partitioning the workload so that multiple terminal servers can support the desired level of performance.

For more information about Terminal Services, see "Deploying Terminal Services"in this book.

Configuring Network Load Balancing Clusters for Servers Running IIS/ASP and COM+ Applications

A key component of e-commerce sites are servers running COM+ applications. In the example shown in Figure 18.5, a server running COM+ handles object requests for the shopping basket of an online bookstore.

Enlarge figure

Figure 18.5 Deploying COM+ Applications on the Same Physical Servers as IIS

In order to ensure that these objects are available when you need them, and to maximize the performance of the site as a whole, it is recommended that you deploy COM+ on the same physical servers as IIS. By doing this, the application servers can take advantage of the scalability and availability gains afforded by the existing Network Load Balancing cluster without any need for an additional separate tier of dedicated servers running COM+.

Using a single physical Network Load Balancing cluster of servers that has been configured for both IIS/ASP and COM+, as opposed to creating a separate physical tier of application servers, reduces hardware and management costs because fewer servers are needed.

Identifying Network Risks

When you identify network risks, you identify the possible failures that can interrupt access to network resources. Single points of failure can include hardware, software, or external dependencies, such as power supplied by a utility company or dedicated wide area network (WAN) lines.

In general, you provide maximum availability when you:

Minimize the number of single points of failure in your environment.
Provide mechanisms that maintain service when a failure occurs.

In the case of Network Load Balancing, you also provide maximum availability when you:

Load balance only the applications that are appropriate to Network Load Balancing.
Make sure that application servers are properly configured for the applications they are running. For more information about proper configuration, see "Determining Server Capacity Requirements" later in this chapter.

A principal goal of Network Load Balancing is to provide increased availability. A cluster of two or more computers ensures that if one computer fails, another computer is available to continue processing client requests. However, Network Load Balancing is not designed to protect all aspects of your workflow in all circumstances. For example, Network Load Balancing is not an alternative to backing up data. Network Load Balancing only protects access to the data, not the data itself. Also, it does not protect against a power outage that would disable the entire cluster.

Windows 2000 Advanced Server has built-in features that protect certain computer and network processes during failure. These features include redundant array of independent disks (RAID) 1 (disk mirroring) and RAID 5 (disk-striping with parity.) When planning your Network Load Balancing environment, look for areas where these features can help you in ways that Network Load Balancing cannot.

Planning for Network Load Balancing

This section will help you determine the number of Network Load Balancing servers you require in your organization and how you need to configure them.

Cluster size, defined as the number of cluster hosts participating in the cluster (can be up to 32 for Windows Clustering), is based on the number of computers required to meet the anticipated client load for a given application.

For example, if you determine that you need six computers running IIS in order to meet the anticipated client demand for Web services, then Network Load Balancing will run on all six computers and your cluster will consist of six cluster hosts.

As a general rule, add servers until the cluster can easily handle the client load without becoming overloaded. The maximum cluster size you need is determined by network capacity on a given subnet. The exact number depends on the nature of the application.

Note

Always be sure that there is enough extra server capacity so that if one server fails, the remaining servers can accommodate the increased load.

When the cluster subnet approaches saturation of the network, add an additional cluster on a different subnet. Use round robin DNS to direct clients to the clusters. You can continue to add clusters in this manner as the network demand grows. Since round robin DNS contains only cluster IP addresses, clients are always directed to clusters instead of to individual servers, and therefore never experience an outage due to a failed server. In some deployments requiring high bandwidth, you could use round robin DNS to split incoming traffic among multiple, identical Network Load Balancing clusters. In Figure 18.6, the IP request discovers DNS (www.reskit.com), which resolves to the virtual IP address of Network Load Balancing Cluster 1 (10.0.0.1) and passes the request to that Network Load Balancing cluster. Subsequent requests are then sent to Cluster 2 (10.0.0.2) and Cluster 3 (10.0.0.3)and then continue in a round robin fashion.

Enlarge figure

Figure 18.6 Round Robin DNS Among Identical Network Load Balancing Clusters

Note

If you use network switches and you deploy two or more clusters, consider placing the clusters on individual switches so that incoming cluster traffic is handled separately. A switch is used to connect cluster hosts to a router or other source of incoming network connections.

It is important to note that a switch can be used to separate incoming traffic in cases where you have more than one cluster.

Determining Server Capacity Requirements

After you determine your cluster size, you are ready to configure individual cluster hosts. In general, you would base this determination on the types of applications you plan to load balance and the client demand you anticipate on these applications. Some server applications, such as file and print servers, are extremely disk-intensive and require very large disk capacities and fast input/output (I/O). Be sure you consult the documentation for each application you plan to run, in order to determine how to configure the servers in your cluster.

While it might be possible to substitute two or three very powerful servers for a larger number of less powerful computers, it generally is more desirable to deploy the larger number. Using more servers allows the client load to be more widely distributed, so that if one server fails, the incremental impact on clients is reduced.

Optimizing Network Load Balancing Clusters

There are some hardware and configuration choices you can make to improve cluster performance for Network Load Balancing. These choices are explained in the following sections.

If the cluster hosts are directly connected to a switch in order to receive client requests, incoming client traffic is automatically sent to all switch ports. In most applications, incoming client traffic is a small portion of total cluster traffic. However, if other clusters or computers are connected to the same switch, this cluster traffic consumes some of their port bandwidth.

To avoid this problem, you can connect all cluster hosts to a hub or repeater that is uplinked to a single switch port. In this case, all incoming client traffic flows from the hub or repeater to the single switch port for simultaneous delivery to all cluster hosts. If client traffic arrives at the switch from multiple upstream switch ports, you might want to add a second dedicated network adapter for each host that is connected to an individual switch port. The use of two network adapters per host on the cluster subnet helps to direct network traffic through the cluster hosts. Incoming client traffic flows through the switching hub to all hosts, while outgoing traffic flows directly to the switch ports.

Requirements for Network Load Balancing

An application can run on a Network Load Balancing cluster under the following conditions:

The connection with clients must be configured to use IP.
The application to be load balanced must use TCP or UDP ports.
Multiple identical instances of an application must be able to run simultaneously on separate servers. If multiple instances of an application share data, there has to be a way to synchronize the updates.

Network Load Balancing is designed to work as a standard networking device driver under Windows 2000 Advanced Server. Because Network Load Balancing provides clustering support for TCP/IP-based server programs, TCP/IP must be installed in order to take advantage of Network Load Balancing functionality. The current version of Network Load Balancing operates on Fiber Distributed Data Interface (FDDI) or Ethernet-based local area networks within the cluster. It has been successfully tested on 10 megabits per second (Mbps), 100 Mbps and gigabit Ethernet networks with a wide variety of network adapters.

Network Load Balancing takes up less than 1 megabyte (MB) of storage space and, depending on network load, uses between 250 kilobytes (KB) and 4 MB of RAM when operating within the default parameters. You can modify the default parameters to allow use of up to 15 MB memory. Typical memory usage ranges between 500 KB and 1 MB.

For optimum cluster performance, plan to install a second network adapter on each Network Load Balancing host to handle network traffic addressed to the server as an individual computer on the network. In this configuration, the first network adapter for which Network Load Balancing is enabled handles the client-to-cluster network traffic addressed to the server as part of a cluster. Although a second network adapter is not required, it improves overall networking performance, for instance to access a back-end database. When Network Load Balancing is enabled in its default unicast mode, the second network adapter is required for communications between servers within a cluster, for example, to replicate files between servers.

For more information about system requirements and cluster and host parameters, see "Windows Clustering" in the Distributed Systems Guide.

Using a Router

Network Load Balancing can operate in two modes: unicast and multicast. Unicast support is enabled by default, which ensures that it operates properly with all routers. You might elect to enable multicast mode so that a second network adapter is not required for communications within the cluster. If Network Load Balancing clients access a cluster (configured for multicast mode) through a router, be sure that the router accepts an Address Resolution Protocol (ARP) reply for the cluster's (unicast) IP addresses with a multicast media access control address in the payload of the ARP structure. ARP is a TCP/IP protocol that uses limited broadcast to the local network to resolve a logically assigned IP address.

This allows the router to map the cluster's primary IP address and other multihomed addresses to the corresponding media access control address. If your router does not meet this requirement, you can create a static ARP entry in the router or you can use Network Load balancing in its default unicast mode.

Some routers require a static ARP entry because they do not support the resolution of unicast IP addresses to multicast media access control addresses.