Achieving scalable Web servers is not a trivial task. There are various solutions to pick from, setup and configuration tasks to understand and perform, and many delicate dependencies between related but heterogeneous technologies. This section describes some of the major issues affecting successful scalability implementations.
Note | Although your immediate role and responsibilities as an administrator may not encompass all of the areas and issues described in this section, you can share the information with relevant members of your organization to help ensure successful development efforts. |
This section discusses the following topics:
Application architects must create designs that are inherently flexible by relying upon open standards that don't restrict the application's construction and implementation to vendor-specific interfaces and tools. Similarly, the Web developers that construct the designed application must be aware that they can significantly impact the application's scalability in the way in which they write their code, build their SQL queries, invoke thread management, access databases, and partition the application.
This section discusses the following topics to consider when designing and building a Web application:
As you create Web applications, you will likely create specific variables that you intend to carry across multiple interactions between a user's browser and a site's Web server(s). Using client variables that get stored in a shared state repository or session variables that get stored in memory of a specific server are popular approaches for accomplishing this. The latter approach, however, introduces a significant challenge for a Web site that is supported by multiple servers. Once a user has begun a session and variables are stored on a specific server, the user must return to that server for the life of the session to maintain correct state information.
A good example that illustrates this concept is an e-commerce application that uses shopping carts. With this type of application, as a customer accumulates items in his or her cart, there must be a mechanism that ensures that the user can see the items as they are added. One approach is to store these items in session variables on a specific Web server. However, if you use this approach, there must also be a way to ensure that the user always returns to the same server for the life of the session. ClusterCATS for ColdFusion automatically handles this for you.
Another approach to solving the same problem is to store client variables in a back-end common state repository. This approach enables all Web servers comprising the cluster to access variables in a common, shared back-end data store, such as a database. However, you must be aware that this approach can potentially impact your site's performance.
Web developers must think through the various user scenarios in which application session and state are affected and engineer appropriate mechanisms for elegantly handling such situations. The three most common ways to handle session data are:
Note | Storing session data on the server requires that a simple identifier be stored on the client, such as a cookie. |
Whatever mechanism your architects and engineers use, it's important that they anticipate the scenarios in which maintaining an application's state is vital to a good user experience. See "Configuring session-aware load balancing".
Another major issue that Web developers must consider when constructing their applications is whether or not the application will be single-threaded or multi-threaded. Threading refers to how the application responds to multiple user requests for application services. If an application is single threaded, the application can only handle a single user request for services at a time. If there are multiple requests for the same service, they are put into a queue, and the application responds to each individually in order. Single-threading is not a scalable approach for application services that you anticipate will be used by many users simultaneously.
A multi-threaded application can handle multiple requests for the same application service simultaneously by instantiating separate connections (threads) for each user request. Applications that anticipate lots of concurrent multi-user activity should ensure that the application functions are multi-threaded.
ColdFusion by default provides multi-threaded application services. However, if your application contains particular functions that require single user interaction, such as writing to shared application variables, you can easily restrict the function to be single threaded by using the CFML tag, <cflock>
.
Dynamic Web applications, those that allow users to modify a database, must ensure appropriate database concurrency handling. Database concurrency handling refers to how an application manages multiple concurrent user requests when accessing the same database records. If an application does not impose any database locking mechanism on multiple requests to update the same record, data integrity can be compromised in the database. In such a scenario, two users could make simultaneous modifications to a record, but only the last change would take effect.
For example, consider a Human Resources Web application on a company intranet. The HR Generalist adds two new employee records to the HR database by filling out a Web form because two new employees have just been hired. The Generalist enters most of the vital information into the records but doesn't yet have the new employees' phone extensions or HMO selections, and therefore leaves those fields blank. Later in the day, the HR Generalist's boss, the HR Director, obtains this information from both new hires and decides to enter it in the database herself. However, one of the new employees, after speaking with her husband, decides to change her HMO selection from the basic selection to the PPO choice, which allows greater flexibility in choosing physicians. The employee calls the HR Generalist to tell him of the change, and the Generalist says he will take care of it immediately. Unbeknownst to the HR Director, the HR Generalist adds the information into the employee records at the same time that the HR Director is attempting to add the outdated information.
In this scenario, if the application uses an appropriate database concurrency validation mechanism, such as a SQL WHERE
clause, then the HR Director would receive a message informing her that she could not access the employee record because it was in use, thereby alerting her that the HR Generalist is trying to change the record. However, if the application did not use such a validation mechanism, the HR Director would overwrite the new data that the Generalist had just entered, resulting in data integrity problems. This simple example illustrates how important it is that your dynamic Web applications handle database concurrency issues well.
The way an application has been partitioned and deployed dramatically affects its ability to scale. Therefore, a key development objective must be to ensure that each partition scales independently of the others, thereby eliminating application bottlenecks.
Application partitioning refers to the logical and physical deployment of an application's three core types of logic, or services -- presentation, business, and data access. If you are familiar with the concept of tiered client/server application development, you already understand the rationale for developing applications in this way. However, if you're not, we'll run through a short review to shed some light on this methodology's benefits.
An application, regardless of whether it is a Web application or a more traditional client/server application, has three main categories of logic, or services.
The way in which architects and Web developers decide to partition and deploy these core application services significantly affects the application's ability to scale. Although your development efforts may no longer be burdened with developing, distributing, customizing, and updating proprietary client software for your applications, the ubiquitous graphical user interface (GUI) -- the Web browser -- presents new interface issues and challenges. For example, you must ensure that your applications' presentation remains performance-friendly. It should minimize the number and size of graphic elements that must be downloaded to the client. Also, because not all browsers are yet able to display all emerging technologies cleanly, such as Java applets and frames, you should carefully evaluate their use in your applications.
Bear in mind these types of presentation guidelines to aid your applications' performance and user experience, and be sure to plan and test for the lowest common denominator that all browsers can accommodate.
Often, partitioning business services to a separate business logic application server from the primary application server, if necessary, can yield better application organization and easier maintenance. You can maximize your application's data services by carefully constructing them and by ensuring that a separate database server (in this case, a separate machine) is used to increase processor capacity for any database transactions.
These are several of the most important topics you and the developers creating your Web applications should consider early on. In doing so, you ensure that your Web applications are designed and coded with scalability in mind.
In addition to application design and construction considerations, you must also plan accordingly to avoid common bottlenecks that can negatively affect a Web application's performance.
Following are typical bottlenecks that can affect your application's ability to perform and scale well:
Improper Domain Name System (DNS) setup and configuration on Web servers is one of the most common problems administrators encounter. This section addresses the following topics:
DNS is a set of protocols and services on a TCP/IP network that allows network users to use hierarchical natural language names rather than computer IP addresses when searching for other computer hosts (servers) on the network. DNS is used extensively on the Internet as well as on private enterprise networks, including LANs and WANs.
The primary capability contained within DNS is its ability to map host names to IP addresses, and vice-versa. For example, suppose the Web server at Allaire has an IP address of 157.55.100.1. Most people would connect to this server by entering the domain name (www.allaire.com) and not the less friendly IP address. Besides being easier to remember, the name is more reliable because the numeric address could change for a variety of reasons, but the name can always be reserved.
Internet DNS is a powerful and successful mechanism that has enabled huge numbers of individuals and organizations to create easily locatable Web sites on the Internet. However, DNS by itself may not allow your Web site to perform and scale as it needs to, thus causing it to become unavailable and unreliable. Whether or not you use DNS by itself to load balance inbound traffic depends largely on the site's purpose and the amount of concurrent activity you expect on it. For instance, a low volume, static site that only provides textual HTML information can likely be accommodated just fine by round-robin DNS. However, a high volume, dynamic, e-commerce site that you anticipate doing lots of volume likely won't perform or scale well ultimately if its only supported by round-robin DNS.
To understand why, let's look further at the e-commerce example. Even if you have planned ahead and set up multiple servers to support this high volume site, if you rely only on DNS, it can only do two things: translate the natural language names to server IP address mappings so that users can find the site, and if you've enabled round-robin distribution for multi-server load balancing, it can distribute the load among each server in a rote, sequential distribution manner. However, if a spike in user activity occurs and causes servers to overload or fail, round-robin DNS will keep distributing the requests among all of the servers, even if some of them are no longer operational.
In short, Internet DNS is limited in its capabilities, and its round-robin distribution mechanism does not contain any intelligence that allows it to monitor, manage, and react to overloaded or failed servers. Consequently, DNS by itself is not a sound load balancing or failover solution for your business-critical sites. The load balancing and failover technology that ColdFusion provides, ClusterCATS, compensates for DNS limitations and allows you to create highly available, reliable, and scalable ColdFusion Web applications.
Following are core DNS elements that you must understand and be able to configure if your ColdFusion applications are to work well with DNS:
A Domain Name System is composed of a distributed database of names. The names in the DNS database establish a logical tree structure called the domain name space. On the Internet, the root of the DNS database is managed by the Internet Network Information Center (InterNIC). The top-level domains were originally assigned organizationally and by country. Two-letter and three-letter abbreviations are used for countries and various abbreviations are reserved for use by organizations.
A domain is a node on a network and all of the nodes below it (subdomains) that are contained within the DNS database tree structure. Domains and subdomains can be grouped into zones to allow distributed administration of the name space. More specifically, a zone is some portion of the DNS name space whose database records exist and are managed in a particular physical file. A single DNS server may be configured to manage one or multiple zone files. Each zone is anchored at a specific domain node. Zones are used for breaking up domains across multiple segments when you need to distribute the management of the domain to multiple groups and for replicating data more efficiently.
The following figure illustrates these concepts.
DNS servers store information about the domain name space and are referred to as name servers. Name servers typically have one or more zones for which they are responsible. The name server has authority for those zones. When you configure a DNS name server, you tell it all the other DNS name servers that are in the same domain.
There are three DNS record types that you must define and configure for each Web server in order for ColdFusion's load balancing and failover technology to work correctly. These records must be defined and configured on your local and primary DNS servers.
This record contains a host name to IP address mapping, where the natural language name is the primary name representing the IP address.
This record contains the IP address to host name mapping. This is the reverse lookup of the A record, in which given the IP address, the natural language host name for the IP address is displayed.
CNAME is short for Canonical record. This record contains an alias name that maps to the primary host name of the Web server. For example, if you have a server named fred.yourcompany.com, you could assign it an alias of www1.yourcompany.com so that users never see fred.yourcompany.com in the event of a server redirection.
To see how all of these records work together, let's look at a simple example. Suppose there are two Web servers named fred.yourcompany.com and barney.yourcompany.com, with aliases of www1.yourcompany.com and www2.yourcompany.com, respectively. You don't ever want your users to see the primary host names (A records) for these servers in their browser; rather, you only want them to see their assigned aliases (CNAME records) when being redirected.
Therefore, your DNS entries would look like the following:
; Addresses for canonical names | ||
fred.yourcompany.com | IN A | 192.168.0.1 |
barney.yourcompany.com | IN A | 192.168.0.2 |
; Aliases | ||
www1.yourcompany.com | IN CNAME | fred.yourcompany.com |
www2.yourcompany.com | IN CNAME | barney.yourcompany.com |
; Round Robin | ||
www.yourcompany.com | 192.168.0.1 | |
192.168.0.2 | ||
www1.yourcompany.com | IN A | 192.168.0.1 |
www2.yourcompany.com | IN A | 192.168.01.2 |
To ensure that your site lookups and translations occur as intended, you must provide correct entries in your DNS records, as shown above. Also, if you want to enable round-robin DNS functionality, your round-robin entries must be done in the manner shown above. On the Windows platform, you make these DNS entries using the Domain Name Service Manager utility. On UNIX platforms, you make these DNS entries in the name.db file, which is read by the DNS server's Berkeley Internet Name Daemon (BIND). See "Configuring ClusterCATS offline maintenance support (NT only)" for detailed procedures.
For additional detailed information about DNS and all of its components and how they work together, you can check out the following resources:
See also "Other Informational Resources".
Load testing is the process of defining acceptable benchmarks for your Web application's performance and then simulating load and measuring resulting response times and throughput against those benchmarks. You perform load testing to measure the application's ability to scale.
This section discusses the following topics:
Load testing is important to your Web site's success because it lets you test its capacities before you deploy it, thereby enabling you to find problems and fix them before they are exposed to your users. Determining your site's purpose and the amount of traffic you anticipate it will receive may affect how you load test it.
Small sites that don't expect heavy concurrent loads may be able to organize and use actual users to simultaneously access the site to perform load testing. However, this is often a difficult activity to accomplish well because it introduces many human variables. Therefore, it is typically not a practice that we advocate. In fact, for larger business-critical systems that expect heavy concurrent load, this type of testing is not feasible and will not be able to provide satisfactory nor realistic results.
A better approach to load testing is to use load simulation software. There are some excellent software load testing tools on the market that let you simulate heavy load hitting your Web server. By using the load testing software in conjunction with your defined benchmarks and formal test plans, you can confidently determine if your Web application is ready for deployment.
Another reason to load test is to verify your failover capabilities. Failover ensures that if a primary server within a cluster of servers stops functioning, then subsequent user requests are directed to another server within the cluster. Failover is addressed in more depth in "What is Web Site Availability?". Using the load testing software of your choice, you can essentially force a server redirection by designating a machine as "unavailable" or by shutting it down.
Note | ClusterCATS for ColdFusion uses the HTTP protocol to redirect packets of data from a failed server to an available server. Therefore, it is important to verify that your load testing tool can handle HTTP redirections properly before you initiate load testing. |
One of the first things you need to do to be able to load test is purchase a load testing software tool and learn how to use it. There are a variety of good load testing software tools on the market, including RSW's e-TEST Suite (which includes a load testing component called e-LOAD), Segue's SilkPerformer, Rational Software's PerformanceStudio, and Mercury Interactive's LoadRunner. Each of these packages provide substantial Web-enabled software testing solutions that will help you effectively simulate and test load.
After you purchase, install, and learn to use the load testing software, you need to determine benchmarks that you want to or must achieve for your Web site to ensure a good user experience. Following that, you must formalize your testing strategy by designing and developing written test plans against which you'll execute your tests.
Once your test plans are written and approved, it's time to run the tests. After you do so, you need to capture and analyze the load testing results and report the statistics to the development team. From there, you'll need to reach consensus about what are the most serious problems you discovered, what are the necessary changes to make, and what is the best way to implement the fixes. After the changes are made and a new build of the application is available, you'll rerun the tests to look for performance improvements. Again, you'll reanalyze the testing results and continue this cycle until the site is operating within the established parameters that you've set. When your team agrees that the site scales well and is operating at peak performance under heavy stress, you're ready to deploy the application into a production environment.
Before starting your load testing, consider the following:
Make sure you understand your Web site's performance and scalability requirements before you start running tests against your site. Otherwise, you won't know what you're testing for and the statistics you capture won't have significance. Also, remember that the benchmarks you define should be customized for the current application; don't simply reuse benchmarks from an earlier site on which you may have worked. Each Web application is often distinct in terms of its design, construction, backoffice integration, and user experience requirements.
Create a test environment that is identical as much as possible to the actual production environment in which the Web site will be hosted. If you don't simulate a similar network and bandwidth scenario, or use the same types of servers, or ensure that the same versions of software (operating system, service packs, Web server, and third-party tools) reside on both the test and production servers, you can't anticipate problems nor determine why they occur. The number of possibilities would be too large.
Load testing in a distributed environment can be problematic if the network on which you are performing your load tests becomes congested, resulting in poor response times. Additionally, if everyone else in the organization is using that network for their everyday activities, such as e-mail, source control, and file management, an increased load going over the network will likely cause significant network degradation for them. As they likely have nothing to do with the testing effort, this situation can cause great frustration.
In such a scenario, it may be more effective to physically sit in front of the server on which the application resides and perform the tests locally rather than bring the entire LAN or WAN to a slow crawl. Also, by testing locally, you are better able to rule out the network as the source of the scalability problems. Alternatively, you may be able to configure a separate subnet on the LAN or WAN that is distinct from the subnet on which everybody else in your environment uses network services.
You should now have a good overview of what scalability implies, the core elements that comprise it, some of the issues that affect successful implementations, and the tasks that must be performed to verify that your Web applications are able to achieve satisfactory scalability.
The next section describes Web site availability and reliability concepts and considerations.