Advertisement: Support LinuxWorld, click here!

February 1999

Tapping
the Source:
LinuxWorld
Expo news

Navigate
   Home
   Topical index
   Archive

Print this story
   Print-friendly version
Subscribe
   Free e-mail alerts

Search


Contact Us
   Masthead
   Advertising Info
   Writer Guidelines
   Link to LinuxWorld

What do you think?
   E-mail the editors

Advertisement
<A HREF="http://ad.doubleclick.net/jump/idg.lw.com/archives%3Bsz%3D125x125%3Btile%3D1"><IMG SRC="34893-linuxmall_125.gif" tppabs="http://ad.doubleclick.net/ad/idg.lw.com/archives%3Bsz%3D125x125%3Btile%3D1" height=125 width=125></A>

The state of clustering

Ready for the breakthrough: The state of Linux clustering

Linux kernel 2.2.0 is a turning point when it comes to symmetric multiprocessing

Summary
Cameron Laird examines the current state of Linux multiprocessing, and looks at its future. He covers basic clustering concepts, the impact of the 2.2.0 Linux kernel, Beowulf clustering, high availability, reliability, and cluster management. (2,100 words)

By Cameron Laird

Is there a Linux cluster in your future? We'll find out, with a look at the state of Linux multiprocessing, paying particular attention to the role of clustering over the next year.

The what, how, and why of clustering
The central idea of computer clustering is simple: A cluster is any collection of more than one computers that can be accessed independently but also as a unit.

This simplicity clouds quickly, though, as we begin to examine how and why clustering is done. Several related but distinct concepts blur the boundaries of what exactly a cluster is.

Users cluster computers to get more -- more processing power, more reliability, or more manageability. Start by looking at a concrete example, say, a Web site that's beginning to gag on its hit load. It needs more cycles. How do you get them?

The most immediate choices are to move to one of the following:

A faster processor
A more capable processor
More processors in a single box
More boxes working together on the same job
More boxes working on the same job but with different data
More boxes cooperating in even more esoteric ways

Multiprocessing is any arrangement that uses more than one processor to solve a single problem. The most familiar variant now is symmetric multiprocessing, or SMP, in which two or more processors share the same backplane. SMP is a commodity in the contemporary 80x86 world, where dual-Pentium boxes are standard catalog items.

A multiprocessing operating system is hard to get right the first time; both Windows NT and Linux had problems with their early SMP installations. In this area, Linux kernal release 2.2.0 is a watershed for Linux. Its impact will continue to unfold in the year ahead, but it's already significant. Linux 2.2.0 makes VA Research Founder Larry Augustin feel "comfortable" up to four-way processing. VA Research Chief Technology Officer Leonard Zubkoff is one of those now massaging bus logic drivers to enable the kernel to scale properly to eight processors, and there are plenty of plans afoot for the kernel to manage dozens of processors.

And it isn't just insiders who are aware of the importance of 2.2.0. (Linux inventor Linus Torvalds called the kernel a "big weight off my back.") Recently, PC Week was one of several mainstream magazines to herald Linux 2.2.0 as "enterprise-ready" on the strength of its multiprocessing capability.

Remember that SMP boxes support the closest possible teamwork between processors. At the other extreme are several software schemes for heterogeneous distributed processing. These facilitate programming applications which do most of their work on an isolated node, with results occasionally communicated over data links of no-better-than-Internet reliability and speed. These arrangements sometimes receive press coverage after locating another prime number or decrypting an intractable test message.

Slightly closer couplings appear in the categories of job scheduling or load leveling software. These typically distribute atomic tasks -- one HTTP delivery or graphical rendering, say -- to whichever host in a compute farm is ready for another assignment.

Clustering splits the difference between load leveling and SMP. It generally involves relatively homogeneous hardware, specialized fast interconnects whenever possible, and a range of useful scaling factors.

To solve our example problem of an overloaded Web server, different sites have taken each of these approaches (along with reliance on several special-purpose products tuned exclusively for scaling Web service). For academic surveys of the trade-offs, see Distributed and Parallel Computing and Designing and Building Parallel Programs, (see Resources), authoritative textbooks which explain the software issues involved in cooperative computing.

That's how the landscape of multiprocessing technologies looks from cruising altitude. Now, what are your specific challenges and how might clustering help with them?

Intensive scientific computing
Suppose you need to schedule the flight crews for a global airline, simulate the magnetic fields around a supernova, predict the number of tornadoes Nebraska will have tomorrow, or locate the best spot to toss down your next 10 million dollars worth of oil well. The most cost-effective way for you to get these high-performance computing (HPC) results is probably with a Linux cluster.

Supercomputers traditionally perform these kinds of calculations, shipping with specialized hardware and software to manage HPC complexity. They do this at a considerable price, however. For much less money you can lash together commodity hardware with Linux and Linux-specific clustering software to achieve the same computational throughput. That's the achievement of the Beowulf project for clustering, and the competitive advantages of Linux clustering in HPC are on course to continue growing in the foreseeable future.

Beowulf, which originated at the NASA Goddard Space Flight Center, is known to many as the "Extreme Linux" package which Red Hat retails for $29.95. Beowulf has already achieved several impressive milestones. The Avalon project gained a prize in the 1998 Gordon Bell competition for the supercomputing performance of its Beowulf cluster. Oak Ridge National Laboratory has received extensive press coverage for its Stone SuperComputer, a useful supercomputer built from obsolete parts.

Beowulf-style supercomputing is still a small minority of the field, and inertia isn't the only reason. The proprietary HPC vendors typically bundle plenty of value beyond raw computational cycles, including:

Compilers for special, parallelized languages
Hardware and software to simplify management of parallel nodes (so they can be hotswapped, assigned to specific tasks, monitored for routine maintenance, and so on)
Specialized hardware architectures to boost the effective data throughput

It's likely that all of these technologies will become more and more widely diffused and "open." Nothing intrinsic to Linux keeps it from incorporating such pieces. Augustin offers a mundane example of the spread of such techniques: several customers have come to VA Research with the need to pack a lot of boards into a small space. This might arise from simple constraints on office real estate, or perhaps limits on acceptable interconnect latency. His company has responded to this market by becoming accomplished in the details necessary to fill a standard seven-foot cabinet with economical processor cards while keeping uptime high. There are no particular secrets to this.

Augustin proudly states that his company's value-ad lies in its execution of what's publicly known -- all the software VA Research writes is turned back for inclusion in the Linux core source.

We all know Linux makes it easy to turn old, unused Windows machines into valuable e-mail and Web servers. This has partly enabled the Internet boom of the last few years. Beowulf technology has a similar potential to help create qualitatively different uses for computers. Red Hat Director of Technical Alliances Robert Hart speculates that Extreme Linux opens up the possibility of dramatic new achievements in such computing-intensive areas as:

Realtime, 99.99 percent accurate, general voice recognition
Realtime, high quality speech synthesis
"Home" image rendering
Stock/financial market simulation
Really complex games, such as true flight simulation with high-quality audio-video output

Hart surmises this technology will actually be the "'killer application' for Linux in the first five years of the next century, quite possibly sooner."

While the race between Beowulf-style HPC and proprietary alternatives will be an interesting one for years to come, Microsoft appears to be a bystander. It has arranged a few publicity events to demonstrate the ability of NT clusters to take on computationally intensive jobs, and its Web pages on the subject are cogently written. However, as the Microsoft Cluster Server (MSCS) Overview states, the "algorithms and features in the current software must be extended and thoroughly tested on larger clusters before customers can reliably use a multinode MSCS cluster for production work, or gain enhanced cluster benefits." The software publicly available to support HPC on NT is surprisingly primitive, even when compared with Beowulf.

Linux availability
A second reason to cluster is for availability. This is the "even if one engine fails, we'll still be able to fly back to base" theory. There are several subtleties involved in this scheme that complicate Linux's availability case.

There's no question that availability is important. There are plenty of applications that need to be up around the clock, but they don't all require the flat-out horsepower of HPC. Think of all the Web servers, factory controllers, telecommunications switches, ATMs, medical and military monitors, flight-control systems, and stock-transaction data stores in the world. A blue screen of death for any of these generally has swift consequences: somebody loses money, a job, or more.

A distinction is often made between reliability and availability. A component is reliable if it lasts a long time before failing; a system is "highly available" if its pieces can fail safely. Microsoft's Wolfpack clustering technology, for example, "can automatically detect the failure of an application or server, and quickly restart it on a surviving server," as its Web site explains well. Failover capability like this is what most buyers think they're getting when they begin asking about Linux clusters.

They're in for disappointment. There's simply no standard failover for Linux now.

That unpromising reality isn't the end of the story though. It's true that Linux software has a lot of catching up to do in managing availability, even with Wolfpack, let alone more robust operating systems such as OpenVMS and Solaris.

However, three points are relevant here:

This condition is likely to change over the next year
For many jobs, reliability can substitute for availability
An architecture that supports availability might not be as necessary as was once believed

The following sections explain why.

Linux's reliability
Many low-level rewrites are necessary before Linux will have the facilities to support high availability properly. The good news, though, is that many of these pieces are already in development because they're valuable even in isolation. Those that might appear in the standard distribution before long include:

Support for high-speed Ethernet, which could answer interconnect questions, at least temporarily
Filesystem experiments, which in Augustin's estimation should yield a trustworthy journaling filesystem within the next year or so
The Eddie Project, a beta-level initiative that now includes four components crucial to management of high availability, including IP migration and content replication
The soon-to-be released Pacific HiTech Cluster Web Server software, designed to bring high availability to clusters of Apache Web servers

Even without these enhancements, very high reliability can substitute for high availability for some jobs. Sam Ockman is president of Penguin Computing and a former employee of VA Research. Penguin, like VA, packages turnkey Linux servers. Much of Penguin's business has to do with putting together boxes that simply don't break. Ockman rather gleefully describes how he's packed as many as 18 fans into a single host to ensure that operating temperature stays within bounds. What has this attention to detail achieved? "I don't want to jinx myself, but I'll tell you this -- we've never had to have a computer shipped back to us to be fixed."

There are plenty of other ways to improve reliability apart from failover or more sophisticated clustering schemes. Disk drives are the first component to go in a well-run shop, and RAID completely answers this frailty. Clean, stable power also solves many problems.

Cluster manageability
The third motivation to cluster, beyond HPC and high availability, is manageability. This year's jargon for the idea is server consolidation. Although not always used consistently, the aim generally is to manage a span of computing resources in a uniform way. An example element of a manageable cluster is software that selectively reports on processes either one machine at a time (what the Unix command, ps conventionally does) or across an entire cluster. Commercial vendors, including Hewlett-Packard and Digital Equipment (now Compaq), currently offer much more than Linux in cluster manageability. As Beowulf-style installations become more widespread, it's likely that this sort of software will come online.

Is clustering right for you?
If you need HPC and have more Linux expertise than money, a Beowulf cluster is a natural choice. Supercomputer manufacturers will continue to dominate the extreme top-end of HPC, but the cost-effectiveness of Linux is in the lead and should only continue to grow.

Linux, frankly, isn't ready for high-availability requirements right now. Check back in a year, though, and this might have changed. This also means that, if you have good ideas on Linux clustering, this is the time to make them real, so that they can become part of the standard distributions.

Discuss this article in the LinuxWorld forums (2 postings)
(Read our forums FAQ to learn more.)

About the author
Cameron Laird (along with Kathryn Soraiz) manages a software consultancy, Network Engineered Solutions, from just outside Houston, TX.

What people are saying:

There are a number of hardware products out there which perform clustering over networks separately from system/node clustering (Cisco, Alteon, HydraWEB, RND Networks, Arrowpoint, Resonate, HolonTech). These perform clustering and load balancing between several nodes in a cluster using an external hardware unit.
--Rawn Shah

Go to this forum

Advertisement: Support LinuxWorld, click here!

Resources

Articles

"Linux: Enterprise-ready"
http://www.zdnet.com/pcweek/stories/news/0,4153,387766,00.html
"Linux enters comfort-zone"
http://www.zdnet.com/pcweek/stories/news/0,4153,387756,00.html
"Pacific HiTech readies fault tolerant software"
http://linuxworld.com/linuxworld/lw-1999-02/lw-02-pht.html

Books

Distributed and Parallel Computing
http://www.manning.com/El-Rewini/index.html
Designing and Building Parallel Programs
http://www.amazon.com/exec/obidos/ASIN/0201575949/linuxworld

More reading

Linux Parallel-Processing HOWTO
http://metalab.unc.edu/mdw/HOWTO/Parallel-Processing-HOWTO.html
Supercomputing 98 panel entitled "Clusters and Extreme Linux"
http://www.supercomp.org/sc98/panels/
SCL Cluster Cookbook
http://www.scl.ameslab.gov/Projects/ClusterCookbook/

Relevant Web sites

VA Research home page
http://www.varesearch.com/
Microsoft Cluster Server Overview
http://www.microsoft.com/ntserver/ntserverenterprise/exec/overview/clustering.asp?RLD=36
Beowulf Project at CESDIS
http://cesdis.gsfc.nasa.gov/linux/beowulf/beowulf.html
Extreme Linux home page
http://www.extremelinux.org/
Red Hat Software
http://www.redhat.com/
Avalon home page
http://cnls.lanl.gov/avalon/
The Stone SuperComputer
http://www.esd.ornl.gov/facilities/beowulf/
The Eddie Project
http://www.eddieware.org/
Penguin Computing
http://www.penguincomputing.com/
Pacific HiTech
http://www.pht.com