Archives
What's New
Site Map
Subscriptions

Home
NetFlash
This Week
Forums
Reviews/buyer's guides
Net Resources
Industry/Stocks
Careers
Seminars and Events
Product Demos/Evals
Audio Primers

IntraNet



















For more info:

Vendor clustering papers:
Compaq
Digital
HP
IBM
Microsoft
NCR
Stratus
Tandem

Wolfpack: Will this dog hunt? - Network World, 9/2/96.

Picking the best cluster - Consultants speak out, Network World, 9/2/96.

Strom runs his own consulting firm in Port Washingon, N.Y., and has written hundreds of articles on computer networking. He maintains two web sites, Web Informant for Web-based marketing issues, and Web Compare, for Web server feature comparisons. He can be reached at david@ strom.com.


Solving the clustering conundrum
A guide to choosing the server clustering system that's best for you.

By David Strom
Network World, 9/2/96

Server clustering is shaping up as a new weapon that vendors of Windows NT systems are using in their battle to win market share from Unix vendors, and to convince you that NT is ready to run your mission-critical applications.

What the NT vendors aren't telling you is that the number of applications that can take advantage of clustering is limited and that the Unix folks, who have been at this game a lot longer, pretty much have cornered the market for IP-based applications such as Web servers.

That said, there are certain applications where NT clusters will serve you well - database servers, for example. And things are bound to get better going forward, so it would behoove you to understand what to look for in a clustering system and how the various implementations differ.

At its core, server clustering is a simple idea: Take several servers and tie them together using a variety of hardware and software tricks to ensure either high availability or scalable performance. Ideally, the goal is to have both: a collection of hardware that doesn't fail and doesn't run out of gas as applications grow and consume more processing horsepower.

However, for such a simple notion, there are many subtle differences among the various implementations, which could have drastic consequences for corporate network architects and application builders.

First let's be clear that clustering is different from symmetric multiprocessing (SMP), whereby more than one CPU resides inside a single server. With SMP, the operating system is aware of the multiple processors and divides its own tasks among the various CPUs. The multiple CPUs share memory, disk and other machine resources. However, when one of these shared resources fail, the entire machine stops working.

That limitation is overcome when SMP hardware is combined with clustering systems, providing the ultimate in high-reliability and scalable machines.

Clustering solutions have long been available in the minicomputer, mainframe and Unix worlds. IBM, Digital Equipment Corp., NCR Corp. and Tandem Computer, Inc. have been selling clustered servers for over a decade. Within the past year, these four vendors, as well as Intel Corp. and Compaq Computer Corp., have also begun selling PC clusters, primarily on Windows NT servers. Hewlett-Packard, Inc. is soon to follow. And Microsoft, trying to beef up NT clustering, has announced a major effort to standardize and port the best of the Unix world to NT under the code name Wolfpack.

Finding the right mix

Analysts generally agree there are only two basic clustering models. The difference between them is whether multiple copies of the same application have the ability to access the same file.

If they do, it is called shared disk or symmetric data access, and special software called a distributed lock manager has to mediate access among the various applications.

If they don't, it is called shared nothing or partitioned data access, and each server in the cluster owns a portion of the overall disk resources. Most of the NT clustering solutions support shared nothing, whereas most Unix alternatives can be configured to support both models.

Within these models, however, a great deal of variation exists in several areas: the number of servers that can be connected; the nature of the connection; the protocols, client machines and applications supported; whether any software is required on each network client; how the cluster is managed; and whether you can vary different processor hardware inside one cluster.

Perhaps the most important issue is the mix of applications, client operating systems, and protocols that the clustering system supports. Each vendor offers a different combination, depending on whether you choose its NT or Unix system. It pays to read the fine print to determine whether what is supported will match up with your application needs.

The best use of clustering involves database applications: Here is where you'll get improved availability of the server as well asthe scalable performance gained by adding extra hardware. However, the clustering hardware must support your particular database server, along with the specific clients and protocols used to access that server. That could be a tough situation.

For example, with Digital's first release of NT clustering, which began shipping earlier this summer, only applications that make use of Named Pipes and Server Message Block protocols will work. This means all IP-related applications such as Web servers aren't supported on Digital's NT clusters, although they will work on its Unix counterparts. And though Digital's NT cluster does support non-Windows clients, they have to manually reconnect to the server after a failure.

Compaq's On-Line Recovery Server, which has been shipping since last year, supports two applications - Oracle 7.3 and SQL Server 6.5 - and only Windows clients. It won't support Macintosh or OS/2 clients at all and non-NT clients have to manually reconnect.

NCR's LifeKeeper, shipping since this past January, seems to have the widest support for all kinds of clients and applications; the product supports automatic reconnection of any NT client, for example. ''We have a richer set of features and functions, and can work with a wider array of software. While Compaq only runs with the latest versions of Oracle and SQL server, we can run with older versions,'' says Martin Sinnott, director of Windows NT product marketing for NCR.

Finally, IBM announced its clusters will support Notes running on Unix, NT and OS/2 and an NT version of DB/2, to ship later this year.

Server sweet spot

Beyond the right mix of applications, the next critical item is how many individual servers are supported by the cluster. Keep in mind each server can have more than one processor, enabling you to build ever more powerful servers. Most of the NT-based clusters today support two servers, whereas the Unix versions are more capable.

''For people worried about high availability, the sweet spot is three servers,'' says Eric Schott, a product manager for Unix clusters at Digital. Schott feels having two servers still allows for a small chance that two independent failures can bring down a cluster, such as a bad disk drive in one server and a broken power supply in a second. ''Having the third server in the cluster is insurance against these independent failures,'' he says.

Stratus Computer, Inc.'s Radio Clusters fits that bill. Radio Clusters, which began shipping in July, comes with support for six servers and is expandable to 24. It uses Pentiums running NT.

Compaq and others have plans to support as many as four servers in future versions, while vendors including NCR plan to go beyond two servers in their NT clustering products later this year.

The Unix market is equally diverse. Digital's Unix products support a maximum of eight servers; HP's Unix solutions support four.

In August, Groupe Bull was to begin selling clusters of eight servers in its Sagister line, and Tandem already supports hundreds of nodes in its ServerNet clusters.

Tying it all together

Another major difference involves how the clustered machines interconnect. Each vendor has a mixture of cables, network cards and other gear. Some components are proprietary, such as Tandem's and Compaq's, whereas others use fiber or fast Ethernet network cards that are available from many vendors.

Generally, you have three kinds of interconnection, says Mark Wood, a product manager at Microsoft: You can use the ordinary network connection to the server, or a shared SCSI bus that connects just the disk drives to multiple servers, or a dedicated, private link among the servers that is used in addition to the network connection. ''The last method is the best for high-availability applications, but it can also be the most costly,'' Wood says.

Both Digital's NT and Compaq's clustering solutions use the second method, connecting two machines to a common set of SCSI disks via separate SCSI cables. Compaq uses its own products for the disk drives and connectors. ''This means Compaq customers are locked into Compaq storage and cannot shop around for their disk drives. With Digital, you can choose the disks you want to buy,'' says Jane Wright, an analyst for DataPro Research. NCR's LifeKeeper can be configured to use either SCSI or a dedicated connection.

Digital uses a 100M bit/sec Memory Channel connection between machines in its Unix clusters. Although it's proprietary, it uses standard components and network connections. Tandem uses a 100M byte/sec connection for its ServerNet links, based entirely on its own equipment.

Checklist items

Apart from interconnect hardware, you also need to consider whether additional software is required on the client end to support the cluster. At present, that's an easy task: only Digital's NT Cluster requires such software. ''This allows the servers to switch during failover,'' says Bob Guilbert, NT clustering marketing manager for Digital. It also provides for a single view of the cluster's resources from the client, which can be a boonfor applications written to take advantage of this method - but few presently are.

Another critical difference is whether all the servers in the cluster have to be the same. This is important in situations where you want to add new versions of hardware to older clusters without having to upgrade every server. Digital offers the most flexibility, providing support for similar machines as long as they use the same brand of CPU. That means you can't mix Alpha and Intel versions of NT in the same cluster, but you can have servers using a mix of Intel chips - Pentiums and 486s, for example.

Next you should look at how the cluster handles server failures: How quickly does a switch from a failed server to a working server (called a failover) occur, and what happens when the primary server is repaired and brought back online (called failback)? Much depends on the cluster manager software, how it is set up, how often it detects whether each server is still alive, and so forth. Some of these parameters are configurable by the user, such as the time between each heartbeat, which is a series of packets sent out by each server to indicate that it is still working.

Failbacks should happen without too much effort - otherwise, the entire high-availability notion of a cluster isn't worth much. Some products, such as the Digital and NCR NT clusters, can be restarted automatically, but this isn't the case with the Compaq products.

''It is ironic that when it is time to bring a failed Compaq server back up, both servers must be brought down simultaneously in order to restore the On-Line Recovery Server configuration,'' says Datapro's Wright.

Note that some vendors switch the server name or IP address from the failed to working servers. That can pose a problem if any of your applications expect a particular server name to be married to a specific MAC address.

Finally, there is the issue of how the cluster itself is managed. There are two basic options: Either manage the entire cluster as a single entity or be able to view the individual servers that make up the cluster. Neither option is necessarily better than the other; you're better off using the same vendor for network management and cluster software.

Given all this activity, it's clear that NT clustering products are going to improve over the next few years, possibly coming close tothe features and capabilities of their Unix cousins. However, clustering will still have limited appeal for enterprise networkers. The number of applications, protocols and clients supported is still a small subset of that used in most organizations, making it a fairly narrow and expensive niche for the immediate future.

At present, your best bet is NCR's NT solution, with its wide support for clients and applications. And Unix is still the better choice if you need a Web server cluster or support for other IP applications.


Feedback | Network World, Inc. | Sponsor Index
Marketplace Index | How to Advertise | Copyright

Home | NetFlash | This Week | Industry/Stocks
Buyer's Guides/Tests | Net Resources | Opinions | Careers
Seminars & Events | Product Demos/Info
Audio Primers | IntraNet