Skip Links

When your Exchange server goes down

MS Exchange disaster-recovery wares

By Tom Henderson, Network World
May 03, 2004 12:00 AM ET

Network World - E-mail availability is an enterprise business mandate. To that end, we tested four high-availability products for arguably the most popular enterprise e-mail system: Microsoft's Exchange Server.

We tested Fujitsu/Softek's Softek Replicator 2.1.2; LeftHand Networks' SAN/iQ Software, Remote IP Copy Software and NSM 200 SAN Module combination; NSI Software's Double-Take for Windows 4.3; and XOSoft's WANSync HA Exchange 3.5.2 Build 45.

Also: How we did it

WANSync HA Exchange earns our Clear Choice designation because it adapted quickly to our setup, presented clear Exchange installation-specific options, and required no subsequent intervention to complete the processes of failover detection, failover and bringing our hotsite/back-up site online.

WANSync HA Exchange also is clearly built for an Exchange environment as it reads the Active Directory, the server registry and Exchange 2000/2003 files, and quickly gives an administrator a profile of where things are, how they're set and options for synchronization intervals. The other products tested treat Exchange more as a minor option - leaving the Exchange customization details to administrators' customization and script writing skills.

There are a number of ways to increase availability - increasing application platform support via power protection, for example, routing techniques and monitoring. Our tests, however, assume a site's Exchange servers have become unavailable - gone from the network map entirely. This mandates Exchange services become available from an alternate site.

This disaster simulation was simple for us. We merely pulled the power feeding - via an SNMP trigger - our primary site's two servers, one of them running Exchange including an Active Directory Global Catalog server and the other a forest-partitioned server in the same Active Directory domain.

The real test was to find out whether vendors' implementations could sense the primary site was down, then recast the hotsite's mirror to Exchange users. After the power failure we assessed how products detected the outage (all but the Replicator could). We then wanted to see the products bring Exchange services back online by using our VPN connection to fail over operations to the secondary site. We tracked how many messages were lost during failover and clocked time to availability.

WANSync HA Exchange and Double-Take for Windows do this automatically, but the latter lost some messages. LeftHand and Fujitsu/Softek require manual intervention to bring Exchange 2000/ 2003 back online at our simulated hotsite/back-up site location - and both lost messages. It should be noted none of the products lost messages from the Exchange message stores - only messages in progress.

WANSync HA Exchange

A self-described 'switchover solution', WANSync HA Exchange is the only Exchange-specific product reviewed in this comparison, although it does retain XOSoft's WANSync technology that's used for other applications such as Microsoft SQL Server and Oracle 9.

Of two available deployment options, we chose the high-availability one, (as opposed to the WANSync "file" method) which dictated the necessary steps to achieve a replicated server. These steps involved developing a scenario that brought the masters' settings together with their replicants'. This scenario mirrored all the settings necessary for the Active Directory, DNS, registry entries, Exchange-specific logs/file locations, and Exchange settings to be replicated to the secondary site.

We linked the source and replication server by making the replication server a 'switchover host' and chose a replication name. The auto-discovery process in WANSync HA Exchange queried our Active Directory and each host for its information. It then offered appropriate default selections of items such as source host monitoring (heartbeat, timeouts, IP pinging, for example) and let us run scripts before switchover or switchback.

WANSync HA Exchange can redirect DNS to point to the failover Exchange server, then change it back if necessary at a failback point. Also, it can change the IP addresses so that the remaining live Exchange server can be found if DNS can't (or shouldn't) be changed. In both cases, users might have to exit Outlook or their mail readers, and flush their DNS cache as we had to do.

Oddly, WANSync HA Exchange was the slowest of all four products tested to perform an initial synchronization of the message and public folder store in our tests, which took more than 10 hours (see tracking Failover Performance chart, right).

There are three levels of replication between WANSync "master" and "replica" servers. The initial synchronization takes place, using options that let large chunks of data be replicated at a time. Moving the chunk size from small to large made no difference in WANSync's slow data copying as far at the initial replication is concerned. Subsequent replication is done either at the block level, or at the file level. Either file-level or block-level replication was sufficient to let WANSync HA Exchange keep all of the 24 messages in queue at the failure point - on the replica server.

Failback is the reverse of failover, and for all of the applications tested, the time to re-synch was slightly faster (but proportional) because most files were already populated on our disabled Exchange Servers.

Of the four products tested, only WANSync HA Exchange prevented any outgoing messages from being lost at the time of failure. WANSync HA Exchange can auto-discover many Exchange facets such as file locations, Exchange specifics and logs.

The WANSync Manager is the core management application. It uses a Microsoft Management Console-like layout that allows fast perusal of paired (mirror and replica) servers, their settings and the settings that are made for failover.

Tracking failover performance
While XOSoft’s WANSync HA Exchange required the longest initial server synchronization time, the product earned our Clear Choice honors for its ability to fail over quickly without dropping pending messages.
  Time for initial synchronization via 10M bit/sec link Average time to availability after failure (1) Number of dropped message transactions during failure (2)
XOSoft WANSync HA Exchange 643 minutes 18 minutes 0
NSI Software Double-Take for Windows 530 minutes 23 minutes 5
LeftHand Networks SAN/iQ Software, Remote IP Copy Software, NSM 200 SAN Module 521 minutes 72 minutes 6
Fujitsu/Softek Replicator 536 minutes 82 minutes 24
Note that these scores are specifically for Exchange 2000/2003, not for other uses or applications.
(1) Average of two synchronizations.
(2) Twenty-four transactions are pending when primary server is shut down.
Click to see:

NSI Double-Take For Windows

Double-Take for Windows is a byte-level replication system. Sources of datasets (drives, volumes and folders) are identified. Then target storage areas are calculated for size and subsequently allocated for replication. In many installations, one target might suit several data sources, but for Exchange, NSI recommends a one-to-one allocation if the targets are going to be used for subsequent/possible failover. The mirroring took 530 minutes to replicate our datasets, consisting of Exchange executables and Exchange stores for both of our two primary site servers.

This initial synchronization was simple to set up and deploy, but additional steps such as batch file configuration and finding Exchange files, for example, are required.

In our two-forest, two-Exchange server example in which we simulated a headquarters and a manufacturing branch topology, four licenses of Double-Take were required, but only two licenses of Exchange server were required.

When failover occurs - as detected by a failure threshold that can be set for communication between servers - the software triggers options you set in a recommended batch file that calls NSI's ExchFailover command. This command starts Exchange on the failover hardware and sets DNS information to the target server.

Our Commenting Policies
Latest News
rssRss Feed
View more Latest News