When five nines just won’t cut it

* Extreme availability

Four major themes emerged in Nemertes’ latest data center research, drawn from interviews with 82 data-center managers, CIOs, IT directors and other IT executives from 65 companies. Regardless of size or industry, companies were dealing with major changes centered on issues of consolidation, growth, availability and operational efficiency. This week we examine how availability expectations have changed over time, leading to “extreme availability: 100%,” as one IT executive put it.

Your average consumer has become accustomed to the availability of a shopping site at 2 a.m. and sees that popular Internet sites are “never” down. These expectations bleed over into the workplace. Even in environments like higher education where 24/7 availability would be considered silly just a few years ago, a new batch of students is raising expectations: “We have to have all systems available all the time now. Our students have been spoilt for availability and we are struggling to keep up,” says the CIO of a university.

In other industries such as financial services, e-commerce, retail and healthcare, high availability has always been required. But, while each individual IT component may have failures, the application “experience” is now expected to be always-on, even for non-critical applications. Raised expectations of availability are unfortunately not followed by large increases in budgets. As a result, IT organizations are trying to achieve more with less.

With the need for increased availability, more than half of all participants used a second data center as a “hot secondary,” providing end-user services but also mirroring the primary data center. Replication between data centers can occur at many levels: replicating SANs, replicating databases or application-level replication. Also, replication can occur synchronously or asynchronously. “Today there is no replication between data centers. The new design we are implementing includes SAN in both new locations and replicated [synchronously] between the two,” says the CIO of a university.

About one-quarter of our research participants deploy a tertiary data center as a backup site to the primary and secondary. Because of transmission delays, it is only possible to replicate data synchronously over short distances, usually up to 50 kilometers. Beyond that, data is usually replicated asynchronously. Some companies therefore set up a three-tier availability/recovery architecture: Primary and secondary data centers are located within a short distance, not more than a few miles apart. Data is replicated synchronously from the primary to the secondary data center. Thus, a building-level failure in the primary data center will result in failover operation of the secondary.

The tertiary acts as the failover site for major disasters where both the primary and secondary may be affected. Thus, even major natural disasters such as Hurricane Katrina, which would destroy or disable both the primary and secondary data centers, cannot disrupt business. While the data in the tertiary data center is a few minutes out of date, it can immediately serve users across a WAN.

Availability requirements have finally stopped going up. Unfortunately, that is because for many companies they have settled at 100%. Even if 100% is impossible to achieve on a per-element basis, it is achievable at the application level with a combination of redundancy, load balancing, caching and backups.

Now all we have to do is find the budget for it.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2006 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)