• United States
by Glenn Gabriel Ben-Yosef

The evolution of resiliency

Feb 24, 20039 mins
Cisco SystemsNetworking

An industry insider gives his take on the latest push for resiliency, and sorts through vendor strategies for creating self-managing networks.

The heat really is starting to get to you. They told you things would be better by 2003, your job would be more secure, your team hired back, your budget increased.

They lied.

You check your options. You can’t jump from the sinking ship to the lifeboat because you’re already in it. But wait – you remember that intelligence is not a miracle: Chance favors the prepared mind. No matter what we sell today, information needs to flow, and it flows through network services. Luckily, vendors have been improving the quality of networks for some time. You can prepare your network to become more resilient with products available today.

We are closer to the goal of the “lights-out” data center based on interoperability and open systems than we’ve ever been. The lights-out data center – our industry’s Holy Grail – runs with no human intervention, taking care of its own troubles through so-called self-healing automatic repair.

While you’ll want to stick with proven products, you won’t get much of a competitive edge by rolling out the identical network configuration as the rest of your industry. Make time to listen to vendors espousing more theoretical, strategic approaches. Vendors such as Sun certainly are aiming high when it comes to plans for virtualizing computing, networking and storage. But, today’s pie-in-the-sky strategy is tomorrow’s shippable products. Network executives’ challenge is to balance market realities against vendor strategies for creating competitive, resilient infrastructures.

Cisco’s resiliency plan

Problems with information flow can basically occur in two places: in the device and in the network that connects devices. Both places are logical spots to improve resiliency.

Boosting the resiliency of the hardware and software in switches, routers and other network devices is relatively simple. The techniques we previously used to keep our infrastructures humming included keeping an off-site inventory of spare parts, maintaining redundant chassis, keeping on-site hot-swappable components and redundant cold/warm failover components. Cold failover components were “connected and configured” but not yet booted-up with software. Warm failover components were “prebooted.”

We now have newer, more intelligent techniques such as load balancing, hot failover components, and software logic and state information to keep things running smoothly.

In the network, we look to topology and protocol. We used to have more network choices such as thick and thin coax, Ethernet, Token Ring, ARCnet and FDDI. Today we can expect dual homing, fiber and IP, Category 5 copper and Ethernet, and 802.11b. WANs still have SONET, ATM and frame relay. While little can be done about inherent network-protocol issues, multiple data paths will increase reliability.

But resiliency is more than simply fixing what goes down. It is the ability to bounce back into shape or position, to recover strength after being stretched, bent or compressed. These attributes are exactly what Cisco says it hopes to provide for IP-based networks.

Cisco’s Globally Resilient IP (GRIP) is an example of one vendor’s effort at increasing availability regardless of the type of network architecture. “The whole idea is to give people a consistent end-to-end IP service experience,” says Charles Goldberg, a product manager in the Internet Technologies Division at Cisco. “We do this by just offering a software upgrade and not requiring anyone to change hardware.”

Intelligent layers

Vendors are pitching new tools and techniques for improving resiliency at each of the three major infrastructure layers — services, software and hardware.
Where resiliency resides Technology What it does How you benefit
Services Sun’s N1, IBM’s Blue Typhoon Lifts business process off infrastructure. Process virtualization.
Software Cisco’s Globally Resilient IP Maintains Layer 2 con-nections during route processor failover. Uninterrupted user experience.
Hardware Redundant components, dual-homing Provides continuous service during equip-ment and carrier failures. Mitigates risk over carrier networks.


GRIP, an IOS technology, addresses resiliency in four areas: the link layer (frame, PPP and ATM connections), routing, Multi-protocol Label Switching and IP services (ensuring gateway router availability).

Stateful Switchover (SSO) is a feature of the Resilient Link Layer component of GRIP. The “stateful” part of SSO means that should a route processor fail, Layer 2 state information will be maintained with the standby route processor. The benefit is that no ATM, frame relay, PPP, High-Level Data Link Control or other Layer 2 connections are lost. The router will continue forwarding packets on the last known route. Then, once route table convergence is completed with the latest topology, the forwarding tables are updated.

Cisco routers don’t maintain state for TCP session numbers, and Border Gateway Protocol (BGP) uses TCP. Therefore, in the event of a route processor failure, BGP must reconverge. Nonstop Forwarding (NSF) is Layer 3 technology that forwards packets while the existing Layer 2 connections are handed off to the new route processor during SSO.

NSF SSO is available in the three major Cisco router hardware platforms that can support two route processors: the 7500, 10000 and 12000. The benefit of these combined Layer 2 and Layer 3 features is that the time to switchover from the failed route processor to the standby route processor is reduced from about 30 seconds to a high of 6 seconds on the 7500 to a low of zero seconds on the 12000,  according to tests conducted by independent lab Miercom on Cisco’s behalf. By running NSF SSO on your edge router, you probably won’t experience much of a change in your next-hop router, so forwarding on the last known routes won’t likely cause problems.

Cisco maintains what it calls “minimal and necessary state” information between the active and the standby route processor so that customers can run NSF SSO on older platforms such as the 7500, which has been in the market for about nine years with an installed base of about 130,000 units. That state information lets the standby route processor know which interfaces relate to which management interfaces. Other information, such as Open Shortest Path First or BGP routing tables, is not maintained, because Cisco says re-creating that information can be done before users know a stateful switchover occurred in their router or neighboring router. Stateful network address translation (NAT) maintains state for an internal IP addressing scheme. Features currently shipping include Nonstop Forwarding, Stateful Switchover, MPLS Fast Reroute – Node Protection, Multicast Sub-Second Convergence, IP Event Dampening, BGP Convergence Optimization and Stateful NAT. Cisco expects Gateway Load Balancing Protocol, Incremental SPF Optimization and Stateful IPSec to ship in the first quarter of this year.

GRIP interoperability with other vendors’ network gear is a question. The issue surrounds what state information is maintained and what is re-created. Maintaining more state information increases resiliency, but is more difficult to do. Re-creating state is slower but relies completely on industry standards.

Juniper, Procket Networks and Redback Networks are working with Cisco via the Internet Engineering Task Force (IETF) to implement some protocol changes that will enable restarting the TCP connections to BGP and then re-creating state, a promising compromise. Cisco says the IETF work is in the pre-request for comment stage. Still, Cisco has a history of introducing modifications to protocols, a tactic it might have to downplay should the market demand strict vendor interoperability. Vendors such as Alcatel and Avici say they hope to maintain all state information – including TCP session numbers – without protocol modifications, which is cleaner from an interoperability standpoint, but more difficult to pull off.

Fortifying Web applications

So, what else can you do to enhance your network while watching the Sun vision unfold, apart from adding an additional route processor to your Cisco routers and upgrading IOS? For starters, secure your existing Web-based services and confidently extend your network to customers and partners through Secure Sockets Layer.

Click here for more.

Virtualizing IT resources: Sun’s N1

While Cisco has been busy increasing the router’s resiliency, Sun wants to make the network invisible. Spelled out in its N1 strategy, Sun’s idea is to divorce the tool from the task by lifting application, file, print and other business services off the underlying hardware computing and connectivity platforms, such as servers and networks, as much as possible.

This smashes the notion of platform specialization and frees developers to code “conceptually” to business services. This vision of the virtualization of IT resources is attractive, but the climate might not yet be right for such a massive paradigm shift.

Sources close to the company say Sun’s steadfast commitment to N1 most likely stems from the “identity crisis” the company faces as it attempts to reinvent itself and live up to its reputation as an industry thought-leader. While Sun shook up the industry with the invention of Java, the vendor didn’t execute its own Java plans well and the technology ended up benefiting other companies more than Sun.

The N1 vision is an extension of the idea of the “network is the computer,” a phrase Sun CEO Scott McNealy coined years ago. The goal is to provide elastic resources that support business processes. But customers will derive real value from the N1 plan only when they can virtualize storage and network assets along with server assets. While Sun might have been successful in virtualizing what it already had on the server side, customers aren’t convinced the vendor can make the necessary multivendor alliances for market success in those other two areas. Nor does Sun have the presence to create a critical mass of customers in storage and network gear by itself.

Sun’s vision is exciting, even if its execution is questionable. Still, like Java, implementation of the vision could come from another vendor. IBM has a strategy similar to Sun’s N1. With a newfound strong presence in services and Utility Management Infrastructure initiatives such as Blue Typhoon (that hopes to ease virtualization management), IBM could be that vendor. In this highly competitive market, the IBM edge is not as much in its technology as it is in its customer base. With so much of its revenue coming from midsize and large companies, IBM could start billing on a utility model, which would easily lead to virtualization and provide a real market for Sun’s vision.

More with less

The lagging economy means fewer IT initiatives might be funded. But more is riding on them as business remains as competitive as ever.

You need to establish what IT service level is reasonable for your industry then set a course to attain and sustain that level. Fortunately, vendors are stepping up with products and services that help you do just that.

Ben-Yosef is principal analyst at Clear Thinking Research in Boston. He can be reached at ggb@cthinking.