Industry split on data center network standards

Despite proprietary and other alternatives already on the market, IETF upbeat on progress of TRILL

The industry appears deeply fractured over the best approach to data center networks, with some vendors backing the IETF's TRILL, some backing the IEEE's SPB, others offering proprietary protocols and still others advocating a combination of approaches.

Despite competing standards and proprietary alternatives already on the market, the IETF insists that TRILL is gaining momentum as a method for solving Ethernet scalability problems in data center networks.

Indeed, the industry appears deeply fractured over the best approach, with some vendors backing the IETF's TRILL, some backing the IEEE's SPB, others offering proprietary protocols and still others advocating a combination of approaches.

TRILL was designed as a way to overcome limitations of Ethernet's Spanning Tree Protocol, a method for preventing network loops and for handling backup paths in the event of a failure. Spanning Tree is inefficient because it doesn't use all of the available paths between switches, and the routes are not always the shortest or fastest. Because of this, topology reconvergence in a Spanning Tree network is slow, which limits scale and make the network more susceptible to link failures.

The IETF is attempting to address this deficiency in RFC 5556 with TRILL, which stands for Transparent Interconnection of Lots of Links. TRILL is a Layer 2 protocol that uses link state routing to map the network, discovering and calculating shortest paths between TRILL nodes called Routing Bridges, or RBridges. This enables shortest-path multihop routing so users can build large-scale Ethernet and Fibre-Channel-over-Ethernet data center networks.

BACKGROUND: Are new IETF switching, routing specs needed?

Ethernet switch market leader Cisco is shipping FabricPath for its Nexus 7000 switch, a technology that accomplishes the same tasks TRILL is intended to address while providing many more capabilities. Cisco says FabricPath is a "superset" of the TRILL standard.

Brocade also says its BrocadeOne fabric architecture is based on TRILL.

Juniper, though, just announced its QFabric line of data center and cloud fabric switches, which do not support TRILL at all but instead support a proprietary method for scaling Ethernet in data centers.

Indeed, Juniper is an outspoken TRILL detractor. At its QFabric announcement, Juniper Founder and CTO Pradeep Sindhu called TRILL "a solution looking for a problem" and "a means to scale Layer 2 networks, but most [data center] networks want to communicate at Layer 3. Layer 3 gets punted to a one-armed router and becomes a choke point," Sindhu said. "TRILL as applied to a data center is a joke."

HP is supporting both TRILL and a competing IEEE specification called Shortest Path Bridging (SPB). SPB is an extension to the Multiple Spanning Tree Protocol that also uses a link state routing protocol to allow switches to learn the shortest paths through an Ethernet fabric and dynamically adjust to topology changes.

"For the traditional enterprise customer segment, HP is actively working on TRILL based solutions including open standards participation with the IETF," states Dominic Wilde, chief technologist and senior director of HP Networking, in an email message to Network World. HP plans to release TRILL-compliant products in the second half of this year.

And HP will evolve current support for IEEE 802.1ah Provider Backbone Bridging (PBB) to SPB, Wilde says. PBB scales virtual LANs by encapsulating MAC addresses within MAC addresses.

"HP is continuing its investment in PBB by evolving to current standards such as SPB that collectively provides scalability between edge and core, multi-tenancy services, resiliency, and multi-pathing," Wilde says. "While these standards started life as solutions for service provider and carrier customers, they are also becoming more relevant to large scale data center enterprise environments."

HP is also utilizing its Intelligent Resilient Framework (IRF) network virtualization and clustering technology to extend the scalability and reliability of TRILL and SPB. IRF is designed to scale data center fabrics by virtualizing multiple physical IS-IS nodes into a single logical node. This keeps the hop count low for efficiency and faster convergence, Wilde says.

Huawei is also backing SPB, citing the same multi-tenancy service attributes HP likes, as well as the standard's maturity and OA&M capabilities, says Reg Wilcox, vice president of optical network marketing and product management for Huawei North America. And like HP, Huawei plans to extend and bridge the capabilities of both SPB and TRILL.

"Huawei ... recognizes that neither the IETF nor IEEE standard has every feature that every customer will require and we are working hard in both standards with other vendors to help bridge these gaps in the lowest-cost manner possible," Wilcox says.

Avaya and Alcatel-Lucent are implementing SPB, given their carrier roots.

Extreme Networks, meanwhile, recently announced that it will use Multi-System Link Aggregation (M-LAG) as an alternative to Spanning Tree, SPB and TRILL.

Extreme claims that, for the majority of virtualized data centers, M-LAG eliminates the drawbacks of Spanning Tree while providing the benefits of TRILL, without disrupting the network or requiring significant capital expense.

Extreme claims TRILL and SPB both require major upgrades.

"TRILL requires new packet encapsulation," says Shehzad Merchant, senior director of strategy for Extreme. "Typically most existing infrastructure does not support this and will require a 'rip and replace.'"

Not so, says Donald Eastlake, co-chairman of the IETF's TRILL Working Group.

"'Rip and replace' is certainly not required to gain substantial benefits from TRILL," Eastlake says. "TRILL can be incrementally deployed -- you can replace classic bridges one at a time by TRILL switches (or RBridges), although the benefits from TRILL increase if you replace more bridges in a LAN with RBridges."

M-LAG, meanwhile, only supports limited architectures, Eastlake says. Also, M-LAG might have very limited multi-pathing capability and may still run Spanning Tree to be sure there are no problems in the case of errors, Eastlake believes.

"Proprietary alternatives for Spanning Tree have been around for years, which I take as a further proof that the quarter-century-old Spanning Tree protocol leaves something lacking for some modern applications," Eastlake says. "Such alternatives, besides being proprietary, typically only work for restricted topologies. TRILL works for general LAN topologies."

SPB works too, for Avaya. The vendor chose SPB for its data center switches due to its simplicity and Layer 2 focus.

"It's part of our carrier heritage to enable flattening (of the network)," says Steve Bandrowczak, vice president and general manager of Avaya Data Solutions. "It features ease of implementation, extension into storage ... It's the right path."

Avaya also believes SPB -- IEEE 802.1aq -- will have broader industry support. But Eastlake says it only works in an area consisting of all contiguous Shortest Path Bridges, and it is limited to point-to-point links.

And it is not yet full baked within the IEEE, Eastlake claims. Even though 802.1aq also uses the IS-IS routing protocol, it is to configure bridging mechanisms employed to forward frames. As such, it does not route, Eastlake says, so its multipath scalability is limited.

"That's why its support for Equal Cost MultiPath (ECMP) has been so limited and why its per-switch routing calculation overhead is exponentially higher than TRILL, making it less scalable than TRILL," he says.

Eastlake says IEEE 802.1 is looking to initiate a new project to try to add routing to SPB and overcome these "problems."

TRILL users pay the price for scale in terms of control, says Peter Ashwood-Smith, a contributing author on SPB.

"One mode of forwarding is like a shotgun blast of packets while the other mode of forwarding is like a rifle shot," he says. "TRILL does a shotgun blast while 802.1aq does a rifle shot. What we find talking to customers is that some want to spray packets and others want more control, and most want both depending on the type of traffic."

SPB also offers more predictability that TRILL, Ashwood-Smith says. Its rifle-type forwarding allows users to plan what is going to happen with offline tools.

"That's more difficult, and possibly impossible, with the shotgun type of forwarding," Ashwood-Smith says.

Also, TRILL might require more extensive equipment upgrades than SPB, he says. To support the shotgun-type forwarding, every card on every hop has to be upgraded, while an SPB implementation will only require upgrades to the ingress/egress cards, and will leave 40/100G Ethernet cards "untouched," Ashwood-Smith says.

"We therefore think we can offer a lower-cost solution to our customers with Ethernet, at least for the short term," Ashwood-Smith says.

As for calculations, SPB offers an option where users can turn off multicast state, which results in calculations that are "considerably faster" than TRILL's, Ashwood-Smith says. And on the issues of standards progress, Ashwood-Smith says SPB was set back a bit by Nortel's bankruptcy and asset sell-off; and that a shotgun-type forwarding option is being added to the specification.

But pre-standard versions of SPB are in 20 or more live deployments, and a "substantial" demonstration of interoperability testing -- including five physical and 32 emulated switches from two vendors -- was done at a vendor's lab in Ottawa. Ashwood-Smith expects SPB to be ratified in the third quarter.

Meanwhile, a TRILL interoperability "plugfest" was recently completed at the University of New Hampshire InterOperability Laboratory last year and another is being planned, Eastlake says. The TRILL base protocol is an approved IETF standard, as are two IS-IS companion documents, he says. But those companion documents are not yet published RFCs, Eastlake says.

The TRILL Working Group has been re-chartered and work is continuing on further TRILL development, including management, OA&M and the like. Cisco claims that its FabricPath technology is based on TRILL, and BLADE Network Technologies, which was acquired by IBM, is also backing TRILL, Eastlake says.

As a result, "I believe that TRILL has increasing momentum," Eastlake concludes.

Senior Editor Tim Greene contributed to this article.

Learn more about this topic

2011 tech priorities: Are you ready to flatten your data center network?

Deep dive: Flat networks are the future

Cisco, sources reveal data center next steps

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2011 IDG Communications, Inc.