Skip Links

Network World

Michael Morris

The Three Parts of MPLS Circuit Diversity

By michaeljmorris on Mon, 11/05/07 - 9:57pm.
Newsletter Signup

Many organizations approach MPLS circuit diversity from a carrier diversity standpoint. Their thinking is that with separate carriers they will be safe, particularly from the dreaded "cloud meltdown". Unfortunately, this misses two important other diversity points in the circuit and puts too much emphasis on "cloud diversity".

First, there are three sections to a circuit, each of which must be considered for diversity.

1 - Local Loop
2 - Intermediate (Long-distance)
3 - MPLS Backbone

Photo Sharing and Video Hosting at Photobucket

The local loop is where most people assume problems will occur, and generally, they are correct. Basic T-1s delivered by LECs are prone to outages due to several reasons. Many T-1s are really just 2-wire DSL lines that are converted to 4-wire T-1s at your office. So, you get to pay for a T-1 and get all the love of a DSL line. Next, T-1s are a CO techs least concern. CO techs will bounce them, loop them, or reprogram them without blinking an eye. Have you wondered why your T-1 is so stable at night and bounces during the day?

The trick with local loop diversity is not just buying another T-1 from a different carrier. For example, say you have a Verizon MPLS T-1 to your office. For diversity you buy a Sprint MPLS circuit. But, no matter the backbone carrier, the LEC is still BellSouth (AT&T). Verizon and Sprint, being the good carriers they are, will look to buy the cheapest access T-1 they can find to maximize their profit......err.....keep your costs low. So, they buy wholesale circuits from Bellsouth and, dutifully, Bellsouth gives them the cheapest circuits available, both on the same cable into the building. So, when that CO tech starts playing, both circuits go down. The best way to get access diversity is to either have the MPLS carriers (Verizon and Sprint in the example) specifically tell Bellsouth which other circuits to avoid (which costs more) or have your circuits delivered via a SONET ring. Surprisingly, many office buildings have SONET rings already installed. The trick is to find the fiber MUX in the building, identify the SONET ring ID, and then get the LEC to provide your T-1 from the ring. Yes, to maximize local loop diversity, you can get T-1s to different LEC COs, but that can be costly. If a SONET ring is available, even if it only goes to a single CO, that is most likely the better business decision.

The second section of a circuit, the long-distance portion, is often the most overlooked part of a circuit. Most MPLS carriers have POPs only in major cities. So, if you have a site in Horseheads, NY your traffic is going to traverse a LD network to reach the MPLS POP in the nearest big city. The LD carrier can vary, but often it is the LEC in the region. So, even though you may have procured two diverse local access circuits for your site, they end up riding the same LD network to the POP. LD outages are less frequent, but do occur, usually due to major fiber cuts or weather events. When a major fiber cut occurs some circuits will move to protection paths on the LD carrier's network, but not all. If both of your circuits are the latter, then you may be down for several hours.

The final part of the circuit is the MPLS POP and backbone. This is what most people focus on for diversity because it's easy to see on a diagram. Two clouds look better than one, so let's get two clouds to be fully redundant. However, as I have pointed out before, global backbone outages are a once-a-decade issue, if ever. Buying services from two carriers introduces more complexity than it's worth. Focus on a single MPLS carrier and then get POP diversity for your big sites and router diversity in the same POP for your small sites. Complete POP outages are rare. I have only seen one myself. This strategy will give you the best protection on this part of the circuit.

Using this three part approach to circuit diversity will give you the best protection against outages for your sites. While outages cannot be completely eliminated, following this model has led to great uptime in our network. We have gone over six months without a site isolation.

Simple concept yet very important.......

0

During the recent fires in S. Cal we lost one of our OC3 circuits into our providers PIP (MPLS) network. If we hadn't required path diversity which actually cost us an arm and a leg, most all of the branch offices would have been without access to our primary data centers.

Excellent blog entry. ;-)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Welcome, visitor. Register Log in
About From the Field

Michael Morris is a communications engineering manager at a $3-billion high-tech company. His background is in enterprise WANs working with telcos and developing large-scale routing designs. He has worked on networks at government and corporate organizations, including networks at two Fortune 10 companies. In his current role, he leads a team of 10 engineers responsible for large-scale IT networking projects and architectural standards for data networks, storage area networks, IP telephony, contact centers, and security. Michael is CCIE #11733 and recently became one of the first three Cisco Certified Design Experts (CCDE) ever (#20080002). He has 11 years experience in networking and communications, including four years as a paratrooper in the U.S. Army. He has a bachelor's degree in MIS from the University at Buffalo and is working on his MBA from NC State University. In 2008, he was awarded the Network Professional Association (NPA) Professional Excellence and Innovation Award for his work on network architecture, templates and enterprise MPLS design.

Contact him.