Skip Links

Our Nexus Data Center Network - To vPC or not to vPC ....???

By michaeljmorris on Sun, 02/08/09 - 8:46pm.

In last week's blog about our new DC network based on the Nexus switch line, I introduced you to Kamal Vyas, the lead network engineer on my team. Kamal returns this week with another blog on our DC network design.


One of the CPOC’s most important tests was the Virtual Port-Channel (vPC) feature supported on Nexus 7000. vPC is projected to be the new spanning tree killer and is a very promising feature especially if you are planning to deploy a big L2 access domain which generally is the case in large DC deployments with virtualization support.

vPC Summary:

  • Empowers port-channels to be spanned across two upstream switches
  • All ports in forwarding state; none in spanning tree blocked mode
  • Efficient use of all available bandwidth

For more information on VPC please refer to the Cisco vPC white paper.

Photobucket

We simulated two access designs. On the left was the traditional Spanning Tree access design. The right side was a 5K access switch configuration with the new vPC access design guidelines and recommendations.

vPC Access Design:

  • Right access switch N5K-vPC (in the diagram) and VLANs 201 and 202 are applicable to vPC access design.
  • vPC has a concept of primary and secondary switch. N7K-01 here is configured as vPC Primary and N7K-02 is vPC Secondary Switch.
  • As per vPC best practices Primary vPC switch (N7K-01) is configured as Spanning Tree Root and also HSRP Primary for VLAN 201 and 202.
  • vPC Keep-alive (ft) link is a 1Gig L3 p2p link between the two Nexus-7Ks; whereas the vPC peer links (regular L2 trunk) between the two chassis leverage 2x10 Gig ports (as per guidelines).

SPT Access Design:

  • Left access switch N5K-SPT (in the diagram) and VLANs 101 and 102 are applicable to SPT access design.
  • N7K-01 is configured as Spanning Tree Root and HSRP Primary for VLAN 101 where as N7K-02 is Spanning Tree Root and HSRP Primary for VLAN 102.
  • Spanning Tree Access design is leveraging a traditional looped triangle design with odd VLANs preferring left Aggregation switch and even ones preferring the right switch.

In our testing, most of the tests were to demonstrate the functionality and observe traffic patterns and convergence times in case of different failure scenarios. Performance testing was out of scope. We also tried to compare vPC design with traditional SPT looped triangle access design behaviors. Traffic generators were leveraged to simulate traffic flow from both core switches down to the access layer. Below is a snapshot of test scenarios and observed traffic patterns.

    Normal Conditions:
      vPC Access Design
    • All members of Po1 @N5K-vPC are in forwarding state. No spanning tree blocked VLANs/ports.
    • Even though HSRP active for VLAN 201 and 202 is N7K-01 both switches (N7K-01 and N7K-02) are forwarding traffic out to core switches with minimal traffic passing across the vPC peer-link
    • Return Traffic is being CEF load-balanced to both N7K from the cores and forwarded directly to N5K-vPC switch.
      SPT Access Design
    • Po1 @ N5K-SPT in SPT forwarding state for VLAN 101 and blocking for 102; vice versa for Po2.
    • Traffic being forwarded out to core switches by N7K-01 for VLAN 101 and by N7K-02 for VLAN102.
    • Return Traffic being CEF load-balanced to both N7K form the core switches and forwarded directly to N5K-SPT switch for respective VLANs.
    • Some traffic observed across the L2 Trunk between two 7Ks (vPC peer links).
    Failure of L2 Trunk/VPC peer link (shutdown Po1 @ N7K-01):
      vPC Access Design
    • Since N7K-02 is the secondary vPC switch it, by design, shuts down (hard down with in 2 sec) all its vPC port channels (Po10 down). Some minimal packet drops were observed. While Po10 and it members are still forwarding @ N7K-01.
    • VLAN SVIs (interface VLANs) 201 and 202 are down on N7K-02, since there are no active interfaces in these VLANs.
    • Since these VLANs are down, N7K-02 stops advertising a route for VLAN 201 and 202 subnets to the cores which receive only a single route from N7K-01.
    • Hence all traffic in both directions is now flowing through N7K-01.
    • Once the peer links are restored N7K-02 brings back Po10 (within 3 - 4 secs) and advertises route for VLAN 201 and 202 subnets to the cores and traffic flows as per normal scenario. No packet drops observed.
      SPT Access Design
    • Since no loop exists any more … N5K-SPT starts forwarding VLANs 101 and 102 on both Port channels.
    • HSRP for VLANs 101 and 102 on both N7Ks go in active-active state.
    • Root guard blocks VLAN 101 on Po3@N7K-02 and VLAN 102 on Po2@N7K-01
    • Outbound (northbound) traffic for VLAN 101 is being forwarded via N7K-01 and for VLAN 102 forwarded via N7K-02.
    • Incoming asymmetric traffic (traffic for VLAN 102 landing on N7K-01 and vice versa) is black-holed as root-guard is blocking the path to the VLANs. (Route tuning was not configured which will be the way to go if SPT design is the selected access design).
    • Once the L2 Trunk was brought back up SPT converges back to the Normal Condition state as stated above. Packet loss observed during re-convergence as well.
    Failure of VPC Keep-alive Link (shutdown keep-alive interface @ N7K-01):
      vPC Access Design
    • No changes in any packet forwarding. Just a syslog message on both N7K specifying the keep-alive link is down. This is not a traffic disrupting condition by itself
      SPT Access Design
    • Keep-alive link is not applicable to the SPT.
    Double failure scenario - 1 (VPC peer link failure followed by a VPC Keep-alive link failure):
      vPC Access Design
    • vPC access design converges to Failure of L2 Trunk/VPC peer state as listed above when the peer link fails.
    • Next when the keep-alive link fails a syslog message on both N7K specifying the keep-alive link is down.
    • Hence all traffic both directions is now flowing through N7K-01 and once the peer links are restored N7K-02 brings back Po10 and advertises route for VLAN 201 and 202 subnets to cores and traffic flows as per normal scenario.
    • Minimal Packets drop observed.
      SPT Access Design
    • Since the L2 Trunk (vPC peer link) ha failed SPT converges to Failure of L2 Trunk/VPC peer link SPT access design state as stated above.
    • Keep-alive link is not applicable to the SPT.
    Double failure scenario - 2 (VPC Keep-alive link failure followed by a VPC peer link failure):
      vPC Access Design
    • When the keep-alive link goes down a syslog message is generated on both N7K specifying the keep-alive link.
    • Next when the peer-link fails this is a split brain condition and vPC members have no way to verify the state of other peer.
    • No vPC port channels are shutdown and both N7Ks keep forwarding packets.
    • Existing flows continue to be forwarded as before the failure; but new flows learning are impaired and uncertain forwarding (or broken state) for new flows are observed.
    • Once the peer links and keep-alive link are restored traffic flows are restored as per normal scenario.
      SPT Access Design
    • Since the L2 Trunk (vPC peer link) has failed SPT converges to Failure of L2 Trunk/VPC peer link SPT access design state as stated above
    • Keep-alive link is not applicable to the SPT.

Hopefully this gives you an idea of where vPC technology is headed in the DC and how it differentiates from SPT-type implementations. Also, as I mentioned in my last blog, even though it was done with an EFT code (not an official CCO release), while testing we were able to demonstrate all above expected behaviors. As of last Friday (2/6) 4.1.3 Nexus 7000 code has been released on CCO and is available for deployments.


Still lots more to come about our new DC from Kamal.

More >From the Field blog entries:

Complete End-to-End Nexus Data Center Design ... (Ok, almost end-to-end!!!)

Cisco Data Center "Big Bang" Announcement - YYYYYAAAAAWWWWWNNNNNN.....

Now a Look at Cisco IOS XE for the ASRs

Taking a Closer Look at the Cisco ASR 1000 Series

If Someone (like your boss) is Asking You What the CCDE Is....

Passing the CCDE is Starting to Sink In

  Go to Cisco Subnet for more Cisco news, blogs, discussion forums, security alerts, book giveaways, and more.