Recently, I’ve done a few labs with OTV in them. Each time, however, I had an OTV SME (subject matter expert) in the lab with me. They configured it and it just “worked magically." No hiccups. No gotchas.
I really don’t seem to learn well if things are super simple and just "work." I seem to need to struggle with it and wrestle with it a bit. But most importantly, since I seem to learn visually, I need to "see the flow in my mind." Which means I need to Wireshark the "magic."
*For those of you new to OTV, I’m not going to explain it here as there are plenty of great URLs and documents out there that goes into great detail regarding OTV. Two I would recommend are:
So Let's Start Playing!
Let’s say we had a vSphere environment with a few ESXi and some VMs. But for this blog let’s just specifically look at two of them. 2 ESXi hosts and 2 VMs. Ultimately, what we’d like to be able to accomplish is to vMotion one of the VMs over to ESXi #42.
The reality of the physical situation is not quite as simple as the above diagram. These 2 ESXi hosts are actually in 2 different data centers. Of course, it is a lab environment, so it isn’t overly complicated either. My "cloud" between the "data centers" is actually just a twinax cable and the 2 Nexus 7K VDCs running BGP between them.
I’m not going to address here all the varying options of what we could have done here. We will pretend that the team has already decided to go the OTV route and we need to get OTV up and running between the two data centers.
Begin with the End in Mind
What is the "end" for this? The "end" is "vMotioning" a VM from one ESXi to another. This means that we have to know the needs of the application. So we need to talk with the VMware experts on our team and learn how vMotion works and how it is set up in our environment.
What we learn is:
- the vMotion is enabled on vlan 158 on the VMkernel - which means we need to extend vlan 158 over OTV
- the VMs are on vlan 1121 – which means we need to extend vlan 1121 over OTV as well
Is that enough to know? Depends. It wasn't for me. But then again, I got burned. So now I check two more things:
- vMotion truly enabled for the VMkernel (see below diagram. The line to the right beneath VLAN ID)
Below is a screen capture on just one of the ESXi hosts. I always check both sides, though. This is a lab, and hence a shared environment.
Configure and Avoid the Common “Pitfalls”
I’m far from being an OTV SME. In fact, I have only done multicast OTV thus far in all my labs, no unicast OTV. But I can at least share with you my "mental checklist" I now have when configuring OTV with multicast:
- MTU, MTU, MTU – Just like any L2VPN there is overhead that is going to help with the “magic” of extending a layer 2 vlan across layer 3 boundaries. Make sure to increase the MTU of all the links from OTV edge device to OTV edge device to take this into account.
- Site vlan – design, know, and configure the site vlan for each site. The site vlan must be the same for the OTV edge devices in the same physical site. Do NOT extend this vlan over OTV. Note: I tend to forget about site vlan when there is only 1 OTV edge device in each data center.
- Site ID – design, know, and configure the site ID for each site. The site ID must be the same for the OTV edge devices in the same physical site. The site ID must be unique between sites (West and East below must each have different site IDs)
- No shut the overlay interface – I tend to forget this.
- Extended Ping Between Join Interfaces – routing needs to work. We need to be able to do an extended ping between the IP addresses of the OTV edge device join interfaces.
- Multicast Control Group ASM – the control group runs in “any source multicast (ASM)”. This means that there needs to be a multicast RP defined, configured, and in the routing table.
- Multicast Data Group SSM – the data group runs in “source specific multicast (SSM)”. This means that devices need to enable SSM for whatever group range is being used here.
- Multicast functionally working OTV edge device to OTV edge device
Time to vMotion and Sniff
Our end game was to get so we could vMotion the VM over to the Data Center East and sniff the vMotion session. So let’s get to the end game and see the magic underneath.
Above we can see the vMotion session. The OTV encapsulation of the layer 2 extended vMotion TCP session is GRE encapsulated.
Voila! I now have a picture of the packet in my mind. That always helps me.
Hope it helps you too!