The Day VMware ate Cisco part 2: A deep dive with Nicira co-founder and the father of SDN, Martin Casado

Shortly after VMware/Nicira announcement, I was able to have a conversation with Nicira co-founder and 'father of SDN' Martin Casado about the hypervisor network and its impact on the future of networking.

1 2 3 Page 2
Page 2 of 3

Martin Casado: Let me just talk first about the notion of the hypervisor being the access layer to the network. When I first got into this, which was five years ago, it wasn't clear at all whether the access layer, the first hop intelligence was going to be on x86 or on a switching asic. If you go forward five years later, it's almost certain that it's going to happen on x86. It's just that both the technology and the economics make more sense that way. So from the technology standpoint, if you have two VMs communicating to each other, the mem copy that you do on the x86 is going to be far faster than doing any sort of DMA-ing through a DMA engine turned into a switching path. Also switching asics are kind of basically limited and you don't need really high aggregation on the server. So it doesn't make sense.

If you look at it from the perspective of a single server, you can do pretty much everything that you need to for virtual networking without doing any sort of special ASIC offload. And the reason this is - from the perspective of a single server - you don't have to have cross-sectional bandwidth of 10 times 48 Gigs. So it's almost certain to me, and I strongly believe this, that the first hop doing networking intelligence is going to be on x86 and the vSwitch. And whether that vSswitch is owned by Cisco or by Microsoft or VMware or Red Hat or whomever, it is almost certainly going to be x86.

The second thing is, let's assume that what I just said was true, which is, if you look at the evolution of networking that the x86 is now the first hop of network intelligence. Then you've got the interesting question which is, within the customer environment, who owns and who controls this new network? And you can see a couple of tabs, not that I've got this from experience  - I don't think we know the answer to that yet. In one world, the networking guys focus on building good physical networks. And they don't focus on all of the stuff that happens on the server. So, say you have a virtual data center that you can do virtual networking. Well, the networking guys build great physical fabrics for whatever gear they want to. And the goal of the physical fabrics is to be very quick and very simple to build out. And then the virtual networking piece becomes a piece of software and becomes a piece of application provisioning. So just as you said, when your application comes out, it has whatever interaction it needs at the virtual networking layer and everything is totally automated. You don't require a human being in the loop. So that's one way that this could play out.

Another way that it could play out on the field is if the networking guys actually have some interaction with or have some purview over the x86. So for example, they could be part of determining what a virtual network looks like and what sort of security policies it should get. Like default when a VM spins up, and what kind of technologies should be used to integrate these virtual networks with the physical network. And I think that this kind of scoping out of territories is still very much under discussion and playing out as we're going.

And for the third thing I'm going to give you two bits of color on that. We have accounts in which the cloud teams will actually take over a network entirely. So they'll actually dictate what the physical hardware looks like, and of course they control all the software. And we have other accounts in which the networking guys specify very specifically what the policies and technologies are used in the virtual networks. And I think that right now you see things across the board and it's anybody's guess where this will converge.

Art Fewell: A lot of enterprises have been focusing heavily on virtualizing traditional enterprise applications and may have had limited insight into whats been happening behind the scenes with XaaS development. For a lot of newer web apps, many enterprises are using ASPs, or they're using XaaS or what have you.  And from what I have seen,  when I go and visit networking departments; lots of times I don't see a lot of awareness of what is happening with the latest in cloud applications, especially web-based customer-facing applications that are more strategic to the business. It seems there is often not a great deal of awareness of how much application development has morphed with modern distributed computing. And while we often still think "I'm the network guy. I can go with a sniffer up there to help them debug at the packet level" the reality is that there's thousands of developers who are now much more capable of debugging complex application streams over the fabric. Given that newer distributed applications send much more complex communications over the fabric than in the past, I don't think the skill set of the average networking guy is right to do network-level troubleshooting and analysis as today this requires extremely deep knowledge of the inner workings of an application.

Martin Casado: This is a very important point. This is an area where you will start to see virtual networking shine. Like you said, if you go and you look at a packet today, that packet tells you very little about what's going on. You don't really know who sent it. You don't really know where it's going to. You don't have any higher level semantics. You just have IP addresses and ports, which are effectively meaningless from end-to-end. A port doesn't necessarily mean an application. An IP address collected yesterday could have been reassigned to a new host. Mac addresses can be overlapping. So it's very difficult to reconstruct something meaningful, given a packet trace.

When you have virtual networking solutions like what we've been doing in the Nicira or what VMware is working on, all of the information that you need to reconstruct what's going on is already maintained by the system, because you have to maintain it in order to build a virtual network solution. So imagine, if you will, that you have a virtual networking system in place and it's collecting all of this debugging information and being stored in a database somewhere. And then you can packet trace, and while you're looking at the packet trace, you can correlate it with this database, and it will tell you the stuff that you're actually interested in.

It will say, "This packet was sent at this time, from this VM to this VM. The policy of the virtual network at the time looks like this." And then you can even ask questions about when did that VM come, did it go, and who was logged on to that VM.  And so we need to move away from this very low-level problem of looking at packet headers to high level questions, such as, who actually sent this? Where was it going? What did the network look like at the time that this happened? And so, if you'll bear an analogy, if you think about programming on a computer, the lowest level thing you can do is look through memory to see what the program is doing, but that - just memory addresses - is very difficult to reconstruct which part of the program is at which memory address and what is going on. But if you use a debugger, it will tell you the symbol and it will reconstruct the context. And so we'll have enough information now to reconstruct the context to get these low-level packet traces.

And so I'm not sure that I would agree that network operators aren't capable of doing this type of debugging, I just think that the tool sets haven't evolved enough to help them to do it. And so what we're going to see is a proliferation of tools which you can feed in, that work level captures. And they're going to spit out high-level and very interesting events. And this is all part of this virtual networking revolution.

Art Fewell: I definitely anticipate so. And I think one of the big things with VMware, being the acquirer, is we can anticipate seeing a lot of these tools emerge as part of VMware's own toolset and partner ecosystem. I find that to be a good thing for the industry. So OVS is not necessarily dependent, it doesn't have to be dependent on OpenFlow.  What do you think about how OpenFlow is going to play into the hypervisor network and the future of OVS?

Martin Casado: Let me talk about Open vSwitch first.  We are absolutely committed to continuing to develop Open vSwitch and even accelerate the development. So we will have as many guys on it, maybe even more, we're going to continue to port it into many different platforms. It's going to continue to be 100 percent open and a solution for anybody to use. We're very committed to that. VMware is committed to that.  As part of OpenFlow support and Open vSwitch, right now there is a tremendous amount of work going on for OpenFlow 1.3 to support, and this is coming from numerous organizations, including Nicira.  You can expect in the next couple of months for that to be complete. We'll have full OpenFlow 1.3 support. Open vSwitch will be available, it will be ported to multiple platforms. It will support OpenFlow 1.3, and it will be open. That is for sure.

Now in my experience, OpenFlow isn't 100 percent suitable right now to solve the full virtual networking problem. There are things that are required that it wasn't built to do. And when OpenFlow was created, it was focused on controlling hardware forwarding pipelines. Nicira has the founding team that created OpenFlow, and it really hasn't gone far enough from those roots to be a hundred percent applicable to the problem of networking within the hypervisor. And so my guess is, going forward there is going to have to be some extensions to OpenFlow or some new protocol, which is better suited for soft-switching at the edge. And at this point I can't really speculate what that will look like, but I do believe something like that will arrive.

Art Fewell: I definitely agree with what you said in your paper.  And I have a follow-up. The real big significance here, something I think that is fairly obvious is, Wall Street pushes everybody to grow, grow, grow. So Cisco didn't have much of a choice but to try to interject their strength in networking to be the centerpiece of the private cloud. But the way I view it, from the perspective of private cloud, networking is a component, the same as CPU resources are a component, and so on.

So when I think about what we're trying to do from the private cloud perspective, we want to look at CPU, we want to look at storage, we want to look at memory, and we want to look at network IO, and make those as high as possible to maximize our efficiency as a key goal. So to me that really speaks to -  within the context of the fabric that connects a cloud container together - If we want to be able to maximize utilization across the fabric, it's really going to require a tight coupling to start to emerge from the application space. 

A lot of people have said that hypervisor networking takes physical switches and it dumbs them down a lot.  There is kind of a commodity aspect because I would anticipate VMware will set their own standards for integration with the physical fabric, but it does seem to me that the requirements to really optimize the private cloud are in some ways going to be more demanding than anything we've seen in mainstream networking - ever. Real-time resource reservation, real-time flow steering, and all of the features that we would need to optimize and keep 70 or 80 percent efficiency levels or whatever the target would be. And so it really seems to me that the future trajectory of not just the hypervisor but potentially of the physical fabric that is part of a cloud container, as separate from the fabric that is connecting different cloud containers together, that fabric is really going to have to become very tightly integrated with the hypervisor network over time. And it's not a pure, simple commodity thing.

Martin Casado: Exactly. And this is a very important point that you are hitting on. People look at these ventures and they think immediately, 'oh, this is about commoditization.' That is not the case at all. The physical network doesn't go away and in fact, demands on them are going to become very rigorous. Because like you said, you're going to be placing workload in different places. You're going to have different constraints and you have to be enforced by the physical fabric. So for the traditional hardware vendors there is a lot of room to really innovate on creating good, differentiated high-speed fabrics. And what is nice about it is they can actually focus on building a fabric instead of also building something that has to handle all of the configuration of all of this other stuff that it needed for provisioning, that's going to go into the software layer. We don't know exactly how technologies will evolve going forth, but there's definitely still going to be differentiated hardware. I think that the technology is going to morph to allow a lot of the provisioning to happen in software at the edge, and then for the fabric to do the forwarding.

Art Fewell: I anticipate in the coming years we will see a significant increase in software based L4-7 services, and over time improved software techniques and CPU improvements will allow for many hardware-centric services to migrate to software. How do you think hypervisor L4-7 network services will evolve?

Martin Casado: I think there are two things going on here.  First is the migration from hardware to software, but that has been happening slowly over time anyway.  Many middle-boxes today have minimal hardware offload (say SSL) with most of the function implemented in x86.  Where we will see a bigger change is that the services must now become distributed.  The software defined datacenter means that any workload can be placed anywhere.  In such an environment, you don't want to funnel traffic through a choke point, but rather keep all of the aggregate bandwidth of the underlying network by distributed the services throughout the network.

1 2 3 Page 2
Page 2 of 3
The 10 most powerful companies in enterprise networking 2022