It's Microsoft vs. the professors with competing data center architectures

Both SIGCOMM 2009 proposals seek to ease complexity of network control plane, virtual machine migration

Researchers from Microsoft and the University of California at San Diego have come up with divergent schemes to address shortcomings of data center architectures, particularly management and configuration burdens, and to promote the efficient use of virtual machines.

Researchers from Microsoft and the University of California at San Diego have come up with divergent schemes to address shortcomings of data center architectures, particularly management and configuration burdens, and to promote the efficient use of virtual machines. 

The two groups presented their findings at the SIGCOMM 2009 conference this week in Barcelona, and each had its own flavor. The Microsoft team sought high performance for all traffic regardless of demand, while the UCSD team focused on allowing the free migration of VMs, minimal configuration when adding new hosts to the network and quickly addressing failures.

Evolution of the router 

Microsoft's researchers also addressed VM migration and Layer 2-like addressing but using a method that calls for installing an agent on every endpoint, which contrasts with the UCSD group's plan to tweak switch software and leave the endpoints alone.

The UCSD effort led by Amin Vahdat, a professor of computer science at the school, proposes a blend of Layer 2 and Layer 3 connectivity for data centers that enables massive scaling that is otherwise limited by Layer 2 factors and reduces the management and configuration demands of Layer 3.

Diagram of data centers using PortLand

They say their PortLand protocol could support a data center network of 100,000 servers without modifying any of the host machines. The group presented its findings in the research paper "PortLand: A scalable Fault-Tolerant Layer 2 Data Center Network Fabric".

Making the addition of devices to the network plug-and-play -- with no configuration or modification of end devices -- was a key goal of PortLand, Vahdat says.

It would support VM migration, something Layer 3 can't do because VMs can move from server to server, each with a different IP addresses. It also introduces a flat mechanism for sharing PortLand-assigned media access control (MAC)  addresses that overcomes the memory limitations of most switches by reducing the size of the address tables each switch has to store, Vahdat says.

PortLand requires additional software that enables switches to discover their place in the data center topology. The software also enables switches to assign a pseudo MAC address to each device that is directly connected to them.

Under PortLand, switches maintain tables of PMAC prefixes and forward traffic to the appropriate switch until the traffic reaches the switch the destination device is attached to. That switch translates the PMAC to the actual MAC so the traffic can be delivered to the correct device, Vahdat says.

To facilitate forwarding traffic, PortLand includes a Fabric Manager server, which performs a function analogous to a DNS server in resolving URLs with IP addresses. Rather than broadcast for address resolution between PMACs and the IP addresses, switches redirect broadcast ARP requests from their connected hosts to the Fabric Manager, which replies with the appropriate IP address.

Fabric Manager maintains a soft state of the network so if it crashes, it can reconstruct the address information from access switches in the network using the PortLand protocol.

If Fabric Manager crashes, the time to continue communication on the network is negligible because the protocol reverts to broadcasting for address resolution, Vahdat says. If Fabric Manager is operating, the lookup runs at wire speed.

PortLand also respects the line drawn between devices network administrators control and the hosts controlled by system administrators. Rather than modifying the host MAC address directly using an agent and a server, the PortLand architecture has the switches translate MAC addresses to PMAC addresses. "We let the end host be what it is and make just small changes to the switch software and no changes to the switch hardware," Vahdat says.

Microsoft's scheme

Microsoft's team, led by Albert Greenberg, David Maltz and Parveen Patel, also deals with the addressing problem by introducing a two-tiered system, a location-specific IP address and an application specific IP address that follows applications around as they migrate to new VMs.

Under the Microsoft VL2 architecture, each server is associated with the location-specific IP address of the switch it is attached to. As with PortLand, a VL2 directory system maps the location IPs to the application IPs. A VL2 agent on each server retrieves the location-specific IP address of the switch nearest the destination server and encapsulates application packets inside it.

Deploying an agent and configuring servers is something PortLand avoids. But VL2 has other features that PortLand doesn't address. For example, VL2's directory server can refuse to provide the location-specific IP address if access policies deny the initiating server connectivity to the destination server. This gives VL2 the ability to enforce access control.

Microsoft's researchers go beyond the ambitions of PortLand by looking at data center traffic patterns and designing a network topology that chooses paths for each traffic flow in a manner that avoids persistent congestion hot spots and provides uniform high capacity between any two servers in the data center.

VL2 calls for a layer of highly integrated aggregation switches with so many connections to a higher layer of intermediate switches that in the event of a failure, performance degrades gracefully, the Microsoft researchers say.

Learn more about this topic

Data center derby heats up

The bridge to terabit Ethernet

What is a router?
Join the discussion
Be the first to comment on this article. Our Commenting Policies