At Interop I was on a panel hosted by the inimitable Jim Metzler titled, "Why Networks Must Fundamentally Change." Now, just judging the attendance alone I must say that there certainly seemed to be enough interest in this topic to warrant a belief that there must be some pressures on IT and the Networking team that are making them wonder how the network will change. The room was seated to capacity, and people were sitting on the floor and standing along the back and side walls - a far cry from Interop panels in the past where success was defined as having more attendees in the audience than panelists.
Now, you may assume that all is well and good and if it isn't broke, don't fix it. You could also take the tack that technology evolution is a given and therefore change is inevitable. Either is fine with me. Let's start by looking at what the major changes are that have an effect on the network that may be influencing the need for different networking models:
- Traffic profiles have changed
- Servers are moving
- The data center is an increasingly structured topology in the largest networks
- Space, Power, Cooling
Traffic Profiles have Changed
Back in the 1990's email connectivity to an executive's desktop was one of the primary driving forces for the deployment of Local Area Networks, LANs, built primarily with Ethernet. We went through the datalink 'wars' in the early 1990s, and by the mid 1990's Ethernet was pretty much the predominant transport. Now, lets look at the traffic characteristics of e-mail - it is a non-real time, asymmetric, bursty traffic pattern that goes from a fixed client to a fixed server and back, over a TCP connection. This is the traffic pattern that defined netwroking architectures from 1995 through today.
In order to better support the delivery of email, to more users, cost effectively the designers of network equipment built in oversubscription - which essentially means: deliver more ports, at a lower price per port, but each port will never have full bandwidth capability. This was perfectly fine for the e-mail application and many devices were oversubscribed at 4:1 or more, and deployed in layers so that the devices could be 'fanned out' to cover an entire floor, building, or campus.
Three layers deep in a classic "Access - Aggregation - Core" model this is essentially 64:1 over-subscription in each direction- again not a problem for a bursty, non-real-time, and asymmetric traffic flow. Everyone assumed Voice over IP would be the 'killer app' to change LANs, but it essentially drove Power over Ethernet more than anything - a 64kb/s traffic flow doesn't get congested too often on a GbE LAN in the campus, and if it does some simple QoS prioritization can deal with it quite handily. Some companies think Video will change everything: it may on the Internet by driving more bandwidth and being a veritable force majeure for service provider backbone upgrades, but it doesn't make a dent in the campus or data center: I mean what is a compressed 3Mb/s 720p basketball game when I am moving it across a GbE LAN?
Today, new types of applications are becoming prevalent: whether you want to look at them as Web2.0, SOA, or Grids, Clusters, or Clouds is a matter of which over-hyped term you want to use and which day of the week it is. The fundamental reality is that traffic patterns are changing in the data center network. The 'North-South/Client-Server' e-mail traffic pattern still exists, but is no longer the primary driver of bandwidth: if, for instance, you go to a search engine and type in 'Squirrel, Point, and Dug' you are likely to hit several hundred servers in a matter of milliseconds as the web server races to put together an accurate response for you, serve you ads, account for ads being served, stick you with cookies, etc. These customers have measured that 10's of milliseconds is the difference between a customer getting the results they are looking for and a potential customer getting frustrated and going to the competition.
Compound this with Storage over Ethernet and IP: whether iSCSI, NFS, NAS, your clustered file system of choice, or nascent FCoE and there is even more traffic going from server to server, server to storage, storage to server, and every once in a while server to client. One measure a customer mentioned to me was that in their facility 90% of the traffic was 'inside' the data center and only 10% left the data center to go to a client PC.
Servers are Moving
With the advent of the virtual machine hypervisor the perfect hardware abstraction layer was created. The operating system does not need to know which specific hardware it is on, it simply needs to know which hypervisor it is on. This enables application and workload portability across server vendor platforms and avoids some vendor lock-in potential.
With live migration or virtual machine mobility you are able to move these virtual machines in near real-time, across a network. The higher the throughput and lower the latency the network, the more effectively this virtual machine mobility works. However, there is a caveat: if you want your TCP connections and IP addressing to stay intact the receiving physical host must be capable of supporting the same IP address that the virtual machine moving to it is actively using. This means that both physical hosts have to be in the same subnet or in the same VLAN depending which layer of the network you are looking at. Since the largest number of physical servers that can be supported doing this is around 64 today it doesn't change the addressing architecture too much, unless the servers are in different data centers, or are connected to different access layer switches that talk to different aggregation layer switches. If this is the case the network architecture all of a sudden starts dramatically impeding the movement of virtual machines: either virtual machine mobility is impeded, or the network is redesigned.
Some people often ask me, "can't I do this with DNS?' In short, no. DNS is cached at many client sites, ignoring your TTL. Additionally, DNS is cached on many PCs for the life of an application session. If you try to change the IP address of your backup server while you are in the middle of a 2GB backup do not expect the connection to continue. TCP doesn't work this way.
So we are left with a challenge- we want to build large, scalable, and stable networks - IP routing does this well. We want to be able to move an IP address from one part of my data center to another, or possibly to another data center, or maybe even to someone else's data center- IP routing impedes this move and prevents it from being stateful today. Several companies are offering non-interoperable solutions for this now.
The data center is an increasingly structured topology in the largest networks
Again, most network switching equipment was designed for campus e-mail distribution. As such laptops come and go, and so do desktops. Nobody wants to keep rigid control over what port a laptop plugs into so the LAN was designed to be as 'plug and play' as possible. MAC address auto-learning and flooding, DHCP, speed and duplex auto-negotiation- these all combined to make it so I can plug my laptop in just about anywhere, get an address, and do my job.
In the data center, especially for the largest data centers in the world, this may not be the case anymore. These features that made life simple, simply do not scale economically any more- as they force the network into a hierarchy that means significantly sub-linear price/performance. In fact it is often cheaper on a per server basis to run a small network than a larger one: something no operator wants.
What if you could turn off all those protocols, force the network to be rigidly static, but learn everything from the inventory management system that does know exactly which server exists on which interface. If something is plugged in the wrong port do not learn and forward it, instead create a change order to have it plugged in correctly. Don't like Spanning Tree? Disable it. If you can manually manage the IP/MAC addresses in your network then you can intelligently route traffic flows to avoid loops. Eliminating MAC learning and IP ARP alone significantly cuts down on broadcast traffic, and this in turn improves server efficiency.
The largest networks in the world are looking hard at their data centers and looking for solutions to making them scale effectively. The solution is not, 'more of the same' of what they have been doing for 10+ years.
Space, Power, Cooling
These words never used to matter when many of the network systems in use today were being designed 10-15 years ago. Thus low efficiency AC conversion, horizontal airflow, and over-investment in silicon at low density/utilization or non-existant silicon refresh rate was the norm. If you have a switch that has any of the following characteristics: side-to-side airflow, less than a 90% AC conversion rate, greater than 50W/wirespeed 10GbE port is was really not designed for the data center requirements of today.
We ran out of space in many data centers: so everyone in the vendor community built denser servers, storage, and switches. Enter the blade server for instance.
We ran out of cooling: this made APC and Liebert very happy as CRAC units became increasingly more common.
We ran out power: and the power companies didn't have any more to give. If you need more power, you may have to subsidize a new sub-station. In many cases it is cheaper to move to a new managed facility out-of-state and put a high performance multi-gigabit WAN connection in than it is to try to extend the life of an outdated asset. Even the government is 'getting it': in the State of California the government is requiring a 30% power reduction for all state IT assets.
Space, Power, and cooling are top of mind concerns of an data center administrator today. When you build a facility that is supposed to last fifteen years and the IT assets you are populating it with are obsoleting it every 5 years there is a fiscal challenge that must be met. How do we extend the lifecycle? What does the network need to look like to offer efficient service delivery? How can a provider achieve a better economy of scale to be profitable in delivering service to clients?
In next week's edition of this blog I will roll up your comments and suggestions, as well as put a few ideas of my own out there for how we can change the network, in the data center, to address some, if not all, of these challenges and harbingers of change.
Douglas Gourlay is the vice president of marketing at Arista Networks - the leading developer of 10Gb Ethernet switching platforms. In this role Gourlay is responsible for the global marketing and product management for Arista. Arista has recently won the ClearChoice award by NetworkWorld for top 10Gb Ethernet data center switch, and Best of Interop: Infrastructure and overall Best of Interop for the Arista 7500.
Prior to joining Arista Networks Gourlay was the vice president of Cisco’s Data Center Solutions Group, where he defined and executed Cisco’s global marketing strategy for data center, virtualization, and cloud computing. This included the Nexus and Catalyst data center switches, application and server load-balancing, storage networking, blade switching, and wide-area application services product families. Under his leadership Cisco’s data center segment grew from a nascent business to over $5B in annual revenue.
Since 1998 Gourlay has led and contributed to numerous hardware, software, and systems architecture developments across Cisco. He has served as senior director of product management for the Nexus Family of data center switches, director of product management for the Catalyst 6500 Series of LAN switches, and led product management for Cisco’s Application Delivery product family. Gourlay has filed or holds more than 20 patents in networking technologies.
Prior to his work at Cisco, Gourlay was an industry consultant and served as a US Army Infantry Officer. Gourlay is an avid pilot and can often be found tinkering on his Cirrus at Palo Alto Airport.