It's still early, but Cisco ACI shows promise as a new way to link networks and applications in a highly virtualized data center
Cisco Nexus 9000
Cisco's Application Centric Infrastructure (ACI) is a revolutionary re-thinking of how to provision and manage data center networks. While the early version we looked at has some rough edges, and Cisco still has some hard problems to solve, ACI has the potential to completely change the way that large, highly virtualized data center networks are configured and built.
Just so there’s no confusion, ACI is not Cisco’s version of Software Defined Networking (SDN). While SDN, for many network managers, is a solution in search of a problem, ACI is something entirely different. It’s Cisco’s attempt to solve the most significant and important problems facing data center managers: how to more closely link the provisioning of data center networks with the applications running over those networks.
The goal is to reduce human error, shorten application deployment times, and minimize the confusion that can occur when application managers and network managers speak very different vocabularies.
By capturing the “intent” of an application directly from the application owner, ACI lets the application owner control their network provisioning, creating a consistent and documented configuration in network elements. The data center network is presented as abstractions that make sense to the application owner. This simplifies scalability issues that tie up network managers, such as subnets, VLANs, virtual routers (VRF), and access control lists (ACL).
ACI proposes extending far beyond Cisco’s recently announced Nexus 9000 switches to include other Cisco switching products (especially the Nexus 2000 fabric extender hardware and Nexus 1000V VMware virtual switch) and third party load balancers and firewalls. Because we only were able to look at preliminary software versions and beta-test hardware, it’s too early to say whether ACI is a home run or a foul ball.
What we can say is that ACI is an amazing new way to look at data center networking, and if Cisco’s engineers and partners can follow through on the promise and premise of ACI, network managers could find themselves spending a lot less time configuring, troubleshooting, and debugging data center configurations. And that’s a very, very good thing.
ACI = APIC plus optimized devices
ACI includes two main components. The first is the Application Policy Infrastructure Controller (APIC), which we witnessed as a virtual machine. The APIC maintains a database of application configuration information, turns that information into network configurations and pushes those configurations into devices.
The second part of the ACI product line is a set of network devices -- real and virtual -- that are optimized for ACI. This includes Cisco’s new Nexus 9000 switches when run in ACI mode and with ACI-supported line cards, and a new version of the Nexus 1000V virtual switch called the AVS that operates as part of VMware’s vSphere virtualization environment.
The APIC works with more than just ACI-optimized hardware on the network end. In our testing, Cisco showed off a number of third-party products, including Citrix and F5 load balancers, as well as some of their own hardware: the ASA firewall, Catalyst switches, Integrated Services Routers (ISR) and Aggregation Services Routers (ASR).
Adding new network hardware and middleboxes (such as firewalls and load balancers) is a matter of writing a “device package” that translates network configuration information into configuration commands.
Network managers and application owners can interact with the APIC in a number of ways. The most direct way is using APIC’s own GUI, a basic tool for defining application data flows. Network managers start in the APIC GUI by running device discovery and adding ACI-compatible hardware to the APIC, then providing network connectivity information and pools of resources.
From there, application owners take over by defining the endpoints in their applications, and creating a “contract” for communications between the end points: who is allowed to talk to who, and how, what quality of service is required, and what network services are needed. The ACI team knows about the slow demise of Fibre Channel, and included both traditional network flows and Ethernet-based storage flows.
Based on all this network and application information, the APIC takes over and translates the end-point controls defined by the application owner into path controls needed within the network, and then takes responsibility for pushing those configurations into data center devices.
Because ACI aims to solve the larger problem of integration between applications and networks, there are more ways to interact with the APIC. Interfaces between Cisco ACI and orchestration tools (specifically, OpenStack), system management tools (including HP OpenView and Solarwinds products), and virtualization hypervisors from VMware, Microsoft, and RedHat are all in the works. Some of this work is being done by Cisco engineers, and other parts by the third party software vendors. In our first look, we only saw Cisco products being managed by ACI, including Nexus 9000 switches and ASA firewalls.
In addition to fabric configuration, the APIC has a host of analytic tools, providing roll-up views of applications, groups of applications (“tenants” in ACI-speak), and the overall network. Cisco told us that it is trying to integrate the network information with workload management, possibly triggering virtual machine migration, for example, when a port becomes overloaded.
Not just another configuration tool
The APIC is not just an automation engine that simply creates long configuration files. The physical (and virtual) switches that are ACI-aware are a critical and valuable part of the system. ACI makes heavy use of VXLAN (Virtual Extensible LAN) technology, an IETF specification that scales layer 2 VLANs across a layer 3 (IP) infrastructure.
Many network managers are familiar with IEEE 802.1ad, also known as “QinQ,” a way of encapsulating a VLAN-tagged frame inside of another VLAN-tagged frame. QinQ increases the number of VLANs from about 4,000 to about 16 million, but still uses a basic layer 2 infrastructure.
VXLAN is similar to QinQ, except that the original layer 2 Ethernet frames are encapsulated into UDP packets which can take advantage of layer 3 services: IP multicast, routing, load balancing using multi-path communications, and so on. VXLAN is a way to scale up inside of data centers and solves problems that very large networks have with multi-tenant deployments, multicast traffic, spanning tree protocol, and VLAN limits.
What does all this VXLAN mean? ACI builds the data center fabric on top of VXLAN. This allows any-to-any layer 2 connectivity. In traditional server environments, that’s not such a big deal because each server stays connected to the same set of Ethernet ports all the time. But when virtualization hits the data center, a single physical server may have dozens of virtual servers, each with their own MAC address and their own layer 2 connectivity requirements. More importantly, as those virtual servers migrate between physical servers, there’s a requirement for the network to keep everything straight so that each virtual server is properly connected to its VLAN and subnet.
The ACI-supported VXLAN fabric in Nexus 9000 and Nexus 1000V virtual switches, backed with IS-IS dynamic routing protocols, makes this all hang together properly. The dynamic nature of virtualized environments is a huge advantage to application owners, but only if the network doesn’t restrict things. The idea behind VXLAN-based data center fabrics is to make the network flexible enough to handle very large scaled-up and scaled-out virtualization environments.
Cisco told us that an APIC cluster can handle up to 1 million IPv4/IPv6 end-points and 200,000 network ports, an assertion we couldn’t begin to test. Fundamentally, the APIC is a configuration and management tool, and doesn’t play any part in the network (other than capturing statistics) once the configuration is defined and applied to the network.
If there was any question that ACI doesn’t compete with a classic SDN like OpenFlow, this clinches it: APIC has nothing to do with data forwarding other than presenting the configuration to the devices in the network. ACI solves different, more interesting and more urgent problems than SDNs do.
ACI’s VXLAN fabric has some big benefits. IP addresses become completely portable, and access control policies and forwarding rules are completely decoupled from physical network ports. By acting as a distributed “default gateway,” the spine-and-leaf ACI fabric quickly routes packets along an optimal shortest-path from server to server without bottlenecks caused by inter-switch trunk ports or routing engines.
But ACI does come with a range of restrictions. Although the APIC can configure other devices, there’s a difference between being an integral part of the VXLAN-enabled data center fabric and acting as a pass-through middlebox, like a firewall or load balancer. Other hardware in the Nexus family without the same support for VXLAN, such as the Nexus 5000 and 7000, can’t participate fully in the spine-and-leaf topology of a Nexus 9000/Nexus 1000V ACI network. When we asked Cisco how existing customers could take advantage of ACI, the company told us: "While it's too early for Cisco to publicly share details on the technology, we plan on presenting an open solution that would allow existing environments to participate in the policy-based automation aspects of ACI.''
A more problematic question is how ACI will integrate with the many middleboxes, such as firewalls and load balancers, that are so common in networks. When a middlebox is “outside” the application, ACI doesn’t really care. But as network managers try and push these services into applications, both within a tier (often called “east-west” communications) and between application tiers (often called “north-south” communications), ACI’s “transparent layer 2 over layer 3” model begins to show some weakness. For example, in the version of ACI that we looked at, only transparent firewalls are supported.
The result of all this is a fascinating paradox. If ACI were just “more of the same,” it would be easy for Cisco to get it right the first time because they’d be building on established practices and systems. But ACI is so different that the final state may be quite different from the product family we saw. Cisco is going to have to roll this technology out into a lot of large data centers to understand where it works, where it doesn’t work, and what adjustments they need to make. The software and hardware we tested were still very much in beta test.
It’s a big gamble, but it’s worth it. ACI is going to be most interesting in very large environments, data centers where 4,000 VLANs haven’t been enough for a long time and where virtualization consumes not just racks, but rows of servers. That doesn’t mean that smaller networks won’t be able to use it, but the architecture today requires a certain amount of scale and size to be worth the investment and complexity.
Network managers have a lot of problems scaling up highly virtualized large data centers with the tools they have. If ACI can meet its goals and enable application managers to push policies into the network by simply describing how their applications work -- well, that’ll be something that no one has done on this scale before.
Snyder, a Network World Test Alliance partner, is a senior partner at Opus One in Tucson, Ariz. He can be reached at Joel.Snyder@opus1.com.
Bryan Lunduke talks with Martin Wimpress—the man behind Ubuntu MATE—about why he decided to make his...
I love my iPhone 6 Plus—and that’s Apple’s problem.
The Internet of Things is predicted to grow to a $1.4 trillion market by 2020, which means there are...
The website of toy maker Maisto was infected with malicious code that distributed CryptXXX, a new and...
Follow these steps to reap the benefits of SDN without disrupting your IT environment
Three ways to respond to demands for a fast, iterative, rapid-feedback monitoring solution
Flame wars in the bug tracker might be exactly the right (harsh) feedback your code needs