Integrated server blades, networking and management make UCS a strong contender for fast-growing data centers in this exclusive Network World test
If you're tempted to think of Cisco's Unified Computing System (UCS) as just another blade server — don't. In fact, if you just want a bunch of blades for your computer room, don't call Cisco — Dell, HP, and IBM all offer simpler and more cost-effective options.
But, if you want an integrated compute farm consisting of blade servers and chassis, Ethernet and Fibre Channel interconnects, and a sophisticated management system, then UCS might be for you.
When Cisco introduced UCS in 2009, based on a 2006 investment in Nuova Systems, everyone had an opinion about Cisco entering the server business. Now that they've had a couple of years to prove their case, we wanted to take a closer look and see whether UCS had lived up to the initial excitement.
We found that for some environments, Cisco has brought a compelling and valuable technology to market. Cisco UCS offers enterprises greater agility and lower deployment and maintenance costs, and is especially attractive in virtualization environments.
While UCS won't be attractive in some data centers, and won't be cost effective in others, it does have the potential to make life, and computing, easier for data center managers.
Under the hood
Cisco UCS has three main components: blade server chassis and blades, a fabric interconnect, which is networking based on Cisco Nexus 5000 switch hardware and software, and a management system resident within the fabric interconnect that controls it all.
The blade server chassis is fairly simple, and there's a competitive selection of blade CPU and memory options. Networking is integrated, not just within the chassis, but between multiple chassis (up to about 20 within a single management domain today).
But what really makes Cisco UCS worth considering is the integrated hardware and configuration management. In UCS, a single Java application (or CLI) is used to manage the hardware and network configuration for up to 176 blades today, with a doubling of that expected to hit the streets soon.
The management system runs as a software process inside of the (mandatory) UCS 6100- or 6200-series fabric interconnect hardware, and is responsible for configuration of the chassis, the blades, and all networking components.
If you follow Cisco's advice and use two fabric interconnects, you'll have high availability for networking, and high availability for UCS management. The management system automatically clusters and runs in an active/passive high-availability mode spread across the two fabric interconnects.
The management interface actually takes the form of a documented XML-based API, accessible either via Cisco-provided CLI or GUI tools, or, if you want to write your own tools or buy third-party ones, directly via the API. We used the Java-based UCS Manager software, which is what anyone with a single UCS domain would want to use, in most of our testing.
Because a UCS domain is limited in size today to about 175-ish servers connected to a single pair of UCS fabric interconnects, it's likely that many customers will have at least two domains for two data centers. In that case, you can manage the two domains separately or buy a third-party "orchestrator" package that lets you work across domains. Cisco actually offers a free open source tool called "UCS Dashboard" that lets you roll up two or more UCS domains into a single read-only view.
There are some limitations to the reach of the management system. For example, if you provision a new server with SAN connections, there's no way for the management interface to reach over to the SAN to make the linkage and match up Fibre Channel names. The same is true for networking: just because you create a new VLAN using the UCS management system doesn't mean that the rest of your network will know about it.
UCS management at your service
UCS management is based largely on the concept of "service profiles," a series of parameters that define every aspect of a single blade server, from BIOS versions, power and disk settings, to network interface card configurations including media access control addresses and storage-area network identifiers.
Once you have created a service profile for a type of server, you use it whenever you want to add servers to your mix. Install the blades, and then apply the service profiles in an "association cycle." Within a few minutes, a server can be provisioned that matches your requirements.
We can say one thing: you don't know what you're missing until you've seen UCS management in action. Getting a server from out-of-the-box to ready-to-use is reduced to a bare minimum of effort. This makes UCS ideal for enterprise environments where the number of servers is sizeable and growing continuously.
If you're not constantly adding new servers, and incurring the pain of configuration and deployment, then an investment in UCS is less compelling.
Digging deeper into UCS servers
Cisco UCS may be all about management, but if the servers that make up UCS don't make the grade, then there's no point. We found a solid core of full-featured blades, but also a lot of obsolete and niche UCS products on the web site and price list that had to be cut away to understand what was really important. In both servers, and in networking options, Cisco has a lot of parts that confuse the issue, making things more complicated than they need to be.
Cisco currently offers B-series (blade) and C-series (rack-mount server) options for UCS, although the B-series are all that matters. The B-series are blades that go into an eight-slot chassis (the UCS 5108), and the C-series are standard 1U to 4U rack-mount servers.
The B-series blades have changed over time. Cisco started with an "M1" series of blades, some of which are still on its price list, and has since gone through an upgrade cycle, offering B200, B230, B250, and B440 M2 blades. Today, the "M2" series includes two-socket and four-socket offerings based on Intel 5600 and E7 series processors with four to 10 cores per socket, CPU speeds up to 3.46 GHz, and with up to 512GB memory.
Blades come in both single slot and double slot configurations, depending on the number of disk drives and the amount of memory you want. (Cisco confusingly calls these half-slot and full-slot, which means they should have called the 5108 chassis a 5104 chassis, since it really only has four "full slots.") Most environments will be based on the single slot configuration, giving eight blades per chassis.
Compared to existing 1U servers from traditional vendors, the B-series blades stand up as very competitive offerings from a technology point of view. In fact, with Cisco's Extended Memory Technology, B200 two-socket servers can have as much as 384GB of memory, beating out traditional rack-mounted Intel Xeon 5500/5600-based servers that top-out at 144G or 288GB (using very expensive and not-very-available 16GB DIMMs). Even if you don't want that much memory, Cisco's higher DIMM slot count lets you use less expensive (per gigabyte) DIMMs to achieve the same memory capacity.
As with any blade server, the focus is on network-based storage via SAN rather than local storage. The B-series blades all have the capability to handle two or, in the case of the B440, four drives, but local storage is extremely limited. If more than four drives of local storage on a single system are important, then blade servers are probably not right for you.
The C-series includes six standalone devices, from 1U to 4U and with a storage capacity of between eight and 16 drives. Anyone looking at UCS should focus exclusively on the B-series, for two reasons. First, while the C-series have most of the capabilities of the B-series blades, they aren't managed and controlled in the same way, although Cisco told us they are working to smooth out the differences.
Secondly, and more importantly, there's just not a lot of point in buying standalone servers from Cisco. All of the advantages of UCS disappear when you're talking big servers with lots of local disks. Once you put a lot of disks on something, it's no good for hypervisor virtualization, and it's no longer a cog in the machine of the data center.
If you had a big Cisco blade server farm and wanted to throw one or two rack-mount standalone servers in, you could do that for a special purpose, but there's no good reason to build UCS in your machine room based on rack-mount servers.
Cisco's blade chassis, the UCS 5108, is also very competitive with other blade chassis on the market. The 6U unit has four power supplies and eight fan trays and is designed for easy maintenance both of the chassis and the blades inside of it. Features such as front-to-back airflow and cabling are all set up for modern data center environments. If you put the UCS 5108 in your data center, you're not going to be surprised by any poor design choices.
On the other hand, the raw blade servers you get in UCS are not going to stun you with their brilliance either. Now that most servers are being treated as commodity systems using the same chipset, there's not a lot of room for computing innovation while maintaining compatibility.
If you've been buying servers by the dozen from Dell, IBM, and HP, Cisco's blade server specifications and capabilities aren't going to be very far afield from what you're used to.
UCS is primarily a server product designed to be sold to data center managers, not network managers, but, as you would expect from Cisco, there's a very strong awareness of the problems of networking in the data center.
For example, a fully configured UCS chassis with eight servers inside will usually only require four power cables, and four data cables to connect to the enterprise network: two 10Gbps ports out of the interconnect card on one side of the chassis and two out of the card on the other side.
That's not bad for eight servers, which would traditionally require eight times as many patch cords for both storage and networking, and four (or more) times as many power cables.
To understand the networking, you have to see that Cisco has created a distributed switch, extending all the way from the traditional distribution-layer switch down to the NIC in the blade server, and even to virtual NICs in virtual machines running on a blade server.
Cisco UCS includes two critical pieces that make this large-scale distributed switch possible. The first piece is the Fabric Extender, the UCS 2104XP. This card — and you need two of them per blade chassis, unless you are simply building a test system — sits in the UCS 5108 chassis, and aggregates the traffic inside the blade server, including both Ethernet and Fibre Channel, from all eight blades over internal 10Gbps interconnects. These fabric extenders shoot the traffic up to the second critical piece, the Fabric Interconnects, (based on Cisco Nexus 5000 switch hardware) over multiple 10Gbps connections.
The benefit of UCS to the network manager is that everything, from the fabric interconnects down to the Ethernet cards in the blades, is managed as a single entity. There's no difference between the management of the core switches, the top-of-rack switch configuration, which wires go to what ports, or how VMware networking is configured — it's all done by one person, the UCS chassis manager, using Cisco's UCS management tools.
The networking handoffs between a Cisco UCS domain of a hundred or more servers and the rest of the data center occurs at the fabric interconnect, where a few Ethernet and Fibre Channel connections link UCS to the core LAN and Fibre Channel switches. It's sophisticated networking, but the details are hidden. Remember that UCS is managed by server managers with a minimum of requirement for networking expertise. To set your expectations properly, pretend that it doesn't say "Cisco" on that nameplate -- this is not a network product, but a server product.
Once the configuration is loaded into a blade, the blade's networking configuration is done and isolated from other devices. That means that when you start to load an operating system on a configured blade, all you see are the Ethernet and Fibre Channel ports configured by the UCS manager.
In a VMware environment, the UCS manager brings virtual ports to each virtual machine. The VMware Vswitch is gone (if you want), because the Vswitch has been replaced by the UCS fabric extenders and fabric interconnect, a true physical switch. There's no need for the VMware manager to understand VLANs, Vswitches, or anything other than normal LAN and storage interconnections.
These configured ports on blades show up as virtual ports on the fabric interconnect. Every virtual NIC on every VLAN (and every Fibre Channel adapter) available to every blade has become a port on the fabric interconnect, literally thousands of them in some situations.
While the fabric interconnects are based on the same hardware as Cisco's Nexus 5000-series switches, you don't get the full IOS configuration capability you might have expected on the Nexus switch. The fabric interconnect switches traffic, but that's about it, meaning that more powerful Layer 3 switch features, such as routing and access control lists, are not available at this level.
If you've gotten used to advanced security features of Cisco's Nexus 1000V virtual switch in your VMware environment, you won't find them in Cisco UCS, and you'd have to combine UCS capabilities and the 1000V, losing some of the benefits of UCS.
Cisco goes even further and strongly suggests you run the fabric interconnect in "End Host" mode which disables spanning tree, making the UCS domain connect up to your network as if it were a really, really, big host. UCS then can spread the load of different VLANs across all uplinks from the fabric interconnect to the rest of the network. This advice makes it clear who UCS is designed for: not the network manager, but the server hardware manager.
Strict configuration makes for simplified networking
Networking flow in Cisco UCS is very hierarchical and very constrained. Every blade connects Ethernet data, Fibre Channel data, and some out-of-band management traffic, over two private 10Gbps connections. These two connections are internal within the chassis, one from each blade to the two fabric extenders also within the chassis (in the normal case). The fabric extenders connect upwards, out of the chassis, to the fabric interconnects, typically using two ports per fabric extender for a total of four ports per chassis going to two fabric interconnects.
From the fabric interconnects, Cisco UCS connects to the rest of your Ethernet and Fibre Channel network via separate Fibre Channel and 10Gbps Ethernet connections.
Some variation in networking is possible, but not a lot. Cisco has multiple Ethernet cards available for the blades, but most network managers will use the M81KR adapter, code-named "Palo," which presents itself as Fibre Channel and Ethernet NICs to the blade, and has two 10Gbps internal uplink ports.
There's also an Ethernet-only card if you don't want Fibre Channel, which will save you $300 a blade. However, if you're not heavily into Fibre Channel storage, all of the networking integration and many of the provisioning advantages of UCS won't mean anything to you — which suggests that UCS works best in a Fibre Channel environment.
In other words, if you're using iSCSI or local storage, you're not a great candidate for seeing the advantages of UCS.
When we looked at UCS last month, the fabric extender was limited to the 2104XP, which has eight internal ports (one for each blade) and four uplink ports to the fiber interconnect, all at 10Gbps. A 2208 model has been announced (along with a matching high-density Ethernet card), with 32 internal ports and eight uplink ports, for the rare environment where 10Gbps is just not enough for a single blade.
The fabric interconnects have also been revised. Cisco originally released the UCS 6120XP and UCS 6140XP, able to handle 20 and 40 chassis ports plus uplink capacity. The current replacement for both is the UCS 6248UP, with a total of 48 ports. Depending on how the rest of your network looks, that would leave you room for 20 to 22 chassis per switch. The unannounced-but-nearly-ready UCS 6296UP would double those numbers, allowing up to 44 chassis, or 352 blades, per UCS domain.
Those maxima are pretty important, because you can't grow UCS domains (that's the word Cisco uses for a combination of fabric interconnects and chassis) beyond two peer-connected fabric interconnects.
If you follow best practice recommendations for redundancy, that means you start with two fabric interconnects (which are clustered into a single management unit), and can have up to about 22 chassis, or 176 blade servers, per UCS domain using released hardware. (Double that if you're willing to wait for the UCS 6296UP to ship.)
All of these configuration guidelines and capabilities make UCS networking a great fit in some environments, but not in others.
If you've had networking configuration and management problems with large virtualization environments or even physical environments with lots of servers, Cisco UCS provides a dramatic simplification by creating a flat distributed switch that reaches all the way down to each guest virtual machine.
If you've been burned by cable management problems, or if the idea of bundling more than 150 servers or 1,500 virtual systems into four racks with 80 internal patch cables and less 10 external patches seems like a good one, then the network density and rollup of UCS will definitely drop your blood pressure. And reduce the likelihood of patching and configuration error.
Is UCS right for you?
After spending a week looking in-depth at Cisco UCS, as we did, it's easy to come away excited about the product. The engineering is solid, the software isn't buggy, and UCS clearly has something to offer to the data center manager.
On the other hand, UCS is not for everyone. If you've only got a 100 servers in your data center, or if you're not growing racks full of servers every few months, you won't enjoy the management interface, because you're not feeling the pain of deploying servers.
If you're worried about single vendor lock-in for hardware and networking, if you run the same application on 10,000 servers, or if capital costs for servers are a major concern, Cisco UCS won't be very attractive to you.
Cisco UCS is thoroughly modern hardware. The performance (running industry standard benchmarks) in both virtualization and non-virtualization environments is outstanding. Features such as power management, hardware accessibility, and high-speed networking are what you'd want from a server vendor. Although there will always be a lingering concern whether Cisco will stay in the server business, they've shown evidence of continuing innovation and development, and solid commitment from customers up to this point.
The use case for UCS boils down to two advantages: agility, and shrinking provisioning and maintenance time.
Agility because UCS treats server blades the way that SANs treat disk drives, as anonymous elements that are brought into play as needed by the load. Whether you're layering a virtualization workload on top of non-virtualized servers, UCS offers some of the benefits of virtualization at the server hardware layer.
One Cisco staffer called it "VMotion for bare metal." It's not exactly that, of course, but the idea is the same: virtual or non-virtual workloads can be moved around computing elements. This makes it easy to upgrade servers, to manage power, to balance loads around data centers, and to maintain hardware in a high-availability world.
The shrinking of provisioning and maintenance time comes from the management interface. All of the little details of bringing a new rack of servers online, from handling Fiber Channel addressing to virtual or physical NICs, to cabling, to power management, to making sure that every little setting is correct -- they're all taken care of by the UCS management layer, either using Cisco's applications, a multi-domain orchestrator from some third party, or even home-grown tools.
If virtualization is one of the first steps you take to gain a competitive advantage in enterprise computing, then the agility and flexibility that UCS delivers are good second steps.
Snyder, a Network World Test Alliance partner, is a senior partner at Opus One in Tucson, Ariz. He can be reached at Joel.Snyder@opus1.com.
Google executive explains how the company attempts to avoid downtime using an innovative method.
It’s prudent for IT pros to cultivate skills that are in high demand. Even better are skills that will...
You can use the CuBox-i4Pro as an Android machine, a general purpose Linux host with or without...
Sponsored by AT&T
Sponsored by Brocade
The man credited with seeding the idea for Amazon Web Services says there’s a massive misconception...
SIP can cut costs and increase network flexibility and the efficiency of existing resources, but...
VMware Workstation 11 has the edge in performance and polish, while VirtualBox 2.3.20 leads in platform...
How IT pros can avoid blunders when interacting with end users.