Citrix VMs are tops in transaction processing, Novell's in I/O speed
When we declared VMware's ESX virtual machine platform to be the performance winner against Micosoft’s Hyper-V.
When we declared VMware's ESX virtual machine platform to be the performance winner against Micosoft's Hyper-V - readers asked, "How could you not test Xen from either Novell or Citrix?"
The short answer then was that neither vendor was ready to enter its Xen hypervisor derivative when testing was conducted last summer. However, in the second round of identical testing done late last fall, we tested Citrix XenServer 5.0, Novell's Xen 3.2 and Virtual Iron 4.4. Two other vendors -- Sun and Red Hat -- were invited to participate but because of varying timing problems, declined to participate.
Our testing confirmed some readers' assertions that open source Xen is a formidable challenger to the closed code VMware and Microsoft hypervisors. When we measured the performance of business transactions running atop the hypervisors, Citrix's XenServer 5.0 was the top finisher in nine out of 12 test runs.
The disk I/O battle was won by Novell's SUSE Xen, which killed all competition in every contest. That achievement boils down to the fact that within the default installation we tested, Novell's SUSE Xen caches writes when using the default, file-backed disk configuration. This caching gives Novell unprecedented speed. But for some, caching disk writes bucks a longstanding practice of passing disk writes immediately to media for the purpose of maintaining transactional integrity. The counter argument there is that should a transactional failure occur while a disk write is in cache storage (before being written to disk), the problem can be easily trapped and dealt with by transaction oriented applications like databases.
When you pull in the numbers recorded by Microsoft and VMware in the last round of testing, you can see that in terms of performance, the Brothers Xen provide new and formidable competition for both hypervisor market leader VMware ESX and its more recent competitor, Microsoft's Hyper-V.
Para vs. full virtualization
Novell SUSE Xen and Citrix XenServer (along with Hyper-V) are capable of bringing into play a process called paravirtualization that can, where supported in both the host hypervisor and in the virtualized guest operating systems, enable a greater bonding between a guest VM and the resources of the physical server. With this bond in place, the guest operating system is supposed to be able access the resources of the host machine more efficiently.
Virtual Iron doesn't support paravirtualization. VMware supports paravirtualization for some Linux versions through a VMI-enabled kernel, but the SLES 10 SP2 64-bit distribution we used in our test bed does not have that kernel at this juncture.
We conducted all tests with SLES VMs running on Novell SUSE Xen and Citrix XenServer hypervisors in both para- and full-virtualization modes. We took these extra steps to discern whether there's an advantage to paravirtualization relationships and our analysis says that while paravirtualization helps some of the incremental load profiles we tested, the overall advantage isn't a consistent benefit. We printed the best numbers achieved for each hypervisor.
Transaction benchmarks summary
We developed several test profiles that mimic common use cases for virtualized guest operating systems. Each product was tested in this round on a HP 580 G5, four-socket, 16-core server, a test bed and process identical to the ones used to test VMware ESX and Hyper-V (see How we did it).
We used SPEC's SPECjbb2005, a Java-based business transaction benchmark, to first compare native operating system performance to basic hypervisor load profiles. We then measured performance as we added guest virtual machines to each hypervisor platform until we hit the final profile that oversubscribes system resources.
The fastest overall performance of a guest VM in our transactional benchmark testing was achieved by XenServer in most cases.
Across the six tests in which each hypervisor was hosting Windows 2008 Server virtual machines, the only case in which XenServer earned the silver was when we ran six Windows 2008 Server guest VMs, all of which had access to a single virtual CPU. Microsoft's Hyper-V achieved the high-water mark in that test run (measured in our first round of testing) with 14,531 bops, compared with XenServer's 14,128 bops.
We can speculate that XenServer gives more resources to a single vCPU than other hypervisors we've tested, which enhances results in situations where the vCPUs are undersubscribed, that is, where there is only a one VM to one vCPU ratio or less.
In the six tests where the hypervisors were hosting SUSE Linux virtual machines, Novell's own Xen implementation was able to best Citrix's XenServer in the test where there was one Linux VM running on a single vCPU. Of course, this win was achieved with a very slight margin, only 25 bops. VMware's ESX beat XenServer in our test where six Linux VMs had access to four vCPUs by a wider margin of 314 bops.
The performance price for virtualization
Virtualizing a guest operating system adds work for the server to handle. As more VM guests means more shared server resources, at some point, performance will degrade because of the extra work each VM guest imposes on the finite server resources. We measured performance of both Windows 2008 Enterprise Server and Novell's SLES 10.2 natively on the server, to garner a baseline of performance expectation. Those numbers came in at 18,153 bops for Windows Server 2008 and 22,240 bops for Novell SLES.
It's possible for a hypervisor to allocate even more resources than a native OS implementation because a hypervisor is able to capture all the resources of a server, where a native installation might not be able to use those resources because of restrictions of its kernel's ability to use all resources of a four-core, or 16-core server. I/O drivers included with hypervisors may also manage server resources more productively.
We divided our testing into two rounds: one with the server confined to one socket of four cores, and; a second where we re-installed the remaining three sockets rendering 16 cores and four vCPUs to each guest instance. In each test, we progressively added VM guests, and compared the results with the native operating system results on the same hardware.
The results showed a clear winner. XenServer was very efficient at finding resources and offering them up to a guest VM. In the first test where we used one VM guest with a single vCPU, XenServer offered sufficient additional resources from the remaining cores to permit Windows to perform faster than its native performance. It's a bit of a smoke-and-mirrors trick (as XenServer's allocates a larger common denominator of resources than the other competitors), but interesting -- and certainly faster than the competition.
Where three VMs shared the four cores with one vCPU allocated to each, XenServer repeated as the performance leader, going just a tiny bit slower, but still faster than native performance. It was only when we started to oversubscribe the four cores with six VM guests that XenServer start to slow down -- but it still exceeded the performance of all four other competitors.
Where we tested Novell's SLES 10.2 Linux as a VM, Novell's SLES Xen bested all (although the results were very close) where we had a single SLES 10.2 VM running on a single vCPU. But Novell’s SLES Xen was bested by XenServer when we increased the number of SLES VMs to three and six, each with access to its own vCPU. In no case was SLES VM performance faster than native performance as it had been with Windows 2008 Server Edition testing.
When we gave the XenServer Hypervisor guest VM instances lots of vCPUs in our second test round, XenServer did well supporting Windows 2008 VMs, pulling down 98.5% of the Windows Server 2008 native numbers when each VM had access to four v-CPUs. It then zoomed to an astounding 108% of native when we added three more VMs to the four vCPUs (remember, it's finding additional resources), then XenServer slowed down as we oversubscribed the number of guest VMs to six guests, four vCPUs each, on a 16-core system to just less than 60% of native.
XenServer continued its winning streak when running Linux VMs with multiple vCPUs available to the VMs except in the toughest test, where VMware's ESX still tromps all when we over-allocate resources by chaining six SLES VM guests with four vCPUs allocated to each guest.
One performance parameter to note regarding XenServer is that there was a consistency issue in the test where we had six VMs running on one vCPU. While the charted performance numbers show the average speed of the VMs, we kept detailed records on each VM’s individual performance. With XenServer, the differences between the slowest VM and the fastest VMs were as much as 41% across 10 test runs of this test scenario. No other hypervisor's guests showed this variation in any of the test scenarios. We also found that we couldn't predict which guest VM would be fastest/slowest through these test runs.
We asked a Citrix spokesperson to comment on the variance in this single test, and Bill Carovano, director of technical product management for XenServer, says the variations were likely caused by the cron jobs that the guests can trigger. Without tweaking, Carovano says these can occur somewhat randomly and may lead to performance variances. In internal testing, Citrix tries to suppress cron jobs to remove fluctuations in its results.
Virtual Iron's performance put it in the overall bottom slot, but it's important to note that the results didn't lag far behind others in all cases. And, Virtual Iron did place second when hosting a single Windows 2008 Server guest across four vCPUs test, a test that gave a single virtual machine a playground of four CPU cores and 2GB of memory -- and all disk resources. That's a pretty wide-open field to run in.
I/O results favor Novell
We tested I/O performance using Intel's IOMeter to assess the number of I/Os per second that each virtual machine could deliver in both under- and over-subscribed conditions.
In the first of our two I/O test scenarios, we used six guest VMs that were assigned one vCPU each, emulating a typical non-oversubscribed server consolidation scenario. The second test made use of six virtual machines with four vCPU, SMP kernels.
Across every IOMeter test, Novell's SLES Xen blew away the competition. The results were so startling (in some cases there was a 10fold advantage in performance for VMs running on Novell's Xen hypervisor), that we retested Novell's SLES Xen across all scenarios. During these retests we carefully watched the disk I/O channel. Our tests include 70% write to 30% read ratio in order to provide large amounts of pressure on the disk channel to emulate virtualization in stressful, high-I/O environments. Servers don't typically see this ratio in many applications, but certain applications such as data warehousing, business analysis, database maintenance and batch processing typical in research applications favor writes over reads, so we test heavily.
In Novell's case we saw that the read/write transactions to disk seemed to come in large cycles, rather than the steady waves that normally typified disk activity while we were testing other hypervisors. From this evidence, we suspected the Novell system was using write caching.
When we asked Novell to comment on this situation, Santanu Bagchi, Novell's senior product manager for virtualization, confirmed our suspicions and told us that write caching is Novell's default when the virtual disk is configured as a file-backed disk as was the case in our test bed.
Write caching prevents bottlenecks when the channel is busy. But it can, in some cases, cause transactional integrity issues. But you can also argue that in many server configurations, write caching can be battery-backed. Being battery-backed staves off the transactional integrity issues by temporarily housing data to be written to disk for the life of the battery or until the transaction is written to media and verified.
In modern data centers, servers are often highly protected with availability features that prevent power outages and other conditions that can corrupt cache and render server data into garbage. It is for these reasons we let the Novell SLES Xen scores stand, realizing that systems purists will likely object to this default installation method and its potential for systems failures.
Citrix XenServer pulled down numbers low enough across most tests for us to query Citrix as to why that was the case. We were told to change the scheduler setting to use the NOOP scheduler, which should have been selected by default but because of a bug in the installer, didn't set correctly on our hardware. This change actually resulted in slightly worse numbers for Windows VMs but resulted in significant improvement with the SLES VMs. Our reported numbers, reflect the NOOP scheduler being in place.
In terms of performance, the Brothers Xen provide some stiff competition. The question is, which is more important to your VM scheme: transactional performance (XenServer is tops there) or I/O performance (Novell’s SUSE Xen screams if you can stand the caching component)? The answer could sway your decision as to which Xen hypervisor might be more suitable for your environment.
Henderson and Allen are researchers for ExtremeLabs, of Indianapolis. Contact them at email@example.com.
Google executive explains how the company attempts to avoid downtime using an innovative method.
A look at some of the coolest bits of Chrome experimentation out there, in honor of Google’s 1000th...
You can use the CuBox-i4Pro as an Android machine, a general purpose Linux host with or without...
Sponsored by AT&T
Sponsored by Brocade
Plans call for moveable lightweight structures and translucent canopies
University of Cambridge's recent data center consolidation aims to reduce the university's carbon...
Trying out Windows 10 and want to get more out of it? Try out these top five tips and secrets for the...
Google executive explains how the company attempts to avoid downtime using an innovative method.