How we tested the iSCSI SAN servers

We tested iSCSI servers by installing and configuring them, putting them through a series of typical tasks, and running performance tests.

Our test bed was based on a dedicated data network provided by an Enterasys C2G124-48 Gigabit Ethernet switch, along with a dedicated control network using another Enterasys switch C2G124-48 switch. The two networks were connected to each other and to our production lab network using a Nokia firewall.

Each iSCSI SAN server was connected directly to the Enterasys network using as many Ethernet ports provided by the vendor with each server. If it was possible to separate control and data traffic, we did so by connecting the control ports to our control network. In some cases, it wasn't possible to separate out the control and data networks, so we ran both control and data over the same network.

Many storage servers support "jumbo" datagrams, a non-standard extension to Ethernet that allows for larger Ethernet frames (up to 9000 octets is common). Depending on the device, this can increase total system throughput by reducing overhead used for packet headers and per-packet processing. The Enterasys switch supported "jumbograms," so we turned on jumbograms for every server that supported it.

Another performance enhancement often used is link aggregation (sometimes called "bonding" or "teaming"), which combines multiple Ethernet ports into a single virtual port that can accommodate the throughput of the component ports and provide high availability. Where link aggregation was supported, we made use of it.

In some cases, we had to configure link aggregation manually on the Enterasys switch; in others, the storage server supported the link-aggregation control protocol (LACP) which automatically brought the ports together. In that case, we simply verified that the Enterasys switch and device agreed on the link aggregation configuration.

We attached four identical servers to the data network, each with a single dedicated 1Gbps connection. Each server also had a 100Mbps connection to the control network. We used an Avocent AMX5100 KVM to control the servers. The servers were all dual-Xeon systems with 3.0GHz CPUs and at least 2GB of memory. Each booted from a locally attached SCSI drive.

Two of the servers used dedicated QLOGIC QLA4050C iSCSI adapters. These adapters connect to the data network with TCP/IP, and handle all of the iSCSI protocol on the adapter directly. This reduces the load on the host and makes certain operations, such as booting directly from an iSCSI virtual disk, much simpler. The other two servers used Intel Pro/1000MT adapters provided by Intel to connect to the data network. Both the QLA4050C and the Pro/1000MT adapter are 64-bit-wide PCI-X 133 MHz adapters with a single gigabit Ethernet connection.

Originally, we had configured two servers with Windows 2008 (one each with QLogic and Intel adapters) and two servers with Centos 5 Linux. We used the Intel adapters with the native iSCSI initiator, Microsoft's own in the case of Windows 2008 and Open-iSCSI on Centos 5. However, the Linux servers proved to be extremely problematic when we came to performance testing. Although we had heroic assistance from the technical support team at Reldata, we were unable to complete testing with Linux and Iometer. We found that the results with our benchmark tool, Iometer, were not credible on the Centos 5 Linux platform using either Qlogic or Intel adapter. This pointed to some interaction between Iometer and the operating system. Two of the iSCSI server vendors confirmed known problems between Centos 5 Linux and Iometer, so we dropped Centos 5 from the performance test and replaced it with more Windows 2008 Servers. However, we did continue to use Centos 5 Linux and the Open-iSCSI initiator as part of our interoperability and usability tests.

For the iSCSI servers, we configured the physical disks that were provided into arrays using RAID6, if that was available, or RAID5, if not. On each of the iSCSI disk servers, we configured a total of eight virtual disks out of the RAID6/5 arrays, two for each of the servers being tested. Where possible, we balanced the eight server disks across multiple controllers within the iSCSI disk server.

Our performance tests were based on simulated workloads designed to imitate four different applications: Exchange 2003, Exchange 2007, Windows file server, and Linux web server. We used common metrics for Exchange we found in other benchmark tests. For the file server and web server, we used existing servers running multiple operating systems at Opus One to provide a workload characterization.

We used the popular Iometer disk benchmark tool on all four servers, and then linked them together to a single Iometer console that started and stopped each test on all servers and aggregated results. These results were obtained using Iometer version 2006.07.27, Copyright 1996-1999 by Intel Corporation. Intel does not endorse any Iometer results.

Our Iometer workloads are summarized in the following table. Each of the worker processes had an I/O queue of 32 entries, ensuring that the Iometer systems were keeping plenty of I/O ready to go. We believe that this workload at the level we configured far exceeds what a typical server would present.

Workload Virtual Disk 1 Workload Virtual Disk 2 Workload

Exchange 2003 3 workers, 4K I/Os, 67%/33% Read/Write, 95%/5% random/sequential 1 worker, 64K I/Os, 5%/95% Read/Write, 5%/95% random/sequential

Exchange 2007 3 workers, 8K I/Os, 50%/50% Read/Write, 95%/5% random/sequential 1 worker, 64K I/Os, 5%/95% Read/Write, 5%/95% random/sequential

File Server 2 workers, 10% 4K, 10% 8K, 40% 32K, 40% 64K I/Os, 80%/20% Read/Write, 5%/95% random/sequential 2 workers, 10% 4K, 10% 8K, 40% 32K, 40% 64K I/Os, 80%/20% Read/Write, 5%/95% random/sequential

Web Server 2 workers, 10% 8K, 20% 32K, 20% 64K, 20% 256K, 30% 512K I/Os, 95%/5% Read/Write, 5%/95% random/sequential 2 workers, 10% 8K, 20% 32K, 20% 64K, 20% 256K, 30% 512K I/Os, 95%/5% Read/Write, 5%/95% random/sequential

We ran each of the workloads for 15 minutes, and repeated each workload test three times, averaging the results to get our performance numbers. In the few cases where a system failed to complete the workload, we discarded those results and re-ran the test.

Because we were not specifically comparing dedicated iSCSI host-bus adapters (like the QLogic 4050C) with standard Ethernet adapters (like the Intel Pro/1000MT), we did not break out the performance results separately for each of the servers we tested.

Learn more about this topic

 
Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2008 IDG Communications, Inc.