King of the NOS hill: How we did it
We tested each NOS on Compaq ProLiant 1600 servers with dual 650MHz Pentium III CPUs, 512K-byte L2 cache and 640M bytes of RAM. The data partition consisted of 14 9.1G-byte drives loaded in a Compaq RAID Array 4214 drive array connected to an on-board Ultra2 SCSI controller. The NOSes were loaded on a 9.1G-byte drive connected to a second on-board Ultra2 SCSI controller. This configuration increased the available bandwidth of the drive subsystem and alleviated the bandwidth bottleneck to the drives.
The client hardware consisted of four ProLiant 1600 machines with dual 600MHz Pentium III CPUs and 640M bytes of RAM. Two additional ProLiant 1600 machines had dual 400MHz Pentium II chips with 256M bytes of RAM. Each client system ran Windows NT Server 4.0.
Servers and clients were connected to the network with Intel Pro 100+ network interface cards (NIC). A Cisco Catalyst 2900 switch with 24 10/100M bit/sec Ethernet ports completed the network configuration. All the NICs and switch ports were configured for 100M bit/sec full-duplex operation. We used an additional Compaq 1600 with four Ethernet NICs as the control machine.
For our NOS benchmarking, we focused on file service and network performance. To test file service performance, we used Client/Server Solution's Benchmark Factory tool, which let us create tests that would stress each operating system's file subsystem. We configured the server to provide a Windows network file share for all clients using Windows System Message Block protocol over IP. We installed a benchmark agent on each client and modified the clients' LMHOSTS files to evenly distribute file transaction requests to the server.
We divided the tests into two categories - small and large file transfers.
For the small file transfer tests, we used a 3-D test matrix of transfer direction (read/write), block size (1K and 8K bytes) and transaction type (random/ sequential), which resulted in eight individual tests. The small file transfer tests used a mix of 80% 1K-byte files, 10% 10K-byte files and 10% 50K-byte files.
All of these write transaction tests were conducted with a write through flag set in the Benchmark Factory software. This flag is set to simulate an application forcing a write through each operating system's cache to disk.
Because many applications do not force a write to disk, we asked Benchmark Factory to recompile its code with the write through flag turned off, and we reran the test with the new benchmark software build.
For the large file transfer tests, we combined reads and writes together in the same tests to emulate the behavior of large file service operations. We used a mix of 90% reads and 10% writes. We then created a set of four tests to cover all combinations of transfer type (random/sequential) and block size (1K and 8K bytes). The large file transfer tests used a mix of 80% 500K-byte files and 20% 1M-byte files.
All of these write transaction tests were conducted with the write through flag set in the Benchmark Factory software. We reran the tests with the write through flag turned off.
The benchmarking agent created the files each virtual user needed at the beginning of each test. We ran five iterations of each test with an increasing load of virtual users starting at one and increasing to 200 by 50-user increments for the majority of our tests. However, for our sequential read/write tests we started at one and increased to 40 in 10-user increments. We did preliminary testing to establish the test parameters, then ran those parameters against each NOS.
We graphed the results of each file test on a curve with five data points. The curves have a knee followed by a plateau. We averaged the data points in the plateau to yield the score for the test.
We normalized the raw scores for each of the 20 file tests and then factored those normalized scores together to obtain a file benchmark score.
To test the network performance of each NOS, we used Ganymede Software's Chariot software, which differs from the Benchmark Factory software in that all file transactions occur in memory. The disk subsystem is not utilized. We used Chariot to compare the efficiency of the NIC drivers and TCP/ IP stacks as measured by the number of operations each NOS could perform before the processor was bottlenecked and to compare the baseline throughput of the servers.
For the TCP stack test, we only used two NICs on the server and disabled the other two. We set the IP subnet mask on all the machines to 255.255.252.0, which put 100.0.1.x and 100.0.2.x in the same subnet.
We set up several bidirectional streams of short TCP file transfers from each of the clients to the server. A TCP session was built and torn down for each 3K-byte file transferred. This put a heavy load on the processors. Because the processors are the bottleneck, this test indicates the efficiency of the TCP stack and NIC driver for each NOS. We ran this test for 10 minutes and recorded the aggregate throughput value for all the streams.
We ran a second Ganymede test to get an idea of the average aggregate throughput of all four NICs on the server. The bidirectional streams between the Chariot endpoints were configured as a long TCP session with a large file size of 10M bytes. The tests opened a TCP session once when they began and then sent files for the duration of the test. The session was not closed until the end of the test. We ran this test for 10 minutes to get an average aggregate throughput measurement. We averaged the short and long TCP file transaction results to get one number measured in megabits per second. This number was normalized to obtain the score for the network test.
We also took a qualitative look at each NOS's management tools, security measures, stability and fault-tolerance features, installation process and documentation.
We evaluated the usability of the overall management interface and how each product handled server monitoring, client administration, and file, print and storage management. We evaluated the scalability of each NOS based on its symmetric multiprocessor ability, failover clustering support and load-balancing clustering ability. For our security evaluation, we examined password file encryption, password and user ID encryption over the network, and any advanced security features offered. For stability and fault tolerance, we looked at each product's software RAID capabilities, backup and restore utilities, and memory protection.
RELATED LINKS
Bass is the technical director and Robinson is a senior technical staff member at Centennial Networking Labs (CNL) at North Carolina State University in Raleigh. CNL focuses on performance, capacity and features of networking and server technologies and equipment. They can be reached at john_bass@ncsu.edu and james_robinson@ncsu.edu.
