Multicast performance differentiates access switches

Once upon a time, layer-2 unicast performance tests would have produced by far the most important results, but that's changed. Measuring unicast throughput on all ports, once considered the acid test for access switches, is no longer a major differentiator.

Even in the most stressful test case – with a Spirent TestCenter traffic generator blasting minimum-length 64-byte frames at all switch ports – throughput was at or very close to line rate for all switches except D-Link's 3650.

We observed similar results when measuring throughput for 256- and 1,518-byte frames as well, both in layer-2 (switched) and layer-3 (IP forwarded) configurations. Throughput just isn't the differentiator it once was.

After we completed testing, D-Link objected to our methodology, complaining that it isn't indicative of real-world conditions. We take D-Link's point, and hope no network manager would consider running a production network at 99% utilization or above. But we've heard this complaint before many times, and believe it misses the point. No one ever represented that industry-standard throughput testing practices use "real-world" traffic patterns (never mind that "reality" differs vastly from network to network). Rather, the goal is to determine the limits of switch performance.

Multicast group capacity

If unicast performance didn't differentiate products, multicast performance certainly did. We assessed multicast by measuring group capacity, and layer-2 and layer-3 multicast throughput and latency. Multicast group counts turned out to be major differentiators, not just in the capacity tests but also in the throughput and latency tests.

The goal of the group capacity tests was to determine the maximum number IGMPv3 multicast groups each switch could handle. This is a key measure of multicast scalability: The more groups a switch can track, the more users can do with multicast.

Since this is an access switch test, we configured each device in layer 2-only mode and enabled IGMP snooping. Then we configured the Spirent TestCenter traffic generator/analyzer to join some number of groups, and measured whether the switch would forward traffic to all groups without flooding (see "Breaking the standards").

The results reveal lots of variation among products, with group capacity ranging from nearly 1,500 for HP's ProCurve to less than 70 for Dell's PowerConnect. For enterprises that only need 70 or fewer multicast groups in the enterprise for the life of the switch, this isn't an important distinction; for everyone else – and that certainly will cover most midsize and large enterprises, and many small ones as well – group counts do matter.

Tracking multicast group capacity

The capacity test focused only on maximum group count. When it came to measuring throughput and latency, the group counts supported by each switch were lower in some cases than other.

In part the difference is explained by switch configurations. We measured layer-2 throughput and latency using more or less the same topology as in the group capacity tests. In the layer-3 tests we enabled protocol-independent multicast (PIM), a multicast routing protocol, essentially putting a router on every port. Just judging from the supported group counts where less than half the switches hit the 500 group count mark, this is far more stressful on the device under test.

Not shown in the chart is the fact that it took multiple software builds for some vendors to obtain these results. Our initial multicast tests of the Alcatel-Lucent, Dell, D-Link and Foundry switches with 500 groups led to lockups or reboots. All these vendors supplied software updates that led to more stable switches. However, as the results show, not all could be tested with 500 groups. If a switch could not hit the 500 group mark we'd outlined for throughput and latency testing, we tested layer-2 and layer-3 multicast throughput and latency at the switch's maximum group capacity.

HP's ProCurve did support 500 groups, but with a twist: In L3 testing, it could use only two virtual LANs, IP subnets and PIM router instances, compared with 49 on all other devices. Clearly this limitation would rule out the use of this ProCurve switch in situations where more than two subnets and multicast routing instances are needed.

Several vendors observed that few customers today support 500 multicast groups at the edges of their networks. But we can argue that conditions may be changing. In some industries, notably financial services, it's already common to support dozens to hundreds of multicast group subscriptions for stock-quote applications.

Multicast scalability may not be a top priority in choosing network devices yet, but it's likely to become more important.

Switch jitters

Latency, or the length of time a switch buffers each frame, is also a key switch metric, more important than throughput for real-time applications such as voice and video. In fact, multicast throughput turned out to be a nonissue in our tests, with all products moving packets within 0.5% of line rate.

For unicast traffic, differences between products handling midsize frames were relatively minor, but average and maximum unicast latencies differed widely when switches handled minimum- and maximum-length frames (see Unicast Latency chart). In particular, Foundry's X448 exhibited unusually high average and maximum delays when handling large frames. The vendor says it hasn't seen this result in other tests, but it was very repeatable in our lab.

Multicast latencies varied much more, with a 500-fold difference between the lowest result and the highest – both from HP's ProCurve switch (see Multicast Latency chart). A big delta between average and maximum latency may indicate an issue with jitter, or latency variation, which can have an adverse effect on delay-sensitive applications such as voice and video. The HP and Alcatel-Lucent switches exhibit much greater variation than other switches between average and maximum multicast latency, with spreads of hundreds or thousands of microseconds. In contrast, all other switches held up traffic at most 1 to 4 microsec.

The Alcatel-Lucent and HP switches also exhibited much higher latency for multicast than unicast. Conversely, Foundry's X448 did far better with large-frame latency when handling multicast traffic. The traffic topologies differed in the unicast and multicast tests, making the comparison a bit unfair, but given that switches move unicast and multicast alike in silicon we were surprised to see any differences.

< Previous story: 10 Gig access switches: Not just packet pushers anymore | Next story: Review of 802.1X authentication in switches shows support is all over the map >

Learn more about this topic

Compare more access switches in our Buyer’s Guide

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:
Must read: 10 new UI features coming to Windows 10