• United States
by David Newman, Network World Global Test Alliance

Filters on routers: The price of performance

Jul 14, 200313 mins
Cisco SystemsComputers and PeripheralsNetworking

Access control doesn't have to be a throughput killer.

Setting filters on routers might be mandatory for access control and usage tracking, but suffering a performance hit is strictly optional.

We took six access routers from five vendors and loaded the devices with progressively larger numbers of filters and routes. Routers from ImageStream, Lucent, Riverstone and Tasman didn’t break a sweat, delivering essentially the same latency and throughput with hundreds of filters and large routing tables as they did with bare-bones configurations.

Filtering dos and don’ts

Featured players

How we did it

Cisco configuration data (Zip file)

Buyer’s guide

Enterprise router Buyer’s Guide with 50 products from 10 vendors.

At the other end of the spectrum is Cisco’s 2651 router. It put up respectable baseline numbers, but performance plummeted once we added filtering. This is hardly a surprise: The 2651 is based on a single CPU and a scant 64M bytes of memory. Although we upgraded the 2651 to its maximum of 128M bytes of RAM, its aging design is no match for other routers in this test. All others use 256M bytes of RAM, and most use custom silicon such as network processors or field-programmable gate arrays to boot.

Cisco declined to participate in this review, saying users are interested in issues other than performance. Given Cisco’s dominant market share, we purchased Cisco 2651 routers for inclusion in this review. We also shared our methodology with Cisco, notified the company of our plans, upgraded the routers’ memory to be able to complete some tests and, as with all other test participants, informed Cisco of its product’s results before publication.

Companies use filters on access routers for all sorts of reasons: To keep unauthorized users or applications out, to track usage of authorized applications and to restrict access to the router. (See Internet Engineering Task Force Guidelines, and “Filtering Dos and Don’ts”.)

Getting the numbers

We measured the performance effect of filtering with three metrics: throughput, average latency and maximum latency (see How we did it). To determine routers’ ability to recover from failure, we also measured reboot times under load for each device.

Our test setup consisted of a pair of identical routers connected by two T-1 interfaces using crossover cables (see test diagram). The product configurations – routers with two T-1s and two Ethernet interfaces – are arguably the most commonly found devices in any corporation’s routing setup.

To determine the performance impact of filtering on this class of device, we began with a baseline case of no filters and no routing, and then added ever-larger numbers of filtering and routing conditions.

In the filtering cases, we asked vendors to configure one router with filters covering multiple conditions: source and destination IP address; protocol number; and TCP or User Datagram Protocol (UDP) port number. We asked vendors to set their last filter as the one we’d use for test traffic, forcing the routers to cycle through their entire filter list. Vendors also enabled logging, so we’d know how many packets “hit” each filter. Tests were run with eight, 16, 64 and 256 unique filters applied.

In the routing test cases, we asked vendors not only to apply various numbers of filters but also to enable two routing protocols – Border Gateway Protocol (BGP) and Open Shortest Path First (OSPF).

We ran through the various numbers of filters with two routing scenarios, dubbed “small tables” and “big tables.” In the small-table case, we advertised reachability information for 64 networks each over BGP and OSPF. That’s the sort of table size a small or midsize business might run.

In the big-table case, we advertised 125,000 routes using BGP and 4,096 using OSPF. The first number represents the current size of the Internet “full table” – the total number of networks visible in the global Internet. The second number represents about 10% of the size of a Tier-1 ISP’s OSPF Area 0 network – the core of any OSPF network.

Holding the full Internet table might seem like a lot to ask of an access router. However, a growing number of corporations use multi-homed connections – BGP connections to different ISPs for redundancy – and their actual table size might be at least twice as large as the one we used.

First things fast

When it comes to throughput, there’s only one right result: line rate. Five of the six products achieved that result, no matter what the frame size, filter count or number of routes involved (see graphic).

This isn’t terribly surprising, considering the bottleneck in our test bed was two T-1s connecting the routers. In this era, when $400 desktop machines can push 100M bit/sec or more, the ability to move data at 3M bit/sec should be a given – even when the load consists entirely of minimum-sized packets, and the router carries a large number of filters and large routing tables.

The exception in this test was Cisco’s 2651. Like the other routers, it moved short packets at line rate, but only in the baseline case with no filters or dynamic routing applied. Once those features were turned on, throughput really tumbled. At best, throughput was less than 30% of the theoretical maximum once filtering was activated. With routing and filtering set, throughput in some cases dropped to just 7% of the two T-1s’ capacity.

Cisco’s documentation warns that access lists can have an adverse effect on performance. These results put numbers to that warning.

The 2651’s troubles in this test merit two other comments. First, in fairness to Cisco it’s unlikely the company’s network design consultants would recommend the 2651 for any customer with requirements like ours. Then again, Cisco’s public relations recommended we use the lower-performing 1710 after reviewing our test methodology.

Second, the 2651 is supplied with 64M bytes of RAM, compared with 256M bytes for all the other routers in this review. With just 64M bytes of memory we couldn’t complete the big-routing test cases with the 2651, so we upgraded its RAM to 128M bytes – the most it will hold.

Of course, no production network carries a load consisting exclusively of 64-byte frames, so we also conducted tests with medium- and maximum-length frames. While some applications, such as voice over IP, consist mainly of short frames, the average frame length on many IP networks is somewhere in the 300-byte range, and many applications use maximum-length frames to move data. We tested with 256- and 1,518-byte frames.

When handling 256-byte frames, the Cisco 2651’s throughput again fell off from line rate, but not as badly as with 64-byte frames; this time the worst-case rate was 29% of the maximum, compared with 7% for short frames. The 2651 moved data at line rate with no routing and up to 64 filters configured, both without routing and with small routing tables. However, rates dropped to 29% of the theoretical limit when we loaded large routing tables on the 2651. All other devices moved traffic at line rate.

Throughput with 1,518-byte frames was line rate in all cases, as it was for all other devices. This is hardly surprising given that frame rates for long frames are a tiny fraction of those for short frames.

The delay game

There might not have been major differences among products in throughput (Cisco excepted), but there certainly were when it came to latency.

In many ways, latency – the delay added by a device as a packet travels through it – is an even more important metric than throughput. Latency affects every packet, no matter how busy the network. Low and consistent latency is critical not just for voice and video applications, but for any application where response time matters. That definitely includes the roughly 90% of all Internet traffic that uses TCP. Because TCP requires timely acknowledgement of data, delays can lead to retransmissions or session loss.

Even at relatively slow T-1 rates, latency for 64-byte packets can theoretically be 500 microsec or less. The Riverstone 3000 and the Tasman 1004 stuck pretty close to that theoretical mark when we looked at average latency (see graphic), but that’s not indicative of what we saw in other cases. Going across two routers, Cisco’s 2651, ImageStream’s Rebel and the Tasman 1400 registered average latency scores that were nearly double that.

There’s no one good answer as to how much latency is acceptable. Humans perceive degraded video quality with delays of as little as 10,000 microsec, and degraded audio quality with delays of 50,000 to 200,000 microsec. For data applications, the threshold might be higher (sometimes much higher).

But this doesn’t excuse a router that adds, say, 100,000 microsec of latency. Keep in mind that these numbers are delays perceived by end users – and there are many other components, such as other routers, attached computers and software stacks, in play, each adding delay of their own. Any router that adds enough delay to mess with application performance is a problem.

In the worst case, with a 256-rule access control list and large routing tables, maximum latency for the Cisco 2651 shot up to just over 261,000 microsec. Even at T-1 rates, that’s a huge delay. To put that number in perspective, consider that a beam of light could circle the Earth nearly twice in the time it took for a pair of 19-inch-wide Cisco boxes to forward one packet under our applied load.

Cisco’s high maximum-latency numbers led us to take two steps. First, the charts’ Y axis had to be broken so we could show differences among other routers. Second, additional tests of the 2651 were run to determine if the maximum numbers are outliers.

In absolute terms, they are outliers: More than 99% of all frames have latencies of 10,000 microsec or less. However, there are significant numbers of frames with latencies of many thousands of microseconds. These numbers can result in degraded performance, especially for delay-sensitive voice and video applications. Jitter – variation in delay among packets – is even more harmful to voice and video applications than high latency itself.

Not all vendors exhibited big differences between average and maximum latency. ImageStream’s Rebel fared the best in this regard; across all tests, it had the smallest difference between average and maximum latency.

Riverstone’s 3000 also had little difference between average and maximum. It also posted the lowest average latency across all tests with short frames and the least amount of variation among different test cases. Tasman’s 1004 posted the next-lowest average latency, but its maximum latency was substantially higher than the Riverstone 3000’s in most test cases. In fairness, the Riverstone device is really a metropolitan edge device with considerably more processing power than the Tasman 1004 access device. That difference also shows up in pricing – Riverstone’s box costs at least twice as much as the Tasman device.

Lucent’s Access Point 1500 presented a special case. According to the router testing RFC (see here), latency is supposed to be measured at the throughput level. However, the Lucent Access Point 1500’s buffers are so large that we exceeded line rate when measuring throughput – so much so that the routers continued to forward packets for up to 17 seconds after the test was stopped. This led to absurdly high latency readings. Lucent reduced the size of its buffers for this test but even so recorded higher maximum-latency readings compared with most other routers. Big buffers can be helpful, especially if most traffic is bulk data transfer, but users looking to deploy the Lucent routers for delay-sensitive applications might want to reduce buffer size. It definitely helped in our test case: Average latency measurements hovered around 2,000 microsec, third lowest in our 256-byte test case.

With 256-byte frames, average latency differences among products were somewhat less pronounced (see graphic). Riverstone’s 3000 and Tasman’s 1004 again had the lowest average latencies across all test cases. The ImageStream Rebel and Riverstone 3000 exhibited the least variation among different test cases, suggesting that latency won’t be affected no matter how many filters or routes are involved.

While average latencies were improved with medium-size frames, Cisco’s 2651 again struggled with high maximum latencies. The worst-case measurement of just over 253,000 microsec was basically the same as the 64-byte case. The 2651 also registered the largest difference between average and maximum latency in all cases. Here again, the Cisco router’s latency is high enough to disrupt virtually any delay-sensitive application – and that’s without factoring for all the other elements that must process the traffic stream.

At the other end of the scale, Riverstone’s 3000 and ImageStream’s Rebel registered relatively little difference between average and maximum delay – an indication of these devices’ suitability for delay-sensitive applications. However, the Riverstone router’s maximum delays did spike upward when running large routing tables.

With maximum-length frames – the sort used in Web and FTP applications – most routers registered average latencies close to the theoretical minimum of 8,000 microsec (see graphic). Cisco’s 2651 was again the exception, with average latencies that averaged 16,000 microsec across all tests. The lowest overall latency belongs to Tasman’s 1004, which delayed maximum-length frames across a pair of routers by an average of 9,400 microsec across all test cases.

When it comes to maximum latency, Cisco’s 2651 again posted by far the highest delays. Its numbers were more than 200,000 microsec in the test cases with 16 or more filters and large routing tables. Maximum delays for ImageStream’s Rebel, while an order of magnitude smaller than Cisco’s, were relatively high compared with the Rebel’s average delays. The vendor attributes the difference to a polling process used with an Intel chipset in its devices. We don’t consider the Rebel’s maximum delays high enough to have a significant negative effect on application performance.

Lucent’s Access Point 1500 also showed relatively high maximum latencies in a couple of the large-routing-table test cases. As a rule, however, most routers delayed packets by the same amount regardless of the amount of filtering or routing going on with the large frames. Lucent’s routers might have struggled a bit with our huge routing tables, but, in general, the delay they introduced was not enough to degrade performance of any application.

Reboot time

Availability is always a concern for any type of network device. While all the routers in this test offer features such as support for failover protocols and/or redundant components, it’s still possible that other factors can take a router out of circulation. Total loss of power is an obvious example – and one that all the redundancy in the world won’t help.

To assess how quickly devices recover from a loss of power, we loaded each router with the same line-rate stream of 64-byte traffic we used in the filtering tests. We did not set any filters or use dynamic routing, so we knew all routers were capable of forwarding traffic at line rate without loss.

Once a router began forwarding traffic, we used an automated power switch to power it off, wait 5 seconds and then power it up again. Because we offered traffic at a constant rate throughout the test, we were able to derive recovery time from frame-loss counts.

Lucent’s Access Point 1500 recovered the fastest, at just 23.6 seconds. The closest competitor, ImageStream’s Rebel, came back to life after 64.5 seconds of downtime. The Tasman 1004 and the Tasman 1400 took 83 and 109.4 seconds, respectively, to come back. Riverstone’s box recovered in 123.7 seconds, while Cisco’s 2651’s recovery time was 146.7 seconds.

The simple and not terribly surprising conclusion from these tests is that Cisco’s aging design (a single CPU and limited memory) doesn’t perform at anywhere near the same level as more modern designs with more memory and hardware acceleration. Even at T-1 rates, there are performance differences among products – and fortunately for end users, numerous options to choose from as well.