Skip Links

Breaking the standards

By David Newman and David Newman David Newman, Network World
November 06, 2006 12:00 AM ET
  • Print

We are used to breaking products in lab testing, but this time we broke 802.11 itself. Our tests uncovered a design flaw in the Wi-Fi protocol that affects performance testing, not just for current 802.11a/g products, but possibly in upcoming 802.11n gear as well. As a result of our tests, an IEEE committee heard a proposal to recognize and fix the design flaw.

It's a common misperception that Wi-Fi is an inherently "lossy" medium. Wi-Fi is highly vulnerable to signal errors, but it compensates with built-in error checking and retransmission mechanisms. Even a huge error rate (say, 10% of all packets, the maximum allowed in 802.11) should still result in zero loss, because packet errors are retransmitted.

That's the theory. In practice, we found a deficiency in an 802.11 packet header that can lead to packet loss.

The physical layer convergence procedure (PLCP) header carries key information about each packet, such as its length and transmission rate. While the rest of an 802.11 packet has excellent error protection because of a 32-bit CRC field, the PLCP header has only a single bit for error checking, and that is nowhere near enough to protect against corruption.

Weak PLCP error checking can fool an 802.11 receiver into believing that it never received packets, even after a transmitter goes through multiple retry attempts.

For example, suppose an 802.11g transmitter sends a 100-byte packet at 54Mbps, and that channel noise corrupts the PLCP header. The corrupted header can convey bogus values, such as telling the receiver the packet is 4,095 bytes long and is being sent at 6Mbps.

An uncorrupted packet would take just 36 microsec to transmit, but in this case the corrupted PLCP header will cause the receiver to keep listening for the packet for 5,484 microsec. The receiver is literally off the air for that long period, causing it to miss multiple retry attempts and give up on the packet as lost.

This perceived loss makes it harder to get an accurate read on device performance. It's standard practice in throughput and latency tests to tolerate zero dropped packets. Because weak error checking in the PLCP header introduces packet loss, lower throughput rates are a likely result.

Weak error handling also can affect roaming tests. If a receiver misses an Extensible Authentication Protocol handshake packet during a roaming event, it can take 30 seconds before the RADIUS handshake begins again. We saw some 30-second roaming times in our tests because of this issue.

The probability of PLCP corruption with short packets and high rates is around one in 1,000. Because performance tests inevitably involve far more than 1,000 packets, results easily can be skewed downward by corrupted PLCP headers.

We compensated for this issue by setting an acceptable loss threshold of 0.1% in our throughput tests. We're not crazy about allowing loss in throughput tests. It's a violation of RFCs 1242 and 2544, and it's a common dodge used by vendors of poorly performing products.

  • Print

Videos

rssRss Feed