Are your pipes too big?

The problem with Long Fat Networks

1 2 Page 2
Page 2 of 2

Solutions

Do you see the problem? In some cases, TCP sacrifices performance for the sake of reliability, particularly when latency and/or bandwidth is relatively high. But is it possible to achieve both performance and reliability? Can we have our cake and eat it too? 

+ ALSO ON NETWORK WORLD Spike in traffic with TCP source port zero has some researchers worried +

Yes, and we’ll look at the options in a moment. But first, let’s look at the two obvious but unrealistic solutions:

1. Decrease latency. If we could decrease the amount of time it takes for a bit to make it from one side of the network to the other, computers wouldn’t have to go get a cup of coffee every time they send a TCP Window’s worth of data. But until someone comes out with the Quantum Wormhole router, or figures out how to increase the speed of light or bend space, you’re probably stuck with the latency you’ve got.

2) Decrease throughput. If we turn a Fat network into a Skinny network without changing the latency or the TCP window size, it stands to reason that link utilization would go up. But I recommend thinking twice before bringing this option up with your colleagues (“What, you want less bandwidth?”).

OK, now that we’ve got that out of the way, let’s look at the real solutions:

1) TCP window scaling. One might wonder why the TCP window size field is only 16 bits long, allowing for a maximum of a 65,535 byte window. But remember that TCP’s reliability mechanism was written in a day when data link bandwidth was measured in bits. Today, 10 Gigabit links are common (and 40G and 100Gbps links are becoming more common).

RFC 1323, titled “TCP Extensions for High Performance,” was published in 1992 to address some of the performance limitations of the original TCP specification in a world of ever increasing bandwidth. In particular, TCP Option 3, titled “Window Scaling,” addressed the 65,535 byte window size limitation. Rather than increasing the window size field in the TCP header to a number larger than 16 bits (and thus rendering it incompatible with existing implementations), Option 3 introduces a value by which the TCP window size is bitwise shifted to the left. A value of 1 shifts the 16 bits to the left by 1 bit, doubling the window size. A value of 2 shifts the 16 bits to the left by 2 bits, quadrupling the window size. The maximum value of 14 shifts the 16 bits to the left by 14, increasing the window size by 2^14.

Increasing the window size has the obvious benefit of allowing TCP to send more segments before pausing to wait for a response. However, this performance benefit comes with some risk, such as buffer issues and larger retransmits when segments are lost. Virtually every modern operating system in use today uses TCP window scaling by default, so if you’re seeing small window sizes on the network, you may need to do some troubleshooting. Are there any firewalls or IPS devices on the network stripping TCP options? Are hosts scaling back the window size due to buffers filling up or excessive packet loss?

2) Multiple TCP sessions. The problem described here applies to a single TCP session only. In the earlier example, Computer A’s TCP session was only utilizing 3.9% of the link’s bandwidth. If 25 computers were transmitting, each using a single TCP session, a link utilization of 97.5% could be achieved. Or, if Computer A was able to open 25 TCP sessions simultaneously, the same utilization could be achieved. This will almost never be a good solution to the problem at hand, but is included here for completeness.

3) Different transport layer protocol. TCP isn’t the only transport layer protocol available. TCP’s unreliable cousin, the User Datagram Protocol, does not provide any guarantee of delivery, and is therefore free to consume all available resources.

4) Caching. Content caching utilizes proxies to store data closer to the client. The first client to access the data “primes” the cache, while subsequent requests for the same data are served from the local proxy. Content caching is a band-aid solution that is becoming increasingly obsolete in an age of constantly changing and dynamically created content, but it is still worth mentioning.

5) Edge computing. Paid services like Akamai decentralize content and push it to the edges of the network, as close to the clients as possible. One of the results is lower latency between clients and servers.

6) WAN optimization and acceleration. Products from companies like Riverbed and Silver Peak, or open source alternatives like TrafficSqueezer, employ various techniques such as data deduplication, compression, and dictionary templating to increase the perceived performance of a WAN link.

Conclusion

Earlier, I said the company in question here had too much bandwidth. Well, that wasn’t really the case. Their real problem was the behavior of the protocol they were using. One of their system admins, at the direction of a software developer, had monkeyed with the configuration of their servers in an attempt to tune application performance. The application wasn’t able to pick up packets from the buffer fast enough (due to a software bug), causing the TCP window to scale back the window size, but not before buffers were filled, packets were dropped, and retransmits were occurring. In an attempt to eliminate the retransmits, this admin had statically set the TCP window size on the servers to a relatively low value. Since TCP was being artificially constrained, it was never able to scale up to fill the available bandwidth of the data link.

Today, barring misconfiguration, most networks won’t run into LFN problems because TCP window scaling is widely used by default. However, with bandwidth ever on the rise, performance will most certainly become more and more of an issue because latency is fixed (unless you figure out how to bend space-time), and we are likely to see similar symptoms. 

Heder, CCIE No. 24788, is a network architect with NES Associates in Alexandria, Va., specializing in large-scale network design. Heder holds a master's degree with a concentration in network architecture and design, and has a patent filed for an IPv6 technology. He can be reached at brian.heder@gmail.com.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2014 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
IT Salary Survey 2021: The results are in