Fast relief for slow Web sites
Tool kit for improving Web site performance includes packet shapers, caching appliances and load balancers.
|
|
|||
|
|
There's much more to a successful corporate electronic commerce site than meets the eye.
While captivating content and flashy graphics may grab the attention of Web site visitors, it's the underlying network services that keep buyers and visitors coming back. The hallmarks of any successful e-commerce Web site are crisp response times, reliable service and consistency for the user experience.
There are a plethora of products available today that can help you move traffic effectively in and out of your site. The products fall into three basic categories: those that shape Web traffic; those that cache repetitive Web-based data; and those that know how to balance the load across your Web servers.

We tested products from each of these categories. Our intent was not to compare them on a head-to-head basis but to examine how the functionality provided by each type of Web traffic management tool could best be used to offer customers the quickest, most consistent and most reliable interaction with an e-commerce site.
We focused on how the products can be applied to serve companies that use a single T-1 line to support incoming traffic to an e-commerce Web site. A T-1 comprises the typical bandwidth increment that organizations use. Unless you represent one of the rare network shops that can afford to spring for a pricey T-3 circuit, chances are you need to care about how bandwidth is carved up for Web-based traffic.
What emerged from these technology evaluations, as chronicled in the sections that follow, is proof that each of these deployment choices can improve response times and assure greater service availability for site users. Traffic shaping, for instance alleviates bottlenecks on LAN links that can be overwhelmed by noncritical traffic such as a large e-mail blast. With traffic shaping in place, you can deploy caching products so you don't waste bandwidth by sending out repetitive requests over the Internet when they can be serviced locally. Finally, a load-balancing tool can help provide consistent response times, fault tolerance and high availability of services.
But while we agree that each of these options offers some advantages by themselves, we recommend that you use a combination of all three so your customers get the best bang for their buck and keep coming back.
Traffic shapers
Of the building blocks we evaluated, traffic shapers stand out as a crucial first step in managing bandwidth so noncritical applications don't choke the pipes leading to Web servers and thwart session setup for e-commerce transactions.
Organizations committed to e-commerce often deploy powerful multiprocessor servers and heavy-duty switches to ensure the availability of services offered across the Internet. Yet all this back-end computing power is often at the mercy of a connection that poses as the front door to a corporate Web site. Whether that front door is a single T-1, a series of aggregated T-1s or even a T-3, if the pipe becomes overwhelmed by the sheer dint of traffic hitting it, all your fancy switching and server farms will be for naught.
That means you need to take proactive control of your bandwidth in order to provide some level of bandwidth guarantees for high-priority traffic. Traffic shapers can govern the use of that link and avoid a spike, even if it comes just once a month.
An organization with a bandwidth-limited WAN link to its primary Web site, for instance, can decide how much bandwidth each application or user receives. Without such controls in place, router or switch buffers overflow and arbitrarily discard packets until the session protocols detect this and back off. In cases in which the most demanding traffic winds up being the least mission-critical (such as a stream of FTP downloads), the bandwidth-intensive traffic prevents more business-critical information, such as order entry data, from getting through.
The players: We evaluated two types of packet-shaping implementations. The so-called buffer managers (a.k.a. bit dribblers) were represented in our tests by NetReality's WiseWan product. Packet shapers that support TCP/IP session rate control were represented in our evaluation by Packeteer's PacketShaper..
NetReality's WiseWan 200 - which supports leased-line T-1 and frame relay connections - sits on the outside of the router and lets network managers build traffic queues that are serviced by the packet-shaping device. WiseWan 200 services traffic queues while endstations send data that fills the queues. We found that WiseWan 200 can be configured to apportion bandwidth to traffic based upon protocol type. WiseWan 200 doesn't completely starve out low-priority traffic, it scales back available bandwidth to it in favor of higher priority streams. WiseWan 200 effectively allows data from each queue to dribble out onto the network at different rates.
By residing outside the internal network, WiseWan 200 can groom traffic streams entering the network. The downside to this deployment option is that such a device offers little control over internal network traffic.
The class of traffic shapers represented in our tests by PacketShaper sit inside the private network just before the router. PacketShaper traps TCP acknowledgements from incoming datastreams and decides how long to hold onto reciprocal TCP acknowledgement packets to delay the acknowledgement receipt. It does this by digging into TCP/IP packets, identifying application traffic types, IP sockets and specific IP address pairs, and then distributing bandwidth according to the different parameters.
This process acts as a governor of sorts, so the transmitting device throttles back traffic, opening bandwidth for high-priority traffic types. Because PacketShaper doesn't buffer traffic like WiseWan 200, if a traffic burst arrives, the Packeteer product calculates the delay until another burst hits and apportions bandwidth accordingly. One benefit of a packet shaper that sits on an internal network is that it can apply policy decisions to local TCP session traffic.
Our recommendations: Although WiseWan 200 and PacketShaper belong to the same device category, they are distinctly different products.
You should use WiseWan 200 if you want to manage your Web site's WAN access links between your router and your ISP. For instance, our hands-on evaluation found that you can manage frame relay links down to Forward Explicit Congestion Notifications and Backward Explicit Congestion Notifications, functionality that is otherwise invisible to a PacketShaper that sits inside of the Web site's router. Moreover, if data compression is important to you, then WiseWan 200 is a likely choice because it makes more sense to perform compression on a serial WAN link than inside a campus network.
By contrast, a product such as PacketShaper that sits inside the router and ships traffic to the router doesn't know whether that traffic is destined for the wide area or if it will remain local. Therefore, it cannot make an effective decision about whether to use compression.
But we do recommend that you use a product like PacketShaper when you need to apportion bandwidth to various outgoing traffic types because it can tell you what type of traffic your users are requesting and let you split off bandwidth just for those applications. For instance, if your site handles large amounts of FTP traffic, that traffic could choke other traffic types by consuming the entire T-1 as users download files. PacketShaper lets you cap the outgoing rate of traffic so e-mail, RealAudio or other traffic types get the bandwidth they need, too.
Caching
Once you deploy packet shapers to better manage the bandwidth coming into your Web site, there's one more step you need to take to cut down on bandwidth waste - the deployment of caching products.
Web caching speeds Web content delivery. Caching takes advantage of the fact that a group of users may want the same information repeatedly from the Web. Because so much content is static, it is pointless to retrieve a fresh copy across the slow multihop Internet. It's more efficient to store copies of Web data close to the requesting users to improve response time and limit bandwidth waste.
Internal user requests for Web data pass through the cache first. Requests are forwarded across the Internet to the Web servers only when the cache does not contain information to satisfy the client request. Under this scenario, you trade freshness of the data for better response time.
Alternatively, caching devices may be positioned close to your Web servers as a front-end processor, off-loading requests for frequently requested pages from the servers. This deployment option is sometimes referred to as "reverse caching." Because the pages in demand are cached, the caching engine simply returns the page to the requesting device, instead of dumping the page request to the Web server.
The players: Only two Web cache companies responded to our invitation - CacheFlow and Cobalt Networks.
Our testing turned up only subtle differences in caching products. One difference arises in how each product processes a Web page request.
CacheFlow's CacheOS hides the IP address of client stations requesting data from a Web server, revealing only the IP address of the caching device. By contrast, Cobalt's CacheRaQ stores the IP address of a requesting client and reveals it to the Web server in the "get" request. The upshot is that CacheRaQ can inadvertently reveal the IP addresses of internal clients to the outside world. While neither approach is right or wrong, exposing the internal IP address structure may raise some security issues.
Another notable difference between caching products is how they can handle an HTTP process referred to as serial retrieval. When a user requests Web content via a browser, literally dozens of round trips occur between the browser and the Web server. That occurs because each Web page consists of scores of objects, and each time a user requests a Web page, a TCP session is established for each object on the page, followed by an HTTP "Get" request. HTTP 1.1 improves upon this serial retrieval of Web objects by adopting a form of pipelining that retrieves objects in groups. However, HTTP 1.0 is still prevalent among Web sites.
In our evaluation of caching appliances, we were impressed with CacheFlow's attempt to bring the pipelining benefits of HTTP 1.1 to older HTTP environments.
CacheFlow's CacheFlow Series 500 uses a technique called Pipeline Retrieval to circumvent any such serial delay. This proprietary algorithm opens as many simultaneous TCP connections as permitted by the source server and retrieves objects in parallel. The objects are delivered to the client's desktop as fast as the browser can request them. In effect, Pipeline Retrieval looks ahead and downloads objects before the client's browser asks for them in order to provide faster access.
Timing is everything
Accessing cached Web pages to improve response time is one benefit of caching appliances, but such products also must guarantee the freshness of Web site data. A number of elements on a Web page may be time-sensitive or date-sensitive, such as stock prices, consumer pricing information or news content. That's why caching engines must contact the original server to determine if certain Web page elements have changed since they were last downloaded to cache.
The problem with this approach is that the request for a freshness check incurs latency - exactly how much latency depends on environmental factors such as the prevailing Internet conditions or the load on a target server. For small objects, it is almost worthless to conduct a freshness check because the number of frames and bytes transferred are roughly the same as transferring the element itself. It's not unusual for internal packets to add hundreds of milliseconds of latency because the caching engine must wait for the Web server to reply.
The caching industry has been unable to convince Web site developers to embrace what is known as "explicit expiration," which would effectively address this caching latency issue. The technique requires that developers tag Web elements with a freshness date.
Without a global approach to time-stamping Web objects, caching vendors rely upon proprietary algorithms that examine cached content for freshness. CacheFlow uses a technique called Adaptive Refresh. The algorithm selectively refreshes Web objects based on their need to be refreshed. Object updates occur at a frequency dictated by the caching engine's capability to formulate a "model of use" and a "model of change" for any given object. Those pieces of information combine to develop a refresh pattern for the Web page. This approach is a proactive technique for determining freshness. One of the benefits of the proactive approach to Web content freshness is that it can be scheduled to take place during off hours, which helps reduce loading on WAN links during peak usage periods.
Other vendors' products, including Cobalt's CacheRaQ, employ a reactive algorithm for freshness checking. That is, the products field a request from a client, check locally cached objects and make a decision to retrieve a fresh object.
Our recommendations: Freshness issues aside, Web caching should be a standard capability built into any e-commerce site. Web caching is particularly important because in addition to saving bandwidth, it reduces latency. If there is any caveat with Web caching, it is to make sure that your cache of choice has a strong mechanism for ensuring that it's serving up fresh data. When our engineers requested page refreshes from our prototype network with the CacheFlow and Cobalt caching appliances, both devices intercepted the requests effectively and provided fresh content wherever necessary.
Load-balancing devices
Traffic shaping and Web caching help network managers gain better control over available bandwidth and reduce latency, but Web site managers also need to take steps to ensure Web server availability.
That's where load balancing enters the picture. At the back end of the e-commerce connection stands the Web server. A popular Web site can overwhelm a single Web server's processing capabilities. Load balancers distribute traffic among Web servers with identical or overlapping content. This approach reduces or eliminates server overload as the culprit of poor e-commerce response time.
Load balancers, in their simplest form, are traffic cops that police HTTP, FTP or other incoming traffic destined for Web servers. Load balancers intercept Web traffic before it reaches Web servers and determine which back-end server is best suited to provide optimal performance and the fastest response time to requesting users.
The players: We examined load balancers from Alteon WebSystems, Arrow Point Communications, Coyote Point Systems, F5 Networks, Foundry Networks, HydraWeb Technologies and RADWare.
Unlike caching appliances that are separated by subtle nuances in how they process page requests, load balancers vary greatly. For starters, they come in all shapes and sizes. Some vendors - including Alteon, ArrowPoint and Foundry - implement load balancers in a switch, which sits between the servers and the Internet connection. Other vendors - Coyote Point, F5 Networks and HydraWeb - implement the functionality in a PC with dual network adapters, one that connects to a hub that front ends the Web servers and the other that links to the Internet feed in a router or other device.
At first glance, PC-based load balancers may seem out of place in enterprise networks. However, our tests showed that is not the case. Because PCs can handle Fast Ethernet traffic on the I/O side, they can safely handle a T-3 worth of traffic.
While an Application Specific Integrated Circuit (ASIC)-based load-balancing switch will offer greater raw performance than a PC-based load balancer, the PC-based device can offer more functionality because the general operating system kernel offers standard programming interfaces to which third parties can write management utilities. So there is a basic trade-off in terms of functionality of PC-based load balancers vs. the raw processing power of an ASIC-based load-balancing switch.
Load balancers track a variety of health statistics on connected Web servers to determine which device carries the smallest load and offers the optimal response time to handle the transaction at stake. Some load balancers, including those from Alteon, Arrow Point, HydraWeb, F5 and RADWare, rely on relatively simplistic ping commands to measure server response time. Other load balancers, including those from Arrow Point and RADWare, can also request entire Web pages via telnet sessions or HTTP requests from servers and measure response times.
The downside of load balancers that support pings to conduct server health checks is that a server's HTTP daemon may fail while the server's network connection may still be alive and respond to a ping. This situation leads the load balancer to believe that the server can handle HTTP requests. This is why some load balancers favor approaches that gather more granular data.
The HydraWeb 5000, for example, issues database queries, measures the response time of the query and uses that the response time as one of five measured variables for each server that produces a "balanceability index." The product then uses the index to determine which server is best suited to handle certain requests.
Additionally, some products employ server agents that collect server statistics and feed them back to the load balancer. Agents provide much more granular information than what can be produced from a simple ping request. HydraWeb's agents, for instance, can sense server CPU utilization, which helps determine the number of requests each server can handle at any given time. The Arrow Point CS-100's agents can bore into transactional data to guarantee that session-oriented e-commerce transactions hit the right type of high-end server. For example, if the CS-100 detects that a request is for an active server page, the request is funneled to a server that can handle such a request.
While server agents provide detailed data, they also consume precious server CPU cycles. Moreover, server agent detractors point out that some intelligent agents are not yet available on a diverse set of server platforms. The case for agents is about the depth of knowledge you desire.
Once a load-balancing device has determined the load carried by each Web server it manages, it will distribute traffic according to algorithms. In our evaluation, we found that Arrow Point's CS-100, F5's Big/ip and Coyote Point's Equalizer all support round-robin balancing. In that approach, the load balancer hands off an equal number of requests to all available servers in a sequential order. Alteon's AceSwitch 180,Coyote Point's Equalizer, F5 Networks' Big/ip and RADWare's Web Server Director support a balancing approach that directs requests to the server with the fewest number of TCP connections. Arrow Point's CS-100 also supports a static weight balancing option that distributes requests to servers based upon an assigned weight - a Pentium 300 may field more requests than a Pentium 200, for instance.
Any Web traffic management algorithm needs to be able to spot a faulty server. But "down" is a relative term in a server outage. While a load balancer may detect that a server is dead, can it detect a connection in which the TCP/IP stack is up, but the Web server software has crashed? Our evaluation of switches found that aside from an extreme scenario in which the server is totally CPU-bound, there is little consensus among vendors about other impairment conditions. Clearly, vendors will have to improve the capability for their load balancers to differentiate between total server failures and partial service failures.
Our recommendations: If your load balancing needs are simple, which is to say that you want an approximate level of sharing across all your servers and your environment consists of a low-speed LAN, then it is likely that most of the products we looked at would do the job. Likewise, if you have static content, a load balancer that offers round-robin distribution should work just fine.
But if your load-balancing needs require more granularity, you should look to products that provide a little extra intelligence by way of server agents. Additionally, if you want to support dynamic Web content, such as pages that are built on the fly, you need a load balancer that supports sticky HTTP connections..
Changes to come
Traffic shapers, caching engines and load balancers can provide much-needed relief to improve Web site response times for end users. But we contend that you may be able to further improve performance by using more than one kind of these devices in the same deployment.
Caching engines placed in front of load balancers, for instance, can further reduce the demands on back-end servers by caching commonly requested pages or Web page elements. Also, the use of packet shapers can ensure that mission-critical traffic gets through the front door so load balancers can intercept the streams and direct them to the server best suited to handle the requests. Caching engines sitting just outside the load balancers will help off-load repetitive page requests from the servers, thus freeing them up for order entries and other mission-
critical transactions.
Over the next few months expect to see products emerge that offer integrated Web traffic acceleration services. Switch makers, such as an Alteon, will fold caching or traffic-shaping features into their products. Cisco has already folded some load-balancing capabilities into its LocalDirector product. Even load-balancing vendors such as Arrow Point are beginning to offer policy shaping and traffic control capabilities on top of their other features.
We contend that you need to blend traffic shapers, caching appliances and load balancers in order to groom response times, improve the reliability of service and offer buyers a uniform experience time after time. Leave out any one of these emerging technologies from the equation and you will jeopardize your company's ability to deliver an effective e-commerce service.
RELATED LINKS
Feature tables
See how traffic shapers, Web caches and load balancers compare in several key areas.
A primer on load balancing algorithms
Network World, 9/22/99.
Cacheability Engine
Give it a URL and it tells you how cacheable it is. Can be used either through the Engine Web site or as a Web browser plug-in.
See how three of these vendors replied to our Load Balancing RFP:
Vendor responses: load-balancing RFP
See how vendors responded to our sample RFP. Network World Fusion, 6/14/99.
Network World buyer's guides and reviews
Featuring in-depth reviews and searchable databases of product specs:
User study: Dense traffic drives Web-server load balancing
Network World Fusion, 6/14/99.
Caching Net Resources
Links to additional technology overviews.
Foo' Bar: Get a load of this
A look at local load balancing. Network World Fusion, 7/19/99.
Foo' Bar: Distributing the load
A look at distributed load balancing - and why it may not make sense anymore. Network World Fusion, 8/9/99.
Feedback
Tell us your thoughts on this article or the issues it raises.
