Americas

  • United States

Don’t blame the network

News Analysis
May 24, 20045 mins
Enterprise Applications

Testing software before a rollout helps curtail finger-pointing about poor performance.

It’s all too easy for users to blame the infrastructure when their access to corporate servers is slow. This is such old hat that at a recent company meeting, the network professionals at apparel retailer Gap showed off their own tongue-in-cheek creation – a song called “Blame the darn network.” In turn, the network folks often blame the application developers for building code that hogs bandwidth and makes everything else run like a dog.

“The complexity of applications is growing, and developers tend [not to] understand the nuances of application behavior in the real world. Now so many applications are mission-critical and IT operations are having to do more with fewer people – this is putting tremendous pressure on the [network operations center] guys,” says Dave Danielson, CEO of performance management vendor Altaworks.

Too often, software is tested in the confines of the software lab but not in the real world, where multiple networks, servers and clients could introduce scenarios that the application might not have been built to overcome.

The problem also could be cultural, Danielson says. Developers strive to write software quickly and efficiently, whereas the operations folks ensure the infrastructure is well managed and performs well. If their motives are different, how can the two groups provide services that are useful to their constituents?

Danielson says organizations need to foster a closer understanding of the focus and challenges of both teams. What’s more, he recommends testing the application before unleashing it to end users. This process helps the network specialists ensure there’s enough bandwidth for the software, and helps developers identify and fix weak spots in the software that could create problems on the live network.

Gap, for example, uses Opnet’s IT Guru and Application Characterization Environment (ACE) network modeling tools to test new software before it’s rolled into production, says Jerry White, senior network engineer at Gap in San Francisco. IT Guru models the network, including routers, switches, protocols, servers and individual applications, while ACE provides detailed analysis of application packet traces and quick diagnosis of problems.

IT Guru is used as part of Gap’s Network Application Deployability Assessment program for testing software on a model network before deployment. Software managers and the network staff who will conduct the tests meet and discuss the application’s demographic, the number of users and their locations. “Once that has been determined we talk to the users to see what they want to get out of the study – is it response-time estimates? Mainly, it is ‘Will network ops sign off on our application?'” White says.

White recalls a time when Opnet showed Gap business users that throwing more bandwidth at a problem might not always be the right answer. The retailer rolled out a pilot at 19 branches in which sales associates could place customer orders from Gap’s online store using the browser on a cash register. Because the response time after a mouse click could be as slow as 45 seconds, the team responsible for store systems wanted to upgrade the 56K bit/sec connection between the store and the site.

A quick model using ACE of a typical transaction using 128K and 256K bit/sec connection speeds found that the faster networks would only shave 5 seconds off the current time. By using the model, White and his colleagues identified that the delay was caused by large transaction sizes and slow processing times of the cash register and server at the online store. The company didn’t change the setup because traffic volumes were low and the performance increase wouldn’t justify the cost.

“Our data center is in Rocklin, Calif., and we have users in Asia and Europe. We can’t throw a ton of bandwidth at problems,” White says. “The ACE reports give users choices – this is what would happen if we upgraded the bandwidth. The reports help them set their own performance expectations. Could they put up with low response times, or do they want to spend the money and upgrade?”

White says the entire testing process – from first meetings to the issue of a report – takes about 40 manhours, much less than the 200 to 400 manhours that Altaworks’ Danielson says some companies spend solving brownouts. Danielson knows of a Midwest insurance firm that has a swat team of seven types of IT specialists who trace performance problems on the live network instead of testing the network and applications at the pre-production stage.

But sometimes it’s still necessary to troubleshoot during production. White recounts an occasion when users at Gap headquarters suffered intermittent delays when accessing an application service provider (ASP)-hosted CRM application. The users connected to the ASP’s data center after passing through a proxy server at Gap’s data center.

“We used IT Guru to capture the application stream, and we found some interesting things. The application was a giant Java script, and the processing power of the clients was slow,” White says. “Users wanted to blame the proxy server but once we ran tests we found that it was not inducing delays.”

Gap fixed the problem by upgrading the slower client machines to 2GHz, setting users’ browsers to use HTTP Version 1.1 and adding bandwidth.

He says the testing tool has been helpful in preventing the finger-pointing between users, applications development staff and the network operations specialists. But he thinks it would be even better if network professionals had more application knowledge. “If I knew what the application was doing, I could make recommendations, rather than just give them raw data. We could then be far more effective,” he says.