The benefits of converged network and application performance management

This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter’s approach.

A converged Application Performance Management (APM) and Network Performance Management (NPM) solution gives organizations actionable information to resolve the most challenging performance concerns in minutes — it doesn’t matter if the slowness originates in the network, infrastructure, application logic, servers, database or other areas that compromise performance.

The benefits include seeing how the system and network resources are serving all applications, and deep visibility at the code-level into how critical applications work. No longer will organizations need to juggle between multiple interfaces and different vendor tools to get to the root cause of an issue that is impacting users’ performance.

Today, most companies have a large collection of monitoring tools that were originally purchased to address specific point challenges or provide visibility based on their needs at the time. Point tools that look at one area as separate entities, such as applications, networks, databases or servers, were fine in the past, but in today’s environment they can be doing a disservice.

These days, the elements in an application delivery environment are so intertwined that it can be difficult to find what’s causing a performance problem without a converged view. It gets more complicated when you add in cloud technologies or third-party services, virtualization, and active optimization technologies.

A typical example: the IT department at Manhattan’s Bellevue Hospital Center is responsible for a network with more than 5,000 users and a mix of LAN and inter-hospital WAN links, as well as hundreds of different applications ranging from patient admission and other database-oriented systems to a PACS (Picture Archiving and Com­munication System) that can deliver multi-hundred-megabyte radiological images to workstations throughout the network.

+ ALSO ON NETWORK WORLD Demand growing for application performance management tools +

IT depends on device-oriented network management systems such as HP OpenView for fault management, and CiscoWorks for utilization information. Unfortunately, they lacked a management solution that would give them an end-to-end view of network and application activity. “After an experience with the PACS system, where it took three different teams of outside consultants to pinpoint a server problem that had been blamed on the network, we realized we needed something that would give us better insight into the conversations taking place across the network, and what the various computers were doing,” says Ben Aheto, network manager for Bellevue. “Without that information, we were spending far too much time defending the network from the typical ‘the network is slow’ complaints.”

By design, siloed or point monitoring tools only look at one piece of the overall picture and do not understand the interdependencies between elements, which are crucial to solving performance problems. This is why, according to Forrester, 31% of performance issues take more than a month to resolve or are never resolved.

Point monitoring network and application performance solutions may seem relatively accurate in isolation, but elongated problem resolution timeframes don’t lie. Take a look around and see if you have senior engineers from the application, network, infrastructure and database teams troubleshooting the same problem in parallel, wasting significant time, and not always finding the true causes of complicated issues. With a collection of these siloed tools many visibility gaps in your application delivery environment exist – particularly, outside the narrow scope of this collection of tools, but also in-between the wide range of disparate components — especially when it comes to understanding their interdependencies, connections and conversations.

In short, point monitoring tools have proven to be ineffective early warning systems for application degradation, because they are neither comprehensive nor integrated, nor do they provide any cross-domain intelligence – often showing everything is “OK” in spite of user complaints.

“Sometimes, a customer would report that an application wasn’t running properly, or that a server wasn’t responding to a request,” says Stefan Thoma, senior network engineer at Flughafen Zürich AG,a private company that operates Zurich Airport. Companies based at Zurich Airport include airlines, retailers, hotels, and restaurants, all having their own respective WANs that connect to and rely on the airport LAN for accessing mission-critical applications.

Airlines, for example, require specialized applications for operations, such as bookings, reservations, check-in and electronic ticketing. As a result, the airport LAN connects to roughly 150 national and international networks and has more than 14,000 network connection points (access ports). “With such a complex network infrastructure, issues invariably arise, and we were spending a lot of time and manpower trying to resolve them. It was very hard to find the root of the problem. Many of our customers’ data centers, for example, are based on another continent, so it was difficult to know whether the problem was due to our network or theirs.”

In fact, according to Gartner, 70% of the time IT organizations learn about performance problems from end users. With organizations relying on applications to perform almost every task critical to the business, they can no longer wait for the phone to ring or start receiving user complaints to take action on business disrupting application failures.

The converged answer

The convergence of network and application performance is a reality, and a holistic understanding that carries a real return on investment (ROI), especially when performance problems aren’t slipping through the cracks of your siloed monitoring tools.

What is needed is a solution that can make sense of it all and provide actionable information to resolve the most pressing performance issues. It’s a requirement to have a detailed, quantitative understanding of whether the applications effectively meet business objectives, and to do so, an accurate representation of everything that can compromise application performance is needed — from poorly executing application code to an overloaded server or load balancer.

Most companies will require two or more teams to troubleshoot the many issues affecting end-user experience, but they really should be doing it in a collaborative approach. This allows for targeted troubleshooting and the ability to isolate the fault domain — whether it’s a code-level defect, an infrastructure problem or a network issue. Let’s look at how this helps each of the key groups who are responsible for network and application performance.

* Network Operations / Support. According to Gartner, “Network teams are often the initial starting point to triage application, server, security and storage issues.” NPM provides visibility and reporting on applications from a network perspective, but specific application behaviors aren’t typically detected, resulting in overlooked performance bottlenecks. As the first line of defense, it’s critical to understand all performance problems impacting an application — whether it’s bad QoS marking, unexpected bandwidth congestion, or when the problem is slow web service calls, poor performing code, or even too many database calls – which is typically uncovered by APM data.

* Application Operations / Support. Remember, we are going for more than “visibility and reporting”; we are looking for actionable information to resolve any problem in minutes. Neglecting to understand how the underlying network and infrastructure impacts the application is a big no-no! If is important to understand why a particular transaction may stall within a particular tier of the network or infrastructure – which, is typically exposed by NPM metrics. Lacking granularity and insight into the network and infrastructure itself can often cause a performance bottleneck to be misinterpreted as a network issue. This requires an accurate understanding of how applications are consuming system and network resources with the ability to distinguish between bandwidth contentions and latency, in addition to common server response time problems, such as, code exceptions, undetected memory leaks or slow SQL queries.

The “Ah-Hah” moment

Only a converged network and application performance management solution can provide an accurate picture of end-user experience and performance across the entire application delivery environment by carefully correlating metrics at the network level with rich application performance management data in real time.

When performance is viewed from a user’s perspective with a clear picture of network and application performance, it creates transparency, and a common understanding about what is happening across the entire application delivery environment – making it easy to find what’s causing a performance problem and who’s responsible for fixing it.

This approach helps speed up the troubleshooting process for all performance problems, and ensures that organizations can get to the root cause in minutes not weeks or months. Additionally, taking a collaborative approach to addressing issues quickly eliminates the finger pointing between different groups, not to mention those dreaded war rooms. But, most importantly, it identifies and prioritizes the biggest performance opportunities across domains to help make more effective decisions regarding IT Investments.

Insider Shootout: Best security tools for small business
Editors' Picks
Join the discussion
Be the first to comment on this article. Our Commenting Policies