Have you ever been pulled into a troubleshooting call, where all the support teams report their monitoring systems are reporting a ‘green’ status, but the users are reporting horrendous performance? Unfortunately, this happens more frequently than most of us would care to admit. When I get pulled into one of these situations, I avoid all the recriminations of the “could of” and “should of” done’s, which do not amount to much as the customer is in the background pulling their hair out in frustration.
When my phone rings the debugging process is usually stalled, and all the basic and traditional debugging approaches have failed the team already. Essentially everyone’s looking at each other shrugging and not knowing what to do next.
From experience with past events, I know that often the debugging process can be sidetracked by ‘apparent cause’ rather than ‘root cause’, where people get stuck chasing symptoms. Particularly tricky is when the root cause is a combination of small issues happening at multiple layers in the IT stack. These small issues can quickly combine to significantly impact overall performance. The challenge is that through the lens of each support team, their system is performing adequately, if not ideally.
In a high stress situation such as this, it’s important to get the team to focus on empirical facts. Facts take the emotion out of the process.
Years of experience have taught me that before I do anything else, I need to know these three things:
1) What I have;
2) What it’s connected to; and
3) What has changed.
At Adaptivity, we have further refined these steps in to a forensic methodology, which systematically walks through each tier of the n-tier system and for each tier it’s IT stack. For example, the typical web system has a web server, application server, and database server. Each of these tiers is made up of the following IT stack: Application, Coordination, Infrastructure Services, Components, Operating Environment, Connectivity, Hardware, and Physical Environment. By systematically walking through these two dimensions and noting any abnormalities, it’s possible to get a holistic view of what’s going on,facilitating the process of identifying and eliminating the smaller issues which are conspiring together to cause the problem.
In trying to identify what problems applications are experiencing (the forensic process), it is critical to identify the parts of the puzzle. Commonly overlooked are access from other systems that either should not be permitted or are unknown. Is it acceptable that a production server is pointing to a test database? How do you know? Typically the data sourcing is in a property file, and can easily be in error. Similarly, are you running clustering for an application, such as JBoss or Websphere? Are the members of the cluster all configured the same? Or maybe the clustering has a dependant library.
In creating this holistic view, the first challenge almost always encountered is the lack of up-to-date documentation of what assets the system is comprised of. It’s surprising how often system managers don’t have a clear picture of what their system is fully comprised of, but understandable as the team that originally deployed the environment is probably long gone. Figuring this out can be a very time consuming endeavor.
Tideway has recently made a community version of its application dependency mapping software, Tideway Foundation, available (http://www.tideway.com/downloads/foundation/). The free download allows for deep scans of up to 30 hosts, which is plenty for most forensic purposes. All you need to know is one of the resources involved in the system you’re trying to discover and some basic credentials that are allowed to access that resource. From there, the Tideway tool will facilitate the discovery of the relationships between the different resources that comprise the system of forensic interest. A tool that can quickly discover these relationships is an invaluable asset - and all the better when free!
On an enterprise level, Tideway can provide this unifying information to help ‘enlighten’ the development and deployment teams. Traditionally the acquisition process of this information can be daunting; but with Tideway, any SA with VMware installed on their desktop can quickly gather deep technical statistics. Having deployed the full Tideway platform in a major financial environment and knowing how useful it is to have this information at my fingertips, kudos to Tideway for making this community version available.
Once you get through the trauma of triage via the forensic process, it’s a good time to bring up the need for an end-to-end transactional monitoring discipline, which would help avoid these situations to begin with. Stay tuned.
Tony Bishop is CEO, Adaptivity. He'd previously served as SVP and chief architect of Wachovia's Corporate Investment Banking Technology Group, where his team earned numerous awards for its SOA and utility computing infrastructure. Tony has 19 years' experience and is the recipient of 40 under 40 Most Innovative IT Leaders, Premier 100 IT Leaders as selected (by ComputerWorld in 2007) and a member of Wall Street Gold Book 2007.
Sheppard Narkier is chief scientist and co-founder of Adaptivity. Prior to that, he was head of software portfolio management and IT governance for the Wachovia Corporate Investment Banking Technology Group. Sheppard has more than 29 years of experience in the IT industry. He focuses on cost-effective IT systems and is an acknowleged expert at reusable components (frameworks, programs, architecture), the realtime enterprise, SOAs, messaging and legacy system integration.
Jim Houghton is the Chief Technology Officer and co-Founder of Adaptivity. Jim was the SVP of Architecture & Strategy for the infrastructure organization at Bank of America, where he drove legacy infrastructure transformation initiatives across 40+ data centers. Prior to that he was the Head of Wachovia’s Utility Product Management, where he drove the design, services, and offerings for SOA and Utility Computing for the technology division of Wachovia’s Corporate & Investment Bank. Jim has also led leading-edge consulting practices at IBM Global Technology Services and Deloitte Consulting.
Post new comment