Why monitoring needs to be reinvented

We monitor things to make sure that they work well all of the time. But nothing works as it is supposed to all of the time. Maybe we need to reinvent monitoring.

Why monitoring needs to be reinvented

Why do we monitor things (user experience, transactions, applications, virtual servers, physical servers, datastores, etc.)? Because we want them to work well all of the time.

Does everything in your environment work well as you expect it to all of the time?

Well, of course not. What if we reinvent monitoring so that we can help make that happen?

Consider the following analogy: An airplane can fly itself from its origination to its destination and land itself without human intervention. If we flew commercial airliners the way we run enterprise IT, planes would be dropping out of the sky left and right—and only be 30 percent full (maybe this would be a good thing).

The business imperative: Compete online or die

Retail and financial services were the first businesses to embrace online business models, and traditional (brick and mortar) companies in these markets now face the mandate to compete effectively online or see their market share eroded over time by online vendors who have no physical presence. Macy's and Target have to compete with Amazon, and your local bank has to compete online with the pure online banking alternatives.

+ Also on Network World: What IT admins love/hate about 8 top network monitoring tools +

But this is not limited to companies that sell products and services directly to consumers over the Internet. Now every business is finding it is more effective to engage with customers, partners, suppliers, prospects and employees online. The pressure to implement business services online is unrelenting and ever increasing. So, almost every company is becoming, at least in part, a software company. Every company increasingly relies upon business functionality implemented in software to compete, acquire new customers, service existing customers or just run the business. And the pressure is not just to implement those services in software quickly; it is also to evolve those services frequently so as to stay ahead of the competition.

The pressure is not just to keep the online operations functioning and working well for the users. Things also need to be fast. Slow is the new down. Studies abound on the cost of latency—with Amazon having reported that an extra 100 ms of latency would cost it 1 percent of its online sales revenue for the year.

The monumental technical challenge

So, competing online with software that works and performs well is a business imperative today, and it's an imperative that is becoming more and more important over time. It is also become a more challenging problem to solve over time for the reasons listed below:

  • New software is being put into production more quickly and then enhanced more frequently.
  • The software stack is becoming more diverse. New languages, such as Node JS, and new runtime environments, such as Pivotal Cloud Foundry, are commonplace.
  • The application is ever more abstracted from the actual hardware. The Java Virtual Machine, Docker, compute virtualization, network virtualization and storage virtualization make it more difficult to know what is actually going on in the environment.
  • The infrastructure is virtualized, dynamic and often automated.

In summary, you have rapidly changing applications running on abstracted, dynamic and automated infrastructures as shown below.

increasing.innovation.dynamic.infrastructure Bernd Harzog


End-to-end monitoring is a problem that no single vendor can address in its entirety. IBM, BMC, HP and CA all tried to address this problem with Business Service Management 20 years ago, and all failed due to the much slower pace of innovation at that time. It is simply a fact that the current pace of innovation and change is so high that invariably new vendors will have to address some of the new challenges. Those new challenges include network virtualization, storage virtualization, containers and serverless architectures (Lambdas). 

Taking advantage of a multivendor best-of-breed approach

So, here is the key insight. Multiple tools from multiple vendors is not a problem. Done correctly, a multiple-tool strategy is a best-of-breed strategy. As long as you are careful to select tools are, in fact, leaders in their spaces and are careful to avoid overlaps as much as possible, a best-of-breed strategy is the way to go. 

What would a best-of-breed strategy look like? It would need to have tools at the following layers:

  • End user experience
  • Application and transaction performance
  • Container (think of both the JVM and Docker as containers)
  • Operating system
  • Compute virtualization
  • Network virtualization
  • Storage virtualization
  • Compute hardware
  • Network hardware
  • Storage hardware

Turning best-of-breed into end-to-end

Unfortunately if all you do is implement a best-of-breed strategy, you will end up with a best-of-breed Franken-Monitor (shown below).

franken monitor Bernd Harzog

Franken Monitors leads to war room meetings where everyone brings their laptop into a room with the console for their product on the screen and everyone walks around trying to visually relate the disparate streams of data and figure out what is causing the problem.

The solution to the Franken-Monitor problem is to put all of the streams of metrics from all of the valuable tools into one common low-latency, high-performance, big data back end. This leads to data-driven IT operations, which means that IT uses data to run its operations just as the business uses data to make business decisions. More on that in the next post.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2016 IDG Communications, Inc.