Americas

  • United States

Use both preventive and reactive operations

Opinion
Jan 03, 20062 mins
Data Center

* Tips for managing data centers both preventively and reactively

Network operations centers focus on reactive management of problems. By continuously monitoring the environment, generating alerts and tracking trouble tickets, operations staff can fix problems as they occur.

As new applications are deployed and servers tweaked, inevitably, errors occur. By building good process around change- and configuration-management tools, these errors can often be prevented. To get the best results, preventive management and reactive management must work in tandem to minimize faults and downtime.

Unfortunately, most network-monitoring and systems-monitoring tools work in a vacuum – they provide no information on recent changes or the dependencies between systems. To make things worse, many network-management tools treat the rest of the infrastructure as if it doesn’t exist – the network is there, but the servers and other resources it connects are “somebody else’s problem.” Companies must try to break these silos so that they can manage their infrastructures as a whole.

When a problem occurs on the network, the monitoring tools should have some context. How does this affect the business? What systems are down? Were any changes made recently? Operations staff can use asset-management tools to map out their infrastructure. In addition, if the tools support some form of dependency mapping, then operations staff can use them to see the broader impact of a problem. By linking into change-management systems, operations staff can also see if any recent changes may have caused a problem. Beyond finger-pointing, this helps solve problems faster and offers the opportunity for process improvements.

Managing the enterprise infrastructure reactively, and also preventively, requires:

* Integration of asset management, change management and monitoring/alerting systems.

* Relating each event to the overall business and evaluating the impact of the event.

* Looking at the infrastructure as a system of dependencies, not just a set of elements.

* Evaluating the dependencies between each high-level business process and the infrastructure elements that support it.

Since each technology domain (network, computing, storage) has developed independently, few of the management tools look at the big picture. It is also difficult to integrate products that are narrowly focused. IT executives should evaluate their management and operations tools from a holistic perspective, looking for integration opportunities, and not just selecting best-of-breed point products.