Web 2.0 tools can help with data center management

* How collaborative tools can help data center managers

Our vision of the next-generation data center may be that of a highly integrated and automated environment, but Nemertes’ research shows that most companies are not even close to that vision. One way to improve things is to introduce collaborative tools, copying some of the more successful aspects of “Web 2.0” such as wikis, RSS feeds and collaborative communities of experts.

A few exceptions aside, most data center operations use a smattering of monitoring tools and a largely manual troubleshooting and problem-resolution process. In many companies, servers are not provisioned or configured automatically. When faults occur, the troubleshooting process is very labor intensive and represents a large part of the operational cost of a data center. Increasing the efficiency of these “Level 2” engineering operations can bring significant savings and service-level improvements.

Our research in collaboration technologies has shown that the ability to bring the right expert into a discussion at the right time can both decrease the time to resolve a problem and increase the likelihood of success. So when an engineer discovers something strange in a log file while troubleshooting a problem, collaborative tools could help them resolve the problem much faster.

Which tools could help with troubleshooting tasks? Here are a few:

* Instant Messaging - In Nemertes’ “Just-In-Time Fetch-The-Expert” model, we have shown that rapid access to specialized expertise or skills can greatly reduce the time to complete a transaction (troubleshooting, in this case) and increase the chance of a successful resolution.

* Search Technology - Being able to search log files for events of interest can make it easier to correctly identify the root cause of a problem, especially if multiple systems are involved. Furthermore, adding collaboration features such as the ability to save and share “searches” can further improve troubleshooting.

* Wikis and shared knowledge bases - Wiki technology has been shown to enable greater collaboration by freeing up groups from hierarchical and formal communication channels. The ability to openly share information, whether in an internal wiki or a public wiki, has already created a wealth of troubleshooting information in technology wikis.

* RSS feeds - Aggregating the latest information from a number of different sources, RSS feeds give engineers the ability to stay abreast of changes in the environment. We can see potential applications for RSS feeds for notification of the latest changes in a change management system, or the latest trouble tickets. Greater awareness of the “big picture” can in turn assist in troubleshooting.

Many open source projects implement the tools above and can be downloaded for free. Of course, the cost of deployment, customization and maintenance should not be underestimated.

One notable commercial/open source hybrid is Splunk. Splunk tools combine collaborative search technology, RSS feeds and a shared Wiki knowledge base.

Automation will never negate the need for intelligence, experience and judgment in the troubleshooting of data center problems. Collaboration can enhance the human skills beyond the reach of automation.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2006 IDG Communications, Inc.