Change management, human error, and network outages

* What are your best practices for avoiding human error-driven net outages?

Many years ago, when Steve was in network operations for the University of North Carolina system, an amazing phenomenon occurred.  Whenever Steve went on vacation, the network uptime skyrocketed.  Of course, Steve's excuses include both the fact that it was the early 1980s and equipment management was more difficult, and that we were aggressively testing bleeding edge equipment.

But we have an inkling that this really hasn't changed significantly. As networks continue to become more complex, so does the task of change management.  In fact, some estimates state that up to 60% of both enterprise and service-provider network outages involve human error of some sort.  This could range from misconfigured equipment to incompatible software releases on the more complex side.  On the simpler side of life, having a craftsperson install a new line in a wiring closet could inadvertently result in a live connection being bumped loose.  Further, poor documentation could cause live circuits to be marked as being inactive.

But what's the solution?  Is automated change management any better?  On the one hand, a mistake in the automated system could cause a programming mistake to be propagated throughout the entire network.  At the same time, these systems usually can also back out the change more readily.  The automated system can also reduce the chances of an isolated typo in a repetitive process.

We'd like to hear your stories and to share them with the readers of this newsletter.  Of course, we don't necessarily need you to share with us the occasions when you personally brought down your net down due to a personal fubar.  Nevertheless, best practices for ensuring that this doesn't happen would be great to hear about.

Learn more about this topic

Common causes of downtime in IP networks

Network World Wide Area Networking Newsletter, 10/26/04

Mgmt. matters: Zurich Life tackles change management

Network World Fusion, 03/15/04

Software upgrade adds WAN load balancing to Foundry switches

Network World, 02/28/05

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2005 IDG Communications, Inc.

IT Salary Survey 2021: The results are in