Skip Links

Network World

Michael Morris

A Zero Defect World?

By michaeljmorris on Wed, 10/03/07 - 8:44pm.
Newsletter Signup

When I was in the Army, we were constantly reminded the Army was not a "zero-defect" organization. Leaders expected mistakes because they were inevitable (especially when the "leaders" were making them). Mistakes were how you learned and got better. The Army's National Training Center in the California Desert - a truly awful place - was built on that principle. The opposing force, the bad guys, played that role constantly. They knew all the terrain, and regularly beat the crap out of the army unit that was there for two weeks of training. My unit was one of those subjected to a two week beating. But we learned a lot.

In July, Cisco's Networkers keynote comedian speaker, John Cleese, made mistakes the theme of his address - "Mistakes are a good thing because you learn from them and get better. So, don't be afraid of making mistakes, because when you start being afraid of mistakes, you stop innovating and taking risks. When that happens your business will be left behind."

However, does this rule apply to networking these days, or for that matter, all of IT? Let's be honest, one line missing or changed in Cisco IOS can make a world of difference. For example, if you're trying to give VoIP the highest level of QoS in your network you would want an ACL like this:

ip access-list extended Tag_EF
 permit udp any range 16384 32767 any range 16384 32767

However, let's say you actually configure this:

ip access-list extended Tag_EF
 permit tcp any range 16384 32767 any range 16384 32767

That can make a BIG difference for your company. Now, conference calls with the global sales team are failing, the CEO's call with the President is garbled, and calls with your customers keep dropping. One little mistake. Are we still a zero-defect organization?

Millions of dollars in productively or actual revenue could be on the line. I doubt the CIO would've called the engineer in question and said "That's ok, I just want to make sure you learned from your mistake". Sure! I bet that's what the CEO, the President, and the VP of the call center told the CIO. As technology gets more and more complicated and integrated mistakes are bound to happen. But businesses have no tolerance for errors or downtime.

I find myself struggling with this problem day-to-day. As I moved further in my career, my focus is becoming more about "what" technology can do instead of "how" it does it. But I find myself searching for the smallest details to verify there are no mistakes, making sure the "how" lines up with the "what". The last thing you want as a senior architect is to recommend a technology only to find out it doesn't work at implementation. Again, millions of dollars are at risk.

I do love networking and IT and look forward to a long career. But honestly, the future is going to be zero defects. And the future is now.

Not realistic

0

Zero defects is not reasonable. Even Six Sigma gives you that .00001% of allowable mistakes. Everyone screws up, and that's just life. Any manager worth their salt knows this and can understand when problems are just individual mistakes or part of a more systemic issue.

Unfortunately in IT, we often are in the position where our mistakes can have very large implications. Part of the experience that you gain as you grow in your position, and make some of those mistakes, is to understand how to avoid the things that can cause problems. Or at least know how to approach them in a safe manner.

Any organization that has a low tolerance for mistakes will force their people to have a low tolerance for risk. This is turn (as John Cleese said) stifles innovation and change. Companies that cannot change/improve will eventually fall to the companies that can.

Zero Defects - a Goal

0

I echo the other comment: zero defects is a goal, but its not very likely. Instead, having an effective mitigation strategy to control or minimize the impact of a defect is a better strategy. For instance, have a dual device architecture; creating a configuration that isolates and allows changes to be applied and tested to one side prior to upgrading the other. Having an adequate test environment that closely mimics production. Having a configuration management application that helps point out configuration changes so they can be verified prior to implementation. Using templates and having staff peer and QA review changes, these are things that will allow us to approach zero defects.

Humans cause errors...they always have, they always will. It is much better to figure out ways to accomodate that than to create a system to assumes none.

Entropy makes "zero-defect" meaningless

0

Anything that can be made entirely free of defects (which is a nonsense objective really) could not possibly remain that way, unless somehow cooled to near 0 degrees Kelvin. Is that a silly physics factoid? Sort of.

The real point is that entropy, or roughly the tendency to disorganization, is present in all systems (except those frozen completely solid), particularly those that are dynamic (such as networks). Every time someone puts their hands to the infrastructure, every bump in the server room, every tick of the clock, means that something can potentially fall out-of-spec and result in a degradation that was not present moments before.

Establishing a zero-defect network may be spectacularly difficult. Maintaining it certainly will be impossible. For a start, there needs to be a self-organizing process that compensates for entropy - and today that is constant supervision by people (which ironically also introduces problems). That self-organization principle needs to become part of the infrastructure itself.

The long term goal underlying the autonomics concept is an optimally balanced set of processes for self-awareness, self-regulation (healing and provisioning), and adaptation. And of course, introducing those processes will simply open the door for more..... defects!

As previously noted by someone else, it's just life in all its glory. If you can't stand the heat..... try the Arctic while there is still some ice left.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Welcome, visitor. Register Log in
Advertisement:
About From the Field

Michael Morris is a communications engineering manager at a $3-billion high-tech company. His background is in enterprise WANs working with telcos and developing large-scale routing designs. He has worked on networks at government and corporate organizations, including networks at two Fortune 10 companies. In his current role, he leads a team of 10 engineers responsible for large-scale IT networking projects and architectural standards for data networks, storage area networks, IP telephony, contact centers, and security. Michael is CCIE #11733 and recently became one of the first three Cisco Certified Design Experts (CCDE) ever (#20080002). He has 11 years experience in networking and communications, including four years as a paratrooper in the U.S. Army. He has a bachelor's degree in MIS from the University at Buffalo and is working on his MBA from NC State University. In 2008, he was awarded the Network Professional Association (NPA) Professional Excellence and Innovation Award for his work on network architecture, templates and enterprise MPLS design.

Contact him.