It's been more than a year since Wayne Harris and his IT comrades at a Canadian healthcare organization exorcised the little demons, but the memories still haunt them.
"We spent an average of 40 hours of overtime a week banging our heads against walls trying to figure out what the heck was going wrong with our servers," says the manager of technical services for Baycrest Centre for Geriatric Care. "We wondered if we were being sabotaged."
In a sense they were, but by the most unlikely of suspects: microscopic metal strands called zinc whiskers that were growing on the bottom of the data center's raised-floor tiles.
It all started in 2002, shortly after the Toronto company had an outfit in to clean up its data center.
"A couple of weeks later our servers started failing. Motherboards, hard drives, you name it," Harris says. "We put in new boxes and sure enough, they failed too."
Harris says the IT team, which at the time was overseeing a collection of about 50 servers, exhausted all avenues trying to solve the mystery. Finally, the mention at a conference of a similar problem led Baycrest to find that when the cleaning crew raised the data center floor tiles, the conductive zinc filaments - just a few millimeters long and a few microns in diameter - went airborne, short-circuiting the servers.
"We try to spread the word now," says Harris, who estimates Baycrest spent at least $100,000 replacing floor and ceiling tiles and giving the data center a deep cleansing. "We don't want others to go through what we did."
While metal whiskers were new to Baycrest, they've actually been known since the 1940s when Bell Labs discovered them in telecom environments. Zinc whiskers are thought to "grow" as a result of molecular stress, whereby the zinc used to keep steel on the bottom of the tile from rusting tries to separate itself from the steel. Whiskers have been found to form in a vacuum, but heat, humidity and other environmental factors also have been suggested as triggers. The metal filaments have been discovered growing in cabinets and other data center spaces.
Like many IT problems, zinc whiskers aren't something that companies victimized by them often discuss openly, perhaps for fear of making the IT infrastructure appear vulnerable or the IT management team seem negligent. As a result, many IT shops don't even think to look for whiskers when data center equipment goes on the blink. In fact, looking for them is pretty tough in the first place because they are barely distinguishable with the naked eye from dust. However, by shining a light parallel to the bottom surface of a zinc whisker-covered floor tile will let the viewer see the whiskers, or more precisely, reflections of them.
David Loman, a power and environmental specialist for HP, speculates that if he asked 10 IT managers about zinc whiskers only two would know what they were. "When I tell people they've got zinc whiskers they look at me like I've grown antennas out of my head," he says.
Loman says one way that zinc whiskers are identified is through a distinctive popping sound that power supplies emit as they are snuffed out by the whiskers. He recalls one customer whose data center lost dozens of power supplies after an old upflow air conditioning system and a new downflow one were turned on at the same time, scattering zinc whiskers everywhere. "It sounded like popcorn," he says.
Those familiar with zinc whiskers say it would behoove IT shops to study up on the contaminant. While the sort of electroplated wood-core floor tiles thought to have spawned most zinc whiskers are for the most part no longer being made or installed, plenty of older tiles remain in data centers. What's more, new compact data center gear, such as blade servers that squeeze components into smaller spaces, are thought to be more susceptible to whiskers.
"I thought the problem would have peaked once manufacturers ran out of the old tiles, but over the last couple of years I haven't seen the problem abate. I think it's grown," says Rich Hill, who heads up a data center cleaning company called Data Clean that comes across a zinc whisker problem about every two weeks.
"People had mainframes for years without any problems from whiskers," Hill says. "Invariably, the newer equipment is what has the problems."
Data center consultant Bob Sullivan says whisker problems grew as computer systems were built with more-powerful cooling fans. "They sucked the whiskers right in," says Sullivan, dubbed by some as the "Father of Zinc Whiskers." While at IBM in the early 1990s, his team discovered that metal whiskers caused problems with certain of the company's storage devices. He also spearheaded development of remediation processes.
HP's Loman says it isn't so much the way the guts of new equipment is being designed - with components closer to one another - that makes them more susceptible to whiskers. Rather, he says it is that more of the systems now can be packed into a rack, making it more likely that if zinc whiskers are in the air, more equipment will be affected. Whiskers rarely get more than about three feet off the floor, but they do tend to congregate, he says.
Loman says manufacturers have taken steps to prevent companies from having their data centers devastated by zinc whiskers. For example, he says power supplies in data center gear are now usually protected with a plastic coating that keeps contaminants at bay. Also, because most power supplies are now disposable, if they get zapped by zinc whiskers, he says they can easily be swapped out for new power supplies. HP and other vendors also make mention of zinc whiskers in data center site planning materials.
Metal whiskers are still not well understood, though, says Jay Brusse, a component engineer for NASA's Goddard Space Flight Center who collects information on whiskers at a Web site. NASA stepped up research into tin whiskers in the late 1990s after hearing about a non-NASA commercial satellite whose failure was attributed to tin whiskers, but has expanded the site's focus to cover zinc and other metal whiskers, he says.
"High-end computing companies I've talked to tell me that things could actually get a little worse before they get better since new equipment still is being installed in archaic rooms that have had plenty of time to grow crops of whiskers," he says.
Another lingering issue with zinc whiskers is whether they could have any effect on the health of data center employees, though experts say research has been limited and that no evidence has shown a link between the whiskers and health problems. IBM employees used to joke that zinc whiskers might even improve their libidos, Sullivan says.
"The joke was that you should stick your head under the floor before heading home for the weekend," he says.
To put whiskers in perspective, Loman notes that other problems his group investigates includes data center damage in the wake of disasters, such as the World Trade Center attacks and volcanic eruptions. He says his group comes across maybe six zinc whisker instances per year, but the cases tend to be significant in terms of the remediation required, which he describes as costly and time-consuming.
Those experienced with zinc whiskers say there is only one way to get rid of them.
"The tiles have to be replaced," Loman says, noting that the whiskers can fairly easily be blown out of equipment. He says experiments have been done to cover tiles growing zinc whiskers with epoxy, but that whiskers have grown through the coating.
Baycrest's Harris says that, among other things, his organization moved its air conditioning system from the floor into the ceiling. The organization also now greatly limits the number of people in its data center, figuring that less foot traffic means fewer contaminants have a chance to be shaken into action.
"When we first heard about zinc whiskers we said, 'You must be joking with us,'" Harris says. "But it's no joke."
Learn more about this topicZinc Whiskers Growing on Raised Floor Tiles are Causing Conductive Contamination Failures and Equipment Shutdowns
More on the issue; has a couple of photos.Zinc whisker white papers
Variety of views on the problem from Data Clean.