• United States

The storm in your equipment

Jan 08, 20043 mins
Data Center

* Why we need error detecting code now more than ever

A standard joke in our industry is that it is very “incestuous,” by which I take it people don’t mean that we date our sisters, but rather that we keep seeing the same folks every very few years in new positions at new companies.  Old friends hire old friends in time-honored fashion, and are in their turn hired by them when circumstances warrant.

The down side of such relationships is that keeping the same people circulating from company to company has a way of propagating stale ideas and, as a frequent consequence, of keeping out the new ones.  On the other hand, part of the upside is that is that every few years in this business you get to see old friends again and renew old acquaintances. 

As an analyst I spend a large part of my time meeting with people or talking with them on the phone, so this sort of thing happens to me quite often.  Last week, I had occasion to meet an old acquaintance whom I hadn’t seen for several years.  Erik is now at Agilent, a leading provider of Fibre Channel silicon.  Just like when I last saw him four or five years ago, he still has lots of useful information to share.

All of which brings us to today’s question of why, in this era of sophisticated computing and storage systems, we still need such things as an error detecting code (EDC), which originated with the early computers as a check for signal integrity back when computer wiring wasn’t nearly so clean as it is now. 

This topic actually came up in our conversation (which is probably of itself a sad comment on the sort of life I lead) and Erik pointed out that these days EDCs are actually becoming more important than ever because of the speed at which our storage infrastructure is moving bits around. 

The increasing speed of the storage infrastructure means there are increasing opportunities for errors to creep in as all those wires, cables and printed circuit boards start acting more like antennas.  The new speeds mean that the tolerance for declaring an electrical impulse to be an “official” bit becomes more demanding.  As speeds increase to the 10G-bit level (and beyond?), additional ways will be required to preserve data integrity amid the growing electrical storm in your equipment. 

Most computing equipment today has been engineered to compensate for the little blips of power that might cause a misinterpreted electrical signal to be interpreted as being a bit.  However, as speeds increase, larger electrical anomalies (double-bit errors) can arise and corrupt data as it moves inside your system.  Worse, even if there are no issues with single-bit or double-bit errors, you may not get the data you asked for. 

Could some stray signal have inadvertently given you some other data? EDC has been created to solve these problems.  Some systems handle these problems in higher-level functions with hardware and software working together.  EDC allows this function to be moved into the chip, making it faster and – in theory – less expensive to perform. 

The result is that when the higher data rates in tomorrow’s systems begin to appear, the systems will be able to move the data without corrupting it.