Skip Links

Data deluge

Specialized storage systems help life sciences firms manage fixed content.

By Salvatore Salamone, Network World
February 24, 2003 12:10 AM ET
  • Print

Storage management isn't easy for any industry, but biotech firms face some particularly vexing challenges. Research and diagnostic tools routinely generate huge amounts of data. Complicating matters is the need to store much of this data in a way that meets a range of regulatory requirements. What's more, some of this information needs to be kept for 35 years or more.

"We have eight mass spectrometer machines that produce 60 gigabytes of data per hour, per machine running around the clock," says Lloyd Segal, president and CEO of Caprion Pharmaceuticals in Montreal. The company uses a mix of Sun  StorEdge T3 disk arrays and StorEdge L700 tape library systems. The online stored data is kept on the StorEdge T3 systems, which accounts for about 5 terabytes of capacity.


Life sciences firms face an array of regulatory requirements


Industrywide, biotech companies must deal with raw data that doubles about every six to 12 months, according to experts. Much of this data never changes. Most biotech research and development experiments generate lab results that, once produced, are simply kept on file somewhere. And data collected in drug clinical trials - including X-rays, medical history and patient reactions to drugs - is collected once and never modified.

All this data often must be retained for more than a decade if it is to be used as part of Food and Drug Administration new drug submission. This requirement to keep data for such a long time is a storage management challenge.

There have been no specific studies to determine what percent of biotech data does not change - so-called fixed content data. However, in general across all markets 75% of all new digital data is fixed content, according to Hal Varian, dean of the School of Information Management and Systems at the University of California, Berkeley.

For such long-term storage "there are lots of problems with tape and optical," Varian says. "The [data storage medium] formats keep changing. And whenever you have a change in format, you have a big problem with data migration. It's easier to have the data available on hard drives because migrating becomes a much smaller problem."

A number of storage vendors recently have launched products that try to deal with this issue.

In December, IBM Storage Systems Group released a hardware and software bundle for sharing, managing and securing clinical trial patient information such as magnetic resonance imaging, electrocardiograms and other digital images. The  bundle includes IBM TotalStorage hardware, Tivoli Storage Manager software and hierarchical storage management software to manage data migration from network-attached storage and storage-area network devices to tape libraries. And several third-party document management vendors have built links to EMC's  Centera storage systems to simplify the way data is retrieved.

Storage management problems were one reason sister companies Celera Genomics and Applied Biosystems overhauled their computing and storage infrastructure last year. The firms replaced a 100-terabyte storage system from HP and HP AlphaServer data center with EMC Centera systems and IBM eServer p60s.

  • Print

Videos

rssRss Feed