Clustered storage technologies catch on
By
Deni Connor
,
Network World
, 03/22/2004
- Share/Email
- Tweet This
- Print
As file sizes and data sets grow into the terabyte and petabyte range, users are looking for a method for storing, accessing
and sharing the files among different hosts.
That's where clustered and storage-area network (SAN) file systems come in.
Vendors have created software and hardware appliances that combine disparate file systems into one file system with one name
space. These appliances and software improve users' ability to access data and share the data with others irrespective of
the media or host computer on which it sits.
The technology these appliances and software use is known as clustered and SAN file systems. File systems of these types have
several advantages over distributed file systems:
- By clustering systems and sharing applications and data, tasks can be performed much more quickly than they could on individual
machines because data doesn't need to be copied or replicated from one file system to another.
- Clustering provides more space for files and file systems.
- Management is easier because only one file system is being managed, not a file system for each storage device or host computer.
- Failover is available because one server in the cluster can take over the responsibilities of another if it fails.
- Users have concurrent access to all files located on the storage devices on their network.
Working as one
In a cluster, a group of independent nodes or host computers work together as one system. They may share a common storage
array or SAN and have a common file system that has one name space. A traditional example is HP's Tru64 cluster file system used in TruCluster systems.
More recent implementations are from Cluster File Systems, Oracle, Red Hat, start-ups Panasas and Spinnaker Networks, and others. Red Hat, which acquired Sistina last year, released its clustered Global File System into the open source; Network
Appliance, which acquired Spinnaker Networks, is using its SpinCluster software to improve its grid strategy, which clusters
network-attached storage (NAS) and SAN storage. Oracle uses its Cluster File System on the company's Real Application Clusters (Oracle 9i RAC); Cluster
File Systems uses its Lustre File System to build high-performance compute clusters.
In the Lustre File System, Panasas and Permabit implementations, individual servers are connected to storage by a metadata
server or device, which categorizes each bit of data so it can be found easily.
Mark Seager of Lawrence Livermore National Laboratory and Scott Studham of the Pacific Northwest National Laboratory are using
the Lustre File System.
"Before we had a file system on every cluster and had to FTP the data between file systems," says Seager, systems department
head. Seager has two 1,000-node clusters in production today.
"Better performance is a key criteria for using a clustered file system," Seager says. "The other issue for us is to not have
to replicate the data [to the other cluster] when someone needs it."
Seager's group does scientific simulation and modeling with its clusters. Seager says it's important to be able to read data
off the file system and concurrently see the results while the simulation is still going on.
Partner Content
www.bmc.com
Gartner 2009 Magic Quadrant for Job Scheduling
Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.
Download whitepaper
Dell's SMART Approach to Workload Automation
Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.
Download whitepaper
Workload Automation Cost Savings 2 Minute Video
A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member. See how in this 2-minute video overview.
Go to video
Comment