As file sizes and data sets grow into the terabyte and petabyte range, users are looking for a method for storing, accessing and sharing the files among different hosts. That’s where clustered and storage-area network file systems come in.As file sizes and data sets grow into the terabyte and petabyte range, users are looking for a method for storing, accessing and sharing the files among different hosts.That’s where clustered and storage-area network (SAN) file systems come in.Vendors have created software and hardware appliances that combine disparate file systems into one file system with one name space. These appliances and software improve users’ ability to access data and share the data with others irrespective of the media or host computer on which it sits. The technology these appliances and software use is known as clustered and SAN file systems. File systems of these types have several advantages over distributed file systems:By clustering systems and sharing applications and data, tasks can be performed much more quickly than they could on individual machines because data doesn’t need to be copied or replicated from one file system to another.Clustering provides more space for files and file systems.Management is easier because only one file system is being managed, not a file system for each storage device or host computer.Failover is available because one server in the cluster can take over the responsibilities of another if it fails.Users have concurrent access to all files located on the storage devices on their network.Working as one In a cluster, a group of independent nodes or host computers work together as one system. They may share a common storage array or SAN and have a common file system that has one name space. A traditional example is HP’s Tru64 cluster file system used in TruCluster systems.More recent implementations are from Cluster File Systems, Oracle, Red Hat, start-ups Panasas and Spinnaker Networks, and others. Red Hat, which acquired Sistina last year, released its clustered Global File System into the open source; Network Appliance, which acquired Spinnaker Networks, is using its SpinCluster software to improve its grid strategy, which clusters network-attached storage (NAS) and SAN storage. Oracle uses its Cluster File System on the company’s Real Application Clusters (Oracle 9i RAC); Cluster File Systems uses its Lustre File System to build high-performance compute clusters.In the Lustre File System, Panasas and Permabit implementations, individual servers are connected to storage by a metadata server or device, which categorizes each bit of data so it can be found easily.Mark Seager of Lawrence Livermore National Laboratory and Scott Studham of the Pacific Northwest National Laboratory are using the Lustre File System.“Before we had a file system on every cluster and had to FTP the data between file systems,” says Seager, systems department head. Seager has two 1,000-node clusters in production today.“Better performance is a key criteria for using a clustered file system,” Seager says. “The other issue for us is to not have to replicate the data [to the other cluster] when someone needs it.” Seager’s group does scientific simulation and modeling with its clusters. Seager says it’s important to be able to read data off the file system and concurrently see the results while the simulation is still going on.Scott Studham, associate director for advanced computing for the Pacific Northwest National Laboratory in Richland, Wash., also is using Lustre for its performance characteristics and size.“We have a 53T-byte file system that sustains performance of a little over 3G byte/sec,” Studham says. “We also have our high-capacity storage that has a half-petabyte file system that will grow to a little over a petabyte.” Studham uses the larger file system for scratch space for computational chemistry, biology or subsurface analysis.“Sometimes the files are too big to put into the file system on one of the nodes or the nodes need to share random locations in a very large file and you need to see the whole thing,” Studham says. Cluster caveatsDespite the advantages that clustered file systems have, there are caveats.“Clustered file systems are complicated animals, and they need good support and a knowledgeable set of technical people to architect them and get them working,” Seager says. “We do have a very close relationship with [Cluster File Systems] for debugging.”“Clustered global file systems are still very hard,” Studham says. “If you have Ph.D. computer scientists and people that are brilliant with file systems, Luster is the best-performing one, and it is the future. If I were a bank where my employees are the systems-administrator-type, I would rather go with a vendor-provided solution.”On the other hand, SAN file systems connect servers with storage and “virtualize” the file system environment.Terry Duncan, chief of engineering and a test branch for the Beam Control division at Kirtland Air Force Base in Albuquerque, N.M., has 50T bytes of data managed by ADIC’s StorNext FS. He uses it for scientific image-data collection.“We take relatively large files of up to a gigabyte in size and use a hierarchical data format to save our data,” Duncan says. “We have a few million files in our large system. “We want multiple systems to be able to see the same data simultaneously at a very high rate,” Duncan says. Some of his files exceed a gigabyte in size. His file system contains a few million files.Duncan says management is easier with a SAN file system.“For us, it is definitely easier to manage [our data],” Duncan says. “If we didn’t have the ability to write files to a common space so they can be accessed by a number of systems together, it would be very hard to handle the data rates that we require. We’d spend a lot of time moving a half-terabyte of data around and doing analysis on it if we didn’t have a single name space.”“The StorNext is for when we need much higher performance,” says Paul Chapman, senior vice president of technology at FotoKem, a video post-production firm in Burbank, Calif. “The Isilon is used where we don’t need the same level of performance.”Chapman says deploying an appliance is much easier than working to implement a file system.“The Isilon was much easier to implement,” Chapman says. “You need to look at the nature of your data and what your needs are with it. There’s no one solution that does everything, which is why we have both the Isilon IQ and ADIC’s StorNext. Our intention is to put in a Linux cluster connected to the Isilon IQ.” Slicing up storageA sampling of products that allow concurrent access to files distributed among many servers and storage devices.Company nameProductType of file systemOperating systems supportedADICStorNext FSSANWindows, Linux, UnixCluster File SystemsLustre File SystemClusteredLinuxIBMGlobal Parallel File SystemClustered, SANLinux, AIXIBMTotalStorage SAN File SystemSANWindows, UnixPanasasActiveScale File SystemNASLinuxRed HatSistina Global File SystemClustered, SANLinux Related content news Broadcom to lay off over 1,200 VMware employees as deal closes The closing of VMware’s $69 billion acquisition by Broadcom will lead to layoffs, with 1,267 VMware workers set to lose their jobs at the start of the new year. By Jon Gold Dec 01, 2023 3 mins Technology Industry Mergers and Acquisitions news analysis Cisco joins $10M funding round for Aviz Networks' enterprise SONiC drive Investment news follows a partnership between the vendors aimed at delivering an enterprise-grade SONiC offering for customers interested in the open-source network operating system. By Michael Cooney Dec 01, 2023 3 mins Network Management Software Network Management Software Network Management Software news Cisco CCNA and AWS cloud networking rank among highest paying IT certifications Cloud expertise and security know-how remain critical in building today’s networks, and these skills pay top dollar, according to Skillsoft’s annual ranking of the most valuable IT certifications. Demand for talent continues to outweigh s By Denise Dubie Nov 30, 2023 7 mins Certifications Certifications Certifications news Mainframe modernization gets a boost from Kyndryl, AWS collaboration Kyndryl and AWS have expanded their partnership to help enterprise customers simplify and accelerate their mainframe modernization initiatives. By Michael Cooney Nov 30, 2023 4 mins Mainframes Cloud Computing Data Center Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe