What the scientists' Lustre could do for you
Lustre high-performance storage and file system now and in the future
Storage Alert
By
Mike Karp
,
Network World
, 07/29/2004
Sign up for this newsletter now!
Storage analyst Deni Connor focuses on storage, application and infrastructure management in this twice-weekly newsletter.
- Share/Email
- Tweet This
- Print
Last time, we introduced Lustre, a new clustered storage architecture that offers an open parallel file system with extreme
scalability. Today, we'll talk about how it is being used in the scientific world and how it could eventually be used in the
commercial sector.
Lustre is object-oriented, based on Linux, and provides storage to high performance computing (HPC) environments that require
ultra fast I/O. The plan is for Lustre to scale to the point where it can incorporate tens of thousands of nodes to provide
parallel I/O to a grid of application servers. As such, it plays a key role in the ongoing development of data grids.
Lustre is still in the relatively early stages of development but it is already real and in use. Ultimately, Lustre will have
automatic failover and server reboot, will have no single point of failure, will work with tens of thousands of clients and
thousands of application servers, and will be application-transparent. But what is its present status?
Release 1, in production for about a year now, is up and running at a number of HPC sites, including Pacific Northwest National
Labs (53 terabytes of storage serving 1,280 dual-CPU Intel Xeon clients) and The National Center for Supercomputing Applications
(150 terabytes of storage serving 1,280 dual-CPU Xeon clients and 104 server nodes, all on a Gigabit Ethernet backbone).
Lustre now runs on i386, ia64, and x86-64 platforms and it is being tested for the PowerPC processor. An OS/X client version
is under development.
Lustre is aimed at serving large computer clusters, but with minor variations in the implementation will work with smaller
commercial environments as well. Key points regarding the file system topology are:
* Data and metadata are stored separately. Data resides on an "object storage target" (OST), which includes both "object
storage servers" and storage devices. Data is addressed using metadata services that reside on "metadata servers."
* Clients - ultimately, in the tens of thousands - can reside on any of several types of LANs (Gigabit Ethernet and InfiniBand,
for example) simultaneously.
* Data is accessed by first getting a file's metadata from an active-passive pair of metadata servers (MDS). These use a
journal file system and front-end a dedicated metadata database. All file system namespace operations, such as file lookups,
file creation, and file and directory attribute manipulation, take place here. This is a high availability solution: should
the active MDS fail, the standby server takes over immediately.
Deni Connor is principal analyst for Storage Strategies NOW.
Partner Content
www.bmc.com
Gartner 2009 Magic Quadrant for Job Scheduling
Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.
Download whitepaper
Dell's SMART Approach to Workload Automation
Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.
Download whitepaper
Workload Automation Cost Savings 2 Minute Video
A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member. See how in this 2-minute video overview.
Go to video
Comment