- Microsoft Windows chief decries standards grandstanding
- The 5 best, and 5 worst, features of Google Chrome OS
- Federal government using PS3 to crack pedophile passwords
- 10G Ethernet cheat sheet
- Top 10 free Windows tools for IT pros, at a glance
Storage analyst Deni Connor focuses on storage, application and infrastructure management in this twice-weekly newsletter.
Last time, we introduced Lustre, a new clustered storage architecture that offers an open parallel file system with extreme scalability. Today, we'll talk about how it is being used in the scientific world and how it could eventually be used in the commercial sector.
Lustre is object-oriented, based on Linux, and provides storage to high performance computing (HPC) environments that require ultra fast I/O. The plan is for Lustre to scale to the point where it can incorporate tens of thousands of nodes to provide parallel I/O to a grid of application servers. As such, it plays a key role in the ongoing development of data grids.
Lustre is still in the relatively early stages of development but it is already real and in use. Ultimately, Lustre will have automatic failover and server reboot, will have no single point of failure, will work with tens of thousands of clients and thousands of application servers, and will be application-transparent. But what is its present status?
Release 1, in production for about a year now, is up and running at a number of HPC sites, including Pacific Northwest National Labs (53 terabytes of storage serving 1,280 dual-CPU Intel Xeon clients) and The National Center for Supercomputing Applications (150 terabytes of storage serving 1,280 dual-CPU Xeon clients and 104 server nodes, all on a Gigabit Ethernet backbone). Lustre now runs on i386, ia64, and x86-64 platforms and it is being tested for the PowerPC processor. An OS/X client version is under development.
Lustre is aimed at serving large computer clusters, but with minor variations in the implementation will work with smaller commercial environments as well. Key points regarding the file system topology are:
* Data and metadata are stored separately. Data resides on an "object storage target" (OST), which includes both "object storage servers" and storage devices. Data is addressed using metadata services that reside on "metadata servers."
* Clients - ultimately, in the tens of thousands - can reside on any of several types of LANs (Gigabit Ethernet and InfiniBand, for example) simultaneously.
* Data is accessed by first getting a file's metadata from an active-passive pair of metadata servers (MDS). These use a journal file system and front-end a dedicated metadata database. All file system namespace operations, such as file lookups, file creation, and file and directory attribute manipulation, take place here. This is a high availability solution: should the active MDS fail, the standby server takes over immediately.
Deni Connor is principal analyst for Storage Strategies NOW.
Partner Content
www.bmc.com
Gartner 2009 Magic Quadrant for Job Scheduling
Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.
Download whitepaper
Dell's SMART Approach to Workload Automation
Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.
Download whitepaper
Workload Automation Cost Savings 2 Minute Video
A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member. See how in this 2-minute video overview.
Go to video
Comment