Last time, we introduced Lustre, a new clustered storage architecture that offers an open parallel file system with extreme scalability. Today, we'll talk about how it is being used in the scientific world and how it could eventually be used in the commercial sector.Lustre is object-oriented, based on Linux, and provides storage to high performance computing (HPC) environments that require ultra fast I\/O.\u00a0 The plan is for Lustre to scale to the point where it can incorporate tens of thousands of nodes to provide parallel I\/O to a grid of application servers.\u00a0 As such, it plays a key role in the ongoing development of data grids.Lustre is still in the relatively early stages of development but it is already real and in use. Ultimately, Lustre will have automatic failover and server reboot, will have no single point of failure, will work with tens of thousands of clients and thousands of application servers, and will be application-transparent.\u00a0 But what is its present status?Release 1, in production for about a year now, is up and running at a number of HPC sites, including Pacific Northwest National Labs (53 terabytes of storage serving 1,280 dual-CPU Intel Xeon clients) and The National Center for Supercomputing Applications\u00a0 (150 terabytes of storage serving 1,280 dual-CPU Xeon clients and 104 server nodes, all on a Gigabit Ethernet backbone).\u00a0 Lustre now runs on i386, ia64, and x86-64 platforms and it is being tested for the PowerPC processor. An OS\/X client version is under development.Lustre is aimed at serving large computer clusters, but with minor variations in the implementation will work with smaller commercial environments as well.\u00a0 Key points regarding the file system topology are:* Data and metadata are stored separately.\u00a0 Data resides on an "object storage target" (OST), which includes both "object storage servers" and storage devices.\u00a0 Data is addressed using metadata services that reside on "metadata servers."* Clients - ultimately, in the tens of thousands - can reside on any of several types of LANs (Gigabit Ethernet and InfiniBand, for example) simultaneously.\u00a0* Data is accessed by first getting a file's metadata from an active-passive pair of metadata servers (MDS).\u00a0 These use a journal file system and front-end a dedicated metadata database.\u00a0 All file system namespace operations, such as file lookups, file creation, and file and directory attribute manipulation, take place here.\u00a0 This is a high availability solution: should the active MDS fail, the standby server takes over immediately.\u00a0\u00a0* Sitting between the data and the clients is a series of object storage servers (OSS), which manage the storage located on the storage devices.* The data itself resides on storage devices behind the OSSes.\u00a0\u00a0 Storage devices are treated as object based storage, and may be of any sort: RAID, JBOD or individual disks.\u00a0 They may be connected to the OSSes directly or by a networked connection.\u00a0All files fall within a global namespace, which means that the file system presents the many directories from multiple file servers as a single unified directory tree.\u00a0 The value of this is that this single directory tree is valid from every workstation and remains valid when configurations are updated.Why should commercial users be interested in a technology that is clearly designed for high performance technical environments?\u00a0 Because the benefits of high performance computing often find their way to commercial applications sooner that we might think.\u00a0 Even today, there are reports (unconfirmed, but from a very credible source) that one commercial site is putting together a large-scale Lustre implementation.\u00a0And besides, despite the fact that Lustre is intended for use in a high performance computing file system, Lustre runs on commodity hardware, which would make things easier when the time comes for it to move over to the commercial world. Until then, look for Lustre to appear on a data grid near you.