More work needs to be done on Linux at the operating system level, grids will have limited appeal, and there will be a mass movement to embrace clustering software among organizations large and small. These were some of the conclusions and predictions of Don Becker, co-founder of the Beowulf clustering project and a significant contributor to the Linux kernel, as he attended the LinuxWorld show in San Francisco this week
Becker is the founder and chief scientist of Linux clustering vendor Scyld Software, a subsidiary of Linux workstation and server vendor Penguin Computing. Privately held Penguin acquired Scyld in June 2003. Becker founded Scyld (pronounced "scaled" or "skilled') back in 1998, building on work he did while at NASA (the U.S. National Aeronautics and Space Administration) where he started the Beowulf Parallel Workstation high performance clustering computing project. NASA was interested in his project for helping in the modeling of climate data.
IDG News Service caught up with Becker by phone Wednesday as he took a quick break from demonstrating Scyld clustering software at the show. What follows is an edited transcript of the interview.
What are your thoughts on how Linux has developed? Symantec and other vendors at the show have been talking about Linux entering a golden age in terms of adoption by enterprises, do you agree?
Linux has evolved tremendously. When I started with Linux as an end user, there were probably a few hundred users, and that's probably an overestimation. I quickly became a developer because Linux didn't have reasonable networking and soon afterwards it needed networking. Linux has fulfilled the promise of Unix, going from [running on] a wristwatch to the fastest [high-end] machine.
Linux is state of the art in most cases, but [being] state of the art isn't good enough. There are many holes in what Linux does, there are still many opportunities [for developers.] Not everything has been done. Five years ago, Linus [Torvalds, the creator of Linux] said the basics had been done and all the interesting things [to be done] would be at the application level. That has turned out not to be the case.
[There's work to be done in] the storage and file systems areas which are changing very quickly. Linux clearly needs a general purpose and easily usable network attached storage. iSCSI is a rapidly evolving technology and within the next year or two, we should see some products or implementations you wouldn't call 'clusters,' but enterprisewide storage shared across multiple systems. There's not a name for it yet.
There's also evolution going on inside the Linux kernel, evolving the VFS (virtual file system) layer. Linux isn't behind any [other operating] systems, it's just that there's a lot of development going on there.
What about grids? They seem to be a major focus at LinuxWorld, although often in terms of vendors trying to target customers leery of the associated complexity they perceive around grids.
Grid tools have been primarily developed on Linux, so that's their platform of choice. A grid implementation is much more difficult than clustering. You need to develop a whole new infrastructure, and then deploy and update it in ways that are usable. It's at least as difficult as deploying a network protocol. Look at IPV6, we're not even halfway there to deploying it.