Skip Links

Clemson IT team embraces call to be entrepreneurial

By , Network World
August 15, 2011 06:07 AM ET

Page 3 of 4

In the Clemson cluster, OrangeFS is used to virtualize 32 commodity Dell storage servers while providing a single name space for the cluster nodes, Wilson says. Directory and file metadata are distributed on 1.6TB of solid state drives across the 32 storage nodes and there is a total of 256TB of raw rotational disk storage.

Unlike other high-performance file systems such as Lustre, which can only have a single metadata server, OrangeFS' distributed metadata approach and unified name space enable the file system to scale nicely while also simplifying operations, Wilson says.

These capabilities may ultimately benefit enterprise computing environments. "With a unified name space across potentially hundreds of storage nodes, you can add and remove nodes as needed and customers won't notice their files moving or ever have to be pointed to a new storage location," Wilson says. "Your unstructured data stores can grow and resize and be redundant and you won't have all of these different little silos of data. So it holds some potential to become an enterprise computing solution a couple of years down the road."

One Clemson researcher, Sebastien Goasguen, is using OrangeFS to develop a cloud-based infrastructure that can launch and work with tens of thousands of cluster-based virtual machines at once. "It leverages OrangeFS by enabling you to have a shared high-performing file system between all cluster nodes," Wilson says.

Goasguen is collaborating with KC (Kuang-Ching) Wang to build software-defined networks between VMs and client machines using OpenFlow, "which represents a nice convergence point with the university's work on OpenFlow," he says.

Clemson is one of seven collaborators with Stanford on the initial OpenFlow deployment. What started out as a tool to facilitate network research by adding an open, centralized, software-defined layer of network routing, OpenFlow promises to "change the whole way we think about networking," Wilson says. "A lot of people are realizing they would like more software-based control over their network infrastructure. ... You can do some really neat stuff."

For example, while it isn't too painful for Clemson to shift IP addresses from its main data center to a smaller center on campus because they share subnets, when you start doing that over long distances and with multiple locations, it becomes extremely difficult, Wilson says. OpenFlow should vastly simplify the task by allowing dynamic networks to be created and changed at the infrastructure level, but also at the application level, opening up significant opportunities for improvement in network flexibility and security.

While it is unclear when and if Clemson will be able to profit from work on OpenFlow, it is already profiting from OrangeFS and other software that is licensed through Omnibond Systems, Wilson says. For example, companies interested in OrangeFS can purchase a 10-server bundle from Omnibond with support for $45,000.

Other Clemson work that Omnibond licenses includes identity management tools (including drivers for Novell's Identity Manager) and even traffic vision technology that state transportation departments can use to help turn roadside video feeds into sensors.

While the license fees help offset Clemson IT costs, the work also helps attract and keep really good people, Wilson says.

Enterprise IT

As important as the HPC cluster is, if it goes down, "researchers understand that's the way life goes," says CTO Pepin. "If the enterprise side goes down, we get fired. It's a smaller portion of the computer electrical power but 90% of the pain, so we care deeply about it."

The enterprise side of the data center includes a mainframe that supports two major systems, the main Medicaid system for the state and the university's student information system, which includes financial aid and registration. "We're on the front end of a transition to a new Medicaid system based on MITA (the Medical Information Technology Architecture) and a student information system replacement project, so the mainframe will be gone in about five years," CIO Bottum says. The new systems will be based on redundant commodity hardware and virtual machines.

The rest of the enterprise infrastructure -- some 700 x86 boxes, mostly Dell and Sun with a little bit of IBM mixed in -- supports about 155 applications, including everything from email and payroll to the school's Blackboard course management system. Most of the machines are running Linux but there is a modest amount of specific-purpose Windows and some Unix. "Our direction is to move toward Linux," Pepin says.

Enterprise computing row (Photo by Zac Wilson)

Enterprise computing row (Photo by Zac Wilson)

"This is where we're looking at doing some cloudy things in the Joni Mitchell model," he says. "It will be more of what you traditionally think of as a cloud because we probably will go down the virtualization path for a large portion of it."

Our Commenting Policies
Latest News
rssRss Feed
View more Latest News