Google, IBM and NSF offer up $5M for large-scale computing research

The National Science Foundation in cooperation with Google and IBM today said it was seeking proposals for the group's Cluster Exploratory (CluE) initiative to explore innovative research ideas in data-intensive computing.

The NSF said it expects to award up to $5 million spread between 10 and 15 awards for CluE, depending on availability of funds. Selected projects will be funded up to $500,000, for durations of up to two years.

CluE will provide NSF-funded researchers access to software and services running on a Google-IBM cluster. NSF will also provide support to the researchers to conduct their work while Google and IBM will cover the costs associated with operating the cluster and other support to the researchers. According to the NSF, the system will be configured with open source software to include Linux and Apache Hadoop - a large-scale distributed computing platform inspired by Google's MapReduce and the Google File System. IBM's Tivoli software will also be used for management, monitoring and dynamic resource provisioning of the cluster.

The initiative is looking for proposals that focus on data-intensive applications and "not cluster computing per se. We are not looking for scientific applications that are based primarily on solving massive numbers of partial differential equations since high-end computing resources are available for such research already, " said Jeannette Wing, assistant director for the Computer and Information Science and Engineering (CISE) directorate at the NSF.

In data-intensive computing, the sheer volume of data is the dominant performance parameter. Storage and computation are co-located, enabling large-scale parallelism over terabytes of data. This scale of computing supports applications specified in high-level programming primitives, where the run-time system manages parallelism and data access. Supporting architectures must be extremely fault-tolerant and exhibit high degrees of reliability and availability, the NSF said.

According to the NSF, over the last five years, private sector companies have launched a number of Internet-scale applications powered by massively scaled, highly distributed data clusters. These clusters contain as many as 90,000 servers, each co-located with hundreds of gigabytes of data. These increases in network capacity and fundamental changes in computer architecture are encouraging software developers to take new approaches to computer-science problem solving.

Until now, such resources have not been easily available or affordable for academic researchers. In October 2007, Google and IBM created a large-scale computer cluster of approximately 1,600 processors to give the academic community access to otherwise prohibitively expensive resources. Earlier this year, NSF joined with two companies to assist with this effort, and the CluE initiative was born, the NSF said.

Google and IBM announced last fall that they were using the cluster as part of a joint initiative to help computer science students gain more knowledge of highly parallel-computing practices.

Layer 8 in a box

Check out these other hot stories:

E-mail management a mighty struggle for US agencies

Wal-Mart telemarketing scammers ordered to forfeit $28.2M

Spammers, crammers, fraudsters and identity stealers: The FTC's top 2008 cases

Military looking for a few good photon chips

Military lays out millions for ultra-endurance aircraft

DARPA's Top 10 wicked cool high-tech aviation systems

FBI turns on national crime information sharing system

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2008 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)