DOE taps 6 firms for exascale computing research

The U.S. Department of Energy will give a total of $258 million for research and development into exaFLOP computing

DOE taps 6 firms for exascale computing research
Oak Ridge National Laboratory

The Department of Energy has awarded six tech firms a total of $258 million in funding for research and development into exascale computing. The move comes as the U.S. is falling behind in the world of top supercomputers.

Energy Secretary Rick Perry announced that AMD, Cray, Hewlett Packard Enterprise, IBM, Intel and Nvidia will receive financial support from the Department of Energy over the course of a three-year period. The funding will finance research and development in three main areas: hardware technology, software technology and application development.

Each company will provide 40 percent of the overall project cost in addition to the government funding. The plan is for one of those companies to be able to deliver an exascale-capable supercomputer by 2021. It’s part of the PathForward program, which is part of DOE’s Exascale Computing Project (ECP), designed to accelerate the research necessary to deploy the nation’s first exascale supercomputers.

“Continued U.S. leadership in high-performance computing is essential to our security, prosperity and economic competitiveness as a nation,” Perry said in a statement. “These awards will enable leading U.S. technology firms to marshal their formidable skills, expertise and resources in the global race for the next stage in supercomputing—exascale-capable systems.” 

The term exascale means a system capable of performance measured in exaFLOPs, which means a billion billion calculations per second. Current supercomputers are measured in petaFLOPs, which is one-thousanth of an exaFLOP. 

And the U.S., which for so long dominated supercomputing, is losing its edge. For some time, China’s Tianhe-2 supercomputer has dominated the top of the Top 500 list of global supercomputers. Now, a second Chinese system has taken the number two position, and a Swiss supercomputer is in the number three slot. The best showing from the U.S. is Titan, a giant system at the Oak Ridge National Labs in Oak Ridge, Tennessee.

Irony of ironies, Titan is powered by ageing AMD Opteron processors, which have long been viewed as not competitive with Intel’s Xeon server processor. All told, 441 of the top 500 are Xeon-powered, while only six use Opteron. 

The race to exascale

ECP Director Paul Messina said the six companies are expected to work on solving four key challenges: parallelism, memory and storage, reliability, and energy consumption. Their work will include the development of innovative memory architectures, higher-speed interconnects and faster computing power without consuming a lot of energy.

The race to exascale will require breakthroughs in compute technology to shrink the physical size of these systems. One of the requests from the DOE is that the exascale system be so energy efficient that requires only 20 to 30 megawatts. The current systems are power-sucking monstrosities that can fill a basketball court. The top two Chinese systems consume 15 and 17 megawatts each, and they are not even a tenth of the size of an exascale system. The Swiss supercomputer is far more efficient, consuming only 2.2 megawatts.

So, if the best system at 93 petaflops consumes 15 megawatts of power, then expanded to exascale it would consume 150-plus megawatts of power, which is more than a city. That’s clearly not acceptable in today’s power-sensitive climate. There needs to be greater density in compute power and faster interconnects because the current situation is not scalable without tremendous and unappealing cost.

It’s encouraging to see AMD on this list. Oak Ridge aside, it’s fallen so far in recent years as to be a non-player in the server space. Mercury Research, which follows CPU sales, puts its market share at less than 1 percent. That, of course, could change with the new Epyc chips, which are clearly competitive with best of Intel Xeon.

I’m curious as to where the money will go, as all six of these firms were already heavily invested in high-performance computing and there is quite an arms race in the CPU and GPU worlds. They’d be making the push anyway. And this is one case where I’m not worried about government money being wasted. They will make good use of it.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Now read: Getting grounded in IoT