$68 million, 200-year, 150,000-core analytics job run on Amazon's cloud in 18 hours for $33,000

That's a lot of computing in the cloud

When Schrodinger Materials Science tools wanted to test out 200,000 different organic compounds to see which ones could be a good fit to be used in photovoltaic electricity generation, the amount of data it had to deal with was an inhibiting factor, to say the least.

The company wanted to design, synthesize and experiment various combinations to find just the right fit. The job would have taken about $68 million worth of infrastructure, or almost 200 years if ran on a single machine. Instead, Schrodinger hired Cycle Computing, which specializes in large-scale distributed high performance computing to do it all in Amazon's public cloud.The job ran across 156,000 virtual cores, and exceeded 1.21 petaflops of computing capacity. Using a distributed system of virtual machines across eight regions of Amazon Web Servcie's public cloud around the world for a total of 18 hours.

Cycle says it was a record-breaking petabyte-scale analytics job. So big it has dubbed it the "Megarun."

Cycle Computing has a software management platform that controls the hundreds of thousands of virtual machines that are needed to run these types of jobs. Life science testing is a perfect fit for this software because of the massive amounts of options that are available to scientists to test a broad range of theories.

Cycle uses it software to make the job as least expensive as possible. Using cloud-based resources that are spun up and then deprovisioned as soon as the job is finished, the total cost came to just $33,000. Cycle used more than 16,700 AWS Spot Instances, which are virtual machines that are not reserved or dedicated resources, but instead are made available to customers when they are available. The Cycle software also schedules data movement, encrypts the data and automatically detects and troubleshoots some errors, such as failures to a machines, zones or regions.

While the 156,000 core run is an impressive accomplishment for Cycle, the company has been doing this sort of thing before. Between 2010 and 2013, it has run analytic jobs of 2,000; 4,000; 10,000; 30,000 and 50,000 cores. This one was so big that Cycle calls it its "MegaRun." In addition to using the Cycle software, named Jupiter, it also used Chef automated configuration tools.

Senior Writer Brandon Butler covers cloud computing for Network World and NetworkWorld.com. He can be reached at BButler@nww.com and found on Twitter at @BButlerNWW.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.