Windows or Linux? Tokyo HPC chief lets users decide A few days ago, we told you about Microsoft’s surprising bid to join the petascale computing age. Windows HPC Server, it seems, was able to hit petaflop speeds on Japan’s largest supercomputer, but the achievement was not recognized by the bi-annual Top 500 list because Linux performed better on the same machine.All we knew at the time was that Tsubame 2.0, the HPC cluster at the Tokyo Institute of Technology, had tested the machine’s speed with both Windows and Linux, with Linux coming out ahead because the Linux run was performed on a slightly larger number of nodes.Microsoft breaks petaflop barrier, loses Top 500 spot to LinuxWindows after 25 years: A visual history One reader who commented on the blog post joked that Tokyo officials “didn’t have enough licenses to run [Windows] on that many.”But it turns out a software bug prevented the Windows HPC Server run from matching Linux’s speed and ability to run across more nodes. The bug was not in Windows HPC Server itself but rather in a software package Microsoft designed to run the Top 500 benchmarking test. Satoshi Matsuoka, professor at the Tokyo Institute of Technology, explained it to me today at the SC10 supercomputing conference in New Orleans, saying Linux’s victory “was purely by chance.”Here’s what happened. To submit scores to the Top 500 supercomputers list, cluster operators have to run the Linpack Benchmark, a software library designed to test a cluster’s speed under extreme conditions.It’s like driving a Ferrari and “hitting the gas flat out for four hours,” Matsuoka said.Because Tsubame uses both Intel CPUs and Nvidia graphics processing units, Tokyo officials needed to run a custom implementation of the High-Performance Linpack Benchmark to take full advantage of the heterogeneity of the system. The Tokyo computer scientists wrote code for the Linux run themselves, and for the Windows run used Linpack code written by Microsoft employees.While a full Linpack run takes a few hours, Tsubame’s creators actually spent more than a week preparing and conducting the tests. The strategy is to start with small tests, and gradually ramp up, identifying problems that slow performance down as you go along.“In actuality, it’s an enormous effort,” Matsuoka said. “Things break down. There’s such a huge stress on the system. It’s the sort of stress that this machine will never see in real production.” Ultimately, the Linux run was performed over 1,357 nodes, achieving speeds of 1.192 petaflops (one petaflop is equal to one thousand trillion calculations per second). This speed gave Tsubame the title of the world’s fourth fastest supercomputer.Windows was outperforming Linux at small workloads, and eventually hit 1.118 petaflops across just under 1,300 nodes, according to Matsuoka. But when a Windows run across 1,360 nodes was attempted, the Linpack software designed for the Windows run failed due to a memory initialization bug.Microsoft has since fixed the bug, but it was enough to derail the Windows bid to top Linux.“There was a small bug in the Windows code that basically did not let them complete their final run,” Matsuoka said. “And we ran out of time. We had to use their second best number, which turned out to be slightly lower than Linux.” Whether Windows would have beaten Linux if not for the software bug is “a mystery that’s engulfed in history, because they failed at the very last moment,” he says.Matsuoka is interested in why Windows was able to outperform Linux in running smaller problems. Since the hardware was the same for both runs, it must come down to either the operating system or differences between the customized Linpack software packages.“We haven’t had the time to do the side-by-side comparison,” Matsuoka says. “We’ll probably do that and publish a paper.”Tsubame is a remarkably energy efficient, general-purpose supercomputer with about 2,000 users in academic and industry research circles. Because Tsubame uses a KVM hypervisor and various cloud-like provisioning tools, it can run both Windows and Linux at the same time on different nodes, and offer users various types of processing configurations.“We’re very flexible,” Matsuoka says. “We can switch certain subsets of nodes to Windows from Linux and vice versa.” Running both operating systems at the same time is possible “because we run virtual machines on some of the nodes.”Naturally, Matsuoka’s user base demands Linux more often than Windows. A little more than 80% of the machine’s time is devoted to Linux, specifically Novell SUSE Linux 11, he says, and under 20% to Windows.“Of course, we get more demand for Linux,” Matsuoka says. “But we do get Windows demand too. Because we can do dynamic provisioning we will size our Linux vs. Windows accordingly to demand and load.”“This might be the first time this has been done at this scale,” he adds, referring to the Windows/Linux flexibility. Although most people in the supercomputing crowd might scoff at Windows, which accounts for only five of the Top 500 HPC clusters, Matsuoka says there seems to be little difference in performance. It should be noted that Microsoft has helped fund the Tokyo Institute of Technology’s supercomputing programs.“I was very curious to see which one would be superior, both in terms of the [Linpack] algorithm, and the underlying operating system,” Matsuoka said. “It was very surprising, because they were very similar in performance.”Follow Jon Brodkin on Twitter. Related content analysis Farewell to Network World readers So long, and see you soon By Jon Brodkin Aug 26, 2011 1 min Microsoft Data Center analysis Linus Torvalds: ARM has a lot to learn from the PC Linux creator speaks on 20th anniversary of the kernel By Jon Brodkin Aug 18, 2011 4 mins Small and Medium Business Microsoft Linux analysis Microsoft expects Internet Explorer exploits within 30 days Patch Tuesday features 13 patches for 22 vulnerabilities By Jon Brodkin Aug 09, 2011 4 mins Microsoft Security analysis Google Music vs. Amazon vs. Apple's iCloud vs. Spotify vs. Rdio vs. ... Are you ready to join the cloud music age? Take a look at the top competitors By Jon Brodkin Aug 08, 2011 10 mins Spotify Microsoft Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe