Why Amazon is rebooting 10% of its cloud servers

Amazon says a patch is needed to fix Xen hypervisor issue

reboot
Credit: Shutterstock

Amazon Web Services issued a blog post on Thursday providing some more details of why the company needs to reboot up to 10% of its cloud servers in the coming days, and it doesn’t have anything to do with the so-called Shellshock vulnerability.

+ MORE DETAILS NETWORK WORLD: Amazon readies major cloud server reboot +

Amazon says Elastic Compute Cloud (EC2) servers from across the globe will be impacted by what it calls a “timely security and operational update” related to its open source Xen hypervisor. The blog post reads:

“As we explained in emails to the small percentage of our customers who are affected and on our forums, the instances that need the update require a system restart of the underlying hardware and will be unavailable for a few minutes while the patches are being applied and the host is being rebooted.”

The full blog post can be read here.

It appears to be just a coincidence that an update to the open source Xen Hypervisor is happening at the same time that security experts have identified a major vulnerability in Linux code known as the Bash Bug, which some are dubbing Shellshock. AWS officials say the two events are unrelated.

+ MORE ON SHELLSHOCK: Bigger than Heartbleed, Shellshock flaw leaves OS X, Linux and more open to attack +

Amazon likely deals with many vulnerabilities each day and week, but Jesse Proudman, founder and CTO of cloud provider Blue Box says this Xen bug is different because it effects the hypervisor that creates virtual machines. The only way to appropriately patch the system is to reboot it.

AWS goes on:

“While most software updates are applied without a reboot, certain limited types of updates require a restart. Instances requiring a reboot will be staggered so that no two regions or availability zones are impacted at the same time and they will restart with all saved data and all automated configuration intact. Most customers should experience no significant issues with the reboots. We understand that for a small subset of customers the reboot will be more inconvenient; we wouldn’t inconvenience our customers if it wasn’t important and time-critical to apply this update.”

Amazon says that the updates must be done before October 1, when details of the Xen flaw are made public as part of the Xen update XSA-108 release. Expect at that time AWS and the Xen community will have more details as to the specific security flaw that is being patched.

Proudman suspects the issue is likely related to flaw CVE-2014-7155 in the Xen code, which was first announced on Wednesday. It was found that the bug can be exploited by a hacker to escalate its privileges, allowing the hacker to potentially glean access to other virtual machines.  In contrast, an issue like Shellshock is something that can be patched in the Linux code and does not require a reboot of the machine.

Proudman says the CVE 7155 has been in the Xen code since the 3.2 release, which was in 2008. Still, he says that customers should not be too worried about the situation since Amazon will be updating all of its impacted machines before more details about the security vulnerability are publicly released on October 1. Proudman says AWS is absolutely doing the right thing by updating its systems and rebooting customer machines, even if that may cause some stress in the coming days.

The big takeaway for customers is that a subset of AWS instances will be rebooted at some point in the next five days. Cloud consultancy RightScale expects the reboots to begin at 10 PM ET on Thursday and run through Sept. 30 at 7:59 PM ET. Customers don’t necessarily have to do anything, but they should be prepared for their EC2 instances to go down for a few minutes if they’ve been notified by AWS. RightScale advises AWS users to test their system for a reboot. “It’s going to test the operational prowess of a lot of systems,” Proudman says.

To comment on this article and other Network World content, visit our Facebook page or our Twitter stream.
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.