An Uptime Institute survey finds the power usage effectiveness of data centers is better than ever. However, power outages have increased significantly. Credit: Thinkstock A survey from the Uptime Institute found that while data centers are getting better at managing power than ever before, the rate of failures has also increased — and there is a causal relationship. The Global Data Center Survey report from Uptime Institute gathered responses from nearly 900 data center operators and IT practitioners, both from major data center providers and from private, company-owned data centers. It found that the power usage effectiveness (PUE) of data centers has hit an all-time low of 1.58. By way of contrast, the average PUE in 2007 was 2.5, then dropped to 1.98 in 2011, and to 1.65 in the 2013 survey. PUE is a measure of the power needed to operate and cool a data center. A PUE of 2 means for every watt of power to run the data center, another watt is needed to cool it. A PUE of 1.5 means for every watt into the IT systems, a half of a watt is needed for cooling. So, lowering PUE is something of an obsession among data center operators. However, Uptime also found a negative trend: The number of infrastructure outages and “severe service degradation” incidents increased to 31 percent of those surveyed, that’s up 6 percentage points over last year’s 25 percent. Over the past three years, nearly half had experienced an outage at their own site or a service provider’s site. This begs the question: Is one causing the other? Is the obsession with lower PUE somehow causing more and bigger outages? Rhonda Ascierto, vice president of research with the Uptime Institute, says no. “We can’t determine that,” she told me. “Some in the media have made that connection, but correlation is not causation. It’s certainly possible they are linked and some findings around efficiency are related, but we did not link those together.” Most downtime incidents lasted one to four hours. Uptime asked people who suffered an outage what they estimated the cost to be, but 43 percent didn’t calculate the cost of an outage. That’s because far too many factors in determining the cause were outside that person’s specialty. Half of those who did make an estimate put the cost were less than $100,000, but 3 percent said costs were over $10 million. What causes data center outages? The leading causes of data center outages are power outages (33 percent), network failures (30 percent), IT staff or software errors (28 percent), on-premises non-power failure (12 percent), and third-party service provider outages (31 percent). To err is human, and this survey showed it. Nearly 80 percent said their most recent outage could have been prevented. And that human error extends to management decisions, Ascierto said. “Oftentimes, people talk about human error being the cause of outages, but it can include management errors, like poorly maintained or derated equipment that may not match runtime requirements,” she said. “The human error comes down to management responsibility.” She added that another cause of failures is there is a trend toward data center consolidation, with firms moving workloads from secondary data centers to primary ones. This takes time, and since the secondary is being decommissioned, the owner doesn’t invest in it. So wear and neglect creeps into a doomed data center, making it more likely to fail. Another cause for problems is the cascading effect of one data center taking down others. That could be either two private data centers or a hybrid situation where an on-premises center is connected to a third-party provider such as Amazon or Microsoft. If one goes down, it has a greater chance of taking down the other(s). Uptime found 24 percent of those surveyed said they were impacted by outages across multiple data centers. “Five years ago it would be a much lower number,” said Ascierto, who added she expects an increase in outages caused by cascading failures between multiple sites, since more and more companies are adopting multiple cloud services strategies, as well as the growing interdependency of multiple IT services. “There is this belief that having a hybrid architecture makes you more resilient, but visibility and accountability is more difficult and the rate of outage is high,” she said. Related content news analysis Western Digital keeps HDDs relevant with major capacity boost Western Digital and rival Seagate are finding new ways to pack data onto disk platters, keeping them relevant in the age of solid-state drives (SSD). By Andy Patrizio Dec 06, 2023 4 mins Enterprise Storage Data Center news Omdia: AI boosts server spending but unit sales still plunge A rush to build AI capacity using expensive coprocessors is jacking up the prices of servers, says research firm Omdia. By Andy Patrizio Dec 04, 2023 4 mins CPUs and Processors Generative AI Data Center news AWS and Nvidia partner on Project Ceiba, a GPU-powered AI supercomputer The companies are extending their AI partnership, and one key initiative is a supercomputer that will be integrated with AWS services and used by Nvidia’s own R&D teams. By Andy Patrizio Nov 30, 2023 3 mins CPUs and Processors Generative AI Supercomputers news VMware stung by defections and layoffs after Broadcom close Layoffs and executive departures are expected after an acquisition, but there's also concern about VMware customer retention. By Andy Patrizio Nov 30, 2023 3 mins Virtualization Data Center Industry Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe