Survey: Outages, staffing challenge data centers

Data center operators are working to increase IT infrastructure reliability, keep key talent from being poached, and stay ahead of environmental regulations, Uptime Institute reports.

IT professionals converse in a network server room / data center.
4 PM Production / Shutterstock

Data-centers are working to improve the resiliency of their physical infrastructure, avoid increasingly expensive outages, and recruit skilled staff in a competitive labor market. Meanwhile, many aren’t tracking critical environmental metrics, even as they face looming sustainability requirements.

These are some of the highlights of Uptime’s Institute's 12th annual Global Data Center Survey, which tracks trends in capacity, tech adoption, and staffing.

Server refresh cycles getting longer

The lifespan of servers is increasing, according to Uptime, and it often exceeds the vendor-recommended three to five years. In Uptime’s 2015 survey, 34% of respondents said they kept their servers in operation for five years or longer. In 2022, that climbed to 52%.

There are multiple reasons for the increase, according to Uptime. Semiconductor availability is one factor. Component shortages resulted in higher prices and increased delivery times, and “smaller organizations with less buying power were often required to hold off on nonessential upgrades,” the research firm reported.

The trend may also reflect a slowdown in server power-efficiency gains. New IT hardware usually improves data-center efficiency, but Uptime suggests that those efficiency incentives are slowing. “Generational changes, particularly in Intel-powered servers, which make up most of the market, are delivering much lower performance and energy improvements than before,” Uptime wrote. “Supply of more efficient servers using alternative (AMD and ARM-based) processors is still limited.”

Cost of data-center outages climbing

There are some encouraging metrics around data center outages, but Uptime cautions that they can be misinterpreted.

In the big picture, Uptime has tracked a steady improvement in the number of outages per site. In 2022, 60% of operators surveyed say they had an outage in the past three years, down from 69% in 2021 and 78% in 2020.

Another positive: Fewer managers reported serious or severe data-center outages. Historically, outages deemed serious/severe account for about 20% of all outages, according to Uptime. In 2022, that fell to 14%.

Despite lower numbers of outages per site and less frequent severe outages, the overall number of outages globally grew year-over-year. On the plus side, the frequency of outages didn’t grow as fast as the global data-center footprint grew.

While metrics around outages can be tricky to interpret, one trend is clear, according to Uptime: Outages are becoming more expensive. In particular, the number of outages costing more than $1 million is on the rise.

When asked about the cost of their most recent outage, 25% respondents said the outage cost more than $1 million in both direct and indirect costs, a significant increase from 2021, when 15% reported million-dollar outages. Another 45% of 2022 respondents said their most recent outage cost between $100,000 and $1 million compared to 47% in 2021. From the report:

“Why is the cost of outages increasing? This can be attributed to a variety of factors, ranging from inflation, fines, service level agreement breaches and the cost of labor, call outs and replacement parts – but the biggest single reason is the growing dependency of corporate economic activity on digital services and on the data center. The loss of a critical IT service often translates directly and immediately into disrupted business and lost revenue.”

Power issues still main cause of outages

On-site power problems remain the single biggest cause of significant site outages by a large margin, according to Uptime. In 2022, 44% of respondents said power was the primary cause of their organization’s most recent impactful incident or outage.

The next most common cause was network issues, cited by 14%. Other notable causes include cooling failures (13%), IT systems problems (13%), and problems at third-party providers such as SaaS, hosting, and cloud providers (8%).

Failure to back up apps in multiple cloud zones

There are mixed messages on the cloud front, too. On the one hand, enterprises are becoming more confident about using the cloud for mission-critical workloads. In 2019, 74% of respondents said they wouldn’t place mission-critical workloads in a public cloud. In 2022, that fell to 63%. At the same time, the percentage of respondents who said they have adequate visibility into the resiliency of the service provided by a public cloud rose from 14% to 21%, Uptime reports.

“Organizations are becoming more confident in using the cloud for mission-critical workloads, partly due to a perception of improved visibility into operational resiliency,” Uptime wrote. “However, other data suggests cloud users’ confidence may be misplaced.”

The issue is availability zones. An availability zone typically has redundant power and networking, and cloud providers recommend that users distribute their workloads across multiple availability zones in case on zone suffers an outage, according to Uptime. The data suggests enterprises aren't doing that as diligently as they ought to.

When asked about the potential impact if a primary cloud provider were to experience an outage across a single availability zone, 35% of respondents said that it would result in significant performance issues or downtime, and another 49% said minor performance issues or downtime would be expected.

“This presents a clear contradiction. Users appear more confident that the cloud can handle mission-critical workloads, yet over a third of users are architecting applications vulnerable to relatively common availability zone outages,” Uptime wrote.

Data-center staffing problems worsening

As the number and size of data centers worldwide continues to grow, the number of job openings also grows and is outpacing recruiting efforts, according to Uptime. It estimates staff requirements will grow globally from about 2.0 million full-time employee equivalents in 2019 to nearly 2.3 million in 2025. Some of those data-center jobs are in new categories and require specialized skills.

From the report: “The staff shortage affects almost all data-center job roles globally. In mature data center markets, such as North America and Western Europe, much of the existing workforce is aging and many professionals expect to retire around the same time, leaving data centers with a shortfall on both headcount and experience. Hiring efforts are often offset by jobseekers’ poor visibility of the sector. Efforts to bolster talent pipelines by attracting career-changers to the data-center industry are still nascent.”

In the 2022 survey, 53% of data center operators reported difficulty finding qualified employees in 2022, up from 47% in 2021 and 38% in 2018. In addition, 42% reported issues with staff being hired away, in most cases to competitors. That’s a significant jump from just 17% in 2018.

Failure to track environmental data

Most respondents say they report on overall data-center power use and power usage effectiveness (PUE), but many still are not tracking critical environmental metrics, according to Uptime. Most data-center operators expect to soon be required to report on carbon emissions, for example, yet many are unprepared to comply.

Among survey respondents, 63% said they believe authorities in their region will require them to publicly report environmental data in the next five years, however just 37% collect and report carbon emissions data (up from 33% in 2021) and only 39% currently report their water use (down from 51% 2021). New laws, standards, and requirements will force operators to address these gaps and establish more stringent sustainability tracking and reporting practices in the coming years, Uptime reports.

This year’s Global Data Center Survey includes responses from 800 data-center owners and operators and input from 700 data-center suppliers, designers, and advisors worldwide.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2022 IDG Communications, Inc.