Multiple Oracle Cloud Infrastructure (OCI) outages have hit users around the world this week, and coming after interruptions in Microsoft's cloud services last month, are a reminder of the importance of site engineering for systems administrators whose businesses rely on cloud-based mission critical applications.\nThe biggest OCI outage this week began on 17:30 GMT Monday and stretched till Wednesday 22:30 GMT, impacting customers across North and South America, Australia, Asia Pacific, Middle East, Europe and Africa.\n\u201cOracle engineers identified a performance issue within the back-end infrastructure supporting the OCI Public DNS API, which prevented some incoming service requests from being processed as expected during the impact window,\u201d the company said on its cloud infrastructure \u00a0website.\nIn an update, the company said it implemented "an adaptive mitigation approach using real-time backend optimizations and fine-tuning of DNS Load Management to handle current requests."\nOracle outages affect multiple cloud services\nOracle said that the outage caused a variety of problems for customers. OCI customers using OCI Vault, API Gateway, Oracle Digital Assistant, and OCI Search with OpenSearch, for example, may have received 5xx-type error or failures (which are associated with server problems), Oracle said. Identity customers may have experienced issues when creating and modifying new domains.\nIn addition, Oracle Management Cloud customers may have been unable to create new instances or delete existing instances, Oracle said. Oracle Analytics Cloud, Oracle Integration Cloud, Oracle Visual Builder Studio, and Oracle Content Management customers may have encountered failures when creating new instances.\nIn an apparently unrelated incident, Oracle\u2019s NetSuite ERP suite suffered an outage at its data center in Boston on Tuesday, leading to downtime that stretched from 12:15 p.m. ET Tuesday till services were restored around 11:46 a.m. ET Wednesday.\nOracle did not detail reasons for the Boston data center outage, but the Register reported in a tweet that \u201csmoke was reported at a data center site used by Oracle NetSuite, coming from electrical equipment in a power room.\u201d Firefighters turned off power to the site and evacuated it, the Register reported.\nNetSuite users report unrecovered data\nCustomers reported on Reddit that they were unable to recover data that been recorded for a half hour before the outage began, with one user posting a statement said to have been sent by NetSuite, confirming that the \u201crestoration point was about 30 minutes prior to the outage.\u201d The statement noted that in such cases, NetSuite typically provides users with a report or list of transactions that were created during the period for which data could not be retrieved by customers.\nThe user who posted the NetSuite statement said that \u201cbased on this, we're assuming we'll have to manually slog through the missing data and then selectively import it into our 'new' NetSuite instance (which is now hosted in Santa Clara, not Boston).\u201d\nIn yet a separate incident, on Monday, Oracle's US Ashburn 2 data center experienced an outage for about an hour.\nOracle claims that NetSuite had 99.96% availability over the past 12 months, and the outages this week come just months after Oracle CEO Larry Ellison, in the company\u2019s second quarter earnings call in December, indirectly took a dig at Amazon Web Services, which suffered a major outage that month. Ellison said that a major telecom company told him that Oracle is different from other clouds as it \u201cnever ever goes down," CNBC reported.\nMicrosoft outages affect users globally\nOver the last few months there have been other major cloud outages. Most recently, on February 7, \u00a0Microsoft Outlook and Teams suffered a global outage. That outage came two weeks after a Microsoft outage in January that affected not only Outlook and Teams, but services including Exchange Online, SharePoint Online and OneDrive for Business. The outages impacted users around the world.\nAlthough the cloud giants have redundant data centers and servers in almost every region, data loss has been commonplace for many outages.\nCloud system architecture is key\n\u201cCloud based solutions, like their on-premise equivalents, need to be architected for true high availability and continuity," said Sam Higgins, an analyst at market research firm Forrester. "Having a cloud foundation and a global footprint does not immediately give you 100% uptime for an application. Especially for applications with a long on-premise history and heritage.\u201d\nHiggins added that other factors that lead to outages include client choices, such as data residency configurations that may constrain how much data replication and backup a cloud provider can implement on its data center network.\n\u201cAdd this to\u00a0increasingly global network complexity, the risk of multiple factors \u2014 some human error \u2014 and you have a perfect storm in terms of an\u00a0outage with real data loss potential. It's this risk that has driven uptake of site reliability engineering,\u201d Higgins said.