Weekly internet health check, US and worldwide

ThousandEyes, which tracks internet and cloud traffic, provides Network World with weekly updates on the performance of three categories of service provider: ISP, cloud provider, UCaaS.

1 2 3 4 5 6 7 8 9 10 Page 8
Page 8 of 10

CenturyLink suffered a major outage just after 6 a.m. EDT Aug. 30 that hit a broadrange of providers and businesses including Twitter, Microsoft (Xbox Live), Discord, Reddit, Cloudflare, OpenDNS, and Hulu. Shortly after the outage began, providers started rerouting traffic from CenturyLink to alternate providers in an effort to alleviate the impact, however, given the size and distribution of CenturyLink’s network, many services were still unreachable, ThousandEyes said. At 8:13 a.m. EDT, CenturyLink announced it was investigating issues affecting some services within their Mississauga, Ontario, Canada data center. Having identified the cause as an incorrect flowspec announcement from the Mississauga data center, CenturyLink requested that its Tier 1 Internet provider partners de-peer and ignore any traffic coming from its network. (BGP flow specification (flowspec) is a feature that allows you to rapidly deploy and propagate filter policies among a large number of BGP peer routers.) In order to resolve the issue, CenturyLink reset all the equipment and start with clean BGP routing tables, a process that took almost five hours to complete. Just before 3:00 p.m. EDT, CenturyLink announced that the issue had been resolved and all services had been restored.

Update Aug. 24

Globally the total number of outages observed across all three categories during the week Aug. 17-23 increased by 21% compared to the week prior, rising from 245 to 296. This increase in the U.S. rose from 90 to 106 an increase of 18% from the week prior.

ISP outages worldwide rose from 166 to 214 and from 72 to 80 in the U.S.

Public cloud network outages dropped worldwide from 28 to 27, and stayed the same in the U.S. at four.

Collaboration app network outages rose from zero to two globally, but remained at zero in the U.S.

ThousandEyes flagged three notable outages during the week.

Just after 8 a.m. EDT on Aug. 18, Spotify suffered an outage that prevented users from streaming songs from the service. The outage lasted just over an hour and would play songs for a few seconds, then pause and return an error. The outage is believed to be assosicated with an expired TLS certificate. Click here for an explanation on the impact of certificate expiration.

About 11:30 p.m. EDT on Aug. 17, Equinix suffered a power outage to a colocation center in Docklands, London. About 2 a.m. the failure of an output static switch from a UPS system triggered a fire alarm, resulting in loss of power for multiple customers. At 3:50 a.m. services started to be restored and were fully restored by 4:50 p.m. EDT. Affected customers included BT, Sky, Virgin Media, Giganet, Epsilon, SiPalto, EX Networks, Fast2Host, ICUK.net, and Evoke Telecom.

About 10:50 p.m. PDT on Aug. 19 Cogent Networks suffered a 36-minute outage affecting U.S. users’ access to Microsoft networks and associated services, as well as CDN content for services such as TikTok and ESPN. The outage affected nodes across the U.S. and apparently resulted from a configuration adjustment. A second outage two hours later at 11:26 p.m. PDT lasted 24 minutes and likely was connected to the first outage’s configuration adjustment. It affected users in the U.S., Asia-Pacific and Europe, Mid-East and Africa. Click here for an interactive view of the outages.

Update Aug. 17

Global outages across all three categories fell between the weeks of Aug. 3-9 and Aug. 10-16 from 294 to 245 (-17%) and in the U.S. from 123 to 90 (-27%).

ISP outages dropped worldwide from 227 to 166 and from 109 to 72 in the U.S.

Public cloud outages worldwide fell from 30 to 28 and from five to four in the U.S.

Collaboration app network outages worldwide remained at 0 for the second week in a row.

Cogent Networks suffered a notable outage at about 10:30 p.m. EDT on Aug. 13 that lasted about 40 minutes and affected its Atlanta, Ga., network. It affected access to Microsoft networks and associated services, such as Sharepoint, Office, Azure services and hosting, and appeared to be located in the Cogent data center in Atlanta. Based on the affected interfaces and nodes it appears it was a result of configuration adjustments rather than a control-plane issue.

Separately, BT incurrd an outage on its European backbone about 7:30 p.m. EDT affecting customers and partners in the U.K., U.S., Sweden, and Germany. The outage came in three four-minute intervals spanning 25 minutes, indicatingan automated restoration process and likely was for maintenance. The outage cleared at 7:55 p.m. EDT.

Update Aug. 10

Globally, there were no collaboration app network provider outages observed this week. In the U.S., this is the second consecutive week of zero outages.

Overall the number of outages in all three categories increased from 248 to 294, the highest tally since late April. In the US the total was up from 99 to 123.

ISP outages globally increase from 181 to 227. In the U.S. the increase was from 88 to 109.

Cloud-provider outages worldwide rose from 18 to 30, and in the U.S. increased from three to five.

Collaboration app network outages dropped from 1 to 0. U.S. outages remained at 0.

About 8:25 p.m. PDT on Aug. 4 Cogent Networks experienced a 15-minuite network disruption affecting parts of its San Francisco network and its infrastructure in the U.K., Germany and the Netherlands. It affected nearly 70 network interfaces. The scope and timing of the disruption indicates the provider was making service adjustments/maintenance. An interactive visualization of the outage is here.

About 3:25 a.m. CDT on Aug. 5, GTT had a 10-minute network disruption affecting parts of their infrastructure in Dallas, Chicago, Los Angeles, and London. The timing and scope of the disruption are consistent with service-adjustment activity. Interactive visualization of the outage is here.

Update Aug. 3

During the week of July 27-August 2, the number of outages globally in all three categories decreased by 6% from the week prior, from 263 to 248. In the U.S., outages rose from 90 to 99, a 10% increase from the week prior.

The number of ISP outages globally decreased by 1%, dropping from 183 to 181. In the U.S., ISP outages rose from 73 to 88, a 21% increase compared to the week prior.

Worldwide cloud provider outages decreased by 38% when compared to the week prior. In the U.S., there were three public cloud network outages for the third consecutive week.

Globally, collaboration app network provider outages decreased from 3 to 1, a drop of 66% when compared to the week prior. In the U.S., no collaboration app network outages were recorded this week.

There were two noteworthy outages during the period:

Verizon Business suffered an outage within their network that impacted users accessing services such as Zoom, Bloomberg Professional and Flagstar Bank. The outage centered on former UUNET nodes located in San Jose Calif., and Seattle. The outage occurred just before 11:00AM PDT on July 27 and lasted a total of 27 minutes, over a 55-minute period. The outage cleared around 11:55AM PDT.

Reddit users began to experience some errors when accessing Reddit's site around 10:30AM EDT on July 29. During the incident, the Reddit site was reachable, but many of the page components produced errors either failing to load or simply not responding to requests, all of which is indicative of an application issue as opposed to a network disruption. A fix was implemented by Reddit at 1:32PM EDT, and Reddit announced that the issue had been resolved at 3:24PM EDT.

Update July 27

During the week July 20-26, the number of outages globally in all three categories increased by 14% from the week prior, from 231 to 263. In the U.S., outages rose from 70 to 90, a 29% increase from the week prior.

The number of ISP outages globally increased by 5%, from 175 to 183. In the U.S., ISP outages rose from 60 to 73, a 22% increase and a return to late June levels.

Cloud-provider outages worldwide were almost double, increasing 93%, from 15 to 29. In the U.S., there were three public cloud network outages for the second consecutive week.

Globally, collaboration-app network provider outages increased from 1 to 3, a rise of 200%, with all outages attributed to a single provider in the U.S. These were the first collaboration outages seen domestically since mid-June.

The most noteworthy outage of the week occurred just after 3:15 a.m. EDT on July 23 when services on Garmin.com and Garmin Connect became interrupted. The outage – which at the time of this writing is ongoing – also affects Garmin call centers, which were unable to receive calls and emails or participate in online chats. The network connectivity to Garmin services remains active, but syncing data and accessing functions on Garmin Connect remain down. Since Thursday, users attempting to access these functions have been met with a “Server Maintenance” message. In a press release on the 27th, Garmin confirmed it suffered a cyber attack that encrypted some of their systems, resulting in many of their online services being interrupted.

Update July 20

During the week of July 13-19 global outages of all three kinds dropped 19% from the week before, from 285 to 231. The drop in U.S. outages was even greater – 28% - from 97 to 70.

ISP outages dropped globally from 215 to 175 or 19%. In the U.S. they dropped 34%, from 91 to 60.

Cloud provider outages dropped 58%, from 36 to 15, and most of those occurred in South America. U.S. outages rose from two to three, or 50%.

Globally, collaboration-app network outages decreased from four to one,  a drop of 75%,  with the outage attributed to a single provider in the U.K. There were no outages in the U.S. for the fifth week in a row.

GitHub suffered an outage just after 2:30 a.m. EDT July 13 that lasted until 4:31 a.m. EDT. Users were affected worldwide. GitHub hasn’t provided details about what caused the outage, but ThousandEyes said there are indications that the source was within GitHub services.

WhatsApp suffered an outage for about an hour on July 14 starting about 6:45 p.m. EDT that prevented users globally from sending and receiving messages on the service. Once the outage was over, users could connect to the service, but once loaded they were unable to execute any functions. WhatsApp confirmed to ThousandEyes that the cause was an internal update to servers.

Update July 6

For the week June 29-July 5, the number of global outages across all three categories increased from 199 to 208, a 5% increase. In the U.S., however, outages dropped from 83 to 63, a 24% decrease from the week prior.

Globally, the number of ISP outages decreased 5%, from 160 to 152. The number of U.S. ISP outages decreased as well, from 77 to 55 outages. Both drops represent the lowest numbers of ISP outages since February.

Worldwide, cloud-provider outages decreased by 11%, from 28 to 25. The lone cloud-provider outage recorded in the U.S. this week was a decrease of 80% from five outages the week before.

Globally, collaboration-app network provider outages increased from 0 to 2, the first outages recorded since early June. The U.S. had zero collaboration app outages this week, recording just two outages in all of June.

There were two noteworthy outages during the period:

On June 29 at 8:15 a.m. PDT a power failure affected the Google Compute Engine in service zones us-east1-c and us-east1-d. Customers experiencing the service interruption would not have been able to reach existing Virtual Machines or create new ones. Other zones in the region were not impacted, so a redundant architecture, where workloads are hosted in multiple zones within a region, would have mitigated user impact. Google announced that all services had been restored and the issues resolved at 1:06PM PDT.

On July 4 about 5 p.m. PDT Comcast suffered a 33-minute outage affecting U.S. uses and those in multiple other countries trying to access services using the Comcast network. The outage was caused by two events over a 40-minute period and affected Comcast nodes on the U.S. east and west coast and the central region. The outage was cleared at 5:45 p.m. PDT.

Update June 29

The total number of global outages for the week of June 22-28 decreased by 29% from the week prior, reaching the lowest number of outages observed since early April. In the U.S., the number of outages was down by 20%.

ISP outages were also down to the lowest levels recorded in the past eight weeks. Globally, the number was down by 26% this week, dropping from 216 to 160. In the U.S., ISP outages were down by 20% compared to last week, from 96 to 77.

Globally, cloud-provider outages decreased by 39% this week from 46 to 28, with the bulk being attributed to South America. In the U.S. cloud provider outages were down by 55% from 11 to five compared to the week prior.

Globally, last week saw zero collaboration app network provider outages for the second week in a row.

Comcast Cable Communications suffered a 24-minute outage affecting users across the U.S. accessing services including Zoom, Visa and Bank of America. The outage was focused on Comcast infrastructure located in Seattle, Wash., and was cleared just after 2:30AM PDT.

Update June 22

Cloud provider outages spiked to new record-level highs for the week of June 15-21. Globally, the number of cloud provider outages increased from 20 to 46, a 130% increase. In the U.S., the number of outages increased 175%, from 4 to 11.

Last week also saw record-level lows. For the first time since the week of February 24, there were zero collaboration app network provider outages both globally and in the U.S.

Globally, the number of ISP outages decreased marginally last week, dropping from 221 to 216. In the U.S., however, the number of ISP outages increased by 22%, compared to the week prior.

From a total outage perspective, the number decreased marginally globally, from 287 to 282. The U.S., however, saw a 14% increase in outages relative to the week prior. From 99 to 113 outages.

An outage of note occurred June 18 at 2:45 PDT and lasted 23 minutes, affecting multiple countries including Australia, France, Germany and the U.K. The outage affected access to Microsoft services including some identity systems and appeared to originate in Microsoft nodes in Des Moines, Iowa. The outage was divided into two outages over two hours, concluding just after 5 p.m. PDT. Click here for an interactive view of the outage.

Update June 15

During the week June 8-14, worldwide the number of total outages in all three categories rose 35%, and jumped 34% in the U.S.

ISP outages globally increased 32% from 168 to 221 and rose 14% in the U.S. from 68 to 79.

Cloud-provider outages decreased from 23 to 20 (-13%) and doubled from two to four in the U.S.

Related:
1 2 3 4 5 6 7 8 9 10 Page 8
Page 8 of 10
SD-WAN buyers guide: Key questions to ask vendors (and yourself)