• United States
Senior Editor

Cisco Webex outage: Collaboration service finding a lumpy recovery

News Analysis
Oct 04, 20183 mins
Cisco SystemsCloud ComputingNetworking

Cisco Webex says some latency, connectivity problems persist a week after the outage

Some Cisco Webex users are still having some problems with the collaboration system more than a week after the service went dark.  

According to the company’s website, a major outage began on September 25 and shut down all Webex services including Calling, Meetings, Control Hub, Hybrid Services and Team.   

At the time, the company stated that “Webex Teams services are currently impacted by an ongoing service outage. Engineering resources are online and working to restore services. We apologize for the impact and all hands are on deck to restore Teams, Meetings, Calling, Care and Context services.”

During the week that followed, most services did return to normal, albeit slowly and with some problems. The Webex team wrote on October 2:

“The Webex Teams service continues to experience high service latency. The engineers continue to complete the tasks on our service remediation list, and these efforts are taking additional time to complete than we would like. The latency is due to the overloading of one service as part of the recovery process from last week’s incident. The team is working on some additional service deployments, in conjunction with adding some additional network controls that will allow the connections to stay connected for a longer period of time, which should address some of the connection retries. Some Webex devices are now online and able to make calls, however, incoming calls and other pairing functions will not be available. The Webex Teams and Calendar services continue to experience latency and errors when connecting to the service.”

A Webex post today, October 4, shows some problems such as poor performance are ongoing:

Engineering continues to work on the engineering tasks to address the remaining open items. Some Webex Teams users are still experiencing inconsistent results with some user spaces. While much of the space data has been restored and is fully operational, there remain some user spaces that either do not appear, or appear but messages and calls are not working.

These user experiences are caused by a condition where the roster or space data is incorrect. Engineering continues to work on code solutions. The engineering team is looking to provide a workaround for some of the affected 1:1 spaces first, followed by a broader code solution. The workaround is undergoing testing, and the broader code solution is being developed. It’s taking longer than we’d like, but we’re diligently working to address the issues.”

Other Webex postings detail other problems the company has had in fully restoring services. On October 3, Webex reported:

Metrics and monitors confirm that the network connectivity across the Teams infrastructure is operating as expected. The latency issue experienced with the Webex Teams client has also been addressed for our consumer customers.

Engineering continues to work on the remaining items to restore the Cisco Webex Teams services. The ongoing data restoration tasks did complete on time, and all data restoration activities have completed. The remaining issues affect user access to spaces and One Button to Push capabilities in some Space Meetings. Some users continue to experience issues accessing and sending messages to specific spaces. The engineering teams are continuing to work on a code solution to address those impaired spaces, and an ETA will be provided as soon as one is available.”

There has so far been little in the way of detail about what actually caused the outage, but there is a root cause analysis going on and a report of the results could be forthcoming, observers say.