AT&T today said a procedural error in installing a software upgrade, coupled with a bug in a switch circuit caused the recent two-day crash of its frame relay network.
AT&T said an attempt to upgrade the software running one of the network's 145 Cisco Systems, Inc. switches sparked a cascading flood of administrative messages that effectively took down the entire network last Monday afternoon. The failure left more than 6,000 customers, including banks, large department stores and Internet service providers, without transaction- and order-processing capabilities - some until Wednesday.
"The problem began when a computer command was issued to upgrade software code in one of the network switch's circuit cards," according to an AT&T statement. "This created a faulty communications path, which generated a large volume of administrative messages to other network switches."
The other switches were unable to route traffic because of the overload, according to AT&T.
AT&T said it has changed its software upgrade procedures and will put in place safeguards to prevent such outages from occurring again.
But Howard Anderson, president of the Yankee Group research firm, said this type of outage could happen again. "We've built essentially self-healing networks," he said. "But no network is completely self-healing." Anderson said there is a 50% chance an outage of this enormity could recur.
"This disruption certainly did not meet our customers' expectations for service reliability, or our own, and for that AT&T apologizes," AT&T CEO C. Michael Armstrong said in the statement.
Although Cisco declined to comment on specifics of the announcement, CEO John Chambers said in the statement that "any service that disrupts customers' businesses is unacceptable."
A waiving of fees for the frame relay service, as promised by Armstrong, is still in effect, but due to end once the investigation is complete.
RELATED LINKS
Log on to our forum
To tell us about your outage.
AT&T network goes down for the count
Includes coverage of the outage and links to AT&T information. Network World, 4/20/98
More details on the outage from AT&T.
Apply for your free subscription to Network World. Click here. Or get Network World delivered in PDF each week.
![]()
Request a reprint or permission to use this article.
