To: C.K. Houston who wrote (315 ) 4/15/1998 9:02:00 AM From: R. Bond Read Replies (1) | Respond to of 618
The problem detailed in the following article from today's WSJ is not Y2K related. However, it does give us a glimpse into a 'what if' scenario while clearly demonstrating the rapid spread of a systemic problem due to embedded systems. The repairs were accomplished promptly and things moved on. Consider such events occurring in succession, say, throughout S.E. Asia. Could repairs take a bit longer. The recent loss of electricity to the financial district of Auckland took weeks to restore and prompted the evacuation of the area. "Business won't tolerate networks that don't work." Read on: The Wall Street Journal Interactive Edition, April 15, 1998 AT&T Moves to Stem Damage From Outage in Data Network By STEPHANIE N. MEHTA Staff Reporter of THE WALL STREET JOURNAL ÿ NEW YORK -- AT&T Corp., facing its first major crisis since the arrival of its new chairman, C. Michael Armstrong, moved to stem damage from a major outage in a data network used by big customers. The breakdown in AT&T's vaunted "frame-relay" network, used exclusively for high-speed transmission of data between computers, affected thousands of corporate customers nationwide, including giant retailer Wal-Mart Stores Inc. and Wall Street firm Salomon Smith Barney, a unit of Travelers Group. Traditional voice and wireless networks weren't affected, AT&T said. The outage underscores the fallibility of data networks at a time when corporations and consumers have come to rely heavily on them. Despite spending billions of dollars to modernize and bulletproof its networks -- AT&T often boasts its networks are "self-healing" -- the breakdown shows that such networks remain vulnerable. Disruptions Began Monday AT&T said it hadn't determined the exact source of the outage, but some outsiders speculated it stemmed from faulty software in the highly complex switches that direct traffic on the network. The company said the problem began in two of these computerized switches, then spread to more than 100 switches nationwide. Business customers began experiencing service disruptions Monday at about 3 p.m. EDT. By Tuesday evening, AT&T said, service was 100% restored. But the damage to the company's image may take longer to repair. Mr. Armstrong, in a conference call with reporters, issued an apology to customers and promised to not charge for frame-relay services until AT&T "has defined the problem and defined the solution." That's little consolation to corporations thrown into chaos by the outage. A spokesman for Wal-Mart said about half of the company's 2,400 U.S. stores experienced difficulties processing credit-card purchases or electronically updating their inventories. "It was a significant problem for us," the spokesman said. He added that service was fully restored to the chain by about 2:30 p.m. EDT Tuesday. A spokeswoman for Salomon Smith Barney said the company used "technological and manual backups" to ensure that its customers weren't affected by the service disruptions. She declined to comment on AT&T's handling of the outage. Other large corporate customers also were able to move to backup systems, masking the outage to clients in the outside world. For some, the system failure was simply a nuisance. Southwest Airlines Co., for example, said the outage didn't affect "day-to-day operations," but made it temporarily difficult to perform simple tasks, such as tracking cargo packages that the airline ships. Countless smaller companies, however, had little or no backup, forcing them to wait as long as 24 hours while AT&T restored service. At least one big customer that didn't wish to be identified said the carrier failed to keep customers informed about when service would be fully restored. AT&T said it updated some of its biggest accounts Tuesday in 15-minute intervals. While consultants believe such outages could happen to any company, AT&T's network failure certainly is a black mark for the company. Many customers still remember a series of AT&T voice and data network outages in the early 1990s that crippled air-traffic control towers and big banks. 'A Humbling Experience' "It's a humbling experience" for AT&T, said Ken McGee, an analyst with Gartner Group Inc. in Stamford, Conn. Still, Mr. McGee and others gave Mr. Armstrong high marks for his handling of the snafu. In his conference call, Mr. Armstrong was quick to accept responsibility and said he talked to a few customers directly. "We will apply all the resources of AT&T to ensure that we have identified and isolated the root cause of this outage," he said. In the case of this outage, a self-healing network is practically worthless. That's because such networks simply reroute data traffic around a cut fiber or broken switch onto working parts of the network. If the entire system is down, there's nowhere to route that traffic. AT&T declined to disclose the locations of the two switches at the heart of the breakdown. The company said the switches were made by Cisco Systems Inc.'s StrataCom unit. Cisco, which makes the gear that runs most of the Internet, issued a statement from its San Jose, Calif., headquarters that said it worked closely with AT&T to restore service and to prevent recurring problems with the network. Consultants said the breakdown provides companies with a valuable lesson in the need to use more than one networking vendor in case their primary network provider fails. "Customers need to make sure they have alternative solutions," Mr. McGee of Gartner Group said. "Business won't tolerate networks that don't work." AT&T shares rose 56.25 cents to close at $64.875 in composite trading on the New York Stock Exchange Tuesday. Copyright c 1998 Dow Jones & Company, Inc. All Rights Reserved.