Westergaard 2000 Site - y2ktimebomb.com
Dancing On the Rim of the Canyon
Here's a riddle: Can the power grid be both robust and fragile plus both resilient and brittle? The answer is yes. Today's column tries to explain how this can be true, and how it relates to Y2K.
It sounds like double talk to say that the grid is robust, and also that, "a tree branch caused the 1996 blackouts." If you think we speak in double talk regarding past events, you'll be less likely to believe what we say about the future. Therefore, it's important to give a proper explanation.
1994 - Grand Canyon, Arizona: A few years ago I visited Grand Canyon with my Dad. At sunset we were standing on one of the lookout points with lots of other tourists enjoying the scenery. Two men appeared. Both were stinking, falling-down drunk. One of them jumped over the railing, fell, and rolled toward the rim of the canyon. Everyone gasped. At the last moment, he stopped right on the rim. What happened next was astounding. He stood and started dancing on the rim of the canyon. I didn't have the stomach to watch a man die so I left.
That night I dreamt that the man slipped on a pebble and fell, and that the Flagstaff newspaper reported the headline as "Loose Pebble Kills Local Man."
1996, California: In 1996, two regional blackouts troubled 12 western states and provinces. It was widely reported that it was only a tree branch growing too close to a wire that caused one of the blackouts. Today, that fact is used as evidence that the power system is actually very brittle, and that any trivial Y2K failure could cause a big blackout.
It was a period of power shortages in southern California, and power surpluses in the Northwest. The price differential made it very profitable, and very humane, to transmit every watt possible north to south. The utilities operated as close to the rim as they prudently could. Unfortunately, they estimated incorrectly where the rim really was so they weren't as prudent as they thought. They fell off the rim.
The rim of this electrical canyon isn't something you can see. It's just a metaphor for an abstract limit. Its shape is irregular, with lots of ins and outs. Its true position and shape can only be predicted by mathematical modeling.
Here's the real story about the Western States Blackout. Over a period of years the utilities got sloppy with the data used in their mathematical models. The model's accuracy deteriorated, and the rim wasn't where they thought it was. Given that state of affairs, all that was needed to trigger a collapse was a branch, or a squirrel, a raindrop, or a sneeze.
Now you understand how some of the headlines which said, "Tree Branch Causes 12 State Blackout", can be absolutely true, yet terribly misleading.
Back to Y2K: There are two ways to cause catastrophic failures in highly redundant systems, rim dancing and common mode failure is the other. This statement applies to many redundant, highly reliable systems, not just the power grid.
What is a common mode failure? Many airliners have four engines because the chance of independent failures in all four at the same time is increasingly small. However, if there's a common mode such as no fuel, then all engines could stop at once because of the same reason. Redundancy is no defense against common mode failures.
There have been four really important blackouts in the 110-year history of the electric utility industry. Dancing too close to the rim caused two of them, the 1996 Western Blackout just discussed, and The Great Northeast Blackout of November 1965. In 1965, we knew much less than today about the canyon or where the rim might be, or how steep the slope.
The other two, caused by common mode failures, were the 1977 New York City Blackout and the 1998 blackout in Auckland, New Zealand. Weather was the primary common mode in both.
In 1977 thunderstorms knocked out one heavily loaded transmission line importing power from outside the city, then another lightning strike, then another. After that things started cascading on their own. The common mode was weather.
The Auckland blackout was similar in some ways. Four big underground cables supplied the city's electric system. One cable failed, then another, then things started cascading. The common mode was hot dry weather plus the age of the underground cables.
In both the NYC and Auckland cases, the utilities could have been more cautious and may have been closer to the rim than they should have been. However, I think it is fairer to say that their real error was to incorrectly decide that the chances of certain events were negligibly small.
It used to be hard to think of real life examples of common mode failures because they were so rare. Well, we don't have that problem any more because there presently exists one that is ubiquitous to every power plant. You guessed it ...Y2K. Y2K is the mother of all common mode failures. All those devices from all those years by all those vendors and what do they have in common? Time. Common mode failures used to be an advance concept that many engineers didn't understand. After 2000-01-01 it will be ordinary talk in the local coffee shop.
Without the common mode, what would the chances be of 100 million computer failures occurring independently on the same day? Unthinkable; right? Right, but only unthinkable if we assume that no common modes exist.
Designers detest the common mode failure problem. Not only can it spoil the seemingly great reliability of the redundant systems they design, but also because they can never be certain that a common mode problem was not overlooked. It's one of the classical impossible problems ... proving a negative. All the king's horses and all the king's men can never be sure that no common failure modes exist.
Explanations are nice, but what can we learn from this about how to act in the near future? Well, in addition to fixing the bugs, we make sure that on 2000-01-01 we're not dancing anywhere near the rim.
Operating conservatively, well away from the rim, will have costs we'll have to pay. If there's a shortage in Southern California, for example, we might decline to ship them power needed to help. We can bring mothballed power plants out of retirement. That would be expensive, but it will increase generating margins. We can cancel vacations, and call in all the extra help we can. We can make sure that every power plant and every transmission line and every other critical component is operating at considerably less than 100% of its maximum capacity. We can pre-start all the backup auxiliary generators.
We can warn people to stay away from elevators and amusement park rides or other optional activities, where safety could be compromised by a power interruption. We can inform people in cold parts of the country where they can go to find shelter and heat if their home loses power. We can advise against large-scale millennium celebrations in urban centers, or in penthouse suites.
If we can get far enough back from the rim, and don't dance, then our chances for avoiding or reducing the negative consequences of Y2K can be improved a lot.
By the way, I don't mean to excuse the corporate responsibility failures that led to the four blackouts mentioned. However, I want you to understand that human screw-ups can never be totally eliminated. That's why I wrote in an earlier column that prudent power customers should be prepared for a regional scale blackout, of duration 24-72 hours, Y2K or not. I wouldn't be surprised if that advice is still valid in the year 3000.
Past & Future Columns I get a lot of mail saying, "Yes, but did you consider ..." There are far too many things to consider for a single week's column. If you're new here, I recommend catching up on the previous columns in the series that started 1998-06-26. See the PP archive. Also, here's an advance peek at some subjects for future columns.
EMS/SCADA nuclear backing away from the rim interconnection, islanding and security power quality customer contributions to Y2K problems triage testing information to the public Y2K problems in 1999 the Infocast Y2K Conference product review - Plan Ahead some positive benefits of Y2K the ripple effect why do we have these computers if we can get along without them
CORRECTION A couple of readers tell me I was wrong about all nuclear plants having black start capabilities. Some plants, notably Pressurized Water Reactors, have such big pumps that the diesel generators that are there aren't big enough to start them without outside help. I don't have an actual count of how many can do black start or not. Nevertheless, other points in the column remain valid, especially the title. We CAN restart the grid after complete blackouts. |