A CALL TO ACTION: THE NATIONAL AND GLOBAL IMPLICATIONS OF THE YEAR 2000 EMBEDDED SYSTEMS CRISIS
WHITE PAPER SUMMARY:
The Year 2000 technology crisis involves computer software and hardware on the one hand and date sensitive embedded systems on the other. Those addressing Year 2000 challenges have typically focused on the former and have too often failed to fully understand and acknowledge the challenges and threats posed by date sensitive embedded systems. Efforts to address the technology crisis have also tended to be based on a limited awareness and understanding of the interconnected nature of the crisis, and the potential for the cascading of failures and problems. Efforts to understand and address the threats have too often overlooked the implications of the crisis for domestic stability and national security, not to mention global stability and security.
While information technology problems have the potential for extraordinary impacts on all aspects of the nation's and the world's infrastructure, date sensitive embedded systems failures can pose even greater challenges, particularly if they occur at the same time as other infrastructure disruptions. Embedded systems failures can pose additional threats to health and safety, including additional threats to social stability and environmental sustainability.
Embedded systems failures can trigger technological disasters which can impede mobilization efforts to deal with infrastructure disruptions. Infrastructure disruptions could in and of themselves be expected to tax emergency response capabilities to the limit.
What are embedded systems?
The following definition is taken from the United Kingdom's Action 2000 website: www.open.gov.uk/bg2000/whattodo/embsys2.html:
"Embedded systems contain 'programmed instructions running via processor chips....They perform control, protection, and monitoring tasks....In broad terms embedded systems are programmable devices or systems which are generally used to control or monitor things like processes, machinery, environments, equipment, and communications."
It is estimated that there may be from 10 to 25 billion embedded systems in existence. It is known that some small percentage of these are date sensitive. Of these a small, but significant percentage are not Year 2000 compliant. Estimates range from 0.2% to over 1%. That would mean that from 20 million to 250 million embedded systems failures could occur owing to the Year 2000-related non-compliance problems. (Source: The Gartner Group).
These include small failures that can have major impacts. Malfunctions could occur in all manner of equipment, devices, appliances, and systems found in homes, hospitals, buildings, plants, facilities, and systems. Malfunctions could occur as well in everything from rail and subway systems to water purification plants, wastewater disposal plants, oil and gas pipelines, oil refineries, oil tankers, off shore oil platforms, chemical plants,manufacturing plants, coal-fired plants, nuclear power plants, nuclear and other hazardous waste facilities and laboratories, biological and chemical warfare storage facilities, and weapons systems of all kinds.
There is simply not sufficient time and manpower to identify, assess, repair, replace, or "work around" all of the date sensitive embedded systems prior to January 1, 2000. (Indeed, some malfunctions could be triggered well in advance of that date.) Efforts are destined to be far less than 100% successful in making necessary repairs or taking other preventive or mitigative actions. In many cases, shut downs will be the only viable alternative.
The failures that are bound to occur may be expected to have an impact on the health and safety of nearby populations, on social cohesion and civility, on food and water supplies, on the economy, on foreign relations, and on the sustainability of the environment. Such impacts could affect small areas, as well as large regions all over the world. Commonsense dictates that greatly expanded efforts be made by the public and private sectors, nationally and globally, to identify, prioritize, and minimize the risks posed by those date sensitive embedded systems posing the greatest threats.
Current efforts to address Year 2000 computer software and hardware problems and embedded systems problems are grossly inadequate nationally and globally. In addition, efforts to address these problems tend to be based on a limited awareness and understanding of the nature and scope of the crisis. The problems are being poorly and unrealistically defined. Even the efforts to address the problems as presently understood are falling far short of the mark.
Indeed, efforts to address the problems have begun and are beginning much too late. The problems are widely understood as primarily involving computer technology, information systems, data processing systems, and communications technology. Resolving these problems involves making needed diagnoses and taking corrective action. Those who tend to define the problem in this narrow way are greatly underestimating the nature and the scope of the problem. There is an increasing chorus of others who see the problem as being much broader. They see the potential impacts as being much farther reaching. They see the societal infrastructure being significantly affected. (For an array of viewpoints concerning the possible impacts of Y2K technology problems, see wdcy2k.org. for the results of a survey conducted in March 1998 of the members of the group known as Washington DC Year 2000.
The results of a second survey conducted in May 1998 of the members of that same group can be found at www.csis.org/html/y2kpress.html#3).
When embedded systems fail, they can fail in a variety of unpredictable ways. Small, seemingly insignificant failures can trigger other system failures.
A few illustrations of systems at risk and illustrations of what can go wrong happen when an embedded system fails are as follows:
1) The absence of adequate attention to the embedded systems in nuclear power plants was cited by GAO in a document dated March 18, 1998 (GAO/AIMD-98-90R, March 6, 1998).
2) The computers which control nuclear weapon systems may malfunction. (See appendices.)
3) The vulnerability of a generator temperature control system at a power plant in the United Kingdom were reported in the Electric Power Research Institute Proceedings from EPRI Embedded Systems Workshop, Proceedings dated 10/4/1997. A compliance test was conducted in which "(t)he (control) valve (for generator cooling) closed (fail safe), tripping the unit on high generator temperatures." It was concluded that "(l)oss of numerous generating units simultaneously in the United Kingdom could be devastating to the country." (See appendices for a fuller account of this incident.)
4) The vulnerability of an Energy Management System (EMS), along with its redundant system has been described involving the Hawaiian Electric Company. This vulnerability first came to light in 1996. "The EMS is a redundant system, meaning that there are two computer systems: one that serves as the primary controller which scans remote points in the field, stores information in our database, and displays the information to load dispatchers in graphic display format and the other computer system acting as a hot standby system. These two systems pass information between them via a Ethernet cable and when the primary soft or hard "fails", the standby system assumes control.
In the year 2000 (Y2K) case, a failure would not fix the problem. Since both computer systems run the same software, if the primary had a problem with the date it would most likely have the same problem on the standby as well. This would result in both systems "failing" over to the other system (a kind of "thrashing"). While this would be a considerable headache for our team, it would most likely be very expensive for our company." (For a fuller overview of the scenarios that could occur given such a problem, see the appendices.)
5) "(A) city discovered that their wastewater treatment plant would have dumped raw sewage into their bay on January 1, 2000 if they had not replaced the 286 chips that controlled the valves." (5-3-97 NTIS Symposium Presentation by Lt. Col. David C. Hall (USAFR) Wright-Patterson AFB, Ohio, "Year 2000 Problem: Infrastructure Aspects").
Why has so little attention has been given to problems relating to date sensitive embedded systems?
The reasons can be simply stated:
In our highly specialized world, relatively few people even know about the existence of date sensitive embedded systems. Of those who do, fewer still understand the complex technology. Those who understand the technology best are software, firmware, and hardware engineers and programmers who specialize in embedded systems.
Certainly, political leaders, policymakers, and others in roles of public responsibility cannot be expected to readily understand the technical intricacies of software, firmware, and hardware engineering and programming as these relate to date sensitive embedded chips. In addition, they do not always have on their staffs, individuals who have such technical expertise.
For all these reasons, very few public officials in any branch or at any level of government have readily grasped the significance that date sensitive embedded systems have in the context of the Year 2000 technology crisis.
Persons in key policymaking roles in emergency management may also lack the kind of technical background that would allow them to recognize the nature of the threats posed by the failure of date sensitive embedded systems. They may therefore fail to see the potential for technological disasters and may consequently fail to undertake necessary preparedness and mitigation measures. There in fact has been an apparent absence of sensitivity of the emergency management community to Year 2000 technology problems. An individual from the education sector who responded to the May WDCY2K survey noted that Y2K was not mentioned on the program of The International Emergency Management Society which met in Washington DC in June 1998. In the words of the respondent: "These are the people who help us recover from earthquakes, floods hurricanes, etc. I was told that a spokesperson from FEMA mentioned Y2K as a problem that had to be solved. She said FEMA was not compliant now but had plans to become compliant. She did not mention any possible disruptions or disasters that might result from firms or systems not being compliant. Hence, Y2K was described as an IT problem with no connection to the subject matter of the conference....Judging from the informal conversations during the conference, my impression is that emergency management professionals from around the world do not yet regard Y2K as a matter of professional interest to them." (www.csis.org/html/y2kpress.html#3)
Those who tend to grasp the significance of malfunctioning embedded systems are often people who are used to thinking "systemically", people who are used to thinking in terms of the interconnections within a system, within a set of systems, and amongst systems and sets of systems.
Owing to the failure to understand the problems associated with the Year 2000, there has been a failure on the part of the public and private sectors to assign or assume responsibility for addressing the problems posed by Year 2000.
Some of the other reasons why embedded systems have not been adequately addressed are as follows:
There are a relatively small number of persons who understand how embedded systems failures can be forestalled.
It can be extraordinarily difficult to access the embedded systems that need to be assessed in order to ascertain whether or not they are date sensitive and, if they are, whether or not they are Year 2000 compliant.
It can be extraordinarily difficult to assess the internal logic of the embedded system.
The scale of the problems are so great that there are not enough trained technicians who are capable of working on them.
The actual testing of an embedded system can damage the system and cause a malfunction.
Because of the multitude of models and versions of embedded systems, it is not possible to extrapolate from one system to another based on the testing of one. In a real sense, each system has to treated as if it were unique, because it may well be.
Identical chips may act differently in different systems.
In cases where a replacement chip is required, it may not be possible to identify the vendor or the vendor may be out of business. It may not be at feasible to manufacture a replacement chip.
Malfunction of an embedded system may trigger other failures and the source of those failures may not necessarily be detectable.
Even if efforts were to bring the nation close to 100% success in addressing computer software and hardware problems, the threats posed by date sensitive embedded systems could make those efforts for naught. Some date sensitive embedded systems are simply bound to fail. Even one accidental nuclear weapons launch or in place accident is one too many. Join that possibility with a nuclear power plant failure like Chernobyl, a chemical plant disaster similar in magnitude to the disasters in Bhopal or Seveso, a release of toxic emissions from a chemical or biological weapons facility, and perhaps, multiple incidents of such events happening at once or in quick succession throughout the world and in the middle of our winter months and there would be national as well as global impacts on an unprecedented scale.
At the same time there could also be other problems whose duration would not necessarily be known at the time. These problems could involve a lack of electricity, a working phone system, radio, drinkable water, food, and fuel for heating and cars and all other forms of transportation.
Technological disasters combined with infrastructure disruptions such as these could make the difficulties of recovery formidable.
No one in the world will be immune from harm if the present level of understanding and if the present level effort are not exponentially increased as rapidly as humanly possible. This calls for leadership of a type that is rare. This is owing to the fact that one of the gravest concerns in this crisis is the possible dissolution of the social fabric, which must be kept in tact if we are to work through the crisis.
Possible approaches are many. Commonsense dictates that there needs to be an immediate prioritization of what needs to be done to minimize our risks in to the extent humanly possible.
Assigning or selecting people to do what needs to be done is next. Priority List A possible list of highest priorities for mitigating and preventing threats and exposures would include a focus on the following:
Nuclear Weapons Systems, including dismantled nuclear weapons - Domestic and Foreign Biological and Chemical Warfare Plants - Domestic and Foreign Nuclear and other Hazardous Stockpiles or Storage Facilities - Domestic and Foreign Nuclear Power Plants - Domestic and Foreign Chemical Manufacturing Plants - Domestic and Foreign
[G8 leadership and UN cooperation will be required in all of these top five priority areas. Non-aggression pacts will be in order, as significant disarming may well be required.] Oil and Gas Pipelines Refineries Power Generation Plants Power Distribution Systems Telecommunications and radio Water Purification Plans Water and Sewage Treatment Plants Off Shore Oil Riggs Tankers Food Availability and Distribution Shelter Rail System Priority Transportation Fuel Supply and Distribution, including stockpiling of meals-ready-to-eat Emergency and Disaster Preparedness and Management Systems, including Readiness of Essential Workers to Serve Maintenance of Health Care Civil Order - Military, National Guard, Police, and Fire and the preparedness and readiness of these and other essential workers to serve Security of penal system and mental institutions
By focusing efforts on the Priority List as proposed here, there would be the best chance of safeguarding all of the following: Public Health and Safety Social Stability Global Stability Environmental Sustainability
The rationale for assigning highest priority status to the top five concerns is owing to the great and possibly overwhelming threats and challenges that could result in breakdown of the social fabric. If we are successful in minimizing these threats, we will have made our recovery efforts far easier. Were we to fail to address these threats, and were any of them or many of them to materialize, the consequences could be beyond imagination, rendering crisis management and recovery efforts extraordinarily difficult. If we were to fail to address these threats, we would be increasing immeasurably the difficulties in restoration and recovery efforts.
This is indeed a case where an ounce of prevention could be worth tons in cure. In our case, billions of dollars invested in prevention and mitigation and preparedness, along with a willful slowing of our economy to accomplish that could result in the husbanding of trillions of dollars worth of resources in the long run and could make significantly smoother the period of recovery.
What kind of organization is needed to accomplish these Herculean efforts?
A state of emergency could be declared and an organization could be put in place in the Executive Office of the President to orchestrate and carry out the entire range of tasks needed to address the crisis.
The mission, functions, and scope of purpose of such department are detailed in the White Paper. Other measures that would need to be taken or promoted nationally and globally are also detailed. *******
Slide Presentation to Accompany Paper
Bruce Webster and Paula D. Gordon, Slide Presentation for a White Paper, "A Call to Action: The National and Global Implications of the Year 2000 Embedded System Crisis", June 1998. Websites of Interest and References y2ktimebomb.com
year2000.com (Search on "Embedded Systems" to find articles)
weyrich.com
References:
Roleign Martin's embedded systems site: http://ourworld:compuserve.com/homepages/roleigh_martin
Mark A. Frautschi, "Embedded Systems and the Year 2000 Problem (The OTHER Year 2000 Problem)" (Draft), JULY 1, 1998. tmn.com
Gene Bylinsky, "Industry Wakes Up to the Year 2000 Menace", Fortune, April 27, 1998
Dick Lefkon & Bill Payne, "The Practical Engineer ~ Making Embedded Systems Year 2000 Compliant", IEEE Spectrum, June 1998
The Gartner Group Source Documents on Year 2000 gartner12.gartnerweb.com
GAO Exposure Draft: "Business Continuity and Contingency Planning gao.gov
GAO Report "The Computing Crisis, an Assessment Guide" http:///www.gao.gov/special.pubs/y2kguide.pdf ...
year2000.com |