Databases for actual control system cyber incidents exist – and they are important for many reasons

Nov. 18, 2019
Obtaining control system cyber incident case histories is possible (my database has more than 1,200 actual cases) but it needs to be done with trusted individuals working with industry experts. There is also a need for “whistle blower protection” for individuals and companies that report these incidents. It is important because these incidents often are generic and can, or have, affected multiple different organizations. 

One of the major reasons that control system cyber security has not made more progress is the belief that control system cyber security is not real because there are so few known cases.  Recently, a question was brought up by several members of ISA99 as to where they can find a website to see the latest ICS cyber security incidents that affected plant/facility operations. Eric Cosman, co-chairman of ISA99, responded by stating: “There have been several attempts to create such a repository over the years. Some have been made public and others have been kept very private. To different degrees they have run into several difficulties, including

- Reluctance on the part of asset owners to share what they consider sensitive and/or proprietary information

- Difficulties in vetting the information

- Lack of consistent views as to what constitutes an “incident”

- Inadequate transparency about what collected information will be used for

- Lack of a clear business case for collecting, vetting and providing such information

There are no doubt other mitigating factors, but the result is that such information is only sporadically available from a variety of sources. Those looking for it have to do a bit of digging and keep the idea of “caveat emptor” in mind.”

I agree with Eric. However, I thought it might be of interest as to what I have found.

The first is the definition of a cyber incident. The DHS CISA and National Institute of Standards and Technology (NIST) definition of a cyber incident is  “An incident is the act of violating an explicit or implied security policy according to NIST Special Publication 800-61. Of course, this definition relies on the existence of a security policy that, while generally understood, varies among organizations.

These include but are not limited to

- attempts (either failed or successful) to gain unauthorized access to a system or its data

- unwanted disruption or denial of service

- the unauthorized use of a system for the processing or storage of data

- changes to system hardware, firmware, or software characteristics without the owner's knowledge, instruction, or consent”

Note that NIST explicitly does not use the words “malicious” or “vulnerability” as the focus is impact. As an engineer, I care about impact which is the focus of my database. I also worry about both unintentional and malicious incidents as sophisticated attackers can make cyber attacks look like malfunctions.

There have been a number of “databases” addressing control systems cyber security. Arguably, the first was the Repository of Industrial Security Incidents (RISI), but it hasn't been updated since early 2015. CSIS has a cyber security database that is a set of brief descriptions, not specific to control systems, and with minimal detail. MITRE has a database of attack types as opposed to control system cyber incidents. NERC has Lessons Learned but refuses to identify incidents as being cyber-related. Control system vendors have internal and customer case histories but can’t share because of confidentiality agreements.

To date, my control system cyber incident database has more than 1,200 cases with more than 1,500 deaths and more than $70Billion in direct damage. These incidents are international in scope and include electric (fossil, nuclear, hydro, renewables, transmission and distribution), water/wastewater, oil/gas, chemicals, manufacturing, transportation (land, water, and air), and defense. All cases are real incidents – some were identified by what did happen, others by what should have happened, but didn’t. My database has not double-counted any of the RISI incidents. I have no security clearances, by choice, so I am not obligated to report these cases. However, I have not made my database public as many of the cases were provided to me in confidence.

My database concept started in the 1989-time frame when I started amassing case histories of instrumentation failures while I was at EPRI. This occurred when an engineer from the Millstone Unit 2 Nuclear Plant and I found the Rosemount oil loss failure in Rosemount 1151 pressure and differential pressure transmitters. Because the Rosemount 1151 was arguably the most popular pressure and differential pressure transmitter in nuclear plant safety applications, I wanted to understand the magnitude of the potential impact. Even though I was looking for nuclear safety-related failures and the nuclear industry requires disclosure, there was almost no discussion of the specific failure mode I was trying to find. I eventually found over 200 of these incidents in nuclear safety applications, including at least one that impacted the Three Mile Island core melt. However, I had to "read between the lines" to find them.

Fast forward to when I helped start the control system cyber security program for the electric utilities in 2000. Because of my "grapevine" of contacts, I was getting calls from people asking about incidents that were occurring they couldn't explain. That was the genesis of starting a database of control system cyber incidents. In the 2001-time frame, I provided Eric Byers my electric industry cases that went into the RISI database as a cooperative effort to understand the potential magnitude of control system cyber incidents. My database was supplemented by presentations from end-users on actual control system cyber incidents from the ICS Cyber Security Conference I started in 2002. It was also supplemented by presentations at conferences where I would give real examples and attendees would say they experienced similar events (these were the engineers not IT or OT). My book, Protecting Industrial Control Systems from Electronic Threats, was published in 2010 has about 20 actual cases.

Eric Cosman is right in that there are very few referenceable cases. I will provide two examples of why traditional information sharing doesn’t work but sharing with trusted individuals does. The first was a large international power plant that lost all control system logic in every one of its over 200 plant distributed control system (DCS) processers. That left the plant with no view or control. I found the disclosure on Linked-in as the Plant Technical Services Manager wanted to know if any other plant had experienced such a problem since he could not get an answer from his DCS vendor or system integrator. Within several days, the Linked-in notification was gone. because of trust, I was able to have him give a presentation at the ACS ICS Cyber Security Conference on what he found. Unfortunately, the DCS vendor did not attend or issue any response. This became more important when I found several other similar cases in coal and gas-fired plants, a nuclear plant, and a chemical plant. The second case was a major international oil/gas company that had experienced a major internal data breach that affected their operational assets. In preparation for a control system cyber security seminar, I met with one of their control system cyber experts. He was aware of this issue from his previous job, but the oil/gas company was not aware as it been intentionally buried. It was fascinating to see the attendees faces when they were told of this case for the first time by their own engineer.

The need to understand impacts became very clear when, in the 2008-time frame while supporting NIST to extend NIST SP800-53 to address control systems, Marshall Abrams from MITRE and myself did detailed analyses of three real cases (they are on the NIST website and in my book). One of those cases was the 1999 Bellingham, WA Olympic Pipeline rupture. This was because the Olympic Pipeline rupture and the 2010 PG&E San Bruno natural gas pipeline rupture were remarkably similar from a control system cyber security perspective. I did a similar study for the International Atomic Energy Agency on three nuclear plant control system cyber incidents.

Currently, there is very little training for engineers to understand when an upset situation occurs that it might be cyber-related. There is also almost no training for network security personnel (be it IT or OT) to look for events that aren't IP network-related. 

Unfortunately, there are very few places to get referenceable cases with independent attribution. This has also made writing peer-reviewed articles difficult when there isn’t independent verification. Much of what I see published on "actual" control system cyber incidents is questionable at best.

Some of the benefits of having a database of control system incidents include:

- Evaluating the adequacy of control system cyber security mitigation and training. What I have found is that many of the most impactful control system cyber incidents would not have been identified or prevented by existing control system cyber security technologies. Moreover, some control system cyber technologies and testing have impacted control systems.

- Providing a basis for a playbook on control system cyber incident investigations (https://www.controlglobal.com/blogs/unfettered/control-system-cyber-incident-hunting-input-for-a-playbook-on-control-system-cyber-incident-investigations)

- Providing real data for more informed insurance and credit rating evaluations.

- Providing a basis for including control system and safety engineers as part of the cyber security team.

- Providing a basis for cyber security policy.

Obtaining control system cyber incident case histories is possible but it needs to be done with trusted individuals working with industry experts. There is also a need for “whistle blower protection” for individuals and companies that report these incidents so they are not punished. It is important because these incidents often are generic and can, or have, affected multiple different organizations.

Joe Weiss