A retrospective on the first two decades of control system cyber security – culture issues still prevent successfully securing control systems

Jan. 6, 2020
It is 2020 yet there has been little movement/understanding on engineering issues and field devices. The irony is that cyber security exists at the network layer where you have short-term glitches, but there is no cyber security where you go “boom in the night” and can have months to years of down time and deaths. Smart Grid, IOT, IIOT, Industry4.0, and digital transformation are all about “lots of sensors” and big data analytics. However, if you can’t trust your measurements, it is “garbage in-garbage out”. As the “bad guys”, including Russia and Iran, understand these vulnerabilities, we can no longer overlook control system engineering issues. For the good of society and the safety of our infrastructure, engineering and security need to work together.

Control system cyber security was, and should be, about protecting the control system process. That is, keeping lights on, water flowing, pipelines from rupturing, etc. We’re now at the end of the second decade of control system cyber security and it has changed from protecting the process to protecting the networks - they are not the same. Because there are so few control system suppliers and they supply essentially all industries globally, control system cyber security applies to electric, water/wastewater, oil/gas, manufacturing, transportation, medical devices, facilities (buildings), etc. Given so much misinformation floating around the Internet about control system cyber security, I thought it might be a good time for a control system cyber security retrospective that could provide a basis for assessing how valid the experts’ prognostications for 2020 might turn out to be.  As a reference, please check out www.controlglobal.com/unfettered.

There continues to be a dearth of documented control system cyber incidents that have actually impacted control systems and processes which is why there is so little attention being paid to securing the control system devices. Control system cyber incidents are real. Unintentional control system cyber incidents date back to December 1972 malicious cases to 2000 or earlier. I have been able to amass a database of more than 1,200 actual control system cyber incidents resulting in more than 1,500 deaths and more than $70Billion in direct damage.

I found my article for the November 2000 issue of Power EngineeringElectronic Security Technology Roadmap. Despite the passage of time, the control system cyber security community still seems to be missing some of the essentials, and this is because domain engineering experts have been overlooked in a general fixation on network security.

Let me say at the outset that the only sensible answer to the question, “Is it important to monitor the network?” is “Yes, absolutely.” Network security is indispensable to any sound approach to control system security and safety, particularly when it can provide context.  But we should also remember that what makes control systems unique are the field devices with their associated safety implications. Unfortunately, after 20 years, these devices continue to be taken for granted by the cyber security community.

Consider the problem of counterfeits and other sensor issues

A good example of this kind of oversight is the problem of counterfeit transmitters (see https://www.controlglobal.com/blogs/unfettered/the-ultimate-control-system-cyber-security-nightmare-using-process-transmitters-as-trojan-horses/ ). Transmitters sit behind every firewall, deliver the 100% trusted inputs to OT networks, and are supported by organizations that are not part of most cyber security teams. Counterfeit parts are a supply chain issue, and supply chains are notoriously difficult to secure. Furthermore, supply chains are attractive to nation-states interested in staging attacks on critical infrastructure as they can affect multiple organizations in multiple industries.

I had a chance to relook at a Control Engineering April 2012 article entitled: When can the Process Control Systems and the Safety Systems Share Field Devices? It provides two real cases that demonstrate the lack of information sharing across industries. The first was an offshore production platform in the Middle East in the 1990s. There was a stuck-open relief valve that was not detected causing an explosion and losses exceeding $1Million. The undetected stuck open relief valve was the same issue with the Three Mile Island core melt. The second case was a process heater at refinery in the Northeast US with a failed transmitter that was not detected. This directly resulted in the heater being a total loss, other equipment damaged, and production shut down. Process sensor issues also directly contributed to the Texas City refinery explosion, the Buncefield tank farm explosion, … and there are still minimal process sensor cyber forensics. In fact, I am working on a paper with two others for a safety magazine on the lack of cyber security of process sensors. Process sensors have back doors that CANNOT be closed or the sensors won’t work! Moreover, the portable process sensor calibrators have no cyber security, are remotely accessible, and often connected to the Internet (Control October 2019 – The Instrument Shop). To demonstrate the breadth of the sensor issue, in the Control Design article on calibrating 4-20 milli-amp (mA) sensors, a reader writes: “We have several temperature, pressure and flow sensors on a new medical-device cleaning skid that we are developing. These instruments are connected to a PLC as 4-20 mA inputs, and there is also a 4-20 mA output used to control a pump motor speed. A recent failure of a flow sensor brought the process skid instrumentation to my company’s quality manager’s attention. He asked how we know that the temperatures, pressure and flow are accurate, and how do we know that we are cleaning properly.”

Yet there are still voices in the OT community pushing to ignore process sensors even though both Russia and Iran are aware of these shortcomings.

OT and control system engineering: they’re not the same

Nation-state cyber attacks are capable of neutralizing even the best technology that hadn’t considered all of the possibilities. This was true for Stuxnet even though the Idaho National Laboratory gave a detailed presentation of the Siemens PCS vulnerabilities in 2008 and DHS giving a step-by-step scenario for hacking the grid in May 2015 that was followed by the Russians in December 2015 hack of the Ukrainian grid. When Triton occurred in 2017, it was new to many but that should not have been the case. Specifically for the Triton attack, the August 2010 issue of AutomationWorld, Protect Your Control Network stated: “This is why Invensys (now Schneider-Electric) worked with Byers Security to harden its Triconex industrial safety systems with an OPC firewall. Using specific signatures for the systems Modbus transmission-control protocol (TCP), this inline firewall adds a layer of protection between safety and control systems. Only traffic that is appropriate for the safety system can get into it and only at rates it can handle.” Triton, like Stuxnet, proved that without context, you wouldn’t know if malware was being introduced in a “correct” session. Now superimpose the control system field device issues that haven’t been addressed and have led to the culture clashes being laid bare between the network security and engineering organizations.

The culture gap between engineering and security is real and growing. The control system engineers have made their own contribution to the gap when you consider an email I received: “I've been working in this area for a couple of years. Lots of companies don't have security specialists with expertise in industrial automation. Moreover, most of the industrial automation specialists don't tolerate IT security specialists and try to keep a distance and what is the worst even impede the implementation of security controls.” On the other hand, most control system personnel have been excluded from cyber security teams or cyber security leadership and this doesn’t help either. This is the real gap disconnect between OT and engineering, not the IT/OT convergence.

One of the most important disconnects between engineering and OT is engineering’s focus on real-time devices and control/safety while OT’s focus is on the HMIs and OT networks. However, the HMIs and OT networks are not directly involved with real-time control or safety. In fact, there are many closed loop control systems that don’t even have HMIs. This misunderstanding continues to be a major source of friction between OT security and engineering: they tend to talk past one another. Currently, process sensors are being addressed at the Ethernet IP layer, and the governing assumption is that the sensor signal is authenticated, correct, and uncompromised rather than at the raw signal level which is the only way to determine if those assumptions are valid. As field devices are often analog systems, use proprietary rather than commercial-off-the-shelf operating systems, and utilize device level networks, they are foreign to most OT network personnel. Additionally, the device-level networks are often serial, with minimal cyber security capabilities. An additional problem is that the only way to determine if the sensor process is working properly is to monitor the fluctuations in the signal which unfortunately have been filtered out by the digital transformation process. From a productivity and “stability” perspective, we have moved forward. From a process forensics perspective, we are now more blind then we were in 2000 and that directly affects process safety. This means the sensor packet data that is input to OT networks is “mistrusted”. That is, trusted when it should not be. Moreover, cyber security requirements in the electric industry (NERC CIPS) don’t even consider field devices and lower level networks as these are considered out-of-scope. This is one of the reasons Moody’s is concerned about utility industry cyber security in their November 8, 2019 Sector In-depth report -Grid modernization heightens vulnerability of utilities to cyber attacks.

Another important issue is that information sharing and reality dealing with control system cyber incidents doesn’t work. In doing my annual end-of the- year cleaning, I found an old article everyone should read from the July 2014 issue of ControlSilence = Insecurity. You can also read the Cyber war games blog the disconnect with reality –  https://www.controlglobal.com/blogs/unfettered/the-gap-between-war-games-and-reality-observations-from-the-2019-naval-war-college-cyber-war-game/

Until the gulf between OT and engineering is addressed it will not be possible to secure any industrial or manufacturing organization!

Cyber security involves people, process, and technology.

People – There is still a lack of control system device cyber security training and cyber security training for the engineers. Without appropriate training, process and technology can, and will be, bypassed or compromised. Because of the lack of control system cyber forensics and training, it may not be possible to know whether an incident is a malicious cyber attack or not which is why process anomaly detection is so important. The people issues also include coordination (or lack thereof) between engineering and network (IT/OT) organizations as well as cyber security and process safety organizations. An example of the gulf between security and engineering are the sector coordinating councils which are dominated by CIOs and CISOs with almost no Operations participation. If you are trying to secure an industrial or manufacturing organization, how can you ignore Operations? This is important because the network organizations understand network considerations but generally not physics or process safety issues which can lead to long-term physical damage. Stuxnet, Aurora (Power September 2013, What you need to know and don’t about the Aurora vulnerability), and the data center damage by compromising power supplies are examples of using cyber to cause physics issues without using malware. Yet these engineering issues were missed, or not understood, by network cyber security experts. It’s not that they’re wrong to look at OT network security, but rather that the picture they develop when they do so to the exclusion of other safety and security issues is incomplete.

Process – There is a lack of a lack of control system cyber security policies based on actual incidents. There is also a need to assure that only appropriate security technology and testing methodologies are being used as inappropriate IT/OT cyber security testing and technologies have shut down or damaged control system equipment. Process issues include information sharing, training, supply chain management, patching, and procurement. The use of inappropriate patches and patch management is why ISA99 initiated 62443-2-3 – Patch Management for Control Systems. Most of the training being offered is for OT personnel for monitoring OT network anomalies not system impacts (I am not aware of training being offered at the raw sensor layer). The Triton malware shut down the plant before it was recognized this may have been a cyber attack despite having segmentation and firewalls. There also continues to be a focus on network vulnerabilities rather than process impacts.

Technology – Technology includes network monitoring - network anomaly detection - and control system device level monitoring - process anomaly detection.  The need for monitoring is to “keep lights on and water flowing” whether the networks are available or not. Malicious and unintentional control system cyber incidents continue to occur though often not made public. The vast majority of control system incidents will not be malicious but they are still critical to monitor for reliability and safety. As a result of the reticence to disclose cyber incidents in the electric industry, the Federal Energy Regulatory Commission (FERC) has initiated a “name and shame” approach. The reticence to call control system incidents malicious, whether cyber or not, has manifested itself in a case where in a recent case senior utility management called the forced shutdown of a power plant sabotage while the union called it shoddy maintenance. The incident was real as the plant was shut down. This case is symptomatic as to why the security world is unaware of most actual cases.

A look back at 2019

Specific to 2019,

- Operational Technology (OT) cyber security and threat hunting vendors continued to flourish. Several OT vendors were gobbled up by IT firms to add OT network expertise to their offerings. However, the domain expertise of the systems being secured wasn’t transferred: they weren’t part of the network monitoring technologies.

- The insurance industry has been getting more involved in offering cyber insurance to industrial facilities without understanding either the issues specific to control system cyber security or the lack of adequate control system cyber security metrics.

- Credit rating agencies are also getting involved. The credit rating agencies are recognizing that the enterprise risk for an industrial or manufacturing organization are the control systems not IT.  Unfortunately, other organizations (like NERC) have not yet made that connection.

- ISA99 has issued several new ISA62443 standards and many more are in progress.

- Process safety standards (ISA84) are working with security standards (ISA99) to better understand the safety and security issues – particularly at the field device level. This is a very big issue as process transmitters come with back doors that cannot be disabled.

- Physics issues such as Aurora, etc, are also not well understood. In fact, new sensor designs include built-in web servers, analog valves being upgraded to communicate with the web, etc.

Some of the malicious control system cyber attacks in 2019 included:

- A US utility cyber attack (reports of which NERC initially kept quiet)

- Multiple ransomware attacks including manufacturing and ports

- Bitcoin mining in a nuclear plant (no impact on control systems)

- Indian nuclear plant cyber attack (no impact on control systems)

- RavnAir Dash 8 maintenance system hack

Some of the unintentional control system cyber incidents in 2019 included:

- Boeing 737MAX Ethiopian Air crash - 157 died (The first 737Max crash was LionAir in 2018)

- Multiple unintentional control system cyber incidents with multiple US utilities resulting in loss of view and control (no mention of the term “cyber”)

- Shutdown of a major US data center

- Stem cell freezer failures

- Valdosta water treatment plant spill

- Boeing Starliner failure to dock

In 2019, field device cyber security issues were presented at the following conferences with the following observations:

- Texas A&M Instrumentation & Automation Symposium (this was mostly engineers who did not understand cyber security)

- S4 (primarily OT security participants where control system cyber issues with field devices were not understood)

- Atlantic Council International Conference on Cyber Engagement (this was mostly senior policy attendees where control system issues were not understood)

- ICSJWG (primarily OT security who did not understand control system devices)

- National Association of Water Companies Cyber Security Conference (control system cyber issues with field devices not understood)

- American Society of Civil Engineers (control system cyber issues with devices not understood)

- National Renewable Energy Laboratory Wind Farm Cyber Security Conference (control system cyber issues with field devices not understood)

- Naval War College Cyber War Games (primarily OT security participants where control system cyber issues with field devices were not understood)

- GE Edge Control System Conference (primarily OT security participants in our session where control system cyber issues with field devices were not understood)

- EnergyTech 2019 (primarily OT security participants where control system cyber issues with field devices were not understood)

- 2nd Medical Device Cyber Security Summit (primarily OT security participants where control system cyber issues with field devices were not understood)

It is 2020 yet there has been little movement/understanding on engineering issues and field devices. The irony is that cyber security exists at the network layer where you have short-term glitches, but there is no cyber security where you go “boom in the night” and can have months to years of down time and deaths. Smart Grid, IOT, IIOT, Industry4.0, and digital transformation are all about “lots of sensors” and big data analytics. However, if you can’t trust your measurements, it is “garbage in-garbage out”. As the “bad guys”, including Russia and Iran, understand these vulnerabilities, we can no longer overlook control system engineering issues. For the good of society and the safety of our infrastructure, engineering and security need to work together.

Joe Weiss