Summary of INCOSE control system cyber security presentation – IT/OT is almost as much of a threat to infrastructures as the hackers

April 13, 2020
Thursday, April 9th, 2020, I gave a presentation to INCOSE Critical Infrastructure Protection and Recovery (CIPR) working group monthly call. With the large attendance, it was evident there was an interest in learning about the critical, but generally not addressed, issues of the engineering aspects of control system cyber security. There was also a common thread that control system cyber security issues are more than just IT/OT convergence. From the broad participation in the call, it was also evident there is a need for these different standards and engineering organizations to collaborate and the need for coordination among these groups. To date, it is unclear who will step up to do the coordination. 

Thursday, April 9th, 2020, I gave a presentation to the International Council on Systems Engineering’s (INCOSE) Critical Infrastructure Protection and Recovery (CIPR) working group monthly call (go here to learn more about INCOSE and the CIPR working group). The goal of the presentation was to present a control system engineer’s view of cyber security and to discuss cyber vulnerabilities in “smart” systems that relate to the INCOSE Smart Cities Initiative. Participants spanned many organizations within and outside INCOSE.  INCOSE working group participation included the Transportation Working Group, the Power & Energy Working Group, the Object-Oriented Software Engineering Methodology Working Group, as well as Smart Cities initiative, Resilient Hospital Model initiative, among others. External collaborating participants included members from the ISA (International Society of Automation), IEEE, SAE (previously Society of Automotive Engineers), InfraGard’s National Disaster Resilience Council (NDRC), domestic and international utilities, manufacturing companies, universities, and other public and private organizations. It is always a problem trying to get a credible count of dial-in participants. In this case, it appears there were 114 connections at the peak (over the course of the call, 229 people joined the call). By any measure this was a very successful call.

It is clear the gap between engineering and network security, whether IT or OT, is growing. This gap was extremely evident in discussions within ISA99, INCOSE, and others. To engineers, safety is sacrosanct and must be protected at all costs. That was not the view of some of the cyber IT/OT security practitioners. On 4/7/20, a very smart IT security expert now working in OT for a major control system supplier stated the following in control system cyber security discussions about safety and security: “The definition difference is probably a matter of our different backgrounds in security…I just don't believe safety belongs as a top-level addition to the CIA triad - if anything it's already included in Integrity if you assert that OT by default includes people as an "”asset"” of an IACS.” As a nuclear engineer, who grew up with nuclear safety, this cavalier attitude toward safety is completely unacceptable and should be for anyone working in control system cyber security.

Examples of the gap between IT/OT and Engineering can be demonstrated by the impacts and even damage network security has caused to control systems and plant equipment. Impacts have ranged from denial-of-service to actual equipment damage. One of my slides was from the NASA Inspector General’s report on NASA cyber security and how IT/OT caused real physical issues.

- “A large-scale engineering oven lost ability to monitor and regulate temperature when a connected computer was rebooted after application of a security patch update intended for standard IT systems.  The reboot caused the control software to stop running, which resulted in the oven temperature rising and a fire that destroyed spacecraft hardware inside the oven.  The reboot also impeded alarm activation, leaving the fire undetected for 3.5 hours before it was discovered by an employee.” 

- “Vulnerability scanning used to identify software flaws caused equipment to fail and loss of communication with an earth science spacecraft during an orbital pass.  As a result, the pass was rendered unusable and data could not be collected until the next orbital pass.”

These are just a very small set of examples of the danger of having people without adequate knowledge of the systems or system interactions involved with these critical systems. I can cite many more that either caused denial-of-service or even damaged control systems.

This is a small sample of the responses I received from the presentation:

The lack of coordination between IT/OT and engineering can be seen from a comment from one of the attendees.  In his note asking for the slides, a representative from a major manufacturer  stated: “Your discussion was something that I’d like to be able to bring both the Plant Engineering and IT folks in the same auditorium and lock the doors for a while.  Having just spent my Saturday recovering plant systems from a switch security firmware update change underneath our plant network just indicated again how far we are away from where we need to be.”

One of the other attendees who until recently was his country’s representative delegated to the NATO Energy Security Center of Excellence wrote on Linked-in: “An excellent presentation. You made very good and understandable points. My takeaways (short list) are:

- IT and OT are closer in terms of what they do vs what is going on in engineering (closest to the physical process).

- In ICS, it is less about “data security” and more about “system security”

- For IT endpoints are laptops and for ICS endpoints are Level 0,1 devices (very important distinction),

- For control systems, “Safety is what is expected” while “security is what is unexpected”,

- OT is about control system networks, not control systems and equipment,

- Measurements start out as ANALOG then connected to digital (WOW probably the most surprising takeaway for me)

Also learned about a new organization INCOSE. Where I would split a hair or two would be in my placing more emphasis on the dangers to ICS from APT. Do not think the majority of individual attackers (hackers) have the required knowledge and skill sets. Not to mention the resources and intel needed to craft an effective attack. Fully agree about the need to be aware of the unintentional events as well as the intentional. Glad I was able to watch your presentation.”

One other stated: “Fantastic presentation!  BRAVO!  You provided such critical and insightful information throughout every minute of your talk.  Typically, one is overly thankful to get one decent take-away. I’m a founding member of InfraGard’s NDRC (National Disaster Resilience Council, former EMP SIG) and a long-time global telecom/Cloud executive and advisor.  I’ve been working the IIoT space for a few years now, mainly indirectly via NIST Cloud working groups, SMART City programs and via start-up technology (e.g., sensors for smart city, agriculture, transportation, etc.) companies that have solutions but face the challenging challenges to educate companies the realities required to create smart AND secure systems. You pretty much nailed the situation.  That was great!  Thank you!  You nailed it.  The full requirements to undertake and maintain simple-to-complex systems are not comprehensibly understood by most, probably all, in different ways and on various levels.  My colleague and I have been fighting this battle for some time, only to achieve even more frustration. Please never stop evangelizing your mission!  It’s more than critical, it’s essential.”

Questions from the Q/A session I was not able to respond to on the presentation included:

  1. What can be done to provide sensor integrity authentication?
  2. Can AI models of hazard impacts at sensor locus reduce risks of shutdown due to adversary spoofing?
  3. Are those more than 1200 incidents are all cyber related?
  4. If a cyber incident happened to the ICS system, which is almost unavoidable, what can the engineer do and should do? Will it be too late if the engineer observed any abnormal in the physical system?
  5. The approaches of monitoring the analog real time of the sensor is very reactive and not proactive, the malicious software may reside on the controller for months or years and before activated, making it too late to detect the physical anomaly aka TRITON. What would be the best way to address this attack vector in a way we can prevent the execution of an attack ahead of a time instead of responding to it?
  6. What questions do engineers ask to begin to address layer 0,1 vulnerabilities that impact the security of the whole system?  i.e., what is the right language to get the attention of owners and cyber teams in order to collaborate?
  7. The VW diesel was an interesting case of the COMPANY "doing the hack itself" - to deceive regulators.  Are there other such cases?
  8. Do you recommend that we stop using the term OT and call it ICS Cyber Security getting engineers more involved to bridge the existing gaps?

With the large attendance, it was evident there was an interest in learning about the critical, but generally not addressed, issues of the engineering aspects of control system cyber security. There was also a common thread that control system cyber security issues are more than just IT/OT convergence. From the broad participation in the call, it was also evident there is a need for these different standards and engineering organizations to collaborate and the need for coordination among these groups. To date, it is unclear who will step up to do the coordination.

Joe Weiss