On June 29th, I wrote following blog:
In the control systems community, the primary focus is on safety and reliability while the most frequent cyber risks are unintentional. As Walt Boyes phrases it, the control systems community needs to focus on functional security. Functional security addresses the ability of systems to perform their functions in the face of intentional or unintentional cyber threats while assuring fail-safe operation. Functional security requires not just control systems domain expertise, but looking at system design and policies from a different perspective. The lack of functional security has led to control system cyber incidents in electric, water, oil/gas, chemicals, and transportation including several with fatalities. Air France (aircraft) and the Washington DC Metro (rail rapid transit) apparently involved cyber control system failures; the Olympic Pipeline Company – Bellingham (gasoline pipeline) did suffer from cyber control system failures.
Common issues in Air France, DC Metro, and Olympic were:
- Reliance on remote (automated) system control
- Previously failures with control systems and components
- Logic that did not provide for “fail-safe” conditions
- Violation of the NIST Confidentiality/Integrity/Availability criteria
Modern communications and control system technologies are making systems more productive, but are reducing robustness. Many control system cyber incidents did not violate IT security policies as they were control system design or operation issues. And, yes, they could have been intentionally caused. The Smart Grid will further blur the lines between IT and control systems making functional security even more important. However, control system domain expertise is lacking. It’s time to address functional security of control systems before more people die.
August 25, I had an opportunity to speak with NTSB. After explaining the NIST definition of a cyber incident and what we have been finding to date, the NTSB engineer concurred this was a non-malicious control system cyber incident. Moreover, not only was there a lack of communications on the position of the train in the block, there was also a lack of appropriate alarms and actions taken on the alarms that did arise. This is very similar to other control system cyber incidents in electric power, water, pipelines, etc. That means there is at least one other commonality between Bellingham and DC Metro (and for that matter the Northeast Outage, the Florida Outage, etc) - operator confusion due to “inaccurate” operator displays.
The issue of functional security and fail-safes not being fail-safe will be a topic of discussion at the October Control System Cyber Security Conference.
Joe Weiss
Further reading:
Metro Control System Fails Test Technology Should Have Averted Crash, http://www.washingtonpost.com/wp-dyn/content/article/2009/06/25/AR2009062501073.html
NTSB: Metro signal system didn't detect test train, http://www.washingtonpost.com/wp-dyn/content/article/2009/06/25/AR2009062500652.html
Metrorail Crash May Exemplify Automation Paradox
http://www.washingtonpost.com/wp-dyn/content/article/2009/06/28/AR2009062802481.html
Bellingham Control System Cyber Security Case Study, http://csrc.ncsl.nist.gov/groups/SMA/fisma/ics/documents/Bellingham_Case_Study_report%2020Sep071.pdf