What would a cyber attack mean to control system recovery – is extended manual operation possible

May 2, 2016

Control system cyber attacks can cause extended outages even with no equipment damage. Are companies ready to operate in manual for extended periods of time?

The prevailing view of SCADA/control system recovery following a cyber event/attack is having a valid stored image of the HMI will assure system integrity and result in a fairly quick turnaround (at most a few days). However, that notion needs to face reality which is entirely different. At a recent DHS briefing, DHS stated it could take 6 months to fully recover from a cyber attack assuming no major equipment damage. Damage to major equipment with long-lead delivery times could be 9-18 months just for replacing equipment. It is now more than 4 months after the Ukrainian cyber attack and the grid is still being operated in manual even though there was no major equipment damage.  A cyberattack that disabled a US SCADA system took approximately 4 months for the utility to recover where there was no equipment damage. During that period, they were forced into manual operation with substations having to be manned. An equipment manufacturer took 3 months just to “clean the system” following a virus affecting their manufacturing systems even though there was no equipment damage.

Control systems are systems of systems, not just Windows-based HMIs. The control system loop consists of data acquisition, field controllers, field sensors, field actuators, etc. not to mention all of the switches, routers, and firewalls. The field sensors, actuators, etc are generally not Windows-based. Many control system components are custom-built and cannot be replaced with off-the-shelf components which could extend outages or manual operation even longer.

The US electric utilities and nuclear plants have no requirement to remove malware. DHS has acknowledged that BlackEnergy malware is in the US electric grids and possibly other critical infrastructures. Consequently, how will system integrity be assured that would allow manual operation following identification of malware?  Manual operation may be the only prudent approach until system integrity can be assured. However, that may not be as simple as it sounds. As Ray Parks who worked for Sandia National Laboratory for many years noted in an April 29, 2016 SACASEC posting (https://us-mg6.mail.yahoo.com/neo/launch?.rand=502ma5rapkdk4#9781096986): “Every one of the power generation and refineries I've visited for assessment has claimed to have a backup plan using manual methods in case the I&C system goes down.  In reality, when you ask a few penetrating questions like "Who will be at each of these locations in the plant?" or "When did you last test the manual backup plan?" the truth comes out - their backup plan is a sham. This architecture adds an additional layer - the primary is the cloud solution, the secondary is the local control system solution, and the tertiary is the manual solution.  When did they last test their local control system solution for backup?  Who knows how to operate the local control system?” My experience mirrors Ray’s comments.

With the “graying” workforce retiring, and the Internet of Things (IOT) becoming more prevalent, extended manual operation becomes even more problematic. Making this even more of an issue is the move to more automation making it less likely that facilities can be operated in manual mode. The also affects the move toward more big data analytics because it is unclear if the data can be trusted following a cyber attack.

Are the US critical infrastructures ready to operate in manual mode for an extended period of time? There may be no other choice.

Joe Weiss