A recent event demonstrates the lack of grid resiliency and the inability to clearly identify cyber attacks

Unfortunately, the responses to my blog on “An almost catastrophic failure of the transmission grid” (https://www.controlglobal.com/blogs/unfettered/are-the-good-guys-as-dangerous-as-the-bad-guys-an-almost-catastrophic-failure-of-the-transmission-grid/ ) focused on the IT/OT disconnect. To me, that misses the technical concerns which are:

- The inability to unambiguously identify an event as being a cyber attack, As seen in the following discussion, the impacts from this incident strongly resemble the impacts from the Industroyer malware.

- The wide spread impacts from incidents that may or not be malicious in nature. In this case the loss of communication and inability to do switching of hundreds of high voltage relays demonstrates a significant lack of grid resiliency.

- The lack of indication of troubles with the relays. In this case only about a fraction of the relays showed a trouble light even though all relays were impacted and had to be manually rebooted.

The impact of the incident looks similar to the 2016 attack of a Ukrainian transmission facility. The details of the attack were provided in the ESET and Dragos reports. IEC 61850 was one of the protocols targeted.  Additionally a unique port scanner was used which had the effect of a DOS disruption of relays that had to be manually reset. According to the ESET report on Industroyer: the attackers’ arsenal included a port scanner that could be used to map the network and to find computers relevant to their attack  Instead of using existing software, the attackers built their own custom-made port scanner. The attackers could define a range of IP addresses and a range of network ports to be scanned by this tool. Another tool from the attackers’ arsenal was a Denial-of-Service (DoS) tool that leveraged the CVE-2015-5374 vulnerability to render a device unresponsive (this vulnerability disclosure was for one specific vendor’s relays and the question is how vulnerable it would be to other vendor’s relays). Once this vulnerability could be successfully exploited, the target device would stop responding to any commands until it would be rebooted manually. To exploit this vulnerability the attackers hardcoded the device IP addresses into this tool. Once the tool would be executed it would send specifically crafted packets to port 50,000 of the target IP addresses using UDP. Because the impacts at this utility were so similar to the Industroyer malware, I was asked if the case could be a “test” run of the Industroyer malware. I was told by the utility there was no malware.

Consequently, because of the similarities to the impacts at this utility and the Industroyer report, there are three questions:

-        Was this event totally coincidental to the impacts of Industroyer? If so, what other unintentional incidents can cause equipment problems and be indistinguishable from cyber attacks?

-        Was the Industroyer malware somehow loaded onto the pen tester’s software because the attackers knew the utility’s substation configuration? If so, why didn’t the utility’s cyber security program detect the malware particularly after being informed of Industroyer? Is the malware still resident? How many other utilities would be incapable of detecting this malware?

-        Did the developers of Industroyer know that “innocent” pen testing software could cause these impacts? If so, was the Industroyer malware developed to mimic the unintentional impacts making the malware detection very difficult at best?

I’m not sure which answer is more problematic. The first answer would demonstrate that lack of control system expertise is completely unacceptable. The second answer would lead to the question as to how many other utility networks have been compromised particularly as BlackEnergy 2 has been in some US grids since at least October 2014. The third answer means the attackers were knowledgeable enough to know the weakness of the pen testing software and the impact on critical grid equipment. If this is the case, how many other software products that could cause grid disruptions have been mimicked? All of these cases lead to the inability to clearly distinguish between unintentional impacts and cyber attacks making it difficult at best to meet NERC CIP and NEI-0809 malware identification requirements.

Another issue is the use of ring bus technology which is used by many utilities. The ring bus effectively “transmitted” the communication failure from one relay to another resulting in the lack of communication to hundreds of high voltage relays.

This event clearly demonstrates the lack of grid resiliency and the inability to identify certain events as being cyber attacks.

Joe Weiss