Live from Yokogawa: "The Silence of the Lambs" Nate Kube on Cyber Security

April 9, 2008
The Silence of the Cyber Threat Dr. Nate Kube, co-founder and CTO of Wurldtech Security, described the control systems of yesteryear as he began his talk, “Answering the Silent Threats to Automation and Process Control.” Yesterday, he began,  the Process Control Networks were isolated or “air-gapped” from Enterprise Networks; they used proprietary communications & products (Security Through Obscurity). Differing technologies prevented cross pollination and there was zero Perceived Risk by ...
The Silence of the Cyber Threat Dr. Nate Kube, co-founder and CTO of Wurldtech Security, described the control systems of yesteryear as he began his talk, “Answering the Silent Threats to Automation and Process Control.” Yesterday, he began,  the Process Control Networks were isolated or “air-gapped” from Enterprise Networks; they used proprietary communications & products (Security Through Obscurity). Differing technologies prevented cross pollination and there was zero Perceived Risk by stakeholders for outside intrusion. Today we see rapid adoption of industrial Ethernet and distributed control systems based on open networks. There is rapid adoption of interoperability, and real time data access are key functionalities. And there is an increase in the frequency, severity, and complexity of Cyber threats. Security in the process industries is not well understood, and the ownership of security issues is unclear. In the future, we’ll see, Kube opined, industrial organizations outsourcing security solutions (ownership). We will see the emergence of sophisticated, multi-vendor “blended threats.” And without IC specific data, threats will bypass legitimate openings in perimeter devices like firewalls and IPS/IDS. “Security,” he said, “will be a major product differentiator.” “We must first, understand the risk, and then implement and maintain solutions,” Kube went on. “This is a simple, yet misunderstood view. Security assessments can help identify risk, but they can also create a huge challenge. Inadequate techniques do not uncover enough risk, and inappropriate techniques generate too much FUD.” There are some serious challenges for industrial cyber risk management. “Traditional” IT governance and assessment methodologies do not expose shop floor risk,” he said. “Despite years of “Preaching” we are still no further in meaningful risk profiles. In the absence of clear risk data, companies are often making critical mistakes - either doing too much or doing too little, and current vulnerability test methodologies generate too many false positives and negatives.” Industrial cyber security, like safety, is best quantified by impact. The impact (financial, safety, etc) of losing control of a given process should dictate the security considerations given to the components of the process. Three independent components dictate the overall security of a process: – Technology Quality • Resilience of communications, stack implementation, failure modes – Process Topology • Component devices, mitigation functionality, configuration schemes – Personal Conduct • Guidelines for interaction with the system, update policies, monitoring policies “It has always struck me,” Kube said, that we’ve built a Culture of FUD. Vulnerability discovery and disclosure has always had an element of cloak and dagger, and -Days have lots of appeal….to hackers. They contribute to a “Hair on Fire” syndrome - every time a new vulnerability comes out - a new patch is required-- generating an alarmist response… which is nothing more than chasing your tail.” In “Current” trends, every time a vulnerability is found, something must be done, but this does not adequately address new threats, the device or program may not be able to be patched due to device age or because of firmware issues (requiring a “forklift” upgrade). This does not actually increase security, and it contributes to the frustrating view of security as a cost center and as security researchers as fear mongers. Kube went on, “Each electronic component has its own “Failure Signatures” or limits and tolerance, and the key is to understand the resilience of the device and make it more so.  Device testing and vulnerability analysis can complement each other to understand the exploitable threats against the device.” Understanding process risk requires understanding of control device failure modes: – Loss of efficiency – Loss of control – Loss of quality – Safety failure – Regulatory compliance issues For industrial users, this involves device faults and erratic behavior at the I/O level - network stack faults alone are not sufficient. You have to tie the device to the real world. Communication protocols are complex and their specifications may contain areas of ambiguity. Incorrect assumptions and insecure development are common sources of vulnerabilities. Proprietary protocols typically have small user base. Disparate technologies interface in unexpected ways. Resource intense communication functionalities inhibit other key processes. Few thorough, repeatable, extendable testing tools are available to vendors. The ones that exist are non-precise, difficult to accurately and consistently reproduce errors. Frameworks do not support rapid extension to proprietary protocols. Each tool has limited observability into the effect of a test (network impact ONLY). Signature centric assessment tools are based on CERT and BugTraq vulnerability lists which are IT focused, and aren't always congruent with control system needs. Control devices are not the usual Unix or Windows based platforms. THE RESULT: Meaningful resilience or robustness testing is often neglected in QA cycles. The Most Common Objections: That Can’t Happen Because of X System, where ‘X’ = firewall, safety system, etc. WHO Would do That to US? This is a challenge of “Positive Design” versus “Negative Analysis.” Engineers design with the intended purpose of the system in mind, they consider failure in terms of entropic events, such as safety or machine failure, where the other compensating controls maintain safe state. As engineers, we ASSUME that the Process will “Fail-Safe.” Hackers, on the other hand, don’t just send one bad packet, they research for weeks or months. Intentional attackers will bypass safety systems and other systems. It’s not a matter of “WHO” but “HOW” - the who can be created in one day if someone is given enough motivation (e.g. threatened with layoff or is mad at co-worker). Kube went through a litany of what he called “Common Misbeliefs...” • Our control protocols have CRCs or other message checksums – so we are protected.INCORRECT – The checksums protect against transmission errors. Attackers can re-calculate the checksums to match the payload, or someone may read the spec and implement it differently while checksums still validate! • We use cryptography (or IPSec) so we are completely protected (and our key exchange is ultra secure). INCORRECT – encryption protects against eavesdropping and alteration of message data (while in transit). The attacker can establish cryptographic tunnel and run the attacks through the tunnel! And more, cryptographic protocol and key exchange may be attacked by themselves. • We have AV, firewalls and IPS/IDS – so we are protected. INCORRECT – targeted attacks created based on 0day (fuzzing) exploits are practically impossible to be detected by any existing technology. They protect you against the script kiddies and after the fact (ok...to be fair – there are some interesting technologies here too). In fact AV, Firewalls, IPS/IDS will add an additional layer of components that may become security threats by themselves (as protocol implementations, they need to “peek” inside data that is sent and received, making them possible weakest links...)‏. • Our control networks are not accessible to outsiders (or connected to IT) – so we are protected. NEVER THE CASE, BUT SUPPOSE IT IS TRUE! Have you though of evil insiders or disgruntled employees? The Australian Sewage Spill (Maroochyshire) is a case in point. What about viruses on maintenance personal laptops? And even if security breach is not of concern, how about system stability? Ever had a malfunctioning switch bring a control or safety network to its knees? Kube discussed a case study where the CIO of a power plant asked them to attack it, even though it had just passed several NERC CIP audits. Kube said that Wurldtech’s attack team, in 4 hours: – We disabled all generation systems – Cause erratic behavior on sensor networks – Interrupted communications to historians and shift office - which is hooked directly to real time traders ($$$$) – Demonstrated theoretical attack against dam spillway management – Disabled firewall and demonstrated bypass techniques – Demonstrated numerous malware and virus based attacks possible “What we need to do,” Kube said, “ is to develop models that are based on threat vectors, not on threat agents. In other words, we need to figure out WHAT can happen and prevent that, rather than try to figure out WHO might attack.” We need vulnerability analysis, robustness and resilience testing, solid Vulnerability Taxonomies - Security Failure Mode Profiles for industrial environments. Then we can develop and implement mitigation strategies against these broad classes of threats. The benefits of doing it this way are large. It creates a greater “Upfront” design cost, but less touch points later on. It gives the owner greater confidence in the installed architecture. It creates improved performance of the overall system, not just from a security standpoint, and sets the stage for Benefit-Driven security. Finally, it resembles safety-driven models, which the controls industry is quite familiar with. Unfortunately, this model requires a level of device testing that is not commonly in use yet. So operators and owners can either "roll your own" or wait on ISA99.04 and the Industrial Security Compliance Institute (ISCI). The model does not address every vulnerability, nor could we, because of the fundamental limitations of device testing. This technique should address security in terms of design constraints, and how to design a resilient process. It is not a guarantee of "no fault" but does provide a robust system that is fault tolerant. This model still requires business continuity and incident response plans. Kube went on to give a summary of where the ISA99 standards sit, and how close to being finished they are. He noted that several users are already using the Zone and Conduit model, and other parts of the as-yet-unfinished standard. “Overall,” he said, “the feedback is quite good!” He then went on to talk about the Security Compliance Institute ISA has started, of which Yokogawa is a founding member. "In summary," Kube said, "you are not secure because you did vulnerability analysis; you are not secure because of the latest patch. Understanding your risk level requires a greater level of understanding than is often available. The emerging trends in security are towards system resilience over patching. Resilience testing is essential to identifying risk, and it can be implemented by anyone today. There are standards requirements emerging, and standard conformance emerging."