KYFHO: Why IT needs to keep its distance from control systems or learn how to do it right

March 21, 2008
Why IT needs to keep its distance from control systems Several actual events and tests have shed new light on why IT needs to understand the issues with control systems before things go uncontrollably wrong. That is, control systems (Operations) coordination and leadership is absolutely required before those networks are touched in any direct or indirect manner. Additionally, it is also important for IT to recognize that control system cyber vulnerabilities can be different than IT cyber vulner...
Why IT needs to keep its distance from control systems Several actual events and tests have shed new light on why IT needs to understand the issues with control systems before things go uncontrollably wrong. That is, control systems (Operations) coordination and leadership is absolutely required before those networks are touched in any direct or indirect manner. Additionally, it is also important for IT to recognize that control system cyber vulnerabilities can be different than IT cyber vulnerabilities. I have been working with MITRE and NIST to demonstrate the benefits of utilizing NIST SP800-53 rather than the NERC CIPs, NEI-0404, etc. We are doing this by taking actual control system cyber incidents and identifying the NIST SP800-53 controls that were violated that allowed the events to occur and how applying the appropriate NIST SP800-53 controls could have prevented the events. One of the most interesting parts of this effort is the selection of the cyber incidents we are analyzing - all caused substantial equipment, environmental, and/or financial impacts (including deaths in one case) and were NOT caused by either Microsoft or the Internet. Specific issues with IT recommendations for control systems: 1) Anti-Virus - Various control system workstations and PCs have been shutdown by newer versions of AntiVirus software because they require more computing resources than are available to many older control system workstations. - NIST performed testing on impacts of AntiVirus definition updates on typical control system processors. The testing demonstrated that depending on the speed and loading of the processors that virus definition updates would create a 2-6 minute denial of service. 2) Patch Management for control system workstations - Control systems often use commercial operating systems such as Windows. However, the version of Windows that are used are often customized by the control system supplier so that the Windows version being used is actually a “Honeywell”, “ABB”, “GE”, “Siemens”, “Areva”, etc version of Windows. This makes the control systems version of Windows different. Consequently, when the Slammer worm hit several years ago, one of the major control system suppliers had to issue a notice to their customers that the Slammer worm may impact the system, but the Microsoft patch WOULD compromise the system. - There was an article in the March 20th edition of Information Week with the headline: “Excel 2003 Patch Causes Calculation Errors”. Excel is embedded in many control system applications. The potential impact of Excel calculation errors in control system applications is appalling to consider, especially since the errors occur when realtime data is entered into Excel. "The error shows up if a patched version of Excel is linked to a real-time data source through macros built with Visual Basic for Applications, according to Microsoft." Does this application of Excel sound familiar? - There has been at least one case where testing for patch upgrades led to system downtime. In this case, the patch was tested in the lab but without checking for impacts on software licensing keys.  In the field, the fix erased the licensing keys resulting in system downtime until the keys could be reset. - Recently, a nuclear plant was scrammed (automatically shut down) from full power operation stemming from a reboot of a workstation. Patch management causes workstation reboots. The unintended consequences of workstation or PC reboots have not been adequately accounted for in control system applications. 3) Vulnerability assessments Often, there is an erroneous assumption that vulnerability assessments mean network scans (either passive or active). There are two problems with this assumption: - Without “walking down a system”, there is no accurate knowledge of what is actually installed in the field. For example, I have yet to meet with an end-user that knew what modems were actually installed and connected in the field. - Active and even passive scans have shut down many control system networks or worse, actually damaged control system hardware (hackers should be that good). There are VERY FEW people that understand how far to go in scanning control system networks. This could be a fruitful area for the national labs in teaching people how to do this very delicate operation. Applying SANS, NERC CIP, NEI-0404, ISO-17799, ISO-27001, and other IT-related guidance to control systems without an understanding of the impacts on system performance is not only imprudent, it can be dangerous to the systems and potentially to the public!  Joe Weiss