"In many languages, there is only one word for safety and security. In German, for example, the word is 'Sicherheit;' in Spanish it is 'seguridad;' in French it is 'sécurité;' and in Italian it is 'sicurezza.' " That's the start of the 2010 article by John Cusimano Director of exida's security services division and Eric Byres, CTO of Byres Security. Both Cusimano and Byres have significant expertise in both safety and security in process plants. Their article was titled, "Safety and Security: Two Sides of the Same Coin." They were introducing a relatively new concept that grew out of the similarity between layers of protection analysis (LOPA) for safety instrumented systems (SIS) and the defense in depth (DID) strategy for cybersecurity in industrial control systems.
That was three years ago. Something has changed in the process industries, but not every top manager or plant manager or plant or corporate IT executive has seen the ramifications of it.
That "something" was, of course, the discovery of the infamous Stuxnet malware, which infected an Iranian uranium enrichment plant and damaged or destroyed over 100 special-purpose centrifuges. Morteza Rezaei, an Iranian automation professional, says, "Main affected country in the early days of the infection was Iran, so I could find many infected projects easily."
It's Not Just the Network Anymore
In case you've been asleep for the past two years or not paying attention or had your head in the sand or your fingers in your ears singing, "La la la la, I can't hear you!," you will know something about Stuxnet. Here's a quick reminder of what it did, and why it is important.
In Nancy Bartels' cover story in October 2010 ("Worst Fears Realized"), Nicolas Falliere of security vendor Symantec says, "Stuxnet can steal code and design projects, and also hide itself using a classic Windows rootkit, but unfortunately it can also do much more. It has the ability to take advantage of the programming software to also upload its own code to a PLC typically monitored by SCADA systems. Stuxnet then hides these code blocks, so when programmers using an infected machine try to view all of the code blocks on a PLC, they will not see the code injected by Stuxnet. Thus, Stuxnet isn't just a rootkit that hides itself on Windows, but is the first publicly known rootkit that is able to hide injected code located on a PLC."
Falliere adds, "In particular, Stuxnet hooks the programming software, which means that when someone uses the software to view code blocks on the PLC, the injected blocks are nowhere to be found. This is done by hooking enumeration, read-and-write functions, so that you can't accidentally overwrite the hidden blocks as well. Stuxnet contains 70 encrypted code blocks that appear to replace some 'foundation routines' that take care of simple, yet very common tasks, such as comparing file times, and others that are custom code and data blocks. By writing code to the PLC, Stuxnet can potentially control or alter how the system operates."
The two fundamental takeaways from this, for managers and directors and all IT people working in manufacturing enterprises are, first, that network-centric cybersecurity planning works just as well as the Maginot Line, and second, that any control system in any industry is vulnerable to a Stuxnet-type attack, whether or not it is connected to an IT-serviced network.
There is a third fundamental point that must be made. Stuxnet used cyber means to attack a plant's operating control system and make it fail in a dangerious and unsafe way.
Safety and Security Intertwined
ISA84 (now IEC61511 and 61508) recognized the need for active functional safety programs in process plants. Many plants now have formal functional safety programs. They have re-evaluated their alarm management systems and have brought their SIS into compliance with the IEC standards. Luis Duran, a safety expert with ABB, puts it this way: "I see that plants and companies with a strong safety culture see safety as a core value positively affecting their economic performance."
Some companies are very far down the road to functional safety. The Dow Chemical Co., as Eric Cosman, co-chair of ISA99 and a security expert for Dow points out, has been working on functional safety since the early 1960s. What a lot of people don't know, he notes, is that Dow has had an active cybersecurity program since the 1990s.
"From my perspective," says Walter Sikora, vice president of security solutions of Industrial Defender, "safety is taken seriously, openly communicated and a high priority. Most utilities and plants have a 'safety moment' before every meeting to stress the point. Even before someone is allowed to visit a plant, they usually go through a safety training video. Very few, if any, companies do the same with cybersecurity. Have you ever visited a process plant and seen a big sign showing how many days since their last cyber incident?"
"It depends on the industry," says Joe Weiss, principal of Applied Control Solutions and chief blogger of Control's "Unfettered" cybersecurity blog. "The electric industry treats security as a compliance, not a reliability or safety issue. Other industries, such as chemical and petroleum, treat security as an important reliability and safety consideration. For example, consider the membership of the ISA99 Leadership Committee. The end users on the committee are primarily from oil/gas and chemicals, with no representation from electric utilities."
But it's not just electric companies that don't get the security/safety nexus. Many companies see cybersecurity as solely an IT problem.
Weiss points out, "I have found that senior management are keenly aware of IT security issues because of Sarbanes-Oxley, but they are only aware of control system security when something bad happens. An ongoing dilemma is how to educate senior management to be as interested in protecting their operational assets as they are in having their ERP installed on time and within budget."
Echoing Cosman, Weiss goes on, "Plant management often feels that cybersecurity is an IT issue—protecting emails and the data in enterprise servers. Another ongoing dilemma is how to educate plant management to understand how cybersecurity can affect reliability and safety."
Sikora adds, "There is too much focus on the technical aspects of security and less on the business aspects. We need to educate our engineers to speak a business language and present to management metrics around security. Our security risk is X. We need to invest Y to reduce our risk by Z."
Al Fung, director of marketing for safety and critical control solutions for Invensys Operations Management (IOM) says, however, "We are seeing a shift in attitude towards raising the importance of security."
Managers are reaching out to the cyber and safety communities to learn and understand the impact and consequence of an attack, Fung says, and they are engaging control system vendors to help prevent and mitigate potential risks.
That's good, because just like safety, security is not fundamentally a vendor issue. Fung's IOM colleague, program manager-cybersecurity, Ernie Rakaczky, has pointed out that the responsibility of the vendor to produce secure systems solves about 25% of the problem, while the end users are responsible for the remaining 75% in the way they implement security policies, procedures, training and enforcement. Just as companies need to create and maintain a safety culture, they also need to create and maintain a security culture—and recognize that the two cultures are the same.
Fung goes on, "Defense-in-depth security is similar in approach to layers of protection for safety in the context of risk reduction and mitigation. Any plant design, safety risk assessment and hazard analysis that involves the use of any industrial control equipment needs to include a security assessment as part of the design for plant safety. If it isn't an integral part, it should be."
And here's where plant security and safety collide. "Plant management and executive management may be concerned about, but do not understand industrial control security," John Cusimano says. Most often they look to the IT department, but "the challenge is that while IT knows how to secure networks, it does not know how to properly apply security control in an industrial control system (ICS) environment. Security assessments tend to focus on fundamentals, such as strengths of passwords, and definitely do not address how to secure ICS protocols."
Safety, Security and Compliance
In the United States, when the Occupational Safety and Health Administration (OSHA) was established in 1971 under the Nixon administration, the new agency focused on workplace safety, not functional safety in the process industries. It also focused on "compliance" to the Occupational Safety and Health Act of 1970.
Since that time, the number of usually avoidable accidents where property, injuries and some fatalities have occurred remains high, with this writer's estimate that between 100 and 200 process plant workers are killed annually, and thousands injured.
And for readers with a financial bent, BP's Texas City refinery has not yet recovered full production since the accident in 2005 that took 16 lives and damaged a significant part of the plant. These accidents carry a huge financial cost in terms of plant downtime, with loss of revenue in nearly every accident, including some plant-closing events, not to mention payouts to victims and their families and government fines, plus softer costs to corporate reputations as well as damage to communities and significant ecological costs.
But savvy end users understand that a good safety culture is about making things safer, not about being compliant with regulations. Safety systems are designed to increase plant safety, without regard to basic compliance. Compliance is assumed. Compliance comes from having a good safety culture, with a good safety system and ongoing safety process.
Unfortunately, in the security area, compliance rules. The North American Electrical Reliability Corporation (NERC) serves as the power industry's self-policing agency. NERC's Critical Infrastructure Protection (CIP) standards rely on enforcing compliance, without necessarily insisting on increased security. In fact, NERC has consistently tried to avoid security issues by literally gerrymandering which installations are to be considered "critical." Joe Weiss tells of a utility that was actually fined by NERC for violating the CIPs when it chose to go after increased security and assumed compliance.
Often, the same attitude applies in the other process industries. Cusimano notes, "Many unscrupulous vendors will sell them [end users] anything, and claim it will deliver compliance. It is still not recognized that ICS insecurity can lead to safety incidents."
Weiss points out, "The NERC CIPs would not have prevented a Stuxnet-style attack on our power industry critical infrastructure."
So What Do We Do?
Dow's Cosman says the best way to incorporate security into functional safety is to adopt the ISA99 standard. Cusimano agrees. "We recommend starting with a control system security assessment or gap analysis," at the same time you update your safety HAZOP. "Your analysis should be based on relevant standards and best practices such as ISA99. The next step is to perform detailed risk analysis or threat modeling to understand the highest risk vulnerabilities."
Here is where the two analyses should merge. In an operating process plant, the highest-risk vulnerabilities have little to do with the data in the servers or the plant manager's emails. These vulnerabilities are in the control system and the safety instrumented system, which, if compromised, could shut down the plant, or worse. As the Stuxnet malware proved, this kind of attack doesn't have to come through the network. There were apparently several attack vectors, but the most significant one for Stuxnet was a targeted "candy drop" of infected USB memory sticks in the parking lot of the plant. Users bypassed any network security measures and plugged the USB sticks directly into the engineering workstations and the process control computers.
A Functional Security Analysis
Weiss talks about how to do a functional security analysis. "A functional security analysis requires senior management buy-in to be successful."
Then, he says, do a detailed assessment of what needs to be done to beef up your ICS security that has been documented as such. That means, find out what measures and systems you think you have in place.
Next, determine what you actually have, not what you think is there. When such assessments have been done, hidden, forgotten modem connections into the control system often turn up. The recently increasing usage of smart phones and tablets needs to be considered as both a safety and a security issue, too.
Next, Weiss says, you have to determine what is connected to what and how. Only then will you be clear what the potential cyber issues are.
Then you need to determine how secure you really are. That means finding out if known security issues, such as patching, have been addressed within the context of equipment availability. If plant conditions do not allow for security implementation, determine what work-arounds have been implemented, so you can continue to operate with insecure equipment.
Weiss says that you have to recognize that plants will be hit. You have to develop a recovery plan. This is where ISA106, Procedural Automation for Continuous Process Operations, intersects ISA84 and ISA99.
Finally, ask the many-billion-dollar question: What probability would you assign to a successful cyber attack on a process plant? In other words, how worried should you be, and how much money and manpower should you throw at this problem?
Weiss succinctly responds, "There have been more than 200 actual control system cyber incidents to date (malicious or unintentional). There have been successful cyber attacks on process facilities (more than just Stuxnet). Risk is frequency times consequence. Since there are minimal control system cyber forensics and minimal information sharing, the probability is difficult to estimate, but since you can expect to have cyber-related issues eventually, the probability should be 1!"