Log In Register

Safety and Security: Two Sides of the Same Coin

A Weakness in Security Creates Increased Risk, Which in Turn Creates a Decrease in Safety, so Safety and Security Are Directly Proportional, but Are Both Inversely Proportional to Risk

03/25/2010

1 vote
Text size: - +

The Safety Analysis and the Security Assess phases have the most similarity by far because, in both cases, the purpose of this phase is to determine the amount of risk present and decide if it is within tolerable limits for the facility. Determining the amount of risk involves identifying the consequences (what could happen and how bad would it be?) and the likelihood of it occurring (how it could happen and how likely it is to happen?).

A typical first step in this process is the hazard and operability analysis or HAZOP. A HAZOP, the most widely used method of hazard analysis in the process industries, is a methodology for identifying and dealing with potential problems in processes, particularly those which would create a hazardous situation or a severe impairment of the process. A HAZOP team, consisting of specialists in the design, operation and maintenance of the process, analyzes the process and determines possible deviations, feasible causes and likely consequences. It is important that the industrial automation and control system (IACS) be listed as a cause if failure of the IACS or unauthorized access could initiate a deviation.

{Sidebar1}Figure 1 shows the piping and instrumentation diagram (P&ID) for a simple chemical reactor. A portion of an example HAZOP for the process is shown in Figure 2. The text in red highlights the IACS-related causes.

Unfortunately, other than identifying the IACS as a potential cause, a HAZOP doesn't study the details of IACS deviations, which is an important step, especially given the size and complexity of modern control systems. An increasingly popular solution to this problem is a special version of a HAZOP called a control hazards and operability analysis (CHAZOP), which takes the next step in understanding the details of IACS hazards. Another technique is a failure modes and effect analysis (FMEA). Both techniques identify causes and consequences of control system failures. The CHAZOP technique extends the concept of deviations and guidewords from HAZOP, extending the list of guidewords for IACS specific types of deviations. The FMEA process takes a more hardware-centric approach by systematically studying the failure modes of each component and the effects on the system. Either technique is acceptable. However, regardless of the technique selected, it is important to include security deviations or failure modes in the analysis.

Figure 3 shows a portion of an example CHAZOP that looks deeper into possible causes of the IACS deviations. In this example, the red text highlights the security-related deviations.

Upon completion of the HAZOP and CHAZOP or FMEA, we should have identified all of the causes of IACS failure, including security failures, and the consequences of those failures. However, we still have not determined the likelihood of these events occurring, which is a necessary step in quantifying the risk, as risk is the product of likelihood and consequence.

Estimating likelihood, particularly for security, can be a difficult task because it can be very difficult to estimate the skill and determination of an attacker. We can simplify the task by filtering the list to include only those with intolerable consequences (e.g., have the potential to cause injury, death, significant downtime and environmental or major equipment damage). There are a variety of techniques available, including risk matrices and risk graphs. However, in safety, one of the most popular techniques for estimating likelihood is layer-of-protection analysis (LOPA). Part of the reason for its popularity is that it gives the user credit for employing a layer- of-protection strategy to mitigating risk. The security field uses the term defense-in-depth to describe a very similar concept. Yet, even though the terminology is different, the LOPA technique can definitely apply to security threats.

Having completed the LOPA or other method of estimating the likelihood, the final step in the front-end of the combined safety-and-security life cycle is to compare the results with facility or corporate tolerable risk guidelines and to document the results. Whenever the estimated risk exceeds the tolerable risk, there is work to be done. That work is covered in the subsequent phases of the combined life cycle and will be discussed in the subsequent articles. However the following case study from a major U.S. refinery illustrates the analysis phase and subsequent develop/implement phase.

This particular refinery was concerned about the safety, security and reliability of its safety systems (and thus the safety, security and reliability of the entire facility). To start, a team conducted a security risk analysis of all control systems, expanding on existing safety HAZOP studies. The information on this was used to drive a FMEA that clearly showed possible common mode failures of the control and safety systems due to either accidental network traffic storms or deliberate denial-of-service (DoS) attacks. While the safety systems would fail in a "safe" manner, even under attack, it was clear that the consequences would be a significant (and thus expensive) plant outage. As well, the probability of occurrence was estimated to be high since the skill level needed to drive such an attack was close to zero (and in fact could be caused accidentally), the attractiveness of the site to possible criminal or terrorist groups was high, and the layers of protection were limited.

The lack of clearly defined security layers drove the decision to adopt a zone-and-conduit model as defined in ANSI/ISA99.02.01 as the solution to the realization/addressing phase. The plant business network, the management network, the basic control system, the supervisory (operator) system and the safety system were each defined as a separate security zone (See Figure 4). Between these, approved "conduits" were defined for inter-zone communications. For example, only the Supervisory Zone was permitted to have direct connection to the rest of the Business Network Zone, while the Safety Zone had only a single approved conduit to the Supervisory Zone (for additional information on the use zone and conduit models see http://www.tofinosecurity.com/ansi-isa99).

Once the zone and conduits were defined, the technical aspects of the solution began. The engineering team defined appropriate safety/security controls on each of the conduits to regulate inter-zone traffic. For example, to manage all network traffic into the Safety Zone from the Supervisory Zone, Tofino industrial firewalls were placed between the two zones. These specially designed security devices are hardened for industrial environments and engineered specifically to manage MODBUS/TCP traffic, the particular network protocol used to communicate to the SIS. They were also designed to be transparent to the network on start-up so they would not disrupt refinery operations while being commissioned. Similar firewalls were also placed between the Basic Control Zone and the Supervisory Zone.

It is interesting that, while the firewalls were being installed, the engineering team noted that Windows PCs on the Supervisory network were generating a significant number of "multicast" messages on the safety and control network, even though they were of no use to the SIS controllers. Messages like these were a significant factor in another famous nuclear safety incident that occurred at Alabama's Brown's Ferry reactor in August 2003, where network traffic caused a redundant cooling water system to fail. This time the firewalls were configured to have these nuisance messages silently dropped from the network as they entered the Safety Zone. Two years later, the safety system has operated without incident, despite numerous changes to the networks around it.

While it is easy to get caught up in IT view of security and think only of hackers and viruses, for the operator of a hydrocarbon processing facility, security is much more. Security is about maintaining the reliability and safety of the entire system. As a result, the security of process systems can be significantly improved with a coordinated approach to both safety and security, starting with the initial analysis. By following a well-defined process, this analysis phase can be both cost-effective and significantly improve the reliability of the entire processing facility, providing an excellent return on investment.


John Cusimano is director of exida's security services division.
Eric Byres is the Chief Technology Officer of Byres Security Inc.

1 vote

Read more about

Related Content You May Like If you enjoy this article, our editors think you may enjoy this related reading.