Cybersecurity / Safety Instrumented Systems

Secure the SIS

Understand the strength of your last line of cyber defense

By William L. Mostia, Jr., P.E.

Cybersecurity is a complex issue, made more complicated by having four technical domains (IT, enterprise, process control and safety systems) with different purposes, goals and potential hazards and risks, and often with different personnel involved. Here are some of the considerations in performing the security assessment required in Clause 8.2.4 of the new Second Edition of IEC 61511-1: 2016 standard for safety instrumented systems (SIS). Some of the considerations also apply to non-SIS safety systems and to the broader industrial control and automation systems (ICAS).

The purpose of the SIS is to ensure that we make product safely. For the most part, it is not involved in how the process is controlled or how process and production information is normally collected or manipulated outside of the SIS, only if the process exceeds safe limits. We have some secondary concerns that the safety system does not affect production through excessive spurious trips or through high maintenance rates.

The broad cyber domains are conceptually illustrated in Figure 1, where it can be seen that the SIS is essentially embedded in the ICAS system and generally is considered part of it. It is important to understand the role the SIS plays in the ICAS to understand how a cyber attack might occur in a SIS, and what cyber consequences could be physically realized and lead to a hazardous condition.

Introduction to IEC 61511 first Edition

When the First Edition of IEC 61511-1 came out some 14 years ago (and ANSI/ISA S84: 1996 before it), cybersecurity concerns and cyber threats were not very well recognized in the process industries. The 61511-1 2003 1st Edition standard did recognize potential security threats to SIS. These threats primarily revolved around inadvertent or unauthorized access and changes that affected the safety integrity of the SIS. Unauthorized or mismanaged changes have long been recognized as a safety hazard (e.g. Flixborough, 1974), which led to the current management of Management of Change (MOC) regulations.

The security concerns at that time (and they still are concerns today) were primarily unauthorized physical access (e.g. locked cabinets, building access control, etc.), programming access (e.g. keylocks, programming panels and computers, dongles, etc.), and password controls. Most security threats were considered internal, e.g. inadvertent, unauthorized, and/or undocumented changes, problems caused disgruntled employees, etc.

Recent rapid advancement of digital control system technology, advances in computing power and interconnection of the world via the Internet, intranets, wireless and now, the Internet of Things (IoT) and Industrial IoT (IIoT) have opened a Pandora’s box of opportunities, but also unleashed the big, bad wolf in the form of increasing cybersecurity threats.

A key difference in today’s security threats and what was in the 61511-1 1st Edition is that we have now moved from a small arena (your company) that included mostly internal, physical threats to a much larger threat arena (the world) that can include both internal physical threats and external, largely invisible cybersecurity threats potentially from all over the world.

Second edition security assessment requirements

IEC 61511-1 2nd Edition recognized the increasing threat of cyber attacks to SIS, and added additional requirements to help reduce the risk. Compliance to the new IEC 61511-1 Second Edition 2016 will require that the SIS have a security assessment (Clause 8.2.4) and that the design of the SIS shall be such that it provides the necessary resilience against the identified security risks (Clause 11.2.12). Clause 8.2.4 details a general outline of what is to be accomplished in a security assessment, but not much on the nitty gritty of performing the assessment. The basic methodology is one of reductionism: breaking down of the SIS domain into smaller equipment pieces, analyzing the equipment’s vulnerabilities, evaluating the existing protections that limit exposure of the vulnerabilities to a physical security breach or cyber attack, providing additional protections to reduce the risk to an acceptable level based on the corporate risk criteria, and documenting the risk assessment. When using a reductionism methodology, care must be taken to not miss system-level threats.

Download: Béla Lipták on safety: Cybersecurity and nuclear power

The overall risk assessment process is similar to a process Hazards and Risk Assessment (H&RA). In some cases, a cyber attack in the ICAS could initiate a cause similar in effect to an equipment failure or human error in a process H&RA, where other layers of protection (assuming some of those have not been defeated in the process of the cyber attack) will come into play to bring the process to a safe state. A standard layer of protection analysis (LOPA) would prevent the physical realization of this cyber threat into a hazard. In this case, the ability of the SIS to resist defeat of the SIS safety function can be paramount to achieving a safe state of the plant.

SIS cyber domain

Philosophically, the SIS is best understood in the context of layers of protection, where the SIS is equivalent to one to three independent protection layers (IPL) that protect against identified hazards determined by a process risk analysis. Figure 2 illustrates a cyber attack on the SIS where a cyber attack on the basic process control system (BPCS) could initiate a cause equivalent to a normal equipment failure or human error. This same attack could lead to defeat of the alarm IPL, if it is in the BPCS and the attacker has sufficient skill in manipulating the BPCS.

If the SIS is designed to protect against that particular initiating cause, it should serve its purpose as an IPL and bring the process to a safe state. But if the attack is also directed at the SIS, defeat of the safety instrument function (SIF) protecting against the BPCS initiating cause could also defeat the IPLs for that hazard, leaving protection to the mechanical IPLs.

If the initial cyber attack was directed strictly at the SIS and not the BPCS, the SIS safety function could be defeated, leaving a latent dangerous failure in the SIS. Cyber attack of the SIS might also initiate a spurious trip (one or many), causing a safety incident or disrupting production. If software resets are used in the SIS (versus field resets), the cyber attack might auto-reset the SIF and trip again later, leading to a hazardous condition.

These SIS cyber attacks can be overt, e.g. in conjunction with a simultaneous cyber attack on the BPCS, or covert by defeating the safety function of the SIS while waiting for a normal safety demand or a later  cyber attack to occur on the BPCS. It is important to realize that the SIS is typically not the only IPL(s) protecting against hazards, which should be considered in any risk assessment, and that the primary risk we are concerned with is a physical realization of a hazard.

It is also important to understand how a SIS can be defeated in order to provide protection against those events. For example, some of the ways a SIS/SIF can be defeated are placing the safety functions in bypass without the operator being aware, placing the logic solver in an infinite loop, changing the trip and alarm setpoints, disconnecting the output from the logic, spoofing the inputs, forcing the outputs, etc. This understanding can lead to SIS designs that, in addition to the Zones/Conduits protection scheme, can help to protect against specific failure modes resulting from a cyber attack.

IEC 61511-1 security assessment steps

To perform a security assessment under the new IEC 61511-1 standard, one of the first things to do is to establish the outer boundary of the SIS under assessment. Where the zones/conduits in ISA/IEC 62443:2010 and ISA TR84.00.09 protection concept are used (Figure 3 and 4), the zone boundaries of the SIS are the boundaries for the security assessment. This is somewhat similar in concept to nodes in a HAZOP. An alternative to the HAZOP, top-down approach would be to use a failure mode and effect analysis (FMEA), bottoms-up approach, and look at how the system could fail to accomplish its purpose given a security event. Or use a combination of both methods to cover all your bases.

The security assessment requires a description of all the covered devices. These device descriptions should include a listing of all the hardware and software versions, including device sub-modules, for change management. There is a famous old saying that the author made up this morning, and that is, “If you don’t know what you have, you cannot protect what you have.”

Security-critical information (e.g. alarm & trip setpoint, field device parameters, communication parameters, etc.) within the devices and system should be identified to allow detection of unauthorized changes in the parameters critical to safety. The description should include all connections to other devices within the SIS, to devices outside the SIS boundary, and to all devices used for non-operational purposes (programming terminals, update connections, field device communicators, calibration equipment, asset management systems (AMS), any connection to the outside world whether they are active or not, etc.). This should include hardwired connections that can be influenced by a cyber attack and any connections that help provide protection against cyber attacks (e.g. hardwired keyed bypass enable switch).

Known cyber vulnerabilities for each device should be listed. As part of generating this list, known cyber vulnerabilities should be discussed with the device manufacturer and researched within the industry. Physical security vulnerabilities for each device and for the system as a whole should also be listed. There is commercial software becoming available to make this inventory effort easier and to help automate monitoring of unauthorized changes and potential security issues. An example of this type of software is Cyber Integrity by PAS.

Develop a description of identified security threats that could exploit listed equipment vulnerabilities and result in security events (including intentional attacks on the hardware, application programs and related operating system software, as well as unintended events resulting from human error). It is important in this type risk assessment to understand the security threat vectors (e.g. how the threat gets into the system (access points), how the point is accessed, the nature of the attack, any enabling conditions that facilitate the threat vector, the devices or path the threat vector takes to propagate through the system to reach a point for a physical realization into a hazardous or undesirable condition to occur, and what is necessary to happen to result in a safety incident).

The methods to prevent the threat vector from reaching the device’s vulnerability and methods to mitigate the cyber attack, or in the worst case to recover from the attack, should be listed. It is also important to note how such cyber intrusions/threats could be detected in the system, even if the system successfully repelled the attack, as these may be probes of your system.

{pb}

Successful cyber attacks that lead to a safety consequence should be investigated just as an accident would be. Unsuccessful cyber attacks or intrusions should be treated as a near misses and investigated appropriately to ensure that the system cyber protection worked properly and the attack failure was by design rather a matter of good fortune.

A description of the potential consequences of security events and the likelihood of such events should be determined. Note that IEC 61511-1 does not define what a “security event” is, and therefore the definition of the consequence is a bit nebulous. A successful cyber attack could be seen as a consequence by the cybersecurity personnel, but to create a safety hazard, there must be a physical realization of the security event that leads to a propagation of the effect into an incident.

A successful cyber attack on a device or system is an intermediate consequence (an initiating cause or the defeat of a safety system) in a risk chain that may include other IPL that could protect against the developing process hazard. This perspective connects many of the effects of a cyber attack on our process control and safety systems into our normal process risk assessment methodology (Figure 2).

A determination of the likelihood of the security event is required, which leads us potentially into the same morass that we have in the HAZOP/LOPA initiating cause frequencies and the SIS equipment failure rates. We simply do not have sufficient information to determine the likelihood of a specific kind of cyber attack (not random) or a security breach (systematic error) occurring on a particular system, or the failure rates of the protection equipment to repel the cyber attack, or the protection equipment’s hardware failure rates. It seems likely that the uncertainty in determining the likelihood will be high, leading to conservative design, or worse, there will be no solid engineering basis to accept such a probability calculation. ISA TR84.00.09:2017 Appendix D provides some example calculations.

Various lifecycle phases, such as design, implementation, commissioning, operation and maintenance, are required to be considered in the security assessment. Like their process safety counterparts, HAZOP/LOPA, a SIS security assessment should occur during the various phases of the safety lifecycle based an overall risk management plan to mitigate potential security risks.

Once the risks have been the determined, the requirements for additional risk reduction measures should be identified to reduce the risk below the corporate risk criteria. A description of how the risk reduction will be accomplished should be included in the security assessment.

There are several notes of importance in Clause 8.2.4. Note 1 points to SIS security guidance provided in the technical report and standards ISA TR84.00.09, ISO/IEC 27001:2001 and IEC 62443:2010. ISO/IEC 27001:2001 is a general-purpose IT cybersecurity standard while IEC 62443:2010 is a standard that covers cybersecurity for industrial control and automation systems. The ISA technical report TR84.00.09 covers cybersecurity and the functional safety lifecycle. It should be noted that this technical report is substantially longer than IEC 61511-1 standard itself.

The security guidance provided in the standard ISA/IEC 62443:2010 and the ISA technical report TR84.00.09 advocate dividing the ICAS into logical hardware groupings called zones, and the communication paths or interfaces between zones, called conduits, to control the interaction between the various zones. This has the benefit of being able to control the paths that a cyber threat vector can propagate through a system and to be able to logically isolate the system should it come under attack, allowing the system to potentially mitigate the attack or to continue to function in part, or function the until the system can be safely shutdown.

Dividing the ICAS into logical zones also provides a basis for select SIS security assessment boundaries. This arrangement of Zones and Conduit can be easily documented conceptually, and Figure 3 illustrates a conduit/zone conceptual drawing (it is a bit busy and shows more than the simple zones and conduits). Note that the control system and SIS can be broken down into multiple zones for larger systems. Figure 3 also shows an outlined SIS zone.

Clause 8.2.4, Note 2 says that information and control of boundary conditions needed for the security risk assessment typically resides with the owner/operating company of a facility, not with the supplier. Like all risk assessments, the burden is on the owner/operator to perform the risk assessment and implement the results of the assessment. This does not relieve the equipment or software supplier from an obligation to provide a secure cybersecurity environment for their equipment, nor to promptly notify the owner/operator of the equipment when a cyber security vulnerability has been identified or cyber breach of their equipment has been detected. The owner/operator has the reverse obligation.

Note 3 states that the SIS security risk assessment can be included in an overall process automation security risk assessment. This is a reasonable consideration, but there are philosophical differences in the purposes of the different systems, their vulnerabilities, and the access points. The control system purpose is to keep the process within safe operating boundaries while making quality product efficiently. The SIS system purpose is to not allow the process to exceed the safe operating limit, with a minimum of spurious trips and maintenance. There is certainly an overlap between the systems that needs to be addressed in the overall risk plan.

Note 4 is important because it states that the SIS security risk assessment can range in focus from an individual SIF to all SIF in SIS or within a company, new and existing. A company should have a risk assessment plan that covers assessing SIS cyber threats, and how that fits into the process safety risk assessment plan. Obviously, an initial SIS security assessment must be performed to comply with IEC 61511 2nd Edition, however, while not required, performing a security assessment on existing IEC 61511 1st Edition SIS would be considered good engineering practice.

Once the initial security assessment has been accomplished and the SIS has been brought up to date from a security protection perspective, subsequent security assessments should be done under Management of Change (MOC) for new installations, for SIF additions to existing equipment, and for SIS design or configuration changes. The security assessment should also be revisited when new security vulnerabilities that can affect the SIS have been identified in the industry or by the equipment manufacturer; after a security audit if deficiencies have been found; or when the security assessment is revalidated (nominally every three years or less).

SIS security documentation

Additional documentation types will have to be developed to allow the security assessment team to perform a through security assessment, to document the current security state of the system, to maintain the security critical information database, and to be able to perform continuous and periodic audit against unauthorized changes.

Many companies have high-level system architecture drawings of their control and safety systems that could be adapted into zones and conduits drawings. It is likely that the higher-level drawing will have to be broken down into middle- and lower-level drawings, such as individual zones and equipment drawings, to show all the access points, equipment and protective measures, plus to have room for the necessary information for a security risk assessment. Figure 4 illustrates a breakdown of the high-level ICAS zone/conduit drawing into just a SIS system-level zone drawing. This figure also illustrates secondary access points (e.g. calibration trip point databases, AMS, Windows, etc.) at the SIS system level.

A further drilldown of the SIS zone/conduit drawing will typically have to be done for equipment security drawings, which would illustrate an individual piece of SIS equipment or a logical grouping of equipment broken down into individual modules if modular in design, documenting model numbers, hardware and software serial and version numbers, the equipment access points, identified protections, and other information to perform a security assessment and to help maintain the security information database. Much of this effort should be done at the engineering level by people who are familiar with the individual pieces of equipment and their security concerns, and then reviewed by the security assessment team (otherwise this assessment will be as long and as boring as a many a HAZOP has become). The results of the security assessment should be reviewed by an independent ICAS security specialist and the final results integrated into the overall ICAS security assessment and risk management plan.

While not always mentioned, successful cyber attacks other that random disruptions depend on knowledge of the system to be attacked, its protections, and the process under control. To find out about the target system typically requires a reconnaissance of that system, maybe more than one. This can occur by cyber means and may signified by minor successful, weak and/or failed attacks on the system to gather process information from enterprise and ICAS and learn what existing cyber protections exist on the system.

Indirect efforts to collect process information may also include cyber reconnaissance on the plant’s potentially less secure computer systems, such as drawing databases, process descriptions, process narratives, standard operating procedures, HAZOP/LOPA databases, calibration database, asset management system (AMS) database, PSV database, etc.

There even has to be concern for paper information (we generate a lot of uncontrolled paper about the process for our HAZOPs and LOPAs). It is obvious that there are there are many doors (access points) into our system that have to be analyzed, controlled or closed to ensure that we are covering all of our security bases.

Conclusions

It is good engineering practice in this day and age of cyber and other security threats to perform a security assessment on the ICAS system, including non-SIS and SIS. To conform to IEC 61511-1 2nd Edition Clause 8.2.4, a security assessment must be performed on the SIS and any new or revised SIS designs must have the necessary resilience against the identified security risks (Clause 11.2.12).

Most major SIS consultants, such as Kenexis, aeSolutions, exida, etc., offer competent cybersecurity services to help you out, but it behooves the owner/operator to have the in-house expertise to be able to do this assessment, or to at least be fully involved in this effort to understand what SIS system components are at risk; what the security threat vectors are; what the system vulnerabilities are; how to limit system access, internal pathway flows, and system and component vulnerabilities to threat vectors; how to minimize the security threats; and how to recover safety, if need be. The better your understanding of your SIS and ICAS systems from a security perspective, the better chance that you will be prepared to repel any security attack. A comprehensive SIS security assessment is a good step forward.