By Dave Harrold
Does this describe you?
Your job is to make informed business decisions. You understand the role of process control systems; you understand the role of safety instrumented systems, but you want a better understanding of safety instrumented system jargon and requirements, and you need to be able to explain safety instrumented systems to your boss
If so, keep reading because this article is for you.
Does this sound familiar?
The following conversation is taking place at the Springfield Snacks and Pesticides Plant between plant manager Mr. Montgomery Barns and supervisor of safety, Gomer Himpson. Let's listen in.
Gomer: Good morning Mr. Barns; how is your day going so far?
Mr. Barns: It's not a good morning when I see you, Himpson.
Gomer: You recall a few months ago that I told you our safety instrumented systems were due for a review?
Mr. Barns: I know that you annoy me about something every month.
Gomer: Well I got with Carl, Charlie and Lenny, and we went through our SIS with a fine-tooth comb. We did a HAZOP and a LoPA and found several of our SIFs were improperly assigned to SIL 1 when they should be SIL 2. We also found a few SIL 2 SIFs that could qualify as SIL 1s, but since we've always maintained diversity among our SIS logic solvers we can't just reclassify these SIFs. We did a preliminary look at the PFD of a few of these SIFs, and we think there is something we can do in the BPCS. Also Mr. Barns, ever since corporate insisted on extending the time between scheduled shutdowns, it is playing havoc with our full- and partial-stroke testing periods. Mr. B. I know I don't have to tell you how OSHA feels about IEC 61511 and IEC 61508. What would really be helpful is if we replaced our old SIS with one from the same vendor as our BPCS; that way everything would be "smart." Of course, that means we really need to install exida- and/or TÜV-certified sensors and final elements.
Mr. Barns: STOP! Have you lost your wits Himpson? You're talking in gibberish and everyone knows that the word "smart" should never be uttered by your lips.
So what do you think? Did Mr. Barns have a clue what Gomer was talking about? Would you? As ridiculous as it probably sounds right now, everything Gomer said made sense, but only if you have some knowledge of the jargon.
During the next few minutes we are going to dissect what Gomer was saying, and we'll also take a look at what it means in terms of safer processes.
". . .went through our SIS"
"I got with Carl, Charlie and Lenny and we went through our SIS with a fine tooth comb."
SIS is an acronym for Safety Instrumented System and is defined in safety standards as, "instrumented system used to implement one or more safety instrumented functions. An SIS is composed of any combination of sensor(s), logic solver(s) and final elements."
That definition provides some help, but let's look a bit closer at this SIS thing.
SIS logic solvers use the input provided by the sensors and execute the program logic that eventually results in an automated emergency shut-down. For example, a 1oo2 (read as one-out-of-two) logic design monitors two inputs. If either of those inputs changes states, the logic solver executes the shut-down sequence. A 2oo2 logic design uses two inputs, and both must agree in order for the logic solver to initiate the shut-down. There are a number of different design types, 1oo1, 1oo2, 2oo2, 2oo3, etc., and it is quite common to mix analog and discrete inputs. This is called using diverse technologies, and it is very useful in eliminating "false or nuisance" shut-downs resulting from common-cause failures.
SIS final elements are the pumps, motors, on/off valves, sometimes throttling valves, etc., that actually stop, close open, or whatever actions are necessary to bring the process to a safe shut-down state.
What Gomer was explaining to Mr. Barns was that he, Carl, Charlie and Lenny had conducted a through audit of the entire SIS―every sensor, logic solver and final element ―in order to compare what was installed and how it was done with what had originally been designed and specified.
Note: A communications network may or may not be a part of the SIS. If the logic solver uses some form of a communications network to connect with the inputs and/or outputs, then that communication network is part of the SIS, and thus becomes subject to all the design and testing requirements of the SIS. However, if the logic solver uses a communication network to connect to a non-safety controller and/or an operator interface, then that communication network is not part of the SIS.
Do not think that every safety application requires a microprocessor-based logic solver. It is still perfectly legitimate to use relays and stand-alone technologies. In fact, there may be situations that justify using a relay or stand-alone system alongside a microprocessor- based solution. Just be sure they remain separate and independent of one another.
…Did a HAZOP, LoPA, …"
"We did a HAZOP and an LoPA and found several of our SIFs were improperly assigned to SIL1 when they should be SIL 2."
The definitions of HAZOP and LoPA have been around a long time and are not confined to SIS discussions. Most companies have been conducting HAZOP and LoPAs for several decades.
HAZOP is short for HAZards and OPerability studies (analysis). HAZOPs are systematic methods used to examine complex facilities or processes in order to find ways to improve operational performance and potentially hazardous situations. It's noteworthy that about 60% to 65% percent of a typical HAZOP study focuses on the operability-related opportunities to improve product quality and overall operational performance.
LoPA is short for Layer of Protection Analysis. It is a simplified form of risk assessment that is often used as an extension to or in conjunction with other process hazard analysis methodologies. LoPA considers the different and diverse hazard mitigation "layers" available. For example, two layers of protection typically provided for holding tanks is 1) an automated overfill protection system and 2) locating the tank inside an enclosed dyke or berm area.
SIF (Safety Instrumented Function) is defined in safety standards as a "safety function with a specified safety integrity level which is necessary to achieve functional safety and which can be either a safety instrumented protection function or a safety instrumented control function." That's really not much help.
To really understand and appreciate the nuances of the SIF will require some additional homework on your part, but for now accept that the SIF is the safety loop equivalent of a single process control loop―something akin to a flow, pressure, level or temperature control loop. But instead of controlling level to an operator determined setpoint, the SIF (safety loop) monitors for a pre-determined unsafe condition (i.e., high level) and automatically initiates the appropriate action necessary to mitigate the unsafe condition.
SIL (Safety Integrity Level – see table) is defined in safety standards as a, "discrete level (1-4) for specifying the safety integrity requirements of the safety instrumented functions to be allocated to the safety instrumented systems. Safety integrity level 4 has the highest level of safety integrity; safety integrity level 1 has the lowest."
SIL values are assigned to each SIF using a systematic methodology, such as LoPA.
Note: When the SIS logic solver is microprocessor-based (i.e., DeltaV SIS, Triconex, GE Fanuc, Allen-Bradley GuardLogix, Honeywell Fail Safe, etc.) it will very likely host multiple safety loops (SIFs) with different SIL value assignments.
Even among experienced safety system practitioners, SIL assignments are often misunderstood. Concern often arises when an identified risk is assigned a SIL 3 value. What some people have overlooked is a small footnote in the safety standard that permits dividing the mitigation of a risk across multiple SIL values. For example, rather than incur the life-cycle costs associated with a single SIF (safety loop) classified as SIL 3, you are permitted to split the SIF into "mini-functions" meaning you may, for example, implement both a SIL 1 and a SIL 2 solution for that particular risk and still meet the SIL 3 requirements.
What Gomer was explaining to Mr. Burns was that he and his buddies had systematically examined the hazards using two independent methodologies (HAZOP and LoPA) and had found that several of the safety loops (SIFs) were not assigned the correct risk reduction values (SIL), thus the plant's safety instrumented system (SIS) wasn't providing the level of protection expected or needed.
Good news & bad news
"We also found a few SIL 2 SIFs that could qualify as SIL 1s, but since we've always maintained diversity among our SIS logic solvers, we can't just reclassify these SIFs."
It sounds like Gomer is talking in circles, but he really isn't.
What Gomer is telling Mr. B. is that a few of the safety loops (SIFs) were originally designed and are now being operated, and maintained as SIL 2 (risk reduction 100 to 1000) when in fact the risk is not very severe, so a few SIFs could be designed, operated, and maintained as SIL 1 (risk reduction 10 to 100).
Gomer's good news is that the company is spending more money to operate and maintain these improperly assigned safety loops than is necessary. Gomer's bad news is that there is a safety instrumented system (SIS) logic solver for each SIL value (diversity) – "…we've always maintained diversity among our SIS logic solvers..." – meaning there is a logic solver that hosts all the SIL 1 safety loops, another logic solver that hosts all the SIL 2 safety loops, etc. In order for the Springfield Snacks and Pesticides Plant to maintain its logic solver diversity practice requires physically moving each reclassified SIF―wiring, program logic, documentation, etc.―from one logic solver to another.
Gomer's implication is that this requires some serious change control procedures and follow-up testing of both logic solvers to ensure each is working properly after all the changes have been completed.
Note: Safety standards do permit different SIL-valued safety loops to reside in the same logic solver. However, doing so means that the logic solver must always be designed, operated and maintained per the requirements of the highest-valued SIL safety loop residing in that logic solver.
A frequently overlooked footnote in the IEC 61511-1 safety standard is Note 2 of paragraph 3.2.74 which says, "It is possible to use several lower safety integrity level systems to satisfy the need for a higher level function (for example, using a SIL 2 and a SIL 1 system together to satisfy the need for a SIL 3 function)."
A common application of this concept often occurs in reactor designs where there is a need to prevent reactor run-away. You can design a single SIF that will likely be classified as SIL 3; OR you can design two SIFs, each with lower SIL values.
In choosing the second option, the first SIF is designed to close/open valves, stop/start pumps, etc., bringing the process to a safe state. The second SIF is designed to inject a "kill" solution (chain-stopper) into the feed stream. If one of these SIFs achieves a SIL-1 risk-reduction value, which is easy to achieve, and the other a SIL 2 risk-reduction value―not very difficult to achieve―then when combined, the two designs provide the same risk-reduction as a single SIL-3 design. However, and here's the best part, both SIFs can likely be cost effectively designed to achieve a SIL-2 risk-reduction value, thus producing an overall more reliable solution without incurring the life-cycle cost typically associated with a single SIL-3 design.
Looked where at the what?
"We did a preliminary look at the PFD of a few of these SIF's and we think there is something we can do in the BPCS."
PFD is another acronym, and it means Probability to Fail on Demand. The probability to fail on demand (PFD) can be calculated using the dangerous failure rate (λD) and the testing interval (TI). The mathematical relationship, assuming that systematic failures are minimized through design practice, is as follows: PFD = λD * TI/2. The equation shows that the relationship between PFD and TI is linear, thus longer times between tests results in larger PFD values.
PFD means that when a demand (in this case an unsafe condition) occurs, there is a possibility that an undetected failure in some element of the safety loop (SIF) will prevent the SIS from performing the necessary shut-down action.
In terms of automobiles, PFD is the likelihood that the air-bag will not deploy in an accident.
Adding PFDavg to the SIL table.
BPCS is defined in safety standards as, "system which responds to input signals from the process, its associated equipment, other programmable systems and/or an operator and generates output signals causing the process and its associated equipment to operate in the desired manner, but which does not perform any safety instrumented functions with a claimed SIL ≥1.”
The short definition is that the BPCS is what for decades has been called the process control system. The BPCS can be a single-loop, panel-based instrument, a distributed control system (DCS) or a programmable logic controller (PLC) with some type of operator interface.
What Gomer is telling Mr. B. is that he and his buddies took a cursory look at the probability that at the same time a demand occurred (unsafe condition existed) that one of the devices (sensor, logic solver, etc.) that made up the few safety loops (SIF) would actually fail (on demand) and prevent the automated shut-down logic from being executed. Gomer was also suggesting that there was a possible solution to preemptively detect these sorts of failures using the basic process control system (BPCS) – though his explanation was way too simplified.
What Gomer had in mind was to access the HART diagnostic information that often goes unmonitored in safety system devices. Since the Springfield Snacks and Pesticides plant already was using HART multiplexers to access the diagnostic information inside most of the process system transmitters, Gomer's thinking was that by adding another HART multiplexer just for safety system devices, they would be able to detect device failures before a demand thereby improving the SIS's reliability.
While I applaud Gomer's forward thinking use of HART diagnostics to improve SIS reliability, that solution may not be appropriate for every company.
Reaping the benefits of HART diagnostics requires that the organization embrace a proactive maintenance culture accompanied by an investment in HART diagnostic utilization training.
When installed properly, HART multiplexers can extract diagnostic information from safety system devices without influencing the reliability of the SIS.
We'll talk a bit more about the relationship between the BPCS and the SIS later. In the meantime, keep in mind that 21st-century operator interface terminals provide a window to a variety of systems and applications, including the BPCS, HART multiplexers, safety system logic solvers, data historians, online modeling and optimization applications, etc. The days of "different tubes for different views" is long gone.
"Also Mr. Barns, ever since corporate insisted on extending the time between scheduled shutdowns, it has been playing havoc with our full- and partial-stroke testing periods.”
Full-stroke testing is the term used to define when each SIF (safety loop) is fully tested, meaning each discrete sensor is forced to its action state; each analog transmitter is forced to its action value; the logic solver is permitted to execute its programmed logic; and final elements are permitted to change to whatever state they've been instructed (i.e., on/off valves fully open or close).
Because full-stroke testing is a complete test of the automated shut-down system, it is usually only conducted when the process is shut down for scheduled maintenance (i.e., scheduled turnaround).
Note: Remain aware that testing the SIS when the process is shut-down means final elements are not working against "true" process conditions (i.e., flow, pressure). The most serious risk is that a valve might have undetected leak through.
Partial-stroke testing is almost always a test of final elements (i.e., on/off valves). Partial-stroke testing is the term used to define how parts of an SIF (safety loop) are tested without actually permitting devices to go all the way open/close. When properly performed, partial-stroke testing should never interrupt production. Note: Partial stroke testing is conducted under actual process conditions.
Note: Clamping mechanical travel stops to the valves shaft permits manual partial-stroke testing, but often eliminates other final device elements from the test. You may only take credit for the SIS elements you actually test.
During the design of a safety instrumented system (SIS), several assumptions are used in calculations, including establishing the safety integrity level (SIL) value assigned to each safety loop (SIF). One of those assumptions is the amount of time that will elapse between when each safety loop (SIF) can be full-stroke tested. The longer the time between each full-stroke test, the greater the PFD thus the greater the SIL value that is required for that safety loop. Note: PFD (probability to fail on demand), MTTD (mean-time-to-detect, and MTTR (mean-time-to-repair) are some of the assumed values used in determining an SIS's design requirements.
If full- and partial-stroke testing has been accounted for from the very beginning, then the period between tests has also been included in establishing the SIL value for each safety loop (SIF). When the production times are extended without considering the impact on full- and partial-stroke testing, the process is very likely operating in a degraded, less capable state.
Partial-stroke testing is recognized within safety standards as a permissible way of extending the period between full-stroke testing; but the standards also caution that you can't use partial-stroke testing as a substitute for full-stroke testing. (See graphic, "Usefulness of Partial-Stroke Testing.")
What Gomer is telling Mr. B. is that because of the corporate mandated to extend plant operating schedules by default, that mandate also extended the testing periods of the time between full-stroke testing of safety loops and that is a problem.
Note: There are a number of process plants worldwide that are running much longer between planned shut-downs than they did just a decade ago. Unless those plant owner/operators have reviewed, recalculated and possibly redesigned (upgraded) each of their safety loops to accommodate the longer times between full-stroke testing, there is a very good possibility that they are not in compliance with industry safety standards, at least part of the time.
"Mr. B., I know I don't have to tell you how OSHA feels about IEC 61511 and IEC 61508.”
Like many of us, Mr. Barns probably didn't have a clue about OSHA's take on IEC 61511 and IEC 61508, and so he likely nodded his head knowingly.
Well sometimes that's okay, but when most people hear the term OSHA (U.S. Occupational Health and Safety Agency), or some similar regulatory agency, they immediately go into "pay attention" mode.
The two IEC documents have been internationally recognized as consensus standards written specifically for electrical/electronic/programmable electronic safety-related systems. That means that these standards represent "good engineering practice."
IEC 61508 came first, and it is quite detailed. It was developed to cover a variety of industries. IEC 61511 is a process-industry interpretation of IEC 61508, thus in many places, IEC 61511 references back to IEC 61508.
For the purists among us, IEC 61511 began life as ISA S84. S84 was harmonized with IEC 61511 in 2000. At the time of harmonization, S84 retained a "grandfather" clause. The concept of the "grandfather clause" in ISA-84.01-2004-1 originated with OSHA 1910.119.
The grandfather clause's intent is to recognize prior good engineering practices (e.g., ANSI/ISA-84.01-1996) and to allow their continued use with regard to existing SIS. The grandfather clause (ISA-84.01-2004-1 Clause 1.0 y) states: "For existing SIS designed and constructed in accordance with codes, standard, or practices prior to the issuance of this standard (e.g., ASI/ISA-84.01-1996), the owner/operator shall determine that the equipment is designed, maintained, inspected, tested and operated in a safe manner."
The grandfather clause establishes that the owner/operator of an SIS designed and constructed prior to the issuance of the standard should demonstrate that the "equipment is designed, maintained, inspected, tested and operating in a safe manner." There are two essential steps:
The ALARP (as low as reasonable practicable) concept requires that the risk be driven lower when the costs are practical. New practices sometimes include practical things, such as very affordable SIS solutions. The civil court and regulatory systems also seem to want them. So, there are cost and moral arguments for moving forward with partial upgrades as they become practical and feasible.
Technically, the S84 committee documented in TR84.00.04 that the determination had to be at least based on a risk assessment of the current design and management system to determine the risk reduction required and verify that the installed systems are capable of achieving it.
Practically, the equipment performance is estimated for the purposes of the design calculation. Then the performance is monitored in the field and when the performance does not match expectations, the assumptions have been invalidated and the risk gap must be addressed. This involves root cause analysis to understand whether the frequency of failure can be reduced. In some cases, this will result in the replacement of the existing equipment with better-performing models.
Ultimately, each SIS solution is likely to organically evolve as problems are found or when better technology becomes available that has advantages that outweigh its costs.
The key principles of both IEC standards are the:
The safety life cycle is just what you imagine; a continuous review and improvement cycle that has been designed to specifically address the safety system from its initial design to its eventual retirement.
We've already discussed SIL (safety integrity level) so we won't rehash it here.
To understand Gomer's comment "…how OSHA feels about IEC…" we need to look at two items.
The first is the U.S. National Technology Transfer and Advancement Act of 1995. This act requires that all federal agencies (i.e., EPA, FDA, OSHA, etc.) recognize existing consensus standards, such as IEC 61511 and IEC 61508. That means that all government agencies have been instructed to accept the premise of consensus standards and abide by the standards' requirements.
Second, in 2000, OSHA sent a letter to ISA. In that letter OSHA acknowledged that S84 (now IEC 61511) had been officially recognized and generally accepted as good engineering practices for SIS.
Additionally, though OSHA's 1910.119 (PSM – Process Safety Management) regulation does not include specific information on the requirements for safety systems, it does require that facilities perform a process hazard analysis (PHA) and take measures to mitigate identified risks. OSHA's mention of safety systems is simply, "The employer shall document that equipment complies with recognized and generally accepted good engineering practices." When we consider that simple statement alongside the 1995 Technology Transfer and Advancement Act, we can only conclude that IEC 61508 and IEC 61511 or something very similar, must be followed.
What Gomer was subtly reminding Mr. Barns was that if the plant had an incident that resulted in an OSHA investigation, the investigators would quickly realize that the plant was not conforming to IEC safety standards, and fines would most certainly be levied and someone might even end up going to jail.
AND HERE IT COMES
ADDITIONAL SIS RESOURCES
The Internet is awash with SIS related information. Here's a few that I believe represent the best-of-the-best:
Emerson's PlantWeb University SIS Course – 11 different courses. Each will take about 15 minutes to complete and you can earn points for some fun gifts. http://plantweb.emersonprocess.com/university/engSch_SIS_XML.asp
Emerson's DeltaV Book Store includes process control-related books from a lot of different authors and sources. It's really a good single-stop shop. http://easydeltav.com/bookstore/
SIS-TECH Technical Information – SIS-TECH is an independent company that specializes in SIS consulting, services and training. The SIS-TECH web site has a really good collection of SIS related material, and it's all free. http://www.sis-tech.com/technical_resources.html
Exida is an independent company that specializes in SIS consulting, training, competency certification of individuals and third-party device certification. exida also hosts a regularly updated SIS device certification list. http://www.exida.com/
TÜV Rheinland Group is an independent company that specializes in SIS related services including competency certification of individuals, and third-party device certification. http://www.tuvasi.com/