The ins and outs of safety instrumented systems—part 2

About the authors

Greg McMillan and Stan Weiner bring their wits and more than 80 years of process control experience to bear on your questions, comments and problems. Write to them at [email protected], and you can also follow McMillan's Control Talk blog.

Click here to read more Control Talk articles.

This is Part 2 of our Control Talk series on safety instrumented systems. You can read Part 1 here.

Greg: Last month, we got an overview of the magnitude of the effort to properly design, implement and test a safety instrumented system (SIS) with an emphasis on measurement and valve redundancy and performance requirements.

Stan: Here, we continue our discussion with fellow Monsanto retiree Len Laskowski on important aspects of SIS not sufficiently talked out, remembering that the design and implementation of a SIS requires considerable expertise and involvement of key plant personnel and SIS specialists.

Greg: Len, what are some of the top reasons an SIS fails?

Len: Systematic failures can occur during any phase of the SIS lifecycle. First of all, many don’t realize standards have evolved over time. But they must ensure the latest standards are understood and followed. For example, on a burner management system, when the flame goes out, you should close the gas valves to stop gas from accumulating in the firebox. Do you need to make sure you don’t have flame before you start? Yes, because once, when a flame detector was “stuck on,” the flame went out, the burner didn’t trip, gas accumulated and caused an explosion. The standards now require you to verify no flame is present before starting.

Too often, not enough attention by every member of the team is paid to the application. Details matter from start to present—I purposely did not say “finish” because ongoing requirements mean you're never finished until you decommission the SIS. If you're the plant manager, king for a day, can you spot the issue? You attend a thorough process hazard and operability analysis (HAZOP); verify the layers of protection analysis (LOPA) evaluation; have a complete safety requirement specification (SRS); install new functioning hardware; install new tested software; do regular proof testing; train world-class operators; and use a randomly selected trip set point or process delay time.

Not all the elements are addressed in the safety lifecycle's three main phases:

Analysis: Hazard analysis, risk assessment, SIL calculations, allocation of safety functions to protection layers, and SRS
Realization: Design and engineering, build, integration and factory acceptance testing, installation and commissioning and safety validation
Operation: Operation and maintenance including diagnostics and testing, modification and decommissioning

In the analysis phase, some examples of unaddressed elements are: not identifying all of the initiating causes; not addressing effects and frequencies of problems from utilities; under- or overestimating consequences; not recognizing the interdependence of layers; and not completing the interaction matrix. As a result, common-cause failures aren't eliminated; safety functions aren't clearly defined and don't completely mitigate the hazard; process safety time for alarms/analyzers/instrumentation is too short; relief valves aren't properly sized; losses of feeds from failure of turbines or prime movers (e.g., pumps, compressors) aren't considered; design basis for process safety time/response time/trip set point is incomplete; bypass and reset conditions aren't clearly identified; and environmental conditions are not specified (e.g., electrostatic protection, corrosion, coating, cleaning).

In the realization phase some unaddressed elements are: sensor doesn't detect hazardous condition; improper sensing technology is used; sensors aren't suitable for application; final control elements don't mitigate hazard completely; instrument failure modes aren't identified; response to instrument and/or communication failure isn't configured; fail position is incorrectly configured; sensors are installed improperly; valve shutoff is incorrect or in wrong direction; testing is incomplete; no test procedure; all failure modes aren't tested (communication, power, utility, instrument); and integrated implementation and testing is not done, particularly in third-party systems (note that this can also occur during design).

In the operation phase, some unaddressed elements are: no bypass philosophy (forces vs. engineered bypasses and bypass valves not monitored); SIS demands aren't tracked or investigated; change management isn't enforced (trip points changed without risk assessment); maintenance isn't sufficient; safety instrumented functions (SIF) aren't tested or testing is inadequate (particularly of failure modes); SIS preventive maintenance isn't done; failures aren't tracked or investigated; instruments aren't restored within mean time to repair (MTTR); partial decommissioning of SIFs and decommissioning impact on other SIFs isn't evaluated; and residual risk mitigation is inadequate.

Stan: Wow! We started with potential mistakes because people tend to want to cut to the chase, and your list of failures is an attention-getter and eye opener, but let’s step back here. Give us your take on how to do SIL calculations to deliver a successful project.

Len: Since 2003, it's been a requirement of IEC 61511 to do SIL verification calculations. The primary reason is to ensure SIF designs are robust enough to meet the required SIL. Companies are adhering to this requirement, but many aren't recognizing the value that can be gained by applying them early in the design.

SIL verification calculations can be used early in the design process to deliver successful projects. They can also be used to facilitate design optimization tool with lifecycle costs in mind.

To do preliminary SIL calculations, you need analysis and documents, including process hazard analysis (PHA) and the SIL determination process, such as a LOPA, engineered solutions (SIF list), preliminary P&IDs and preliminary causes and effects (C&E). You also need extensive information, including instrument types (e.g., DP cell, butterfly valve), number of instruments, voting architecture (e.g., sensor and final elements), safety-critical devices, demand rates, SIL and risk reduction factors (RRF), valve fail positions, SIF tags, maximum spurious trip rate, proof test interval, established common causes, established mean time to repair (MTTR), proof test coverage factors, mission time, startup time and architectural constraints.

In the first pass at SIL calculations, use generic data for instruments. If you know specific models of instruments that will be used, then use that data to give you approximate numbers. Look at architectural constraints and systematic capability. Try to include as many of the following items as possible to refine the calculation. Expect numerous iterations before you're done.

For the final SIL calculations, you need information on final-element TSO (tight shutoff), severe service, partial-stroke testing, device degradation, advanced diagnostics in logic solver, ability in logic solver to detect over and under range, available external comparison, service likely to plug, sensor fail positions, “proven in use” information, volume boosters, and maintenance capability. Some wild cards are “energize to trip” applications, availability of utilities (e.g., air, nitrogen, steam, cooling water), sequences, sharing of instruments (splitters), multiple logic solvers, meeting process safety time, mitigation systems such as fire and gas, and spare equipment (e.g., pumps).

Here are some SIL verification best practices:

Start as early as practical in a project, and
Design to RRF and SIL, not just SIL.

The proof test interval target should be turnaround time plus six months to allow for production schedule changes. A typical proof test interval practical limit is 60 months; beyond that, you're going into high demand for a LOPA demand rate once per 10 years—run backup calculations to see if the design will pass in high demand. Define safety-critical inputs/outputs (IO) early.

To achieve a SIL, there are more requirements than just PFD average. You need architectural constraints and systematic capability. SIL verification calculations are required and they help define project requirements.

There are a number of major degrees of freedom in doing SIS design. The biggest factor is type of redundancy, realizing that one-out-of-nine (1oo9) devices to trip voting may not be a valid SIL calculation architecture. For example, a reactor may have nine temperature measurements in place around it. This is done because there can be localized hot spots. The voting for configuration would be 1oo9, but for SIL calculations, you actually have nine sensors measuring different temperatures and the voting is 1oo1, which will be a big difference in the results.

For brownfield modernization/expansion projects, start with front-end engineering. For new grassroots facilities, information may not be available from the bulk of the studies until early detail design.

High-risk areas should be evaluated as soon as possible, as early SIL calculations help ensure and deliver a successful project with proper instrumentation and facilities included in the design, cost estimate and schedule; proper scope for the project including potential impact on other proposed IPLs; lowest overall cost; good proof test intervals; reduced maintenance (valve testing and bypasses); and minimization of personnel exposure. Run the SIL calculations over and over to meet objectives. The result can be a timely, successful project and lower overall cost.

Doing SIL verification calculations early proves the design is feasible; proves the required SIL can be achieved; provides the project with the proper scope early enough so required capital, schedule and quality can be provided with proper scope; allows a proper design to be implemented rather than rushing and squeezing in a substandard design; positively impacts lifecycle costs by improved constructability, maintainability and testability; and ensures proper access to locations, platforms, bypasses and devices.
Doing SIL verification calculations late may result in finding designs aren't feasible; redesigns are required to attain required SIL; greater potential for significant change orders with SIS electrical and instrument (E&I) piping plus schedule delays; increased possibility of substandard design due to schedule squeeze; and lifecycle costs potentially negatively impacted by requiring increased testing, downtime for testing and reduced maintainability.
Doing SIL verification calculations just prior to startup can result in really ugly stuff. It's possible that the SIL isn't achieved; the design is inadequate; installed risk reduction is inadequate and additional mitigating measures are required by unit (negative cost and resource impact); or the project needs to redesign SIS and demo inadequate instrumentation and piping. The result can be significant project delays and overruns, and/or lifecycle costs negatively impacted because of cost and schedule pressure.

Homepage image courtesy of khunaspix at FreeDigitalPhotos.net