I was in a joking mood when in my November column on SIS, I referred to the camel being a horse designed by committe. I apologize for that, I did not intend to offend the committe. I know that the members of the IEC61508 and other committes (for decades, since 1998) worked very hard and deserve our thanks. My goal with the November coulumn was to move our safety standards away from equations filled with acronyms and towards quantifiable and practical improvements, and not be turned off by anything or anybody.
Let me start this second column on safety standards by restating that they must become made simpler, more practical, more quantifiable and focused on the whole loop, not only on its equipment components. This also means that because we are interested in the SIL of the complete loop, it must also reflect the reliability of the communication links, the power supplies and most importantly, the automatic protection against human error. I will show that what we need is not more onion diagrams (Figure 1), but a clear understanding that a loop will only be safe, if all of its components are safe. We must understand, that it does little good if some electronic components are "suited for" a certain SIL level, if the loop itself is not protected from operator errors, cyber attacks, communication or power supply failures.
The Cultural Divide
We live at a time when cultural attitudes concerning automation are changing and the re is a debate about who should have the "last word" on safety, the machine or the operator? Standards, such as SIS reflect the old culture which trusts the operator more. Let us review, if this "manual mentality" is still valid in the 21st Century and if not, why not?
Safety statistics tell us that the number one cause of all industrial accidents is human error. One could refer to 3-Mile Island where the operators poured water into the instrument air supply, to BP where there was no automation the keep the drilling pipe straight, to the ferry accident in Korea where safety overrides were not provided to prevent the captain from turning too sharply into a fast ocean current or to airplane accidents where the pilots were not prevented from landing at the wrong speeds. The list goes on...
This is occurring in an age when we land robots on Mars, target enemy tanks by drones and are getting ready for the driverless car, in which we can play bridge on our mobile telephones, while it is parking the smart vehicle. Why is this contradiction? Is it just tradition? Is it that the older generations still do not realize that the choice is not between humans or machines, but between trusting the judgment of panicked rookie operators running around in the dark at 2AM or trusting the judgment of professional control engineers, who spent months in analyzing all potential "what if" combinations, before deciding on what emergency actions should be triggered under a particular set of emergency conditions?
So What Do We Need?
What we need is a change in safety philosophy, a change in our attitude towards the role of the operator and a change in our fragmented approach to safety. I would for example combine all those layers in the traditional onion diagrams (Figure 1), into a simple three layer one:
The 1st Layer would be the core, containing the Basic Process Controls (BPCS). This innermost core and its operation would be identical to those of the old "onion diagrams". In this region the operator is in charge. Here he/she is free to change set-points, modify control/logic algorithms, retune controllers, add or change sensors, final control elements, etc. While the plant conditions are within this region, the goal is to obtain optimized plant operation and maximized production.
The 2nd Layer in this onion diagram of mine is the safety instrument system (SIS) layer. When the plant conditions enter this layer, the safety actions are triggered and these instruments completely overrule the 1st (BPCS) layer. This 2nd layer is using its own dedicated (when necessary redundant or voting) sensors and/or final control elements and has no interconnection whatsoever with the 1st layer. In this 2nd layer, the operator can still make changes, but only with the formal (written) approval of plant engineering.
The 3rd Layer is the Override Safety Control (OSC) layer, which can not be turned off or overruled by anything or anybody. When the plant conditions enter this highly accident-prone region, safe shut-down of the plant is automatically triggered no matter what. This layer depends on its own sensors, overrules any and all actions of the inner layers and has absolutely no connection to the Internet. In other words, within this layer, the operator is out of the picture (operator actions are blocked) and the plant is shutting down under preplanned, totally automatic control.