1661899078616 Cg0802 Alarmbutton

Six Sigma Alarm Management

Feb. 5, 2008
Six Sigma Methods Bring Order to Alarm Chaos at Monsanto’s Soda Springs, Id., Phosphorus Plant

By Brent J. Thomas

Monsanto’s plant in Soda Springs, Idaho, produces elemental phosphorus, or P4.  Most of the P4 is sent to other Monsanto facilities in Luling, La., and Camacari, Brazil, to make PCL3, the primary ingredient for Roundup herbicide. The remainder ends up in a variety of products including foods, cleaners, water treatment, flat-panel televisions, oral care products, paints and coatings, and pharmaceuticals. Specific applications of Soda Springs elemental phosphorus include the manufacture of Skydrol hydraulic fluids for aircraft, Phos-Chek fire retardant for fighting forest fires and Dequest water purifier.

The Process

The process of manufacturing elemental phosphorus begins with mining the phosphate ore from the Rasmussen Valley mine thirty miles north of Soda Springs. The ore is trucked to the site on Monsanto’s private highway and stockpiled.

The ore is fed to a rotary kiln, much like a cement kiln, 325 ft long and 16 ft in diameter, rotating at about 3.5 rpm.  Organic material is burned out of the ore, and nodules are formed.  The nodules are then mixed in batches with quartzite and coke and fed to one of three electric arc furnaces.  The three furnaces combined consume approximately 150 megawatts of electricity. The phosphorus leaves the furnaces as an off-gas and is condensed for shipment.

The Situation in Soda Springs

The furnace process at Soda Springs is a dynamic one that requires sophisticated controls. Monsanto began using Fisher’s PRoVOX distributed control system (DCS) in the early 1980s. By about 1996, Fisher had become Emerson, and development of the DeltaV control system had begun. 

Figure 1. A PB4Y fire bomber drops Phos-Check on a brush fire. The bright red retardant fades in the sun and eventually decays into fertilizer.

Soda Springs is nearing the end of an eight--year project to convert its PRoVOX DCS to DeltaV. Both control systems provide sophisticated alarming capabilities which in many ways are far too easy to implement. Back when an alarm meant installing a sensor and cable, which cost significant dollars, we carefully evaluated the alarms according to need, and the number was kept to a minimum. With the advent of digital control systems alarms became easy and cheap (almost free). The result was gross overuse of the alarming capability, which caused as many problems as it solved.

Monsanto began putting an emphasis on Six Sigma practices in 2001 and an emphasis on alarm management in 2002. Both corporate and local teams were formed to assess the state of the alarm systems within Monsanto and define a path forward to develop an effective, efficient alarm methodology and system. The local team decided to apply the new methodology to one furnace first to get a feel for what we were up against. This furnace was due for conversion to DeltaV and presented some opportunities for improvement.

Where Do We Go From Here?

How to analyze the present state and how to improve the situation were the biggest questions. The answer to both was the DMAIC (Define, Measure, Analyze, Improve, Control) methodology model of the Six Sigma process. After two years of work. Monsanto arrived at a firm alarm philosophy and had corporate-wide standards for analysis, improvement and key performance indicators (KPIs).

The first step in the analysis was to ask, just what is an alarm system and what should it look like?

Figure2. Typical control room alarm boundaries and appropriate d operator responses.

The purpose of an alarm system is to alert the operator of conditions that require attention. The ideal, of course, is to stay in the green target operation area, but this would require an unrealistically stable process—one that wouldn’t need a DCS.  The next best thing is to stay within the yellow area, which leaves the operator free to optimize operations. This also requires a mostly stable process that can be completely controlled by the DCS alone. Unfortunately, our furnace processes are extremely dynamic, and operator intervention is a fact of life.  However, we can minimize operator workload by controlling the number of total alarms by eliminating those that are not necessary.

Problem Statement(s)

Our alarm problems came down to these:

  1. Too many alarms. Alarms that don’t mean anything desensitize operators to the ones that do.
  2. Too many alarms at once. Alarm floods allow for no reaction time.
  3. Too many alarms had no defined response. Operators sometimes had no way to react to an alarm.
  4. Poor alarming practices can cause incidents. (See Three Mile Island and Chernobyl).

We used the Six Sigma cause-and-effect fishbone diagram to investigate the possible causes of our alarm problems. We found that while we did a good job of selecting and maintaining instrumentation, our methods were the root cause of the alarm system state—we had no philosophy or guidelines for configuring alarms.

Solution

At this point, we applied the Six Sigma DMAIC system to the problem. First we developed an alarm management philosophy and selected alarm metrics (the Define phase). Then we assessed the present alarm system (Measure/Analyze phases). We reduced the number of nuisance alarms (Analyze /Improve phases). We rationalized alarms by need and priority (Improve phase). Then we developed an alarm configuration database (Improve – Control phases). Finally, we will implement knowledge-based alarming where appropriate.

DEFINE

Steps in an Alarm Management Project
Our alarm management philosophy has five points.

  • Every alarm will be rationalized by need and priority.
  • Every audible alarm will have a defined response.
  • Each response will have an appropriate response time.
  • Alarm system metrics will be defined, measured and reported.
  • A continuous improvement process will monitor alarm system performance, identify meaningful opportunities for improvement and implement them.

The alarm philosophy decided on by the team was used in every step of the process from there on. Basically, it says that an alarm should be informative and help the operator do his/her job. It should not distract the operator’s attention from the alarms and conditions that do require action. In general, although there are some exceptions, if there is not a defined response for the operator that will help correct the situation, it shouldn’t be an alarm.

From this point on, if I use the word “alarm,” I mean an audible alarm—one that sounds a horn that an operator has to acknowledge to shut off. “Advisory” parameters in the control system cover those issues for which there is no defined response, and changing alarms to advisory or log events is one of the options for reducing alarms.

MEASURE

After we firmed up our alarm philosophy, we selected alarm metrics. Monsanto established a corporate-wide initiative to improve alarm system performance at all locations after an incident occurred that was partly due to a poor alarm system design.  A corporate alarm team was chartered which agreed on a set of metrics. The average alarm rate metric was the one most used in the Soda Springs project. 

Figure 3. The Six Sigma cause-and-effect fishbone diagram of Monsanto’s alarm issues.

Here are the metrics we used.

Average Alarm Rate:  Average number of alarms per hour per day

Operator Loading:  Number of 10 minute periods per day with 10 or more alarms

We also report on

Standing Alarms—alarms lasting longer than 24 hours.
Alarms per console point count
Alarms per plant process area.

Initially the metrics decided upon by the Soda Springs team were the same as those defined by the corporate team.  For our primary metric, we use ten-minute periods during the day and compare our performance to the EEMUA (Engineering Equipment & Materials Users Association) guideline: no more than ten alarms in a ten-minute period.

The metrics were calculated by dividing each day into 144 ten-minute periods. The number of alarms that occur in each ten-minute period is used to calculate the Average Alarm Rate. For this project we used four-hour averages. The threshold value was set at 10, so only those periods with 10 or more alarms are counted. These data were used for the initial analysis. Later in the project, the average alarm rate was used for evaluation of improvements, since it is the best indicator of the overall state of the alarm system. The advantage was that it included all time periods during the day regardless of the number of alarms,  and analysis became much simpler. We assumed—and later found we were correct—that reducing the average alarm rate would reduce all other metrics as well.

In next month’s issue, we will cover the measurement, analysis and improvement phases of the Soda Springs alarm management project.

Brent J. Thomas is a manufacturing technologist at Monsanto.

Go to ControlGlobal.com/alarms.html for some of the best articles, podcasts and white papers in our archives. Also read our article Six Sigma Alarm Management Part 2

Sponsored Recommendations

IEC 62443 4-1 Cyber Certification – Why ML 3 is So Important

The IEC 62443 Security for Industrial Automation and Control Systems - Part 4-1: Secure Product Development Lifecycle Requirements help increase resilience for control systems...

Multi-Server SCADA Maintenance Made Easy

See how the intuitive VTScada Services Page ensures your multi-server SCADA application remains operational and resilient, even when performing regular server maintenance.

Your Industrial Historical Database Should be Designed for SCADA

VTScada's Chief Software Architect discusses how VTScada's purpose-built SCADA historian has created a paradigm shift in industry expectations for industrial redundancy and performance...

Linux and SCADA – What You May Not Have Considered

There’s a lot to keep in mind when considering the Linux® Operating System for critical SCADA systems. See how the Linux security model compares to Windows® and Mac OS®.