Knowing what you're looking at is the most important thing, but a sharp focus can help you get there. Defining, organizing, prioritizing and understanding alerts and alarms is one of the most critical jobs in process control. However, perhaps because so many applications are isolated, inadequate and dangerous alarm practices can persist for years. Sometimes it's too few useful alarms, but more often it's floods of duplicate and unnecessary alerts that were thrown in for compliance and convenience's sake rather than ensuring safe operations.
The solution begins with an inventory of processes and alerts, but it continues with assessing and prioritizing alarms, incorporating more helpful database software and HMIs, and training staff to follow these improved procedures.
For instance, Vale's Copper Cliff nickel smelter in Sundbury, Ontario, runs two flash furnaces, which take in 4200 tonnes of dry solid charge (DSC) ore powder per day, and flash-smelt it into nickel matte product, sulphur gas for fuel and iron slag. However, several years ago, engineers in Vale's Ontario division started an alarm rationalization project because the two furnaces were generating more than 17,000 alarms per day on average, most of which weren't understandable to the operators, and which didn't lead to useful action.
"Previous rationalization projects tried to reduce alarms, but always ended up adding more," says Gerry Seguin, Vale Ontario's senior automation specialist. "Then a consultant from the Engineering Equipment Materials Users Association came in for a seminar; gave a report on consequences that really hit home; and showed the alarms weren't the operators' faults because they hadn't been given the tools for their job. After that, we were able to get financing, resources and people, including bringing back two pensioned operators full-time to go over all alarm interactions."
So Vale Ontario implemented EEMUA's Publication 191, "Alarm Systems--A Guide to Design, Management and Procurement," and ISA's ISA 18 alarm management standard; evaluated and reduced alarm system loading on the two flash furnaces, and redesigned its alarm systems to assist operators. Vale also partnered with Invensys, and adopted its PAS Alarm Management Software database, which works with its existing Foxboro I/A Series process control software to download approved alarms to the application's field controllers.
As a result, alarms on Copper Cliff's two flash furnaces dropped to a daily average of just 66, which exceeded EEMUA's average alarm rate benchmark for steady operation. "We also researched to ensure that operators had enough time between alarms and trips to respond effectively," adds Seguin. "Since the initial project in 2007-08, all new alarms go through an alarm manager to make sure they meet the criteria we've laid out, which is like a risk assessment chart that evaluates likelihood, criticality and potential damage, and sets priority levels."
Seguin reports that Copper Cliff is continuing to improve its alarm and HMI philosophy and rationalization process and is applying it to other areas of the smelter, including a new replacement nickel converting unit it will commission this September.
Pull Pointer Out of Panel
Similarly, DTE Energy's Greenwood Energy Center (GWEC) in Avoca, Mich., has been upgrading its DCS from a 1990s-era Westinghouse Distributed Processing Facility (WDPF) system and lightbox displays to ABB's Process Portal A (PPA) 800xA DCS, but first decided rationalize and reorganize several thousand alarms and related devices and software.
"Previously, many of our DCS alarms weren't rationalized. They were just characterized as high or high worse, and so we had a lot of noise and chattering alarms that were only fixed occasionally," says Kip Dobel, senior engineer in GWEC's Engineering Support Organization. "In fact, operators determined unit status by the different volumes of alarms and not actual alarms. Some also found interesting ways to silence nuisance alarms, such as jamming a pointer in their board. Pages of alarms would scroll by on a unit trip, making it very easy for an important alarm to get lost in all the noise."
Figure 2: Graphite operator interface panels have cast-aluminum construction and full-color touchscreens, and combine a range of plug-in modules with protocol conversion, data logging and web-based monitoring and control.
Photo courtesy of RedLion
Located 60 miles north of Detroit, GWEC is an 800-megawatt "peaker" oil and gas plant, which helps power the grid when demand is high. This means it ramps its electrical production up and down more than other plants.
To focus on important alarms and eliminate chatter, Dobel reports that GWEC began migrating from WDPF to PPA 800xA in 2010, and installed Matrikon's Process Guard software to help with post-event unit trip analyses. PPA 800xA included customer libraries, seven operator consoles, three engineering work packages (EWPs), domain and 800xA controllers, and AlarmInsight operator assistance software for 800xA, which grew out of collaboration by ABB and Matrikon. GWEC also runs OSIsoft's PI historian software to document high-priority alarms and check on operating devices.
Document and Bring in Veterans
"We had about 6000 analog I/O points, and this meant dealing with about 10,000 decisions just to rationalize alarms and alerts from our analog signals," explains Dobel. "We wanted to give our operators an alarm system that would provide timely, accurate information to assist in operating the powerhouse in a controlled manner; employ Matrikon's Alarm Manager management of change (MOC) software to handle the rationalization; set up and execute an alarm rationalization scheme following EEMUA 191's principles; and provide rationalization data to the operators' consoles."
Dage adds, "Previously, we were putting Band-Aids on the bleeders in our alarm system, but we weren't completing the documentation needed. This was the first time we did full documentation."
Dobel added that GWEC also hired a senior software engineer from Matrikon to help get its three-month, $250,000 alarm rationalization project started; hired some retired operators to help; and set up a dedicated Alarm Rationalization Room to present component data, trace alarm profiles and facilitate hashing out the most logical and efficient ways to reorganize and reassign them.
"If you haven't done a rationalization project before, we recommend you hire an external expert to help, but make sure you check their credentials and bring them in for a trial week to see if they can do what you need to get done," adds Dobel. "Also, using retired operators was helpful because they walked down the system and traced many devices and alarms, but it was also a mistake because we needed to get more of our current operators involved too. A good rationalization team should have a panel board operator, a DCS expert that knows all the logic blocks in the systems and how they relate to the alarms, instrument technicians, and a scribe to keep everyone on track.
"The rationalization room got everyone on the same page. A lot of tribal knowledge had built up in our processes, and we needed to eliminate it by standardizing on common best practices. So, it helped to talk about what was bringing us to certain alarm situations."
The rationalization team started with GWEC's I/O tag list and the plant's piping and instrumentation diagrams (P&IDs). "We found we could sort the tag database however we wanted, but we learned it was better to take the P&IDs and rationalize the whole system," says Dobel. "You have to ask questions like, 'What flow do I need here?' or 'What level do I need here?' The aim is to avoid unnecessary double alarms, but it can take a long time do them--sometimes three or four hours to reach consensus on one alarm. You have to get your subject matter experts (SMEs) for the process on speed dial."
Help Operators Do Their Jobs
GWEC also used AlarmInsight software to present alarm profiles and operating data to its operators in a more concise and less text-heavy format. "Besides delivering important alarms, we tried to give our operators assistance beyond the routine and obvious tasks, and help them with things they might not think about at 3 a.m. So we focused more rationalizing some of these unusual events." says Dage. "For instance, we found we rarely used Level 2 and Level 3 alarms, so we began to discuss the reasons why, and document our alarm philosophy."
Likewise, to optimize its own alarm system, Compañía Mega recently held joint workshops with Honeywell Process Solutions, and implemented its Alarm Configuration Manager (ACM) software. ACM was integrated with the plant's existing process control system, Honeywell's TotalPlant Solution (TPS), and implemented recommendations from EEMUA 191. Mega is a joint venture between the Dow Chemical Co. and Brasoil Alliance Co., which provides hydrocarbon feedstocks to Dow's Bahia Blanca site in Argentina and has two gas plants linked by a 600-km pipeline.
In short, Compañía Mega and Honeywell used ACM and EEMUA's guidelines to:
- Standardize the alarm process by defining a plant alarm policy, so all staff operates with the same quality of alarms.
- Dramatically reduce the number of alarm activations requiring operator intervention.
- Provide peace of mind to operators by not overwhelming them with unnecessary alarms.
- Improve the response time of operators to verify incoming alarms and make decisions when an alarm is activated.
- Reduce human error in the management of alarms, avoiding unnecessary production stops, equipment failure, vents, etc.
This concern about alarm rationalization is gathering steam. Jason Wright, market manager for PlantPAx software at Rockwell Automation says, "We're seeing a mind shift, especially in the process industries, about alarms and HMIs. Historically, alarms were driven just by knowing the process and following the critical elements. HMIs layouts followed PIDs. Today, HMIs are driven by a greater appreciation for human-factors engineering and the best ways to convey information to the operators, so they can respond appropriately.
"Using ISA 18.2 as a guide, we've expanded the Alarm State Model in our PlantPAx 3.0 software to include three distinct alarm suppression states," adds Wright. "The previous two were 'suppression by design' and 'disabled or out of service,' and now we've added a third for granularity called 'shelving,' which allows alarm suppression with an automatic timeout."
Seeing More Clearly
HMIs also can give operators a better understanding and chance to respond to alarms and performance changes.
For example, system integrator One-Step Automation in Niverville, Manitoba, builds automation systems for grain handling and processing, but these users want HMIs they can use anywhere to get real-time feedback on bin levels, motor failures and alarms, surge hopper levels, and the ability to control shutdown processes.
"Much of the equipment in seed processing facilities is driven by variable-frequency drives (VFDs)," says Arlin Friesen, One-Step's automation specialist. "Clients want to adjust motor speeds based on the quality of product they see coming off the processing equipment, and they want to monitor product quality with live camera feeds."
Consequently, Friesen used Opto 22's groov platform and its web browser without plug-ins to build One-Step's own operator interface that can be used on PCs, tablet PCs and smart phones. "This allows our users to control VFD speeds using groov's adjustable buttons or sliders, while the interface also displays live product flows via IP cameras on their equipment," adds Friesen.
While groov doesn't have direct alarm capabilities yet, users can add whatever alerts or alarms they want when building their interfaces. Also, groov and its optimized displays (Figure 1) are based on HMI best practices for building screens with prioritized data, minimal graphics and muted colors.
On the HMI hardware side, Red Lion Controls reports its new Graphite operator interface panels include cast-aluminum construction and full-color touchscreens, and combine a range of plug-in modules with protocol conversion, data logging and web-based monitoring and control. The plug-ins reduce development and commissioning time compared to traditional systems, which typically use an HMI paired with separate I/O, PLCs and other controllers, and require more programming and configuration (Figure 2).
Management and More Documentation
Back at Greenwood Energy Center, Dobel explains that management buy-in and commitment were crucial to GWEC's alarm rationalization project, not just for funding and resources, but to give the team the authority to make rationalization decisions and require operators stick to them--even though there are some exceptions. "For example, if your facility had a historical event, has to meet an EPA requirement or must carry out a particular management requirement, then these just have to be done," adds Dobel.
In all, Dobel, Dage and their colleagues spent about eight hours a day for three months hashing out alarms. "That was too much, so we'd recommend a schedule of doing rationalization in the morning and then gathering information in the afternoon," says Dobel. "So far, we're done with rationalizing alarms for 80% to 85% of our I/O components, and we're also still working on the logic for our smart alarms and maintaining the existing alarm system. And we're still meeting once a week to do more rationalization. In fact, on his own, John triaged that last 20% of our alarms and made them Priority 4, so operators can assign priority levels later.
"Alarm rationalization includes many different devices, but the basic questions for each are always the same: 'Do the operators need to know about this alarm?' and 'What are the consequences?' adds Dage.
Dobel adds, "After documenting your alarm rationalizations, it's also important to be consistent with the rationalization rules you come up with, and as you build those rules, you need to document them too. And get started doing alarm rationalization now. Don't wait for an incident or accident."
Besides continuing its rationalization efforts, GWEC and the team are doing more continuous improvement, and have set up another whiteboard to aid communications between operators, IT and other players on items to work on. For example, it lists the Top 20 alarms each week and addresses the bad actors behind them, which has already reduced the number and severity of these alarms.
Figure 1: The vertical, analog-style indicator and sliders in Opto 22's groov platform allow uses to create interfaces that can be more quickly understand and responded to by their operators.
Photo courtesy of Opto 22