Systems Integration / Asset Management / Optimization

DTE Energy Tackles Unruly Alarms

Rationalization Process Restores Meaning--One Alarm at a Time

By Jim Montague

ABB Automation & Power World 2013

In the course of an ongoing control system upgrade at its Greenwood Energy Center (GWEC) in Avoca, Mi., DTE Energy engineers also undertook to rationalize and reorganize a decades-old, dysfunctional approach to process alarms. "Previously, many of our DCS alarms were not rationalized," explained Kip Dobel, senior engineer in GWEC's Engineering Support Organization. "They were just characterized as high or high worse, and so we had a lot of noise and chattering alarms that were only fixed occasionally. In fact, operators determined unit status by the volume of alarms and not the actual alarms. Pages of alarms would scroll by on a unit trip, making it very easy for an important alarm to get lost in all the noise."

The company has been working to upgrade its distributed control system (DCS) from a 1990s-era Westinghouse WDPF system and lightbox displays to ABB's Process Portal A (PPA) 800xA DCS, but first decided it was essential to rationalize and reorganize the unit's approach to alarms. Dobel and colleague John Dage, DTE's principal engineer, presented "Alarm Rationalization Process at Greenwood Energy Center" this week at ABB Automation and Power World in Orlando, Fla. Located about 60 miles north of Detroit, GWEC is an 800-MW oil and gas plant, which serves as a "peaker" to deliver power to the grid when demand is especially high. This means the facility ramps its electrical production up and down more than other plants.

To focus on important alarms and eliminate the chatter, Dobel reported that GWEC began efforts to migrate to PPA 800xA in 2010 and also installed Matrikon's Process Guard software to help with post-event trip analysis. PPA 800xA included customer libraries, seven operator consoles, three engineering work packages (EWPs), domain and 800xA controllers, and AlarmInsight operator assistance software for 800xA, which grew out of collaboration by ABB and Matrikon. GWEC also runs OSIsoft's PI historian software to further document high-priority alarms and check on operating devices.

"We had about 6,000 analog I/O points, and so this meant dealing with about 10,000 decisions just to rationalize alarms and alerts from our analog signals," explained Dobel. "We wanted to give our operators an alarm system that would provide timely, accurate information to assist in operating the powerhouse in a controlled manner; employ Matrikon's Alarm Manager management of change (MOC) software to handle the rationalization; set up and execute an alarm rationalization scheme following EEMUA 191's principles; and provide the rationalization data to the operators' consoles."

Dage added that, "Previously, we were putting Band-Aids on the bleeders in our alarm system, but we weren't completing the documentation needed. This was the first time we did full documentation."

GWEC also hired a senior software engineer from Matrikon to help get its three-month, $250,000 alarm rationalization project started; hired some retired operators from outside the plant to help; and even set up a dedicated Alarm Rationalization Room with whiteboard and projector to present component data, trace alarm profiles, and facilitate discussing and hashing out the most logical and efficient ways to reorganize and reassign them.

"If you haven't done a rationalization project before, we recommend that you hire an external expert to help, but make sure you check their credentials and bring them in for a trial week to see if they can do what you need to get done," said Dobel. "Also, using retired operators from outside was helpful because they walked down the system and traced many devices and alarms, but it was also a mistake because we really needed to get more of our current operators involved too. A good rationalization team should have a panel board operator, a DCS expert that knows all the logic blocks in the systems and how they relate to the alarms system, instrument technicians and a scribe to keep everyone on track."

"The rationalization room was needed to help get everyone on the same page. A lot of individualized tribal knowledge has built up in our plant and processes, and we needed to standardize on some common best practices. So it really helped to talk about what was bringing us to certain alarm situations."

The alarm rationalization team started with GWEC's I/O tag list and the plant's piping and instrumentation diagrams (P&IDs). "We found that we could sort the tag database however we wanted, but we learned it was better to take the P&IDs and rationalize the whole system. You have to ask questions like, ‘What flow do I need here?' or ‘What level do I need here?' The aim is to avoid unnecessary double alarms, but it can take a long time—sometimes three or four hours to reach consensus on one alarm. You have to get your subject matter experts for the process on speed dial."

Once specific alarms, their I/O points and other components, as well as their operating profiles and information were gathered, each was documented and added to an overall matrix that organized them according to type and style of alarm, severity and consequences of each.

GWEC also used AlarmInsight software to present alarm profiles and operating data to its operators in a more concise and less text-heavy format. "Besides delivering important alarms, we tried to give our operators assistance beyond the routine and obvious tasks, and help them with things they might not think about at 3 a.m. So, we spent more time rationalizing some of these unusual events." 

Dage added, "For instance, we found that we rarely used Level 2 and Level 3 alarms, and so we began to discuss the reasons why and document our alarm philosophy."

Dobel explained that management buy-in and commitment was also crucial to GWEC's alarm rationalization project, not just for funding and resources, but to give the team the authority to make rationalization decisions and require the plant and its operators stick to them—even though there are always some exceptions. "For example, if your facility had a historical event, has to meet a specific EPA requirement or must carry out a particular management requirement, then these just have to be done," added Dobel.

In all, Dobel, Dage and their colleagues spent out eight hours a day for three months working out of the rationalization room and hashing out alarms. "That was too much. We'd recommend a schedule of doing rationalization in the morning and then gathering information in the afternoon," said Dobel. "So far, we're done with rationalizing alarms for 80% to 85% of our hard I/O components and now are working on the logic for our smart alarms and the existing alarm system. We're still meeting once a week for two to fours hours to do more rationalization. In fact, on his own, John triaged that last 20% of our alarms, and made them Priority 4, so the operators can assign them priority levels later.

"Alarm rationalization includes many different devices, but the basic questions for each are always the same: 'Do the operators need to know about this alarm?' and 'What are the consequences?' added Dage.

Dobel added, "After documenting your alarm rationalizations, it's also important to be consistent with the rationalization rules you come up with, and as you build those rules, you need to document them too. Get started doing alarm rationalization now. Don't wait for an incident or accident."

Besides continuing its rationalization efforts, GWEC and the team are doing more continuous improvement and have set up another whiteboard to aid communications among operators, IT and other players. For example, it lists the top 20 alarms each week and the bad actors behind them, which has already reduced the number and severity of those alarms.