To Hell and Back

How to Rebuild a Control Room in Five Days. Lessons in Disaster Recovery and Emergency Response

1 of 3 < 1 | 2 | 3 View on one page

By Jon DiPietro

Between February 7 and March 14, 2009, more than 400 bush fires across the state of Victoria, Australia, scorched over a million acres of land, killing 173 and injuring 414.  Engineers at Goulburn Valley Water (GVW), provider of urban water and wastewater services to 54 towns and cities on the outskirts of Melbourne, watched as their telemetry system from the Kilmore Dissolved Air Filtration plant reported an ambient control room temperature of 142 °F before going silent on the afternoon of Saturday, February 7.  A site visit on the following day revealed that while the treatment plant survived the fire, its control room was completely incinerated, destroying the electrical switchgear, plant HMI, laboratory, instrumentation and chemical dosing systems.  With only five days worth of water stored, an emergency response plan to rebuild the control room and recommission the plant went into action.

What followed over the course of the next five days is a case study in how proper backup and change management procedures, strong vendor relationships and dedicated, cross-trained employees can pull off near-miracles.  Armed with a full set of accurate as-built drawings, up-to-date PLC programs and HMI computer backups, GVW was able to assemble a portable control room in the parking lot of its operations center, deliver it to the site, connect thousands of control points, commission the plant and resume delivery of 10ML per day of water to the towns of Kilmore, Wandong and Heathcote Junction in five days. 


"If you want to make an apple pie from scratch, you must first create the universe."
- Carl Sagan

In the months and, indeed, years leading up to the fire in Kilmore, GVW had built a foundation that would give it this chance at success.  There were three key areas of preparation – documentation, human resources, and relationships – and lacking any one of them would have spelled failure.

Documentation. One of the most obvious and crucial requirements is documentation, and GVW was very well prepared in this respect.  Employing strict change management controls, the IT department had current backups of the PLC programs and HMI systems running at the plant during the time of the fire.  Just as important, it had accurate "as built" drawings of the plant stored off site. 

However, one aspect in which it was not as well prepared was the settings on components such as variable-speed drives, loop controllers and flow valves.  As the saying goes, sometimes it's better to be lucky than good, and GVW had its fair share of luck.  In one case, the settings for a critical flow meter had not been stored, but after extricating it from the debris and powering it up, engineers breathed a tremendous sigh of relief to discover that it was still in working condition, and they were able to record all the settings.

Human Resources. During an emergency response, documents, plans and procedures will only get you so far.  Humans must do the work.  GVW had a good internal team with a wide range of skills, including an electrician and an instrumentation technician with over 30 years of combined experience at the company.  The IT department had made the decision years ago to "embed" some of its personnel with computer and networking experience in the operations group.  The result was a diverse team with an important balance of skill and experience that proved crucial in executing the response plan. 

As GVW's IT Manager Noel Squires describes, "A number of years back, we had a couple of near-misses with losing ladder logic, which was at a time when looking after PLCs and process controls was really considered operations and had nothing to do with IT.  But we pretty quickly realized that it's got a lot to do with IT, and that disciplines we use in IT routinely, such as like change management and backups, applied equally to process control.  So we formed the Operations IT group, and that was a little different in that we were taking these former operations people and putting them into the IT section."

Relationships. It's a cliché to say that teamwork is critical in these situations, but ideas become clichés for a reason; they are usually true.  Having said that, there were three specific relationships that had to work in order for this emergency response to work as well as it did.  The first was the horizontal relationship between disciplines.  The team of workers with electrical, SCADA, PLC, construction and IT skills already knew and worked with one another for some time.  The second was the vertical relationship between the response team and management.  While the response team was busy planning and executing, management was fetching coffee, arranging meals, filling out paperwork, providing accommodation and otherwise providing support and blocking distractions as much as possible.  Finally, the external relationships with vendors were equally vital.  The control room fire occurred on a Saturday night, which necessitated a Sunday response.  Many of the vendors abandoned family barbeques in order to open warehouses and drive to Melbourne so that parts would be at their doorstep first thing Monday morning.  In some instances, parts were delivered without purchase orders under a "gentleman's agreement" to simply settle up accounts later.  This can only be accomplished through solid customer-vendor relationships.

1 of 3 < 1 | 2 | 3 View on one page
Show Comments
Hide Comments

Join the discussion

We welcome your thoughtful comments.
All comments will display your user name.

Want to participate in the discussion?

Register for free

Log in for complete access.


No one has commented on this page yet.

RSS feed for comments on this page | RSS feed for all comments