Drowning in Data, Starving for Information, 4

May 5, 2010
Concluding This Four-Part Series on Data Analytics with Special Guests Modeling and Control Specialist, Brian Hrankowsky and PAT expert, Randy Reiss. How Important Is a Better Use of Data?
By Greg McMillan and Stan Weiner

Greg McMillan and Stan Weiner bring their wits and more than 66 years of process control experience to bear on your questions, comments and problems. Write to them at [email protected].

Stan: We conclude this series by talking with Brian Hrankowsky a specialist in modeling and control at a major pharmaceutical company, and more remarks from PAT expert, Randy Reiss, to get a better perspective of the opportunities and problems of so much data availability. Brian, how important is a better use of data?

Brian: The dollars per deviation are huge in the pharmaceutical industry. We are often in a reactive mode due to lags in availability of analytical data and review of data and often don't know much about cycle times, yields, problems or the impact of process improvements until well after a batch has been completed.

Greg: What are the problems with current tools?

Brian: The commercially available tools offered for batch were originally for continuous processes, and batch features were tacked on later. In some cases, the only feature provided is the use of a "batch running" or "look at the loop now" flag. This makes batch and batch-to-batch analysis very hard. Batch systems have many more dimensions than just batch on and batch off. Data collection systems can't handle the query loads of routine analysis, as they are optimized for storage, not retrieval. The event record formats are not easy to query, as they were designed for operators, printers and loggers, not detailed metrics analysis. Why do users need to combine several records to find out how long an alarm went unacknowledged? Analysis tools assume all necessary events are already available and can be detected in real time.

Users always come up with triggers after implementation of the analysis system, for example, determining when the loop was in control. This trigger is defined as the start of a window where the process remained within certain limits, so the trigger has to be backward time-stamped to the start of the window. Statistical tools require manual exclusion pre-analysis, so there is a lot of spreadsheet work outside the tool. Batch operations require an exceptional rangeability of utility systems, as batch volume and reaction or crystallization rates go from zero to a maximum. This throws loop-analysis tools because the process is never stationary, and tends to trick the tools into thinking the problem is with the equipment.

A batch has many dimensions. It's amazing how a simple change in tank level can make a loop-analysis tool useless. The workaround is treating the loop like multiple loops, which works, but costs a lot more in licensing and effort. Uncompressed data is not going to happen. The data we have is what we have, and switching to uncompressed data would push users to historize less to make it affordable. The compression requirements change with process variables and their importance, but the tools don't allow the compression to be adjusted dynamically. We don't want to have to write custom code, but the answer to most batch process-related implementation issues is to do so.

Stan: What are users trying to do?

Brian: We are working on role-based dashboards, complete data integration, a simple query analysis environment, exception-based, end-of-batch reports, automation of routine analysis tasks and exception-based notifications. We want to know at a glance how the process and equipment is performing relative to normal. We want metrics on cycle time, alarm rates, quality and assays, PV maximums and minimums, and raw material and energy use. The ultimate goal is reporting and visualization of everything that could have significant causal relationship and the root cause of a specific deviation, yield change, impact to cycle time or change in efficiency. 

Greg: What questions do various users want dashboards to answer?

Brian: For operators: "Is there something wrong that needs my attention now? What actions are coming up for me to plan for?" For production and technical support: "How is the process and equipment performing compared to normal? Where should we focus to optimize?" For quality assurance: "Were there any exceptional events on the current batch? Did we stay within spec and execute correctly? Have any deviations been filed and, if so, what is their status?" For business leaders: "Why are we down? What is our production? What did we spend? Where did the money go?"

Stan: What are the requirements for complete data base integration?

Brian: Presently we have islands of data. Current analysis requires lots of data: tickets, external and internal lab results, maintenance and calibration data, alarm data, batch historian data, continuous historian data and regulatory limits data. Troubleshooting/investigation also require maintenance data, equipment specifications and materials tracking information. Users should not need to be database experts or learn different interfaces to access data. They also should be able to access the data electronically all the time. Essentially, the tools need to get to the point where users can extract information as easily as they can describe what they want in words.

Greg: What are the biggest challenges for database integration?

Brian: Maintenance is not integrated with process data. When there is a problem, the first thing a process engineer will ask is what maintenance was done on the equipment or automation system. Potentially, there are several times the number of variables to review on a daily basis with a fully integrated data system. Operations, maintenance and engineering do not have time to check every potential trend or measurement every day. To make use of the data efficiently, the exceptions have to "bubble up." Better yet, notify only when the data must be reviewed.

Stan: Why is there so much more data with batch operations?

Brian: We are not just interested in the "make medicine" portion of the process. For example, there are clean-in-place and sterilization operations and the need to detect leakages and blockages. Multiple transfer operations for one unit operation cause variable numbers of events to seek per run. We use events to mark transitions in process and control windows—S88 recipe object boundaries aren't granular enough. Different product grades and formulations are different enough to require recipe-driven tuning parameters. In essence, every phase is a process unto itself that tends to require complete analysis.

Randy: Much of what Brian is talking about with data mining is critical to a successful implementation of analytics. Misalignment of data is a common pitfall because the process data is usually in one historian and the lab analysis data is in another. Yet another database may be used to store feed stock quality that can be used as initial conditions for the analysis. Start and stop times for each stage of a batch must be defined, and the data used for modeling must be extracted from the historian with the same criteria as are used on-line. All this must be consistent across all sources of data. Any discrepancies will compromise the model and/or the on-line analysis. Proper data mining is the most underestimated part of an analytics project.

Greg: We conclude this informative series with another memorable Top 10 List from Randy.

Top 10 Auxiliary Quantitative Data  Sources of "Value Add" to the Process

10. Shoe-sole height of the guy who puts the stick in tank 42 to read the level
9. Number of clouds in the sky during the feedstock delivery.
8. Clicks it takes the ignition to fire up the burner for the boiler
7. Diet Cokes consumed during the second break of the AM shift.
6. Questions asked at the start of shift meeting.
5. Questions asked after a shutdown.
4. Gallons of coffee consumed during a production run.
3. Dollar amount of the maintenance staff cell phone bill.
2. Remaining shopping days until Christmas
1. Weeks since last tuning seminar.