Log In Register
Print page

Home » Articles » McMillan & Weiner

Voices: McMillan & Weiner

Drowning in Data, Starving for Information-2

McMillan and Weiner Tacked a Big Question, What Is Data Analytics?


Greg MacMillan & Stan WeinerBy Greg McMillan and Stan Weiner

Greg McMillan and Stan Weiner bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

Stan: We are fortunate to be able to interview Randy Reiss, who helped develop a data analytics system for batch processes for the Federal Drug Administration's (FDA) Process Analytical Technology (PAT) initiative.

Greg: What is data analytics?

Randy: It is the use of multivariate statistics for monitoring an industrial process. The most commonly used methods are principal component analysis (PCA) and partial least squares (PLS). Both reduce the dimension of the data by abstracting it based on predominate correlations. In doing so, the major correlations of the data are highlighted, and the effect of data redundancy is reduced. PCA is used for early fault detection and employs two statistics to explain if the process in within control limits: Hoteling's T^2 and an error statistic called Q SPE, or DMod-X. T^2 is a comparison of the relationships of the measurements. For example, the model may show that pressure and temperature are highly correlated; thus, the T^2 statistics will explain how well the on-line batch matches that modeled relationship. The error statistic is a reading of how far the measurement is from the model. The two statistics are complementary in that, if a deviation does not show up in the model (T^2), then it shows up in the error. In use, the operator would monitor the PCA statistics to make sure the process is within a control limit. If the control limit is exceeded, then a drill-down contribution can be used to investigate which process tags are causing the statistics to report a deviation.

PLS is used to predict the end of batch quality. The statistic employed is the prediction and an associated confidence interval. The prediction is the calculated value of a real quality parameter at the end of the batch. For example, a complicated lab analysis that could take hours may be required to determine an acid content of a product. PLS can be used to model that QA parameter, and allow the operator to move the batch to the next processing stage based on the prediction value. Likewise, the confidence interval has uses in identifying critical periods of the process and its effect on product quality.

Stan: How do you address the 3D aspect of batch profiles?

Randy: Batch data from a set of measurements can be thought of as a stack of matrices aligned by measurement and time slice. Although 3D analysis methods are available (for example, parallel factor analysis), it is more common in industrial analytics to use 2D analysis algorithms, such as PCA and PLS.

To get the 3D data into a meaningful 2D form, the data is unfolded. Unfolding is a matter of taking 3D data and placing it side by side to form a 2D matrix. The method of unfolding affects how the analysis deciphers the relationship in the data.

Since a batch is comprised of a time series of data from a set of measurements, it makes sense to analyze the data across batches, but we want to retain the information per sensor and per time slice. That method is called batch-wise unfolding and is the preferred method for process analytics.

Variable-wise unfolding retains the data per variable, but analyzes it across the batches and across the entire batch time. The result is a model that tries to represent the entire batch with a single set of linear relations. That is like using a line to characterize a curve. It may be somewhat correct part of the time, but never really fits the curve. Because batch-wise unfolding retains the information per time slice of the model, it creates a model with a finer granularity that better characterizes a non-linear process. However, the non-linear behavior of a batch-wise unfolding can be over-fitted, whereas a variable-wise unfolding model cannot.

Some call hybrid unfolding a third  method, but it is actually just variable-wise unfolding with local scaling; that is, instead of scaling each variable from all the time slices of the model, hybrid unfolding only scales the values from the same time slice. It's an improvement to variable-wise unfolding in that sense, but the overall structure and consequences of variable-wise unfolding still remain.

Greg: How do you deal with variable batch lengths?

Randy: The analysis requires that each batch be of the same length— not the reality of production; thus, methods are required to manipulate the data to a uniform length, such as accordion stretch/shrink, simple truncation of data, the use of an indicator variable, major event synchronization or dynamic time warping (DTW).

DTW works very well for model building. A dynamic optimization adjusts the length of each batch to the best fit with the least change in data. The results retain the features of all the batches, and do a good job of matching them in time. The benefit to the analysis is a much better batch profile to develop statistics. However, DTW is a time-consuming algorithm, easily accounting for 40% of the time to develop a model. When performing analytics on-line, a more complex problem presents itself: properly aligning the on-line batch with the correct time slice of the model for analysis. The major difficulties of on-line synchronization are the lack of complete data because the batch is in progress, and the time constraints of real-time processing of the analysis.

Stan: What guidance can you offer on how to select inputs and batches?

Randy: A single quality assurance (QA) value (e.g., end-of-batch lab quality result) should be used to rate each batch. If you want to look at multiple QA values, then you will need to do multiple analyses. The set of batches used to generate a model is called the training set.

I get the best results when a uniform distribution of QA results is used for the training set. The training set uses the same number of batches from the full range of the QA parameter. Usually, this is done by establishing sub-ranges or bins, and then populating each bin with the same number of batches to represent the full range of the QA parameter evenly in the model. Be sure to keep a test set of batches off to the side to validate your model.

Selection of inputs is somewhat more difficult and subjective. Start with all the measurements for the equipment unit. Know your process and what measurements are important for that stage of processing. Strip off measurements that are obviously not applicable. Then consider each one. Avoid those that are only relevant for a short period of the stage or that are shared by other equipment units. If you're unsure, keep it in the model and try it out. Iteration is how you develop a good model. Try it and look at the PCA contributions to see what variables are causing deviations. Are they real, or is the measurement problematic? Is the deviation expressed in another measurement? Try not to restrict the data coming into the analysis. Redundant measurements are OK if each measurement is not problematic in itself, but unrelated data can be cause false alarms on-line. Once a set of measurements is determined, this will usually stay the same over time. However, as a process develops, you may need to regenerate new models with newer batches in the training set. Be sure to maintain the uniform distribution of the QA values in the mode training set when updating the model.

You need a minimum of 30 batches to stabilize the mathematics of a reasonable process. Greater variability in the training set will require more batches. Diminishing returns start when using more than 50 batches for a process. More batches are not always better. Too many batches of little variability will create a model that is very tight and may cause false alarms.

Greg: Randy's latest Top 10 list might just have an Oscar winner. 

Top 10 Movie Titles for the Data Analytics Project

10. Honey, I Shrunk the Data
9. Analytics Now
8. The Batch Hunter
7. Lord of the [Principal] Components
6. The Which Data Project?
5. 2001: An Analytics Odyssey
4. Statistic Without a Cause
3. The Empirical Strikes Back
2. Snow White and the 70 Batches
1. The Extrapolationist.


More Voices

Drowning in Data, Starving for Information-2
03/10/2010
McMillan and Weiner Tacked a Big Question, What Is Data Analytics?

Drowning in Data; Starving for Information - 1
02/16/2010
This Is the First of a Four-Part Series on Past, Present and Future Challenges and Opportunities Presented by the Deluge of Data Now Available to Automation Professionals

The Future is Now
01/12/2010
Process Control Is Open to Innovation!

Show Me the Money - Part 2
12/17/2009
Showing the Monetary Benefits of Projects as Much as the Technical Achievements

Show Me the Money – Part 1
11/18/2009
The Best Way to Keep Your Job, Save our Profession, and Do Wonderful Things Is to Quantify the Benefits of Process Control Improvements. Money Talks

Going, Going, Gone - Part 3
10/14/2009
If the Industry Wants Access to Its Consultants, It Should Insure Their Prospective Retirees Have No Outside Interests

Going, Going, Gone — Part 2
10/08/2009
What Can Be Done to Save and Promote Process Control Expertise?

Going, Going, Gone–Part 1
10/05/2009
When Greg and Stan Retired, There Was No Attempt from the Industry to Save Their Expertise. Over 100 of the Best Minds in Process Modeling and Control Retired, and the Indusrty Made No Attempt to Retain a Snippet of Their Knowledge

Downturn Turndown
07/29/2009
Are People Going to Be Put to Work Improving the Rangeability of the Plant or Let Go to Save Short-Term Costs?

Sensible Sensor Speed-Part 2
06/29/2009
Greg McMillan and Stan Weiner Continue Their Conversation on Sensor Speeds

Sensible Sensor Speed–Part 1
05/13/2009
Greg McMillan and Stan Weiner Talk Sensor Speeds

The Secret Life of pH Electrodes - Part 3
04/09/2009
Bringing Back Experts to Wind Up Our Discussion of One of the Grand Old Tools of Process Control- the pH Electrode

The Secret Life of pH Electrodes – Part 2
03/09/2009
Greg McMillan and Stan Weiner Continue Celebrating the 100th Anniversary of the Glass pH Electrode

The Secret Life of pH Electrodes–Part 1
02/11/2009
Celebrating the 100th Anniversary of the Glass pH Electrode

It’s 12 a.m. Do You Know What Your PID’s Doing?
01/12/2009
The Proportional-Integral-Derivative (PID)

APC and Wireless Rabbits
12/22/2008

Secure Answers for a Risky Business
11/03/2008
Why Is There Such an Increased Focus on Security?

Deltas Rule
09/15/2008
Deltas Rule in the Equation for the Digital Implementation of the PID Algorithm

Oneness
08/08/2008
Is There a Metaphysical Aspect to “Oneness?”

Disturbing Remarks
07/03/2008
The Memories of Loops Gone Bad and Processes Gone Wild

Feeding on Feedforward
05/05/2008
This Month’s Topic—Feedforward Control

Loops are Not Just for Continuous Processes
04/04/2008
Greg McMillan and Stan Weiner bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

Up From the Ashes
03/13/2008
This Month’s Column Tells How One Intrepid Fellow Survived a hit From a Reorganization Meteorite and Went on to Find Happiness as Part of a Process Control Improvement Team.

Deal or No Deal
02/05/2008
Greg McMillan and Stan Weiner bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

Straight Talk
01/04/2008
Greg McMillan and Stan Weiner Bring Their Wits and More Than 66 Years of Process Control Experience to Bear on Your Questions, Comments and Problems

Year End Puzzler Bonanza
12/10/2007
Greg McMillan and Stan Weiner, PE bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

One Man’s Story – Back to the Future
10/02/2007
Greg McMillan and Stan Weiner, PE bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

One Man’s Story – Part 1
08/01/2007
Greg McMillan and Stan Weiner, PE bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

Round Up Them Puzzler Answers
07/01/2007
Greg McMillan and Stan Weiner, PE bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at controltalk@putman.net.

The best of the best, Part 6
06/07/2007
Control Talk columnists McMillan and Weiner continue their multipart series of talks with some of the Great Minds in Process Control. This month it’s Terry Blevins, 2004 Control Hall of Fame inductee.

The best of the best, Part 5
05/11/2007
Control Talk columnists McMillan and Weiner continue their multipart series of talks with some of the Great Minds in Process Control. This month it’s Sheldon Lloyd, past VP of technology for Fisher Controls.

Best of the best, Part 4
04/12/2007
Control Talk columnists McMillan and Weiner continue their interviews with the big names in process control. This time they talk with Bob Heider, adjunct professor at Washington University in St. Louis.

Best of the best, Part 3
03/08/2007
In the third installment of this series, Control Talk columnists Greg McMillan and Stan Weiner, PE, continue their interviews with the big names in process control. This time they talk with ISA Fellow Vernon Trevathan.

The best of the best — Part 2
02/07/2007
In the second installment of this two-part series, Control Talk columnists Greg McMillan and Stan Weiner, PE, bring their wits and more than 70 years of process control experience to bear on your questions.

The best of the best, Part 1
01/05/2007
Columnists Greg McMillan and Stan Weiner, PE, bring their wits and more than 70 years of process control experience to bear on your questions, comments and problems in this month’s installment of Control Talk.

Talking about talking
12/15/2006
Columnists Greg McMillan and Stan Weiner, PE, bring their wits and more than 70 years of process control experience to bear on your questions, comments and problems in this month’s installment of Control Talk.

Tuning rule bonanza
11/07/2006
Control Talk columnists McMillan and Weiner say in order to get good performance, you need to measure and track the process, the controls, and the equipment under actual real-time operating conditions.

Still life
10/13/2006
Control Talk columnists McMillan and Weiner invite Wendy Kramer, Mark Sowell and Control Hall of Fame inductee Terry Tolliver to comment on improving the control of batch distillation applications.

Flashbacks
09/15/2006
In conjunction with their retirement motto of better late than never, Control Talk columnists McMillan and Weiner offer answers to the May and June Puzzlers, and the Top 10 signs your project is behind schedule.

Intoxicating answers
08/10/2006
Control Talk columnists McMillan and Weiner describe the period of time between when you first take a drink and when you first recognize the effect and bypass the next round for coffee as dead time.

15 case-in-points of common control myths
04/18/2006
In a time-proven tradition of subjecting everything to scrutiny and ridicule, columnists McMillan and Weiner offer up the following 15 examples used to help illustrate and demystify control mythology.

The Bad Hall of Fame
03/20/2006
Control Talk columnists McMillan and Weiner induct some really bad instruments, final control elements, and systems into the Bad Hall of Fame, then proffer the Top 10 signs your life is like a Reality TV show.

Resolutions are made to be broken
02/13/2006
Control Talk columnists McMillan and Weiner provide their unique brand of commentary on process trends and dynamics, then offer up some humor with their Top 10 broken New Year’s resolutions.

Five rules for helping a middle-aged engineer
01/16/2006
Control Talk columnists McMillan and Weiner provide their unique brand of commentary on the handling of cascade loops, then offer up some humor with the Top 10 reasons you should migrate to a new DCS.

Are you grounded in reality?
12/23/2005
Control Talk columnists McMillan and Weiner get an insightful reply as to why a plant instrument engineer said the control schemes and instruments successfully used at other locations won't work in his plant.

Daytime talk is a hoot and a holler
11/06/2005
McMillan and Weiner imagine a transcript of a control engineer on a daytime talk show and offer up the Top 10 reasons why you won’t find a model-based control text book anywhere in today’s college classroom.

Top 10 signs your software is over the (leading) edge
10/19/2005
Control Talk columnists McMillan and Weiner discuss standard mixing design practices for neutralization control and offer up the Top Ten Signs you are over the edge with your leading-edge software.

Top ten signs you're an endangered species
09/11/2005
Why is the instrument engineer such a rare find? The answers may be in the standard dialog on the causes of endangerment. Here are the Top Ten Signs you are an Endangered Species.

Intrinsically wicked
08/22/2005
Control Talk columnists McMillan and Weiner rustle up answers to why an electrode changed when it was inserted, then provide a bit of humor with a Top Ten list of reasons not to retire.

Top 10 signs a startup has gone wrong
07/01/2005
Control Talk columnists Greg McMillan and Stan Weiner, PE, offer up a bit of humor regarding startups, and how Stan avoided being fired despite a recent Monday morning hangover.

Bonkers vortex meter spells double trouble
06/05/2005
Control Talk columnists Greg McMillan and Stan Weiner offer up a bit of humor along with the answer to April's Puzzler on why a vortex meter measuring toothpaste went bonkers.


Free Subscriptions

Control Digital Edition

Access the entire print issue on-line and be notified each month via e-mail when your new issue is ready for you. Subscribe today.