By Greg McMillan and Stan Weiner
Greg McMillan and Stan Weiner bring their wits and more than 66 years of process control experience to bear on your questions, comments, and problems. Write to them at [email protected].
Stan: How do you quantify the effect of unmeasured disturbances?
Randy: The analysis is based on measurements. Unmeasured disturbances come in the form of variability in a QA value that is uncorrelated to the measurements. For the online principal component analysis (PCA), the quality assurance (QA) value is not considered, and, therefore, there would be no indication. For the projection to latent structures—also known as partial least squares (PLS)—the prediction simply would be wrong. In short, there is no quantification of unmeasured disturbances.
Greg: How do you set the thresholds for deviations?
Randy: A model does have an alpha level parameter that indicates how sensitive the model is. If that parameter is set to .01, then anything outside of 99% of the bell curve is considered a deviation. A more common alpha level is the .05 or 95% exhibited by the bell curve.
Stan: How do you determine if a PCA or PLS result is wrong?
Randy: The PCA is wrong if it generates so many false positives or false negatives that the operation personal ignore it; then the analysis is not doing its job. For PLS, if the predictions do not reasonably match the lab analysis QA, then PLS is not doing its job. In short, if the PCA or PLS are not adding value to your daily operation, then something is wrong.
Greg: How do you find the most important contributions for PCA and PLS (drill down)?
Randy: PCA contributions are the amount each measurement contributes to the overall statistic, either the T^2 statistic or error statistic (Q or SPE). Thus, when a fault is detected by the statistic exceeding the upper control limit (UCL), the contributions can be used to see which tags are causing the fault. The idea of multivariate analysis is that more than one measurement may be causing the fault. Thus, when looking at contributions, the operator may not see a single measurement contribution that exceeds the UCL, but a combination of measurements that are larger than the rest, but still well below the UCL.
This is where process knowledge comes in. The operator or engineer needs to be able to consider the measurements that are identified as outliers and associate them to a process fault. One method may be to bring up a trend of the measurements for this equipment unit and focus on the tags identified by PCA and ask, "What is going wrong?" The idea is to assess a cause from the correlations provided by the analysis. PCA is an early fault detection tool, but does not diagnose the fault for you. PLS contributions may or may not be available in some toolsets.
When a predicted product quality goes bad, it is natural to ask what caused the prediction to indicate a decrease in product quality. However, the prediction calculation only looks at the major correlations between measurements and modeled QA. Thus, the contributing tags for a prediction may not portray the overall picture of the fault, but it may be useful in pointing out a major factor. Regardless, it is always important to refer to the associated PCA charts for a larger view of the problem. In short, a PLS analysis only works on data defined in the model, and PCA reports on what is in the model and what is happening outside of the model.
Stan: Is there anything that can be done to verify causal relationships or track down root causes?
Randy: The very nature of statistical modeling is correlations, and any causal relationship must be verified. To this extent, analytics are just indicators and should not be used to explicitly define causes for process characteristics.
There is no substitute for a process engineer with a good understanding of the process. The analytics should add to the engineer's process knowledge base and understanding, but should not be used as a de facto tool for defining casual relationships in the process. In practice, if a modeled process shows a relationship between certain measurements and product quality during a period in the batch, it's worth investigating. Periods when the PLS confidence interval significantly narrows may be a good indicator that a critical stage of processing is occurring with regards to end-of-batch product quality. Likewise, if the un-normalized upper control limit (UCL) in the PCA analysis dips, it means the modeled batches were very similar during this period. This information can be used as a starting point for process improvements that may result in better product consistency and quality.