Data Analytics in Batch Operations

When Batch Processing Is Critical to Your Operations, Imagine the Worth of Knowing the Predicted Value of Quality Parameters

4 of 4 1 | 2 | 3 | 4 > View on one page

Collect Process Information. The measurements and the batch operations definitions for the production of various products is unique to each installation. To apply data analytics to a batch process, the team doing this work must have a good understanding of the process, the products produced and the organization of the batch control. Existing documentation on the process and the batch control were distributed to the project team for study. We held a meeting to allow operations to provide the team input to become more familiar with the process. Based on this information, we created a list of the process measurements, lab analysis and truck data for raw material shipment. This formed the basis of what Lubrizol refers to as the Inputs–Process–Outputs data matrix. This information defines the data that will be considered in the PCA and PLS model development.

Instrumentation and Control Survey. A basic assumption in the application of analytics to a batch process is that the process operation is repeatable. If there are issues associated with the process measurement or control tuning and setup, then these should be addressed before data is collected for model development. Thus, in parallel with the initial project meeting, an instrumentation and control survey was conducted for the two batch process areas addressed by the project. Any instrumentation problems found in the survey were corrected. Also, changes in loop tuning were made to provide best process performance. For example, the temperature and pressure loops associated with three reactors were retuned to provide improved, repeatable performance.

Integration of Lab Data. Key quality parameters of the batch operation at the Rouen facility are obtained by lab analysis of grab samples. Typically the lab analysis data is then entered by a company into its enterprise resources planning (ERP) system (SAP in the case of Lubrizol) and is used for quality reporting, generating certificates of analysis and in process improvement studies. The properties analysis for truck shipments are also entered into the ERP. To allow this data to be used in online analytics, the team created an interface between Lubrizol’s SAP system and the process control system.

The material properties associated with truck shipments are used to calculate the properties of material drawn from storage—inputs used in the PCA and PLS analysis. Thus, it is important to characterize both the quality characteristics of incoming raw materials and the quality of end-of-batch characteristics.

Historian Collection. When the process control system was originally installed at the Rouen plant, all process measurements and critical operation parameters associated with the batch control were set up for historian collection in one-minute samples using data compression. However, for the purpose of analytic model development, it is desirable to save data in an uncompressed format. Thus, additional historian collection was defined for the measurements, lab data and batch operation data. This information is collected using 10-second samples and saved in uncompressed format. This allows the data analysis in a finer time resolution and also allows further definition of a more appropriate resolution for future implementation. Later, analysis of the data will determine if the resolution needs to remain at a fine resolution, or if it may be reduced.

Model Development. The tools for model development are designed to allow the user to easily select and organize from the historian a subset of the data associated with parameters that will be used in model development for a specified operation(s) and product. The tool provides the ability to organize and sequence all the data into a predetermined data file structure that permits the data analysis. Once a model has been developed, it may be tested by using playback of data not included in model development. Since the typical batch time is measured in days, this playback may be done faster than real time which allows the model to be evaluated quickly for a number of batches.

Training. The statistics provided by online analytics will be used primarily by the plant operator. Thus operator training is a vital part of commissioning this capability. Also, separate training classes on the use of the analytic tool will be conducted for plant engineering and maintenance personnel.

Evaluation. During the first three months of the online analytics, operator feedback and data collected on improvements in process operation will be used to evaluate the savings that can be attributed to analytics. It will also be used to obtain valuable input to improve the user interfaces, displays and the terminology used in the displays. This will allow the project team to improve the analysis modules further to maximize operators’ and engineers’ use and understanding.

As shown by these project steps, most of the time required to apply online analytics is associated with collecting process information, surveying instrumentation and control, integrating lab data, setting up historian data collection and training. When the analytic toolset is embedded in the control system, it will reduce the effort required to deploy the online analytics. A well-planned project and the use of a multi-discipline team will play a key role in the installation success.


In short, the use of statistical data analytics will likely cause people to think in entirely new ways and address process improvement and operations with a better understanding of the process. Its use will allow operations personnel to identify and make well-informed corrections before the end of batch, and it will play a major role in ensuring that batches repeatedly hit pre-defined end-of-batch targets. Use of this methodology will allow the engineers and other operations personnel to gain further insight into the relationships between process variables and their importance in affecting product quality parameters. It also will provide additional information to help guide process control engineers to pinpoint where process control needs to be improved.

Robert Wojewodka is technology manager and statistician with the corporate operations department at The Lubrizol Corporation. Terry Blevins is principal technologist at Emerson Process Management and a member of the Process Automation Hall of Fame.

Hear a podcast about this article with the
authors and Control Editor in Chief Walt Boyes at

Multivariate Statistics 101

Principal component analysis (PCA) enables the identification and evaluation of product and process variables that may be critical to product quality and performance. Equally important, this tool may be used to develop an understanding of the interactive relationship of process inputs and measurements and online, inline, or at-line analysis of final product. When applied online, PCA may be used to identify potential failure modes and mechanisms and to quantify their effects on product quality.

Multiway PCA (MPCA) is an extension of PCA and provides a very effective means of accounting for and aligning batch data (i.e., different batch lengths). Using MPCA, engineers can apply data-intensive technologies and extract exact information in order to monitor conditions and also relate conditions to faults in batch processes.

Projection to Latent Structures (PLS), sometimes referred to as partial least squares, may be used to analyze the impact of processing conditions on final product quality parameters that are often measured using online, inline, and at-line analysis of final product. When PLS is applied in an online system, it is possible to provide operators with a continuous prediction of end-of-batch quality parameters. PLS is also effectively used with PCA so that only the important process variation is identified in PCA to minimize false alarms of variation that may be present, but not meaningful.

Discriminant Analysis (DA) is related to PCA and other statistical analysis methods, such as analysis of variance. It is oftentimes used in conjunction with PLS as a powerful technique. Discriminant analysis is used when one is relating process variation to categorical or classification criteria as opposed to a continuous output measurement. Examples of a categorical classification may be something as simple as the batch being “in specification” or “out of specification,” or as complex as many different categories representing abnormal situations in the batch or other categorical classifications of quality or economic parameters.

Dynamic time warping (DTW) algorithms effectively measure the similarities between two sequences that may vary in time and/or speed. The method then adjusts for these differences so that multiple batches may be combined together into one data set so that the batches may be analyzed.

4 of 4 1 | 2 | 3 | 4 > View on one page
Show Comments
Hide Comments

Join the discussion

We welcome your thoughtful comments.
All comments will display your user name.

Want to participate in the discussion?

Register for free

Log in for complete access.


No one has commented on this page yet.

RSS feed for comments on this page | RSS feed for all comments