Self-service analytics for non-data-scientists

[sidebar id =1]Though plant operations typically generate mind-boggling quantities of data—both structured and unstructured—plant engineers and operations personnel can only leverage a small percentage to make better process decisions. This is because most process data from plant systems is stored in online and off-line process historian archives, and process historian tools, although relatively simple to use, aren’t ideal for analyzing data or search queries. The archives are “write”-optimized and not “read/analytics”-optimized, so finding related historical events and building process context can be a time-consuming and laborious task.

Improving process performance and overall efficiency requires operational intelligence and an understanding of data. One of the basic elements is that process engineers and other stakeholders must be able to search time-series data over a specific timeline, and visualize all related plant events quickly and efficiently. This includes the time-series data generated by the process control, lab and other plant systems, as well as the usual annotations and observations made by operators and engineers.

The challenges typically presented by historian time-series data are lack of a mechanism for search and an inability to annotate effectively. By combining both the search capabilities on structured time-series process data, and data annotated by operators and other SMEs, users can understand more precisely what’s occurring and predict what likely will occur in their continuous and batch processes.

ARC Advisory Group has identified a handful of solution suppliers that are taking a different approach to providing industrial process data analytics, and also leveraging multidimensional search capabilities for stakeholders. The approach combines the necessary elements to visualize a process historian’s time-series data, overlay similar matched historical patterns, and enrich with data captured by engineers and operators. A form of “process fingerprinting” then provides operations and engineering with greater process insights to optimize the process and/or predict unfavorable process conditions. Furthermore, unlike traditional approaches, performing this analysis doesn’t require the skill set of a data scientist.

Key elements of the approach include:

A system that brings together deep knowledge of process operations and data analytics techniques to minimize the need for specialized data scientists or complex, engineering-intensive data modeling, which can turn human intelligence into machine intelligence to gain value from operational data already collected.
A model-free, predictive, process-analytics (discovery, diagnostic and predictive) tool that complements and augments, rather than replaces, existing historian data architectures.
A system that supports cost-efficient virtualized deployment and is “plug-and-play” within the available infrastructure, yet has the ability to evolve into a fully scalable component of corporate Big Data initiatives and environments.

Data scientists not required

Process engineers and operators need to accurately predict process performance or the outcome of a batch process, while eliminating false-positive diagnoses. Traditional, backward-looking, “describe and discover” analytics solutions are little help here. Accurately predicting process events that will likely happen in a facility requires accurate process historian or time-series search tools and the ability to apply meaning to patterns identified in process data.

[pullquote]Process analytics solutions in one form or another have existed in the industrial software market for some time. They’re all after the same use cases. These largely historian-based software tools often require much interpretation and manipulation by the user. Predictive analytics, a relatively new dimension to analytics tools, can provide users with valuable insights about what will happen in the future based on historical data, both structured and unstructured. However, many of these advanced tools tend to be perceived as engineering-intensive “black boxes” targeted toward power users who can make needed interpretations. For a lot of operational and asset-related issues, this approach is not economically practical (negative ROI). That’s the reason a lot of vendors are targeting only that 1% of critical assets.

Other predictive analytics tools start by using a more enterprise-based approach, and require more sophisticated, distributed computing platforms such as Hadoop or Cloudera. These are powerful and useful for many analytics applications, but represent a more complex approach to managing plant and enterprise data. Companies that use this enterprise data management approach often employ specialized data scientists to help organize and cleanse data.

Better approaches come as on-premise, packaged, virtual server deployments, easily integrated with the local copy of the plant historian database archives, and evolve over time toward a scalable architecture to blend in with available, enterprise-distributed computing platforms. The newer approach uses “pattern search-based discovery and predictive-style process analytics” targeting the average user. It’s relatively easy to deploy and use, providing the potential for organizations to gain immediate value without a big data or modeling solution—and no data scientist is required.

Multidimensional search solutions

There are now methods of connecting to existing historian databases, and using a column store database layer for their index creates a clever front end that allows multi-dimensional, search-based analytics queries. Using pattern recognition and machine learning algorithms can allow users to search process trends for specific events or detect process anomalies, which can make the systems that use this approach distinct from traditional historian desktop tools.

[javascriptSnippet]

What is ultimately created is a “Google Search for the Process Industries.” According to Peter Reynolds, ARC senior consultant, “The new platform is built to make operator shift logs searchable in the context of historian data and process information. At a time when the process industries may face as much as a 30% decline in the skilled workforce through retiring workers, knowledge capture is a key imperative for many industrial organizations.”

These next-generation systems also work well with leading process historian suppliers, including OSIsoft, AspenTech and Yokogawa. The technology forms the critical base layer of the new systems’ technology stack because it uses existing historian databases, and creates a data layer that performs a column store to index the time-series data.

Typically, it’s designed to be simple to install, connect and get up and running with the use of a virtual machine (VM), without impacting the existing historian infrastructure.

A core solution should include:

Column store with in-memory indexing of historian data;
Search technology based on pattern matching and machine learning algorithms, empowering users to find historical trends that define process events and conditions;
Diagnostic capabilities to find the reasons behind detected anomalies and process situations;
Knowledge and event management and process data contextualization;
Identification, capturing and sharing important process analysis among billions of process data points;
Capture capabilities that support event frames or bookmarks manually created by users or automatically generated by third-party applications (with annotations visible in the context of specific trends); and
Monitoring capabilities that integrate predictive analytics and early warning detection of abnormal and undesirable process events on saved historical patterns or searches. They should leverage live process data used as a live view by operators to determine if recent process changes match expected process behavior; proactively adjust settings when they do not; calculate possible trajectories of the process; and predict process variables and behavior.

The technology playing field for manufacturers and other industrial organizations has changed. Owner-operators now have different tools for improving plant availability and asset effectiveness. “There’s an immediate need to search time-series data and analyze them in context with the annotations made by engineers and operators to make faster, higher-quality process decisions,” says Reynolds. “If users want to predict process degradation or an asset or equipment failure, they need to look beyond time-series and historian data tools, and then search, learn by experimentation, and detect patterns in the vast pool of data that already exists in their plant.”

This new process analytics model can support the necessary upgrading of traditional process historian visualization tools. Process historians have been useful for storing process data and connecting to real-time systems. However, the basic process analytical tools to perform rear-looking trends or analyzing data in Microsoft Excel have been time-consuming and limited in their functions. The tools used to visualize and interpret process data are typically trending applications, reports and dashboards. These have been helpful, but not particularly good at predicting outcomes.

Today’s owner-operators are cost-constrained, must deal with erosion of knowledge and talent, and need cost-effective ways to get value out of data already generated at the plant. ARC believes this next generation of solutions with ‘Google-like’ search capabilities will be welcomed by process industry end users.