Data is increasingly available in the industrial arena. But is it increasingly useless? Clockwork Solutions VP of Industry Solutions Serg Posadas and Director of Services Brad Young share their unique perspectives here.
Is IIoT data useless?
Serg: Data alone is not much use. To support decisions and actions, data must be analyzed and developed
into informative insights that directly address a business’ objectives. And it’s much more involved than simply plugging data into a pre-fab model to create magically insightful analytics. Lots of data preparation is required ahead of the modeling. We need to understand the desired outcomes from the data analysis. Only then can the proper modeling platform, analytics, and visualization be implemented to support business goals.
We should understand that data is an overloaded term. Across an enterprise, different centers and employees define and use data with varying practices. So when we talk about data, we’re often speaking from differing bases and sometimes focused on fundamentally dissimilar goals. So as we collectively work to apply data to a specific purpose, it’s not unusual to find people pulling on different ropes. Often, data issues are complex not because the data itself is cryptic, but more because the people managing the data are complex.
This influence of the human element is changing though. Technology is evolving to enable a growing Industrial Internet of Things (IIoT) where automated data streams are gathered continuously and transmitted from machine to machine. Capital assets are being outfitted with the ability to transmit signals about their health. This acceleration of data production is fastest around the industrial data related to enterprise assets—about twice as fast as any other type of data. M2M (machine to machine) systems will communicate this data directly to each other. GE reports that within the next five years, more devices will be connected to each other than there will be people on the planet: that’s over 50 billion connected machines. This transformation will shift the complexity of data issues—from dealing with the humans in the loop to handling the soaring volume of data.
To be positioned for this future, companies must capture the value from this data—not simply explore and display it for business-monitoring purposes. An analytics platform must be in place to drive the value out of these huge data volumes by transforming historical and real-time data into data about the future and insights that directly apply to today’s difficult, complex decisions.
Brad: Data, and data points, are useless by themselves. Data combined with analytics (or data analysis) is not useless, and can be very powerful. The power is realized when smart data analysis is put into the hands of those that can make decisions to affect change. You have to have all three pieces: Data, Analytics, Decision Makers.
What differentiates a piece of data from an actionable insight?
Serg: Well a piece of data is to an actionable insight what a bolt is to an aircraft engine. We need to add lots of additional parts and processes to the bolt before it becomes part of an insight. While the bolt doesn’t make the engine, a faulty bolt can certainly break the engine. If the bolt works its way loose, the entire engine can fall apart due to the imbalance. In the same manner, analytics can be vulnerable to faulty pieces of data. It is the data scientist’s job to design models, platforms, processes and decision-support systems to withstand the problems we see with data elements every day.
Whether the data transmission went awry, or a human injected an entry error, or the data is missing, or we experience high levels of uncertainty. Our data analytics must be able to recognize these conditions and adjust to them. We don’t want to build an elegant model with elaborate dashboards only to have one problematic input drive us towards the wrong decision. The idea is to use the data to improve our business processes, not to create new problems because we failed to design our modeling and analysis properly.
One piece of data does not create an insight, so how much data is needed? Well...how much data to you have? How dirty is that data? And how much time do we have until we use that data for an important decision? You should start by clearly defining the question we’re trying to answer. Understanding the end goal comes first, before we can define the data requirements. A specific set of metrics are designed to answer each individual question. These metrics are driven by time-dependent historical observations, future states, or both. The analytics platform should then populate detailed, time-based answers to a vast set of complex questions. From these answers we construct actionable insights on business operations.
Brad: There is an increasingly overwhelming volume of data being collected today. There is little value in collected data unless you can use the data to obtain insights that turn into process-improving action. The bridge between data and actionable insight is analytics (or data analysis). Analytics allows you to discover patterns, trends, and correlations that describe how “things” are behaving. Leaders can take information gleaned from analytics and take action to affect the behavior of their system.
Why is a high-level view of data critical to making it useful?
Brad: The whole purpose in using valuable resources to collect and analyze data is to create “actionable intelligence,” which is given to leaders so they can make informed, actionable decisions to influence the behavior of the system. As mentioned above, a high-level view of the data allows one to see patterns, trends and correlations that characterize system behavior.
Serg: The modeling and analysis that transforms data into information that drives decisions and actions must fit the operational situation. Are we collecting data from a process that includes lots of natural variability? Do we need to quantify and model uncertainty? Is the data being generated in large volumes? At a high velocity? Is the data format consistent or does it require lots of conditioning? Are we experiencing data gaps? Is the data accurate or is it riddled with errors? Are we dealing with many different forms of data?
By understanding the characteristics of the data available, we are able to decide which modeling platforms and analytics are suited to answer our business questions. One model cannot fit all situations, so even highly-specialized models must be customized to ingest and process the data that’s available. Also that input data will need conditioning before it can deliver analytics.
So having a good understanding of the data itself is the first step in delivering the decision support and information needed to improve business processes, develop sound strategies, and adapt to changing conditions.
Can an enterprise take full advantage without using historical data or is pairing old and new information critical to success?
Brad: The short answer is no—you can’t take FULL advantage without using historical data. As in all cases, you have to understand your system. Part of understanding the system is knowing how it has changed over time. This allows you to provide context when pairing historical data with new data. Further, pairing historical data with new data can help you understand how your system has changed, or confirm/deny what you thought about how your system changed.
Serg: We can classify anything that happened in the past as historical data. So if by “new information” you mean data that was very recently collected, those inputs still fall in the historical category. Then the question becomes: How far back should we go to collect meaningful data about our assets, processes, and the operating environment?
There’s no single answer. You should rely on the expertise of the data science team to determine whether data from further back in time is adding value or simply driving us further away from reality. In rapidly changing environments and seemingly chaotic operations, going further back may actually introduce errors. Yet, there is no universal rule-of-thumb. In fact the analysis team should be carefully evaluating data sources and modeling techniques before making this decision.
So if recent information, even if it was just streamed to us seconds ago, is considered historical data, then what could possibly be viewed as data that doesn’t fit into this category? To capture the detailed future operations of complex systems, simulation-modeling platforms can encapsulate policies, business rules, and environmental conditions that have not been observed in the past. This type of analysis is essential if we are influencing a business strategy. The simulation generates data that did not previously exist and often generates output that is greater in volume than the data used to build the model.
Looking back at historical data enables us to react. Even sophisticated machine-learning algorithms are observing history, adjusting as new conditions are observed and offering insights based on what just happened. But no historical method can evaluate conditions and changes to operations that haven’t yet been captured by historical data. So if our goal is to develop a strategy that helps us respond to new conditions and practices, we need much more than these rear-facing analytics.
For accurate results in complex, dynamic systems, we need a more complete approach that is not driven by the past. We may see outcomes that decouple the future from the present and the past. Thus, any historical model limits our view of operations to fit a rear-facing, reactive worldview is severely hampered and bound to veer off target in complex, uncertain environments.
Relying on analytics powered solely by historical data is like driving full speed ahead while staring into the rearview mirror. Your data-science team must be sophisticated enough to recognize situations that require simulation-driven results that provide insights that are influence by both the recent past and the uncertain future.