GE-Banner
GE-Banner
GE-Banner
GE-Banner
GE-Banner

The essential tools of industrial data science

Sept. 30, 2015
Big data's meaningless without effective analysis. A peek inside the GE toolbox.
About the author
Jim Montague is the Executive Editor at Control, Control Design and Industrial Networking magazines. Jim has spent the last 13 years as an editor and brings a wealth of automation and controls knowledge to the position. For the past eight years, Jim worked at Reed Business Information as News Editor for Control Engineering magazine. Jim has a BA in English from Carleton College in Northfield, Minnesota, and lives in Skokie, Illinois.

"The real purpose of the Internet of Things (IoT) is better outcomes for customers, such as reduced downtime and increased productivity," says Matt Denesuk, chief data science officer, GE Digital. But the path from raw data to business benefits is a long and winding one. Luckily, GE and its industrial data science teams have been wrestling with these challenges for many years, and are in fine shape to help users tame big data and take advantage of it.

Denesuk and his colleagues, Mark Grabb, manager of visualization and computer vision, and Achalcal Pandey, machine learning lab manager, both of GE Global Research, presented "Optimize Assets with Industrial Big Data" today at Minds + Machines 2015 in San Francisco.

"GE's own assets—from turbines to medical equipment—generate huge amounts of information, and this only increases as we add data from their local environments and enterprises," said Denesuk. "However, though we've been doing industrial data science for a long time, it's historically been customized and not very scalable. Now, we're distributing more of our know-how, putting it on a platform, and even developing an industrial data services consulting capability. We also know that industrial data science is different from regular data science. Industrial equipment failures have huge safety and assets costs, and industrial data science requires different skills and infrastructure."

Industrial data science basics

Denesuk reported that the three main pillars of GE's industrial data science program are: physics and engineering-based models; empirical, heuristic rules and insights; and data-driven techniques, such as machine learning, statistics, optimization and advanced visualization. "Usually, the problem is that there's not enough of the right data, and so we need to combine all three of these areas to really optimize asset performance."

"Usually, the problem is that there's not enough of the right data, and so we need to combine all three of these areas to really optimize asset performance." GE Digital's Matt Denesuk on the use of first-principles models, empirical rules and statistical analysis to translate raw data into performance improvement.  

Denesuk explained that developing an industrial data science program often begins with gathering as much data as possible, with computing power and data storage capacity to match. Then, once the data is analyzed and an application's models are built, less information and computing power is needed, but an effective deployment system is needed to get those models to individual applications and devices.

"Analytical-based maintenance (ABM) is the future of equipment health modeling and performance management, but it requires us to broaden the concept of wear; extend and integrate physics-based models; add explicit and stochastic elements to deterministic models, and put them in maintenance systems," explained Denesuk. "This is how we put asset-based performance in Predix. GE delivers practical data science quickly, whether it's tactical field investigations, data science-driven solutions, or Predix integration and deployment."

Denesuk added that the usual asset optimization journey with GE's clients includes basic reporting, advanced reporting, anomaly detection and alerts, predictive analytics with actionable management information, prescriptive analytics with high-value guidance, and ultimately leads to operational optimization with sophisticated management of business operations. The biggest gun in GE's industrial data science arsenal is its Digital Twin program.

Digital Twin

Similar to a simulation that GE and its customers can refer to at any time along its lifecycle, Digital Twin combines physics-driven analyses and other data to produce the biggest possible wins in optimum asset performance.

"We can take current data about a machine or system, and then show what will happen when it moves forward in time, and also do more maintenance and productivity together," explained Grabb. "We can use Digital Twin to collect models for each device in the field; do performance management for one asset; do operations optimization for a group of assets; help optimize an entire business; and then continuously improve the fidelity and accuracy of the models going forward."

Digital Twin's basic framework includes a data lake on a Predix cloud, performance model and lifecycle model, as well as asset performance and operations and business optimization programs. "For example, to maintain an aircraft engine, there are traditionally prioritized inspections and maintenance," explained Pandey. "Digital Twin changes the paradigm to analytics-based maintenance, and manages the fleet on a per-engine basis. Every single component goes through the cumulative distress and cumulative cycle models, so we can check the effects of different conditions and behaviors.

"We're also doing new sensing for compressor blade health monitoring with magnetic sensors on the blades. This solution can see 0.2-mm changes even at high speeds, and generates 3.2 million samples per second to help evaluate turbine health. We're developing high-fidelity Digital Twins, which can clean data and see component-level degradations, such as power loss versus operating time, and then identify anomalies due to data-flow changes. These anomalies can't be seen with machine learning, but physics-based methods can see them, and we can tell our customers."

Pandey added that Digital Twins are even using artificial intelligence to clean data, further improve model fidelity, and perform more automated inspections and detections. GE also has Digital Twin Modeling Framework to help build models automatically and update them continuously. "This is another part of our Predix platform," added Pandey. 

About the Author

Jim Montague | Executive Editor

Jim Montague is executive editor of Control.