I was researching some new applications last month, and on a couple occasions filled out web forms requesting quotes from suppliers I've never contacted before. I was interested in possibly making a purchase, so I was dismayed when days went by with no reply. I wonder if their responses ended up in my spam bucket?
If your employer is like mine, your IT department has deployed some sort of spam filter that automatically scans your email messages, seeking to prevent obvious junk mail and phishing exploits from appearing in your inbox. Used to be, I had the option of examining every message that was caught in the spam bucket. To this day, spam still gets through and legitimate messages are filtered. So I'm motivated to look at the list of filtered messages when the daily reports appear. But the list has become ponderous—it takes a lot of effort and discipline to get through one's normal inbox and respond to important messages, let alone look at a list of presumed spam that might be many times larger. Were my missing messages among the damned that someone in IT has deemed I never need to see?
My robot angel of spam filtering is in fact a rather highly evolved form of artificial intelligence (AI). Spam filter robots have been "reading" billions of emails over the years, and have been "learning" how to filter spam, to some degree, from us. That it remains such an imperfect solution makes me concerned about the promises of "big data" to make sense out of all the jabbering we're likely to get from the coming Internet of Things, both consumer, commercial and industrial. There are those imagining that some pure, fabulous math-magic is going to crunch through mounds and clouds of IoT data, and send the plant manager a text when the process is about to break. I don't aim to discourage dreamers, but I wake up to a harsher reality where the crafters of AI still haven't perfected the everyday spam filter.
Not to say there isn't value in the bits of data we get from our existing networks of things. You may already be using some obvious ones, like vibrations on big compressors or other rotating machinery. But an alarm from a failing or misadjusted probe is "spam" we'd rather filter. How do one's data analytics get smart enough to distinguish false alarms from real outliers worthy of further investigation? Our process historian is full of data points that represent anything from a failed sensor to routine calibration. If it's all just numbers to the data engines, how will the robot know what correlates and what's an anomaly?
Our processes are extraordinarily diverse—nearly every refinery has a cat cracker, but no two are alike or uniform in their deployment of sensors and instrumentation. But, for a given process, it's been proven that there are indeed patterns of normalcy that can be channeled and analyzed to detect excursions, and alert us to impending failures that we can avoid. This is the sort of insight we want big data to reveal—it could save us millions. To a degree, these patterns can be derived from pure mathematical analyses, but at some point a person with real knowledge of the process is needed to judge which patterns are valid and which ones are meaningless. If AI becomes an agent of spam—meaningless messages that require no action—the time and resources expended on data analysis will have been wasted.
We don't want to create an internet of spam, and we have to gird ourselves, our experienced operators and process specialists for a long slog through the big data with our clever robot friends. Somehow, we need to educate and alert our management, who might get the impression that they only need a purchase order and IoT benefits will be delivered "automagically." It's going to take people. We have the ability to bring more sensors, more data and more computing power to gain insight into our process. But like our email spam filter, human intelligence and experience is still needed to ensure valuable information gets to the right individuals.