To bring in and use big and non-traditional information streams, there are many software packages, data management and storage methods, new communication protocols, programming tools, and cloud-computing services, coming mostly from the IT side. Here's an incomplete list and glossary of the primary players:
- Cassandra or Apache Cassandra is a free and open-source distributed database management system designed to handle lots of data across servers with high availability with no single point of failure
- Cloudera provides Apache Hadoop-based software, support, services and training
- Dynamic SQL is a programming method that lets users build SQL statements dynamically at runtime
- Hadoop is an open-source, Java-based programming framework that supports processing and storage of large data sets in a distributed computing environment. It's part of the Apache project sponsored by the Apache Software Foundation
- Mongo DB is a free, open-source, cross-platform, document-oriented database program classified as a NoSQL database program, and uses JSON-like documents with schemas
- MySQL is an open-source database and Oracle's big data intelligence platform and application
- Power BI 9 is Microsoft's data visualization and business analytics tool
- SAP Hana is an in-memory, column-oriented, relational database management system developed and marketed by SAP SE
- Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams
- Splunk produces software for searching, monitoring and analyzing machine-generated big data using a web-based interface
- Tableau data visualization software that joins databases and graphics
- TrendMiner is a predictive analytics tool for the process industry
For more, read Control's November 2016 cover story, or download the full issue.