Data Acquisition

Primary tools and packages for data analytics

A partial list of the best data analytics software providers

By Jim Montague

Just like in baseball, you can't tell the data analytics players without a program. Here's a partial list of the major leaguers:

  • Amazon Timestream is a time-series database service for IoT and operational applications that can store and analyze many events at less cost than relational databases.

  • Cassandra, or Apache Cassandra, is a free, open-source distributed database management system designed to handle lots of data across servers with high availability with no single point of failure.

  • Cloudera provides Apache Hadoop-based software, support, services and training.

  • Excel is the ubiquitous, software-based spreadsheet for major operating systems, including Microsoft Windows, Apple macOS, Android and Apple iOS, and has been widely applied by them for more than 20 years. It consists of calculation, graphing tools, pivot tables and Visual Basic for Applications macro-programming language, which lets users perform many different calculations, and deliver results to the spreadsheet.

  • Falkonry and its Falkonry LRS machine leaning (ML) software use multivariate, time-series data to find patterns hidden in operational data, and deliver actionable insights.

  • Hadoop, or Apache Hadoop, is an open-source, Java-based programming framework that supports processing and storage of large data sets in a distributed computing environment with MapReduce programming. It's part of the Apache project sponsored by the Apache Software Foundation. Hadoop divides files into large blocks, sends them to a cluster's nodes, and moves code packs into those nodes to parallel process information.

  • IBM Cognos Business Intelligence software helps users understand their data, view or create business reports, analyze data, and monitor events and metrics.

  • Lumira, or SAP BusinessObjects Lumira, is business intelligence software from SAP BusinessObjects that manipulates and visualizes data. It's related to SAP Hana in-memory, column-oriented, relational database management system.

  • Matrix Laboratory (MatLab) is numerical-computing and multi-paradigm, numerical computing software and proprietary programming language created MathWorks. It enables functions and data plotting, algorithm deployment, matrix manipulations, HMI construction, and program interfacing with other software, such as C, C++, C#, Java, Fortran and Python.

  • Microsoft Power BI is a business analytics service that delivers business intelligence and interactive visualizations via a simple interface that lets users create dashboards and reports. It delivers cloud-based BI services, as well as a desktop-based interface, and provides data warehousing, including data preparation, data discovery and interactive dashboards.

  • Microsoft SQL Server Analysis Services (SSAS) is an online, analytical processing and data mining tool in Microsoft SQL Server. SSAS analyzes data spread over multiple files, tables or databases. Microsoft business intelligence and data warehousing services in SQL Server include integration, reporting and multidimensional and tabular analysis.

  • Mongo DB is a free, open-source, cross-platform, document-oriented database program classified as a NoSQL database program, and uses JSON-like documents with schemas.

  • MySQL is an open-source database and Oracle's big data intelligence platform and application. Dynamic SQL is a programming method that lets users build SQL statements dynamically at runtime.

  • Neltilion Analytics is part of Endress+Hauser's Netilion environment, and can register assets manually, create assets lists automatically with an edge device, and provide analyses and overviews on an insights page.

  • PI System software from OSIsoft connects sensor-based data, operations and people to manage process efficiency, asset health, quality and resource management. It works through server-based technology, makes historical and real-time data immediately accessible, and gives users end-to-end enterprise visibility.

  • Predict-IT APR software from Engineering Consultants Group uses data from existing OSIsoft's PI system applications, and monitors it in real-time to recognize performance degradations.   

  • Seeq software application dedicated to process data analytics, and lets users search your data, add context, cleanse, model, find patterns, establish boundaries, monitor assets, collaborate in real time, and interact with time series data. It connects to and works with all types of process data historians.

  • Spark, or Apache Spark, is an open-source, distributed, general-purpose cluster-computing framework that delivers a programming interface for clusters with implicit data parallelism and fault tolerance. Spark is based on resilient distributed dataset (RDD), a read-only multi-set of data items scattered over a machine cluster, and maintained using fault-tolerance. Spark and its RDDs were developed due to limits in MapReduce cluster computing.

  • SparkCognition enables users to weave artificial intelligence (AI) into their organizations, and works with them to build human-level intelligence applied at machine scale to optimize operations and find new solutions.   

  • Splunk produces software for searching, monitoring and analyzing machine-generated big data using a web-based interface

  • Spotfire, or Tibco Spotfire, is an analytics platform that includes an A(X) version with natural language search and AI-powered insights, user interface, automated data flows and real-time data streams.

  • Statistica is analytics software, presently owned by Tibco Data Science, which provides data analysis, management, visualization and mining functions, and enables predictive classification, modeling, clustering and exploration methods. Other capabilities are available by integrating with R open-source, free-of-charge programming.

  • Tableau is data visualization software that joins databases and graphics.

  • Toumetis provides predictive analytics to IIoT applications with its ML capabilities and Cascadence platform, which help users by distributing algorithms that enable asset performance data to be delivered to other stakeholders in secure, permission-based views.

  • TrendMiner is a predictive analytics tool for the process industry.

  • Trifacta reports it develops data-wrangling software for data exploration and self-service data preparation for analysis. Trifacta works with cloud and on-premises data platforms to turn raw data into clean, structured formats. Trifacta uses machine learning, data visualization, human-computer interaction and parallel processing to prepare information for business processes, including data analytics.

Like this article? Sign up for the twice weekly Control Update newsletter and get articles like this delivered right to your inbox.