Cool new products from big data’s Hadoop World show


Strata World/Hadoop World

There’s a big world of big data tools and services, and many of the leading ones are on display this week at Strata World/Hadoop World in San Jose. From the latest distributions of open source database technology to handy tools for helping to manage them, check out some of the latest big data technology hitting the market this week.


R Server and latest version of Spark for HDInsight

Pricing: R Server for HDInsight: Starting at $0.40/node/hour; Spark: Starting at $0.32/node/hour

Key features: R Server for HDInsight runs Microsoft R Server as a service on top of HDInsight. Using Hadoop and Spark, you can scale R scripts to handle data up to 1000x larger than what is achievable with open source R and up to 50x faster performance through multi-threaded math libraries and transparent parallelization of R Server. More info.

microsoft azure

Azure Data Catalog

Key features: Azure Data Catalog is an enterprise metadata catalog that allows for self-service discovery of data sources. It allows analysts, data scientists or data developers to register, enrich, discover, understand and consume data sources. More info.


Workload Migration 1.0 & Data Blending 1.0

Key features: Impetus Workload Migration enables enterprises to quickly and easily identify, analyze and offload data and workloads from a traditional data warehouse to Hadoop. Ingest multiple sources of data, mix and match, cleanse and enrich, and create complex workflow pipelines. More info.

ibm looker

Looker Blocks

Key features: Looker announced it has formed an alliance with IBM Cloud Data Services to deliver a suite of Looker Blocks, new developer tools designed to simplify and customize data analysis for any business utilizing IBM’s Cloud Data Services. 

The Looker Block for IBM completes the vision of IBM’s Simple Data Pipe app, utilizing Looker to quickly transform data that has been moved into dashDB using the Simple Data Pipe app. More info.

Trifacta Enterprise with Photon Compute Framework

Trifacta Enterprise with Photon Compute Framework

Key features:  The Photon Compute Framework is a technology enhancement at the core of Trifacta’s interface developed to provide users with an interactive and computationally intelligent data wrangling experience on large in-memory datasets. More info.


Tamr’s data unification platform now compatible with Apache Spark

Key features - Beyond compatibility with Spark, Tamr is developing open interfaces and core components to support data curation solutions powered by Spark. This common toolset will support scalable application development for verticals from procurement to customer data integration, life sciences, and more. More info.


Ryft ONE Cluster

Key features - The Ryft ONE Cluster delivers a high -performing and efficient way to modernize data center architectures for petabyte-scale big data analytics. More info.


Adaptive Data Preparation Platform

Key Features: Paxata’s Spring ’16 release of its Data Preparation Platform includes advanced filter grams, smart integration of complex nested JSON/XML data and Hadoop compressed files, granular searching across all columns of wide datasets and in every cell value. More info.



Key features - Striim (pronounced “stream”) is an enterprise-grade streaming integration and intelligence solution. The platform enables real-time data integration, and Change Data Capture from enterprise databases, into Hadoop, Kafka, Cloud and more. More info.


Elastic Integration Platform Winter Release - 2016

Key features: SnapLogic’s Elastic Integration Platform Winter Release 2016 adds the ability to translate data pipelines into the Spark data processing framework without scripting with the new Spark mode for data pipelines. More info.



Pricing: Community Edition will be Open Source – free to download. Enterprsie Edition not yet available and pricing has not been set.

Key features: Build a Docker container powered Hadoop/Spark cluster in five minutes with zero cluster, container, Hadoop or Spark experience. If you can install a smartphone App, you can install ClusterGX. More info.


Stream Processing Quick Start Solution

Key features: The Stream Processing Quick Start Solution is a combination of MapR software and services for processing high volumes of streaming data to easily create new Internet of Things applications. More info.


GraphLab Create

Key features: GraphLab Create allows users to evaluate, explore and explain machine learning models and predictions. Quantitatively measure the quality of models and predictions and compare alternative methods. New explanations shed light onto why a model makes a particular prediction, allowing developers to gain confidence that models are making decisions for the right reason. More info.


Platfora 5.2

Key features: Platfora 5.2 features native integration to Tableau, Lens-Accelerated SQL accessible through any SQL client, the option to run directly on the Hadoop Cluster using YARN and enhanced vizboards. More info.

Copyright © 2016 IDG Communications, Inc.