Pivotal is an interesting beast, a spin out created from technology contributed from both EMC and VMware. The company is seen as the cool kid to the relative conservatism of its parents. While both VMware and EMC are hampered in their ability to innovate by the relative lack of appetite by their customers, Pivotal has no such constraints. Indeed, industrial powerhouse GE invested a nine-figure amount in the company primarily based on the fact that it is an innovation machine. The Cloud Foundry open source PaaS is a good example of a project that Pivotal is leveraging to offer innovation to its customers. The company recently boasted a $100 million run rate for its Pivotal Cloud Foundry (PCF) business unit.
But PCF is just one part of the business. Pivotal also has a big analytics play, correctly recognizing that the same organizations that want to use developer-centric tools also have a requirement to analyze the ever-increasing amounts of data they have to deal with. To this end, the company has been pushing its own HAWQ analytics engine and associated data science tools. The company has now decided to open-source these solutions and is partnering with a couple of well-respected third parties, namely Hortonworks and Altiscale, to bring them to market. It's also taking a well-aimed blow at Oracle and suggesting that this move will be a "strong knock against Oracle and the traditional database."
In terms of what the technologies actually are, HAWQ is the Hadoop-native SQL analytics database and MADlib the parallel machine learning library both will now be a part of the Apache Software Foundation (ASF). HAWQ was launched in 2013 but was itself a natural extension of the expertise gained through the acquisition of the Greenplum data warehousing system alongside the PostgreSQL database. HAWQ created a real-world application for Hadoop: SQL analytics. With this move, Pivotal suggests it will corner this big potential market. As for MADlib, the library is a collection of scale out, parallel machine-learning algorithms integrated with HAWQ. MADlib was developed by Pivotal, in conjunction with researchers from the University of California, Berkeley, Stanford University, the University of Florida, and Pivotal’s customers.
HAWQ is able to execute the MADlib algorithms in parallel and natively inside any Hadoop cluster based on Pivotal’s Hadoop distribution (Pivotal HD), Hortonworks’ Hadoop distribution (HDP), or the ODPi (ODPi.org) core. MADlib supports HAWQ, Pivotal Greenplum, and PostgreSQL.
Pivotal will continue to distribute and support commercial distributions of Apache HAWQ and Apache MADlib via the Pivotal Big Data Suite. Customers with enterprise support agreements will also be entitled to request priority technical assistance as well as receive patches and hotfixes.
Of course, there is an obvious question to ask, and that is why Pivotal is choosing to open source a solution that it claims is seeing excellent commercial success. That's a fair question, but arguably one which ignores the realities of today's market with regards delivering technology within a broad ecosystem. The days of single-source vendors are rapidly passing, and many companies are increasingly achieving commercial success by leveraging strong partnerships of mutually interested parties. This is the model that Pivotal pushed with Cloud Foundry, an open source Platform as a Service (PaaS) that is not only available from Pivotal, but from a range of third-party vendors.
This would appear to be the thinking behind this announcement, and it's also a win for the third parties named today; Hortonworks and Altiscale get to deliver a more specific product offering alongside their broader Hadoop tools.
It will be interesting to see the success that the open-sourcing of these tools generates. As for that claim about it being a severe knock to Oracle... my advise would be to take that barb with a grain of salt.
This article is published as part of the IDG Contributor Network. Want to Join?