The Hadoop market is white hot right now. It seems everywhere we turn there is a new company throwing their hat into the Hadoop world. Is all of the attention justified? Is this a case of overhype causing overload? How can the Hadoop world possibly make room for all of these companies to survive and thrive? I guess we are going to see how this all plays out.
Yesterday word came down that Yahoo had spun off some of their top Hadoop engineers into a new open source company backed by Yahoo and Benchmark Capital called Hortonworks. For those wondering, Horton is an elephant in the movie Horton Hears a Who starring Jim Carrey. Of course Hadoop’s logo is little baby elephant. I guess that might be a connection?
Anyway, Yahoo has been the biggest early supporter of Hadoop and has some 40,000 servers processing 5 billion jobs a month. Hadoop is actually an Apache project, so while Yahoo is a major supporter and was the early lead, many major web companies and others have been contributing.
Cloudera is one company that has become the “Hadoop company” supplying support and services around it. Fresh on the heels of the Hortonworks announcement, Cloudera made an announcement of their own. Their latest version will offer more configuration and management tools for Hadoop.
But it doesn’t end there. Platform Computing also yesterday announced “it has signed the Apache Corporate Contributor License Agreement allowing the company to contribute to the Apache Software Foundation for developing Apache-based, open-source Hadoop Distributed File System (HDFS)”. Platform is focusing on its recently announced Platform MapReduce which Hadoop was originally based on back when MapReduce was a strictly Google tool.
Not to be outdone, MapR which “allows more businesses to harness the power of big data analytics. MapR's innovations make Hadoop more reliable, more affordable, more manageable and significantly easier to use,” today announced an expansion of their partner program “ to enable diverse organizations within the Hadoop community expand their reach, ultimately helps customers leverage big data analytics through integrated access to MapR’s next generation distribution for Apache Hadoop”.
Wait there is more! Acuate also yesterday announced support for Hadoop into its BIRT framework for business intelligence. The company says,
“The combination of BIRT’s open source, flexible approach to business intelligence and Hadoop’s data scalability enables organizations to build information applications that give the full range of end users — including business analysts and non-technical users — valuable insight into data stored in Hadoop"
One more company throwing their hat into the Hadoop ring is Pervasive which also announced the release of their “Pervasive TurboRush for Hive, new software that makes Hive queries run faster on less hardware”. In case you were wondering “Hive is the data warehouse system built on top of Hadoop. Pervasive TurboRush for Hive accelerates Hive by using the Pervasive DataRush dataflow engine on the back end, providing faster execution of Hive programs without needing to modify any code”.
Last week I wrote about no less a company than Lexis-Nexis offering their own Hadoop competitor, HPCC as an open source alternative to Hadoop.
So what is the gold rush about? Eric Baldeschwieler, Hortonworks' chief executive and former head of software engineering for the Hadoop team at Yahoo said, “we anticipate that within five years, more than half the world's data will be stored in Apache Hadoop". Well if that is anywhere near true, you can see why the land grab is on.
What a great open source story. One Apache project giving rise to all of this. Now the question will be can they all play together and will the “coopetition” make big data easier, better and faster for all of us.
In the meantime asking the real Hadoop company to stand up, you could wind up with a room full of elephants!
As co-founder and Managing Partner at The CISO Group, Alan Shimel is responsible for driving the vision and mission of the company. The CISO Group offers security consulting and PCI compliance management for the payment card industry. Prior to The CISO Group, Alan was the Chief Strategy Officer at StillSecure. Shimel was the public persona of StillSecure as it grew from start up to helping defend some of the largest and most sensitive networks in the world.
Shimel is an often-cited personality in the technology community and is a sought-after speaker at industry and government conferences and events. His commentary about the state of security, open source and life is followed closely by many industry insiders via his blog and podcast, "Ashimmy, After All These Years" (www.ashimmy.com). Alan is now also a regular contributor to The CISO Group’s security.exe blog and podcast.
Alan has helped build several successful technology companies by combining a strong business background with a deep knowledge of technology. His legal background, long experience in the field, and New York street smarts combine to form a unique personality.
Disclosure: The CISO Group sells a software-as-a-service PCI compliance application called SAQPro. The company is independent and does not represent any other vendor's products as a reseller.
Policy on comments: Respectful discussion is welcomed! However comments that use inappropriate language, consist of name calling or personal attacks, or include accusations of wrongdoing are not appropriate. Those comments will be deleted or edited.