Database vendors add Google's MapReduce
By Chris Kanaracus, IDG News Service
August 26, 2008 02:20 PM ET
- Share/Email
- Tweet This
- Print
Greenplum and Aster Data Systems, two startups involved in large-scale data analysis, announced this week that their products
will support MapReduce, a programming technique originally developed by Google for parallel processing of large data sets across commodity hardware.
Software developers tend to be more comfortable with languages such as Java and C++ than the database language SQL, said Mayank
Bawa, cofounder and CEO of Aster, maker of a cluster database system that splits workloads into multiple discrete tiers.
"Most developers struggle with the nuances of making a database dance well to their directions," he wrote in a blog post. "Indeed, a SQL maestro is required to perform interesting queries for data transformations (during ETL processing or Extract-Load-Transform
processing) or data mining (during analytics)."
Enter MapReduce, the goal of which was to provide a "trivially parallelizable framework so that even novice developers (a.k.a
interns) could write programs in a variety of languages (Java/C/C++/Perl/Python) to analyze data independent of scale," Bawa
wrote.
Meanwhile, Greenplum, maker of a database it says can scale to a petabyte of information, said this week that a MapReduce
framework will be part of its dataflow engine as of September.
The twin announcements brought a nod of approval from one close observer of the database world.
"On its own, MapReduce can do a lot of important work in data manipulation and analysis. Integrating it with SQL should just
increase its applicability and power," wrote Curt Monash of Monash Research, on the DBMS2 blog.
"MapReduce isn't needed for tabular data management. That's been efficiently parallelized in other ways," he added. "But if
you want to build non-tabular structures such as text indexes or graphs, MapReduce turns out to be a big help."
The IDG News Service is a Network World affiliate.
Partner Content
www.bmc.com
Gartner 2009 Magic Quadrant for Job Scheduling
Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.
Download whitepaper
Dell's SMART Approach to Workload Automation
Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.
Download whitepaper
Workload Automation Cost Savings 2 Minute Video
A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member. See how in this 2-minute video overview.
Go to video
Comment