- 12 iPhones Apps That Will Make You a Networking Star
- 10 Careers Robots Are Taking From You
- Big Data Gold Isn't Always Where You Would Expect It
- 6 Tips to Build Your Social Media Strategy
Network World - The single biggest news coming from Amazon Web Service's first user conference recently was the launch of the company's newest service, Redshift, a cloud-based data warehouse tool. And it prompts the question: Is the cloud the right place for data warehousing?
AWS officials say for businesses struggling to manage their data, the cloud can provide a low-cost alternative to investing in infrastructure to manage it all on their own sites. Perhaps the biggest issues holding back Redshift are the same concerns that come along with using the public cloud in general, though. Some just don't feel comfortable putting sensitive financial or personally identifiable data in anyone's public cloud. And then there's the issue of how all that data is actually transferred into the cloud.
These issues -- a combination of potential benefits related to cost and manageability, combined with concerns about security and data transfer -- will likely mean that Redshift follows the path of many of AWS's other enterprise-geared services, says Jeff Kelly, a big-data researcher at The Wikibon Project. Forward-looking businesses that have already embraced Amazon's cloud may move more quickly to the cloud for services like data warehousing, whereas larger enterprises that have been slow to jump into the public cloud may test the service on a use-case basis to see if it's the right fit for them.
CLOUD SHOWDOWN: Amazon vs. Rackspace (OpenStack) vs. Microsoft vs. Google
Data warehouses have traditionally been defined as customized data storage services that aggregate data from multiple different sources and collect it in a central location to be able to run reports and queries of it. Many companies use data warehouses to compile regular financial reports or business metric analyses. Redshift is a column/SQL-based tool designed to scale from a terabyte up to multiple petabyte size.
Along with announcing Redshift AWS also released two new virtual machine instances types meant to work with Redshift, including an XL instance that has 2TB of local storage, and an 8XL instance type with 16TB of storage. AWS has partnered with database analysis company ParAccel to architect Redshift after Amazon.com, AWS's parent company, invested in the company last year. Like traditional on-premise data warehouses, Redshift can be architected to, for example, integrate data from Amazon's DynamoDB NoSQL database, Simple Storage Service (S3), or from existing applications on customer's own premises. Redshift is a repository for the data for it to be exposed to business analytics tools that run reports on it.
"I think there will definitely be some interest" for Redshift, says Kelly, the Wikibon researcher. "One issue with data warehousing is many times this is highly critical, proprietary information that some may be reluctant to ship off to a cloud provider." For organizations with data that is siloed, has variable demands, or for companies that don't have the on-premise infrastructure to manage data warehousing, it could be an attractive option, though. "If you're already doing data management in the cloud, and particularly Amazon's cloud, this seems like an opportunity to take advantage of a new service," he says.