- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
Page 3 of 4
“You can't program all those things into a data repository,” he says.
The same applies to external data sources, he adds. “Data collections at the federal level have changed dramatically over the past 50 years,” he says. “Understanding the culture and context of data collection is really a necessity for using the data well.”
MYTH 6: The more specific the prediction, the better
It's human nature to think that something that is more specific is more accurate. That `3:12 p.m.’ is more accurate than `sometime in the afternoon.’ That the metereologist who predicts that it will definitely rain on Sunday morning is more accurate than the one who predicts a “fifty percent chance of showers this weekend.”
In fact, the opposite is true. In many situations, the more exact prediction is less likely to be accurate.
Say, for example, a customer buys a very specific laptop, in a very particular configuration. And the only other customer to have bought that same product in the past also bought a pair of hot pink stilettos.
“A recommendation for hot pink stilettos may be very specific, but may be too specific – and have a high margin of error,” says Jerry Jao, CEO of Retention Science, a marketing firm in Santa Monica, Calif.
“This is actually something we see pretty commonly among business and marketing managers,” he says.
MYTH 7: Big Data equals Hadoop
Hadoop, a popular open-source database for unstructured data, has been getting a lot of attention lately.
But there are other options.
“There is a whole NoSQL movement,” says Irfan Khan, general manager and senior vice president at SAP Big Data. “There is MongoDB, Cassandra – a whole rack of other technologies.”
Some of those technologies may be a better fit for a particular Big Data project than others.
In particular, Hadoop works by dividing data into chunks, and working on multiple chunks simultaneously. This approach works on many Big Data problems, but not all of them.
“While YARN and Hadoop 2 address some of this, sometimes you need to deal with things in ways that Hadoop isn't ideal for,” says Grant Ingersoll, CTO at Redwood City-based LucidWorks, a Big Data consulting firm. “People need to keep a level head and decide what is best for them, not just what is the shiny object that all the cool kids are using.”
MYTH 8: End users don't need direct access to Big Data
With Big Data moving in at a high speed, from a wide variety of sources, and in large volumes, it might seem that it is just too complicated for regular employees to deal with.
But that's not necessarily the case.
Take, for example, all the data generated by the devices in an intensive care unit. Heart rates, respiration data, EKG readings. Too often, though, the doctors and nurses can only see a patient's current readings.