- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
Network World - "Puppies are free, but you still have to feed them."
Virendra Vase jokingly likened this truism to Klout’s experience with big data. Vase is CTO at Klout, which uses analytics to measure people’s influence across social networks. “We basically sit above the social networks, gathering all the signals, helping folks to understand and be recognized for their influence,” he said.
It’s an effort that requires processing 12 billion data signals a day for more than 400 million people on multi-petabyte clusters. Klout’s big data environment includes the open-source platform Hadoop, along with HBase, Hive, ElasticSearch, Scala, Storm, Node.js and other tools.
[[RELATED: Etsy gets crafty with big data]]
Vase spoke recently about some of the big-data lessons Klout has learned since its launch. He was joined by Rachel Higham, chief enterprise architect at insurer ACE Group, and Brian Barnes, vice president of consumer applications at Tenet Healthcare.
The trio shared their big-data tips, tricks and troubles with attendees of The Big Data Conference in Chicago, led by Johna Till Johnson, president of Nemertes Research, and John Burke, principal research analyst at Nemertes. Here are some of their insights:
Data volumes complicate testing, storage, computation
The sheer volume of data makes testability a challenge, Vase said. Dealing with unstructured data formats and limitations of social media-derived data (such as 140-character Tweets) also complicates Klout’s big data efforts. And when it comes to storage, more isn’t always better, he cautioned: “We have to be able to figure out: What sort of data do we need to keep? What do we need to archive?” he said. “At the end of the day, it’s not really about data, it’s about how we analyze the data.”
Maturity of technologies is a challenge
In support of his puppy analogy, Vase warned that many big data technologies are works in progress. While the base technologies are maturing, the tools for management and configuration are in a nascent phase, leaving IT pros to do the work of addressing the gaps. “We’ve had to develop tools from a management perspective, from a workflow perspective, from a configuration perspective,” Vase said.
Think modular, be ready to invest
A modular infrastructure has been important for Klout, since it allows the IT team to handle changes in business priorities and provides transparency into operations, Vase said. He reiterated the need to invest in management and productivity tools. “That’s what 20%, 25% of my engineering resources are focused on, productivity tools and workflow management.”
Expect to struggle to find talent
Big data tools are evolving rapidly, and vendor support isn’t where it needs to be -- which makes finding big data talent even more challenging. “It’s extremely hard to find talent, so what we’ve done is hire really good engineers and train them along the way,” Vase said. (See "So you want to be a data scientist?')
Align technologists with product and business people
It’s easy to talk about getting big data experts to work closely with product experts and business pros, but it can be difficult to follow through with the idea. “Increasingly over the last couple of years we’ve brought them together, because both sides need to understand the other side,” Vase said.
Spread the word
ACE Group was founded as a reinsurance company and has diversified its global business lines to include insurance products for businesses (such as property and casualty, group accident and health, reinsurance and risk management) as well as consumers (including personal accident, homeowners, and auto). “It’s been that shift in strategic direction that created our first recognizable big data problem,” said Higham.
ACE Group has become adept at using big-data analytics to uncover risk and fraud patterns, identify new business opportunities, and gain insights into customer sentiment -- and then share the results with the company. Big data showcase events for business and IT help create a buzz around the potential of big data analytics, for example. “I think the biggest learning we have is this: Absolutely over-communicate our intent, our vision and our success stories,” Higham said.
Define a vision, educate the leadership team
“We spent a lot of time mapping where big data could be leveraged in both our underwriting and claims processes and explaining that back to the business,” Higham said. “That identified our first two proofs-of-concept for us, and now we’ve got a pipeline of over 40 areas where we’re going to further apply some of the tools and techniques.”
Establish a steering group
ACE Group’s steering committee leads the company’s big data agenda. Surprisingly, it’s not stacked with techies. “It’s hardly got any technologists on it. There are four technologists and about 20 business leaders on that team,” Higham said.
Have data & talent in order before starting projects
“One of the critical things was to invest in building out the skills and resources first before we started this journey. Without that we would have had an unacceptable lag in delivering value back to the business,” Higham said.
Embed talent within the business
ACE Group’s big data experts are scattered throughout the business. “We have fully embedded the practice in all of our underwriting and claims teams around the world. We’ve done that by establishing a core competency, collocating new skills within our business -- statisticians, data-scrubbers, data analysts, process experts -- with our claims and underwriting expertise,” Higham said. “It’s that collocation that has helped us pool knowledge, share expertise, and evolve and innovate where we’re leveraging big data.”
Don’t underestimate vendor management or systems integration
For Vanguard Health Systems (which is now part of Tenet Healthcare), a best-of-breed approach to big data analytics meant taking on eight vendors, explained Barnes. “It took us eight different companies, because no one had an out--of-the-box solution,” he said. Vendor management is a challenge, as is integrating all the different systems. “Systems integration has been a huge challenge for us,” Barnes said.