If your data lake turned into a data swamp, it might be time to take the next step

A conversation with Deloitte’s analytic practice leader about common analytic problems and new solutions

swamp data

As the leader of Deloitte’s analytics practice, Paul Roma directs the company's analytics offerings across all businesses, so he sees companies struggling with a range of issues.  Network World Editor in Chief John Dix recently talked to Roma about everything from what analytics problems companies are facing (Hint: the swamp reference above), to tools that help extract more value (cognitive analytics and machine learning), and even the executive management roles that are evolving (the title doesn’t matter much, but ownership of the problem does). 

paul roma Deloitte

Paul Roma, Chief Analytics Officer, Deloitte Consulting LLP

What do customers typically bring you in to address?  Are they looking to solve a specific problem, or are they trying to address bigger picture, overarching analytic issues?

Most routinely we are brought in to work on a specific business outcome.  A customer may be looking to, say, improve their consumer Net Promotion Scores (NPS), which is an industry norm for scoring a consumer’s relation to a particular corporation and its products. It’s called Net Promotions because it’s a heuristic that adds together several factors and builds up to a way to judge yourself.  Or a healthcare organization may come to us to help improve outcomes in certain healthcare protocols, so we’re usually talking about business outcomes.

When you arrive do you find companies have the analytic tools they need, or are they looking for input on new tech as well?

Larger customers have the analytic tools.  There isn’t a company we go into that doesn’t have one of everything.  The question is more of focused usage. It isn’t a shortage of data, either, because they’ve got tons of data.  It’s normal now to have data warehouses or data lakes that have been put together over the years.  But I’ve seen multiple millions of dollars spent on data lakes that become what I call data swamps.  They’ve spent all this money to put everything together, yet they can’t do anything with it. The major question now is how to use the data to get better outcomes.

Given there is so much data and so many different tools available to make sense of it, how do you go about helping clients move forward?

I offer up three ways to think about it.  First, if you’re grounded in an outcome it leads you to certain questions to interrogate the problem.  If I want to improve consumer relations or if I want to improve outcomes in healthcare, you’re at least grounded in what you want to do.  As you analyze the data the experience leads you to create certain domains and take what is a very unstructured data lake and start to apply structured boundaries.

Once you’ve done that you can start to use more advanced tools like cognitive analytics to apply structure to the data lakes, and natural language processing and machine learning to give you a way of letting the data give you hypotheses. 

The advanced techniques have gone beyond putting out a report and then looking at the graph to see what it says.  Now machine learning can actually create causal analysis and tell you what the hypotheses are for which variables, or which data domains are most influential to a particular outcome.  In healthcare, for instance, the machine can indicate why readmissions on a given protocol are high.  The causal analysis leads to that type of analytic.

The advanced techniques are probably where we’re called in the most to try to make sense of all the data.  Without advanced techniques there is no way to cut through it.  You have no knife to cut through the data.  Just running reports will create endless reams of paper that, frankly, you could never get anyone to interpret.

We bring custom-tuned algorithms to a lot of our engagements -- whether it’s in healthcare, supply chain or customer marketing -- and with machine learning algorithms and supervised learning cycles we can run against their data and create hypotheses you can investigate with your experience.

Interesting.  Are those algorithms particular to vertical markets or is there a common base you build from?

We have horizontals and verticals.  The verticals tune into markets like supply chain in manufacturing or supply chain in consumer products, protocols in life sciences, etc., and horizontals are used throughout.  [An example of the latter] is a sparse matrix completion algorithm we patented.  If the data lake you have for a particular problem doesn’t fill in all of the variables you need, it runs predictive algorithms to fill that in and creates hypotheses as to what the trends would be.  We just ran it against a diabetes protocol with a large healthcare company and, with 93% accuracy, we can predict who isn’t compliant with their diabetes protocol without having any of the compliance data associated with it.

Meaning you can predict who isn’t doing what they’re supposed to be doing?

Right.  They’re not doing weigh-ins, they’re not doing their exercise.  It doesn’t predict exactly what they’re not doing because we just started, but it predicts who isn’t compliant. We’re hoping to improve its accuracy into the high 90s, and then we’ll be able to scrutinize a whole hospital system because it becomes predictive at that point.  Before you have compliance problems you can show trended scores.  This person is trending towards someone that won’t be compliant.  And then you can have a nurse call and ask, “Are you having trouble taking your insulin?  Is there a reason you haven’t been doing your exercise?  Are you not getting to the doctor because you have a transportation problem?”  You can start to seek out particular issues in the protocol to try to help.

Is this something you leave behind after your engagement ends?

Deloitte has become a provider of products and software over the last four years.  That was my previous endeavor, creating the products and solutions portion of our company so I can talk fairly in depth about it.  We now offer Software-as-a-Service products and we leave behind, if you will, installed solutions.  We do both.  It’s just a matter of the problem we’re solving as to which one makes the most sense and which one is most economical. 

Where does the push for this type of analytics come from within an organization?

I would say the strongest push is from the business, not the boardroom.  We do lots of dashboards for executives, but typically you start with a business owner, and then after it’s working, the business owner is presenting it to the CEO and the board and it becomes more viral and usually works its way back down to the next business unit.

I was talking to the chief data officer of a financial firm and he told me that when they started some of their big data efforts they had to reconcile a bunch of differences in their core customer data. Is that typical of many organizations? 

Yeah.  Mastery of data seems to be a forever problem, to be honest.  The tools to master data have gotten better, but the rate of data being created is outpacing the capabilities of the tools.  It’s a very typical problem and it’s a critical path problem. It becomes central to almost every issue. 

Speaking of chief data officers, the title first cropped up in the financial sphere, but it seems to be showing up in more industries. Are you seeing new roles emerge as analytics gets higher priority?

Definitely.  In certain organizations the chief marketing officer is the chief data officer.  In other organizations, the chief digital officer is the one that owns data. One of the first things we try to understand is who owns it and at what level is the ownership, at what level of stewardship is true data ownership assigned.  We don’t necessarily encourage everybody to have a CDO.  What we encourage is proper ownership and governance of data so it can be prioritized.

Do most customers have that ownership ironed out?

Fifty-fifty.  I’d say in half the cases the companies are on a path, they’ll have a roadmap that says, we’re trying to improve our data security in these ways, we’re trying to improve our advanced analytical approaches in these ways, and they can talk about how they’re trying to improve their, let’s say, customer mastery and master data.

The other half don’t have roadmaps for all the domains, and in those cases we usually recommend pulling together a lot of these programs so you can leverage the efforts to create better business results for everything from supply chain to marketing, manufacturing, finance, etc.  Putting the programs together and lining them up usually gives much better bang for the buck.


Switching gears a bit, these new Internet of Things efforts create another big data problem, no?  What are you seeing?

We have a fairly large IoT practice now, and demand is climbing quickly.  In terms of it being a data problem, usually we’re brought in because there is a question as to the strategy of a particular outcome, because Internet of Things programs typically are expensive and multi-year and we’ve seen a few of them run amok fairly quickly.  In the last 3-5 years we’ve seen companies charge in and not get the payback they wanted.  Now the technology is far cheaper, more superior.  From our perspective, we would say that it’s ready-now, depending on the use case, and we’re seeing that in the demand and in implementations and return on investment.

Any other overarching things that I didn’t think to hit upon here that are important to address?

One trend we haven’t talked about is cognitive.  How do you build intuitive systems that I can engage with that starts to think like we do and starts to understand spoken words, starts to understand images and pictures?

Google predicts that more than 50% of searches are going to come in as audio, images and video in the next three years.  Let’s say they’re wrong by a year.  Let’s say it’s four years.  That change is still immense and will permeate through business, permeate through our processes, permeate through apps.

The ability of our enterprise systems to interpret the spoken word and unstructured data and engage with us in those ways is quickly moving to the critical path.  We have a lot of projects that are running on that.  It is a huge area of investment in many industries.

Any in particular?

Number one is healthcare, followed by financial services.  But all industries are going after it, including hospitality and leisure because of the consumer engagement aspects.  The hospitality  industry has been a huge user of consumer products because of the customer engagement opportunities.  I would say the more active you are in terms of engagement the more these technologies are going to help you.

What’s an example in healthcare?

A great example, because we’ve employed a few of them, is using cognitive technology to build an actual case, taking electronic medical records, pharmacy script history, family history, health risk assessments, and compiling that for a doctor before a visit, and actually highlighting, “You need to look at this portion of the blood differential (which is basically a blood test) because the LDLs are out of range and the large cells are the ones at issue,” and start to actually analyze and give advice.

Then the doctor can respond back with a question:  “Can you give me a medication that you would recommend?” And it would come back with advice – “I would recommend this medication, and I wouldn’t use this one because it’s contraindicated because of a family history for X type of allergic reaction.”

The machine can give you that advice in a real-time dialog.  The machine builds a cognitive chain and allows you to walk through various conversations and it learns how to follow a doctor along and figure out what they’re going to ask.  The first time he uses it it’s not going to know to pull all the pharmacy scripts and do that recommendation.  But when the doctor asks every time, it’s going to add it to the first thing it tells him and it will do that going forward.  The systems starts to get smarter without anyone writing software.  You train it rather than build it.  That trend is frankly a major turnover in terms of how we engage, but it’s also a turnover in how we build and how we think about systems and their usage.

Fascinating.  Sounds like you have a fun job.

Fun or consuming.  How about both?  I love it.

Copyright © 2017 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022