Microsoft’s Nuance deal might trigger a new IT spending wave

In Microsoft’s hands, Nuance’s Dragon speech-to-text technology plus AI could turn Cortana into an assistant that figures out what you’re up to and delivers data you need before you ask.

ai artificial intelligence ml machine learning robot touch human hand
kentoh / Getty Images

OK, help me understand this. Microsoft just spent almost $20 billion to buy Nuance, the company that supplies the popular Dragon speech-to-text tool. Microsoft already has speech-to-text available in Windows 10 and through Azure, and even a partnership with Nuance. Nuance’s single big jump in stock price in its history coincides with Covid and WFH, which is now (hopefully) passing. Nuance revenue boom? Apparently, ending. The Dragon product? Incremental to Microsoft’s current position. Health care vertical? Interesting, but not a cash cow.

Still, with all of this, Microsoft thinks Nuance is worth the 20 billion, which by the way is a third more than the rumored price and the most Microsoft has paid for anything other than LinkedIn. They’re either crazy, or there’s something else in play here. Let’s look at a much more interesting possibility, that this is really about Cortana, and what Cortana could become.

Cortana, Microsoft’s personal assistant, has died on all the platforms except Windows. It needs more than a mid-life kicker here to compete with Siri or Alexa or even Google, it needs a rebirthing. Microsoft has been talking about enhancing the Cortana experience for several years now, but never really suggested how that might be accomplished. Nuance might be the foundation, not because of Dragon as a whole, but because of what goes into it.

What sets Nuance apart is its ability to parse speech into text, taking into account sentence structure and meaning. It understands more of what’s said than Microsoft’s native tools, and understanding could be the next step for personal assistants. You’ve already likely noticed your assistant is getting...well...pushy. Instead of waiting for a question, it’s starting to butt in. Many people resent that enough to disable the assistant, but suppose its interjections were really useful? It’s not just that an assertive assistant is spooky, it’s that it’s not useful enough to overcome spookiness. Make it more useful, and it’s contributions could be welcome. What can add that utility is contextual services—services and information that fit into the context of our lives.

I worked with a hospital chain on next-gen technology, and one of their big problems was that the Grand Rounds that were the center of the teaching experience were all videotaped, but finding stuff in them was almost impossible. Nuance could certainly decode enough medical-speak to do a good indexing job, to make finding the recorded discussion of specific topics easy, and to let a hospital get the most from their teaching experiences. That’s the low-apple application in health care that some believe justifies Microsoft’s acquisition, but I think that’s still not enough.

Instead, let’s suppose that during Grand Rounds, an intern asked whether a patient might have peritonitis, an often deadly infection in the abdomen. Suppose nobody picked up on it. Could Nuance, transcribing the whole event in real time, detect the reference to a medical condition the patient wasn’t being treated for, one that’s consistent with symptoms, and trigger an alert? Could Nuance check medications referenced against the chart? These are contextual services, because the information is delivered while everyone is still clustered around the patient.

Healthcare examples like this are important to understanding the value of the deal, given that Microsoft cites the industry’s opportunities explicitly in their announcement of the Nuance acquisition. They’re not the only example, of course. Almost any professional whose business is based on multiple conversations could employ the same set of tools in a very similar way. It demonstrates that understanding speech is not only important in transcribing it, but in giving IT a greater role in supporting what the conversations are about.

Another example, this one from the transportation industry. A yard worker is trying to find a specific axle in a vast yard of rail junk. A phone conversation with the supervisor of the yard isn’t helping, but could Nuance, transcribing the call, recognize the term that identifies the axle and check records, or even trigger an AI search of images, to locate the missing part?  Could be.

Customer support may be the iceberg whose tip my example describes. Just like a worker/supervisor conversation, customer support conversations involve gathering and presenting answers based on what a customer is asking. Having an application culling the conversation for important words and keeping a list of prioritized references and responses would not only make customer support easier and more responsive, it would make it more accurate. What industry doesn’t depend on customer support?

The beauty of contextual services is that all they lack today is the context. We have reams of corporate data, and for a decade or more, the answer to how to unlock it is to get better analytics. Analytics means reports, but while reports may help guide practices, they’re hard to work into specific tasks. Reports and analytics are old-think; corporate IT organizations need to focus on getting that storehouse of data coupled directly to worker activity. To do that, we have to know what the worker is trying to do, and conversation analysis could be a big step in that direction.

Then, of course, there’s the consumer and the current applications of a personal digital assistant. The ability to understand speech could take those assistants beyond answering questions like “Who was the last shortstop to bat over .300?”  It could make that personal digital assistant into a virtual member of every conversation, listening and analyzing and uncovering insights, and when one hit a threshold of importance, stepping in politely to help out.

If personal digital assistants evolve this way, it could have a profound impact on all of networking and IT. First, mobile devices become the paramount user conduit to information and entertainment because they’re always with us. Mobile security becomes critical, and the business case for 5G becomes a bit more credible, too. The shift to a mobile agent for everything also promotes a new vision of services, one where there’s a permanent cloud-resident piece of each of us, working to serve our interests. Imagine how much could be added to that!

Today we connect sites, which is the wrong approach if you’re trying to empower people who are moving freely. A future network to support a contextual future has to connect to people and has to be able to gather information and insights from many different sources to create a useful set of recommendations to those people. We started business connectivity talking about the company network that everyone and everything was on. Then we admitted that wasn’t the case and started thinking, with things like SD-WAN, about “getting back to” the company network. In the future, there may be no such thing as a company network, only people networks and resource access rights.

I think this is what Microsoft is planning for, in which case CIOs need to plan for it too. Start thinking about the way that stored information and legacy applications could be directly coupled to workers at the point of their activity. Start thinking about how to harness conversations, emails, location data, and other things to create that critical context. Plan now, and when somebody like Microsoft unveils a transformational component of context, you’ll be ready for it.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2021 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)