Machine learning: Are we there yet?

Machine learning can help us tackle difficult problems in networking and security

Machine learning: Are we there yet?

In my recent blogs, I have written about automation tying the network to other domains of IT, and how it’s a capability available today that you should start using.

Machine learning is another hot topic. While the timeline is several years out for many machine learning applications in networking, it has the potential to be one of those rare technologies that comes along every few decades and fundamentally transforms how networks run.

+ Also on Network World: 4 ways Google Cloud will bring AI, machine learning to the enterprise +

After all, leading companies such as Amazon, Apple, Facebook, Google and Baidu are already transforming their products and business processes with machine learning.

Hopefully, as the technology matures, much of the inner workings of it will be deep inside the systems and clouds of your vendors. But I wouldn’t be surprised if your company begins looking to you soon to help support early machine learning applications or experiments.

Therefore, I thought it was timely to sit down for a Q&A with one of the industry’s foremost machine learning experts, David Meyer, a long-standing leader who has spent the past four years investigating how ML can help us tackle difficult problems in networking and security.

Q: What is machine learning?

In traditional programming, the programmer establishes the rules for generating the output. In machine learning, you provide data, along with observed outcomes (e.g., supervised learning), and the role of the software is to learn the rules. The image below illustrates the contrast in the approaches.  

Machine learning software “learns” by seeking to discover the processes that generate the observed outcomes of particular inputs. The training set of input data and outcomes determines the prediction accuracy of the model. After this training phase, the trained software is ready to make inferences. Given a new piece of input data, the trained model can now infer the expected outcome. In some cases, such as online systems, the software continues its learning.

traditional programming vs machine learning Brocade

Q: Can you provide examples that work today?

Search engines are an example of machine learning software that we all use daily. Search engines improve their results through continuous online learning in their algorithms. This learning not only includes past experiences with clicks, but also typically includes a large set of additional factors based on the performance outcomes defined for the search engine. Ad revenue is a great example of this.

Additionally, those who use Apple’s Siri or Amazon’s Alexa products may notice that personalized results improve over time as the device gains knowledge of a user’s specific preferences.

Q: Where does machine learning software run?

One option for running machine learning software is in the cloud. Google’s search engine, Apple’s Siri and Amazon’s Alexa are examples of this. Another option is to run a trained machine learning program in the traditional model—where the software no longer learns, but the user gets the advantage of all its prior learning through inference.

Optical character recognition to "read" written text and natural language processing to "listen" to spoken words are examples in many traditional software packages today. The advantage of both of these approaches is providing access to the intelligence without requiring the compute power locally that is required to run machine learning software.

Q: With mapping of inputs and outputs as a fundamental principle—how well does machine learning apply to networking?

An advantage of machine learning is that it can tackle real-world problems of high complexity. You may have read how machine learning systems are becoming increasingly accurate in diagnosing health problems from a set of symptoms. This is because we can train machines using data from extremely large groups of patients, such as those who have had strokes and those who did not, and instruct the software to learn how to tell them apart.

We have these same types of scenarios in networking, where using large amounts of observed inputs and outcomes means machine learning can help us anticipate what is happening. Machine learning will be able to analyze the structure of collected data to find patterns we didn’t know were there. This will address a far wider range of networking and security behaviors than those we are able to describe numerically in functional relationships.

Q: Where can we apply machine learning in networking?

We will apply machine learning to networks to understand the processes that generate the datasets we observe so as to classify events and predict and respond to various and previously unknown types of network and security events.

Right now, the types of problems that machine learning can tackle in networks are limited by a lack of behavioral data from a large sample set of networks, especially large networks. In addition, we are limited by our ability to label the data and to describe the outcomes in a common way. Yet there are important problems we can tackle in the next two years, such as security and anomaly detection, component failure, congestion, and optimized network orchestration.

In the long term, machine learning could fundamentally change the algorithms by which we run networks, even an algorithm as foundational to the network as routing. Today, our routing rules are simple, numerical math based on limited factors for the link metrics. How much more efficient could our networks be if they could factor a much larger set of real-time states to determine the best ways to move traffic around? Over the next 10 years, machine learning could change almost everything about how we administer networks and how they run.

The first use cases for machine learning will be in critical problems where we have observable outcomes. Advancing our existing analytic and prediction tools beyond simply processing in statistical or numerical ways to truly consider a wide range of observed network behaviors is a hot area of research for many vendors, including the following developments:

Security and anomaly detection

We have known examples of malicious network traffic on which to train machine learning systems so that they can alert network and security administrators of potential active risks. In the future, machine learning will not only aid in detection, but also in the prediction and remediation of threats. As more and more devices join the network through the Internet of Things and machine-to-machine connections, rapid-learning capabilities will be critical to a strong security stance.

As in all tools, it won’t only be those who are protecting networks that will use these new systems; those who threaten the network will surely apply machine learning to find vulnerabilities. So, security is indeed a critical area of the network for us to apply machine learning quickly.

Prediction and mediation

Machine learning systems are now learning how to analyze network behavior and predict network protocol anomalies that signal nascent problems in the network, such as congestion. They also can learn to recognize other issues, such as component failures or performance slowdowns in the earliest stages of their origin, to proactively mediate and avoid user-impacting events.

Network orchestration and control

The complex dynamics of network orchestration and control is another area where we are likely to see applications of machine learning systems. Machine learning will enable these systems to adapt to evolving environments, optimizing the resources made available in virtualized networks, as well as the configuration and management of the network.

Automation tools

The automation of networks is a rich area to apply machine learning because of the vast amounts of data created and managed. For example, last year KDDI Laboratories announced the world’s first proof of concept (PoC) for an AI-assisted automated network operation system. KDDI was able to show a system capable of learning and predicting when hardware or software anomalies would lead to catastrophic network failures.

Additionally, the PoC system could initiate recovery plans through an integrated management system. Others in the space are exploring how machine learning can be used to link network automation tools to automated processes beyond networking, such as those for DevOps.

Q: What do you recommend network architects do to get ready?

If you are responsible for architecting a network of any size, keep following the machine learning space. I recommend you start with a Coursera course by Andrew Ng, associate professor at Stanford University. From there, a Google search on machine learning and networking will yield additional resources, including my recent podcast.

You will want to know enough to assess the potential value of machine learning tools from vendors for your particular network and the considerations in deploying it. Once you have the foundation, you may see some specific challenges in your network that are worth experimentation in the development of a machine learning system to see how it can help you.

Q: What do you recommend network and security administrators do to get ready?

Get more comfortable with basic tools and simple programming for automation, such as those used in NetOps. This foundation will not only help you operate your network today, but also prepare you to benefit from early machine learning capabilities that can help you protect your network and operate it more efficiently. You may wish to start using machine learning tools available today for your personal life, such as Apple’s Siri or Amazon's Alexa, to experience how they learn over time about preferences and needs.

Copyright © 2017 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022