This column is available in a weekly newsletter called IT Best Practices. Click here to subscribe.
The number of new malware variations that pop up each day runs somewhere between 390,000 (according to AV-TEST Institute) and one million (according to Symantec Corporation). These are new strains of malware that have not been seen in the wild before.
Even if we consider just the low end figure, the situation is still dire. Especially when it comes to advanced persistent threats (APTs), which are the most sophisticated mutations of viruses and malware, which are very effective at going completely undetected by many of the cybersecurity technologies in use today. Even security experts tell organizations to be prepared for "when" and not "if" an attack is successful.
Over the years we have seen the evolution of different types of detection technologies. It started out with signatures—a technique that compares an unidentified piece of code to known malware. With a million new pieces of malware hitting the Internet each day, it's clear this approach is obsolete.
The next evolutionary step was heuristics, which attempts to identify malware based on behavioral characteristics in the code. This evolved into looking at the behavioral characteristics of what the code does once it is executed. This led to sandboxing, in which unknown code is run in a virtual environment to observe if it is malicious or not.
Most recently we have seen the emergence of machine learning to detect malware coming into networks. This technology uses sophisticated algorithms to classify the behavior of a file as malicious or benign according to a series of file features that are manually extracted from the file itself. The machine needs to be told by humans what parameters, variables or features to look at in order to make the decision. Often machine learning cybersecurity solutions are used to identify a suspicious situation, but the final decision as to what to do about it is left to a human analyst.
Now the next evolutionary step has come to market. Deep Instinct claims it has the first cyber security solution on the market today based on deep learning. Deep learning is an advanced form of artificial intelligence which uses a process that is close to the way human brains learn to recognize things. Deep learning could have a big impact on cyber security, especially in detecting zero day malware, new malware and very sophisticated APTs.
Once a machine learns what malicious code looks like, it can identify unknown code as malicious or benign with extremely high accuracy and in real-time. Then a policy can be applied to delete or quarantine the file or perform some other specified action.
So, how does a machine learn to identify malware? It's similar to the way people learn. Suppose you take a child to the park and show him a dog and say "this is a dog." You show him various dogs and this supervised training process helps the child learn. You don't explain why this is a dog; you just tell him it is a dog. At some point the child will recognize an animal that he has never seen before as a dog, and he will do this in real-time and with the highest level of confidence. You can show him a photo of a dog and he will recognize it is a dog. You can remove 20% or more of the pixels from the picture and he will still instantly recognize it as a dog.
Deep Instinct uses this process to help its core engine learn to recognize malicious code. The company collects hundreds of millions of files of every variety—Word files, PDFs, executables, etc. The type of file is irrelevant because deep learning is agnostic to the type of data. Deep Instinct scientists run tests on these files to classify them as either malicious or legitimate. Then they feed this massive data set to their engine, or artificial brain, as training. The end result is a prediction model that the company calls the instinct. This instinct is exactly like a small child viewing a dog that he has never seen before and being able to say with assurance and in real-time that it is a dog.
Deep Instinct packages its prediction model – the instinct – into a small footprint agent. This agent can be put on any type of device – PC, laptop, tablet, smartphone, server -- running any operating system. When a file is opened or downloaded a process begins in which the agent breaks the file into its smallest pieces and runs the pieces through the prediction model. The instinct then uses its training to determine if it is malware or not. This all happens in about five milliseconds. Everything happens on the device in real-time, and this enables the decision of deleting the malware, blocking it, or whatever the enterprise wants to do with it—before it does its damage. What's more, there is no impact on the user experience.
Because the agent has everything it needs to conduct the analysis of unknown files, it is autonomous of the enterprise network or even the Internet. That means a device is protected whether it is online or offline. For example, a worker can be sitting on an airplane with his device in airplane mode. If he inserts an infected USB stick, the agent on the device will analyze the files on the stick in pre-execution mode and find the malware before it can infect his device.
Deep Instinct also has an agentless version of its solution which leverages the prediction model and protection capabilities but not on the device itself. The company says it can be connected to any type of a gateway via APIs or SDKs. For example, this mode of Deep Instinct is integrated with FireLayer's cloud access security broker to do malware detection and prevention for cloud-based files and applications.
Deep Instinct continuously trains its engine – the artificial brain – so it is able to recognize new malware. This improves the prediction model and gives a higher level of confidence in identifying malicious files. Despite these continual updates to the instinct, the agents on devices can go months without updates and still be highly accurate. Deep Instinct says an agent that is not updated for four months only degrades by 0.5% to 1% in its ability to detect malware.
Tests conducted by Drebin University and Siemens CERT benchmarked Deep Instinct against top defense solutions on the market. In attempting to recognize mobile malware, the top 10 security vendors had an average score of 61.5% accuracy. Deep Instinct's solution was 99.86% accurate. In another test on a dataset of 16,000 APTs, Deep Instinct recognized the malware 98.8% of the time.
Implementation involves installing the agent on devices and an appliance on your network for policy administration, a dashboard, and reporting. The company says it will do proof of concepts for potential customers using dataset files so you can compare this solution to your existing cybersecurity tools.