• United States

The WAF backed by artificial intelligence (AI)

Oct 02, 201810 mins
Artificial IntelligenceMachine LearningNetwork Security

AI and machine learning solutions are beginning to surface as major successes against DDoS attacks and more specifically against the application DDoS world.

orange monitors with lock icon network security cyber threat
Credit: Getty Images

The web application firewall (WAF) issue didn’t seem to me as a big deal until I actually started to dig deeper into the ongoing discussion in this field. It generally seems that vendors are trying to convince customers and themselves that everything is going smooth and that there is not a problem.

In reality, however, customers don’t buy it anymore and the WAF industry is under a major pressure as constantly failing on the customer quality perspective.

There have also been red flags raised from the use of the runtime application self-protection (RASP) technology. There is now a trend to enter the mitigation/defense side into the application and compile it within the code. It is considered that the runtime application self-protection is a shortcut to securing software that is also compounded by performance problems. It seems to be a desperate solution to replace the WAFs, as no one really likes to mix its “security appliance” inside the application code, which is exactly what the RASP vendors are currently offering to their customers. However, some vendors are adopting the RASP technology.

Generally speaking, there is a major disappointment at the WAF customer end because of the lack of automation, scalability, and coverage of the emerging threats which become essential as modern botnets become more and more efficient and aggressive. These botnets are made now by an Artificial Intelligence (AI) functionality on top of the “old” Internet of things (IoT) botnets which are becoming more and more multi-purpose in its ability to attack with different vectors. The functionality that the classic WAF offers have become a matter of discontent, while next-generation WAFs, which were born as AI systems that may address such a multi-dimensional threat complexity, are quite rare. 

There are not so many artificial intelligence/machine learning (AI/ML) solutions in the cyberdefense segment of the network and application defense. However, more AI and ML solutions are beginning to surface as a major success against distributed denial-of-service (DDoS) attacks and more specifically against the application DDoS world, which was shown by L7 Defense with its unsupervised learning approach. Such technology may also play a crucial role in the WAF solutions, as defending against the same multi-purpose botnets.

We are beginning to see movement in the use of ML for the WAF in the cloud. This is evident by the fact that this year Oracle purchased Zenedge, a provider of cloud-based, ML-driven cybersecurity solutions. Zenedge (now Dyn since Oracle’s purchase of it) offers a WAF, which shows signs of automation needed by Oracle cloud offering, although it is not enough to make a huge difference from traditional WAF functionality, as lack a significant technological advance in covering the essential spectrum of threats much better than existing technologies.

AI and ML are the tools used for predictive analytics. Undoubtedly, they are a must for the future and survival of cloud-based WAF environments.

Issues with the classic WAF and the cloud

The classic WAF has scalability issues. We can play around with server load balancing and elastic services, but the scale is not something that was initially built into the WAF. Although, you may generate instances the fact is that it was not designed for elasticity. That actually means that classic WAF’s were not built for cloud architecture. As for that, they were remodeled, but scaling and automation was still an issue. It seems to have been accepted not because it’s good, but because there is no alternative.

Moreover, WAFs were never flexible against dynamic threats made by advanced botnets as were mostly made to protect from simple “SQL injection” like attacks. They are definitely not good to protect against attacks such as credential stuffing.

Both aspects of the problem demand manpower to customize and maintain WAF in any environment. However, what was good enough for the on-premise environment cannot be buried now at the clouds fully automated environment. 

WAF and application layer (DDoS) attacks

Application DDoS is a major private case for the WAF. It was not so long time ago that it was cloned alongside as a complementary solution to the WAF, as the problem became more serious and demanding, while DDoS has always been viewed as an operational problem rather than an application problem. An application layer DDoS attack is where attackers target the layer 7 OSI (Open Systems Interconnection) layer.

They hunt for specific website functions and features with an intention to disable and disrupt them. The attacks may come in low traffic rate usually less than 1 Gbps blended and mostly undistinguished from normal traffic and therefore are highly challenging being detected by the traditional network defense tools such as the WAF’s, as are usually operates based on known signatures or by detecting bold behavior patterns, none is relevant to this category of attacks.

As for the application layer DDoS, we have the same multi-purpose AI empowered IoT botnets with, but in this case using specific, damaging applicative vectors while using the same camouflage techniques, to mislead the defender that they have been using in the DDoS world. As a result, it makes sense for the next logical step that the WAF should now be connected to the DDoS problem.

Now, if you don’t use AI with the WAF, you must be ready to ‘Fail to prepare, prepare to fail’. We started with static requests, then moved to dynamic requests. We migrated from scripts with loops to automated AI-based attacks. The automatic spreading of malware was a major turning point and now we begin to see fully automated DDoS attacks.

There is evidence of the fact that the botnets combined with the evolution can be applied to attack the WAF. However now, instead of trying to take your system offline with a DDoS, bad actors are attempting to take data out of the system or damage your data in some way.

There are still two goals, which have now been gathered as complementary. The first is that operationally your application must withstand a DDoS to remain up and running. Secondly, no bad actor should infiltrate the application, abuse your customers and damage the data.

AI enhanced IoT botnet attacks

Using the same principle described in my article last year on Network World – “The rise of artificial intelligence DDoS attacks” – the attacking vectors themselves might be classic ones such as SQL injection or more updated such as credential staffing. Standard attacking tools such as w3af and Grabber can be used to perform complex, multi-vector attacks, for these kinds, wherein, you are looking to attack specific functionality within the web domain. Yet, you have ‘zero-knowledge’ about the defender and what way is it going to hit.

AI enhanced attack mechanism is for efficiency. Applying specific vectors, which are pointing from credential stuffing to abuse customer accounts, will send multiple requests with username and passwords, in the hope that something will catch something. It follows the belief that people are using the same passwords in multiple places.

With AI, it can handle the battlefield while changing tactics as per the defender response, which is even more fatal. On the attacking side, AI can optimize the attack towards a specific target and do it automatically without human intervention. Therefore, the same advances in IoT botnets, principles and C&C servers can be used in other applicative attacks that should be addressed by WAF.

There is something that is coming dynamically towards you with the same intensity that was presented in the form of the AI DDoS C&C server. Therefore, it has the potential to slide under any radar or threshold. AI has the ability to optimize on the fly in order to go unnoticed by a fixed pattern that the defense is using. Here, a static defense against flexible attack will not work.

Defense side

On the defense side, traditional methods are using some dictionary or database of known vulnerabilities. However, this is not working anymore because the vectors can now be randomized. They can come from multiple sources with multiple patterns.

The vectors won’t be detected, and the defense side will not be dynamic enough to mitigate efficiently. If you are looking for specific patterns, you will fail with a hard landing. 

How can this be resolved?

I imagine multiple AI machines working at the resolution of the web page and API, named as WAF-AI. This way, you now have a system that independently guards each of your websites automatically as a standalone. Each AI machine guards a specific webpage or API.

Without high-resolution machine learning of the applicative baseline, it is not possible to efficiently defend against multi-vectors or human mimicking attacks or simply minimize the request rate attacks.

The WAF should be able to combat a variety of multi-vector attacks such as SQL injection, remote command execution, remote file inclusion, local file inclusion, PHP injection, LDAP injection, Memcache injection and cross-site scripting (XSS); all at once. We need the expertise to identify these types of attacks and categorize them with the utmost accuracy from the first request. Objectively, this is the critical part to identify from the ‘very first request’.

The false positives and false negatives should be restricted, which would be close to zero at the web page level. If you know what you are looking for and you are accurate, you won’t miss a beat. Remember you are in a war zone here. The DDoS is only a part of what we will see in the WAF. The WAF is the core of many vendors, while the DDoS is core to other vendors. However, in the past, DDoS has been something that has been set aside from the WAF.

However, the fact is that the WAF is well connected to DDoS. They are a part of the same issue. Eventually, with this thinking, we can make a difference.

The right way forward

We need to find a solution that is using the same AI concepts from Applicative DDoS. We need to add specific categorizations on top of the possibility to algorithmically dynamically identify all types of attacks on the fly.

Firstly, you need to identify the type of attack, for example, an attack on the login page and then take preventive measures to stop the attack. Additional capability must be given to the WAF for it to figure out with precision if there is some kind of traffic aggregation in your web interface coming towards you.

What’s needed is the ability to identify what’s coming in via the specific fields to impact the specific web pages. This is the highest possible resolution. If you work at this resolution, you can control anything that is coming to the application. This is where you actually need to be in order to stay guarded.

Companies like L7Defense are applying the same unsupervised learning algorithm from used with high excellence as for the Applicative DDoS challenge, with the ability to identify any WAF-related attack from the first request. They protect from classic threats on web systems (OWASP 10) to more sophisticated automated threats (OWASP 20), as well as from the attacks on APIs. From their demo, it seems to capture very complex attack scenarios, with multiple zero-day patterns used by the attackers, while keeping the level of errors to the right side, very close to zero false positive and negative.

From their online demo, it seems that F5 is making some progress, too. After my extensive research, I have found that most likely it doesn’t support multi-vector capabilities or smart flexible pattern. From their DDoS Hybrid Defender online demo, it seems like it is classic behavior analysis based on manually set or global adaptive thresholds. However, they don’t claim any machine learning capabilities, while traditionally behavioral analysis didn’t make it for the WAF.


Matt Conran has more than 19 years of networking industry with entrepreneurial start-ups, government organizations and others. He is a lead Architect and successfully delivered major global greenfield service provider and data center networks. Core skill set includes advanced data center, service provider, security and virtualization technologies. He loves to travel and has a passion for landscape photography.

The opinions expressed in this blog are those of Matt Conran and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.