This column is available in a weekly newsletter called IT Best Practices. Click here to subscribe.
Who's that coming to your website? Is it friend or foe? Is it a customer wanting to buy your products, or someone or something wanting to steal your web content? Is it a community member that wants to post a relevant comment, or a spammer intent on planting junk links and content in your open comments section? Is it a real person clicking on an ad, or a web bot driving up fraudulent clicks?
Web applications are increasingly being subjected to automated threats such as click fraud, comment spam, content scraping, abusive account creation, and more. These and other illicit or unwanted activities are described in detail in the OWASP Automated Threat Handbook for Web Applications.
This article is about one vendor’s approach to defeat unwanted web traffic, whether it's automated or human-driven. I should point out that there are desirable and highly useful web bots too, such as the web crawlers used by search engines to find and index content, and chat bots that are used to fetch information and bring it into chat rooms where humans meet. Any solutions designed to defeat malicious bots have to allow the good ones to proceed.
Using this basic interrogation approach, Distil says it can catch the intercepted proxies that hackers use to automate malicious or unwanted requests. By weeding out the requests that are lying about their true identity, Distil claims it can eliminate 70% to 80% of the threats at the outset. That's the easy part.
Then there are bots that have gotten to be so advanced that they are automating an actual browser. In essence, they are a man in the browser. Distil uses machine learning to distinguish these bots from real visitors.
Fundamentally a bot browsing pattern is going to look different from legitimate traffic. Distil profiles dozens of metrics to reveal anomalies. For example, what time of day are they coming in? Where did they come from? What was the previous site they visited? What was their entry point to your site? How did they navigate through your site? What pages did they go through? By profiling these and other bits of information, Distil has learned that bots end up being either really random or really systematic. The patterns help Distil try to identify what is not real.
Most security solutions today use an access control list (ACL) or other blocking mechanism based on a single IP address. When a bad actor is discovered, the solution blocks their IP address. As mentioned, APBs often use multiple IP addresses. Blocking one of them does little good when the bad guy just changes to a different IP.
In contrast, Distil blocks bad actors based on a digital fingerprint, which is comprised of a browser and the machine it is running on. Even if the bad actor shifts IP, if the request profile is has the same fingerprint, Distil can identify it as bad too. This increases the burden of obfuscation for the bad guys. Once Distil identifies a fingerprint as belonging to a malicious actor, that knowledge gets shared across Distil's customer network to block it for all customers.
Some hackers have advanced beyond attacking web applications and have started to go after the API calls that power the web app, or they've figured out native mobile apps that power the API. The API calls have access to the same database, same infrastructure and the same data that the website does, so hackers just program against the API to get what they want. Distil Networks addresses API security from three different angles: web, server-to-server and mobile APIs. The solution acts as an automatic shield against API hijacking, scraping and abuse.
Automated web bots are used against companies in so many ways that the use cases are practically unlimited. The sophistication of some of the uses is surprising. For example, there is a financial hedge fund that regularly sends bots to the website of a publicly traded company to gather information about product inventories. By gathering this same information every week, it's possible to make deductions of how well the products are selling. In other words, if there were 100 widgets in inventory last week, and there are only 70 this week, then 30 widgets must have been sold in that timeframe. This provides the hedge fund with approximate sales data that it can use to make decisions about buying or selling the company's stock ahead of its earnings call.
It's stunningly clever for the hedge fund, but it probably leaves the public company feeling like its financial data was stolen, which in a sense, it was. Bot detection and mitigation is the way to close this and many other types of web vulnerabilities.