Bots mitigations overview with Advance WAF - Anti Bot engine

With more and more bots traffic hitting web applications it has become a necessity to manage bots accessing web applications. To be able to manage bot access to your web application you must first be able to detect them and only then allow or deny them.

Those actions can be done by F5 advance WAF and this article will provide an overview of bot mitigations capabilities for versions 12.x , 13.x & 14.0

Advance WAF dos profile is a powerful bot management tool with various options to deal with bots. We classify them into two main types:

Anomaly based detection – anomaly engine to identify increase in RPS generated by bots
Proactive bot defense – a dedicated anti bot engine to identify bot activity

Let’s review each one of them in more details.

Anomaly detection engine

Bot traffic most often generate increase in RPS. Advance WAF anomaly engine has several detection mechanisms that identify an increase in traffic based on different criteria:

By source IP – detects increase of request per second which is the classical indication for a single bot generating traffic from a given IP. “By source IP” measure ratio increase and fixed increase request per second (RPS) from any IP accessing the web application.

The ratio RPS anomaly detection is the calculation of the past request per second in compare to the current one. Ratio based should be used when the bot is originating from an already known IP that used to send X amount of traffic and now send 2X or 3X times more traffic. The prevention policy will be activated once this ratio per given source IP is reached.

The fixed RPS detection anomaly is the limit of request per second that above it will consider an attack and will trigger the prevention policy. Fixed RPS detection should be used for new source IP’s with spikes that pass the fixed threshold which above it considered to be an attack.

By Device ID – detect increase of RPS from a given device ID. The Device ID is the actual http agent that generates the HTTP request which makes the source identification increase more accurate. The reason Device ID exists is that the internet works with gateways also known as NATed traffic ( Network address translation ) that represented by single IP with many devices behind it. Blocking the single source IP will cause to blocking all legitimate users behind this source IP.

Similar to “by source IP “there are two anomaly types: ratio based and fixed rate which are calculated as mentioned above.

It is recommended to use device ID with the source IP detection to isolate the attacking source behind a source IP. Note that Device ID is done with java script injection, so check it before using it.

By geolocation - Sometimes the bot generates traffic from a specific country. This can be detected with the geolocation detection that measure RPS arriving from a specific country.

It is recommended to use geolocation anomaly when traffic is not expected from those country with a low fixed RPS rate.

By URL – detects increase of RPS on a specific URL which helps us determine if a bot is hitting the web application but since it runs low and slow it is hard to detect the source IP.

It is recommended to use URL anomaly when by source IP / by Device ID can’t detect RPS increase due to low and slow attack.

By site wide – detects increase on the entire web application (FQDN) Site wide detects RPS anomaly on the entire virtual server (most often the app FQDN) and measures both source IP’s and URL’ to try and conclude if there is an increase of traffic due to bots activity that are running under “the radar” without being detect by other anomalies.

Each of the detection methods owns a prevention option that can apply to the detected source. This is where the actual mitigation occurs to stop the attack. There are three prevention options available for each of the detection that was introduce above:

Client Side Integrity Defense (CSID) aka browser challenge – is the pioneering Client side injection by ASM (from 2008) to identify a browser or a bot. when a request arrive to the WAF DoS profile the request is being held and a CSID script is send to the originating source. The script is a java script that checks if the sources can:

Support JavaScript
Support HTTP cookie
Execute a computational challenge

CSID then sends the answer to the WAF DoS profile for evaluation that if qualify as a browser will be allowed. Then the initial request that was held will be reconstruct and be sent to the web application. If the answer from the CSID will not be qualify for the tests mention above then the initial request will be dropped.

CAPTCHA is the second prevention policy option that each of the detection methods has. CAPTCHA is the ultimate human or bot test and many web sites uses CATPCHA to challenge unknow sources that access their app with it. The CAPTCHA challenge is present to the unknown source in the same way CSID, but unlike CSID that is done under the hood with no user intervention, CAPTCHA is visible to the user. The AWAF DoS profile CAPTCHA can be fine tune to fit the look and feel of the web site to get better usability.

Request blocking - Request blocking has two options:

Block all – any source that pass the detected thresholds will be blocked at the TCP IP level.

Rate limiting – any source that pass the detected thresholds will be rate limit to half the traffic or to the historical RPS.

Note that block all will end the attack, so it should be use when we are sure that the source is indeed the attacker. Rate limit on the other hand is slows down the attacker but will also allow other users to access the app.

Those three preventions using different types of approaches: CSID is done with no user intervention while CAPTHA is visible. CSID and CAPTCHA try to understand who the offending source is (bots or human) and request limiting is indifferent to the “identity” and limits / blocks the offending sources.

Reporting

The Dos visibility (AVR module) provide visibility on the traffic that access your web application based on the anomalies and mitigations that were triggered by the detections and prevention mention above. The application event report provides details on the actions done by the dos profile and useful information can be found such as: time of attack mitigations that were apply and additional information.

The graphs that are shown in the image below and located under Securtiy -> Reporting -> DoS -> Dashboard

Anomaly Summary

The anomaly engine in the advance WAF dos profile is a TPS is a power full anti bot detection that can identify bots activity by monitoring the amount of request on various entities such as by source Ip, geolocation, specific URL , etc.

By source IP – detect increase in RPS on bots – use to detects bots
By Device ID - use to detects bots behind Nated IP sources
By URL - use to detects bots that focus on a single or fixed URL’s
By Geolocation - use to detects bots when they originate from a specific country
By Site wide – when the others detections don’t trigger but the site still experiencing load. (low and slow attacks)

Once the anomaly engine identifies an increase in request the prevention policy is applied on the source that triggered it. Client side integrity defense checks that the source is a browser and if not the source will be blocked. CAPTCHA check is to identify a human and rate limit will slow down the source.

Client side integrity defense – use it when you want to allows only browsers to the site and no user visibility to this check is needed.
CAPTCHA – use it when you want to evaluate a source for human or bot and user visibility to this check is ok.
Block – rate limit – use it when you don’t want to block all the traffic from / to a specific source but you do want to slow down the attack
Block – blocking - use it when you want to block the offending source and reset his connection.

Proactive bot defense

The second engine available in advance WAF is the anti bot engine which is also part of the ASM DoS profile. The anti bot engine is a dedicated feature set for dealing with attack originating by bot and the mitigations focus on the client side level of legitimacy.

Bot signatures

The first mitigation for bots is the bot signature mechanism that match user agent stings to detect known bad bods. Bot signature includes two pre define signatures sets: benign and malicious which provides a way to monitor the site bot traffic or to block unwanted bots.

Bots can be manage and allow specific bot to access the site with or without reporting and to report an block the bot. the pre define bot signature should be used to understand the bots traffic that access your web site. During attack those signatures can be protect your site when they are triggered by offending sources.

Custom bot signature can be created for specific bot traffic. Custom signatures can be written in simple mode for quick usage or in in advance mode that allows writing of more granular signatures Manual for bot signatures.

For example, identifying a specific user agent on offending source which is not in the bot signatures list. Adding the user agent to the bot signature pool will prevent the attack from this bot.

Anti bot Impersonating

Advance WAF also has a powerful mechanism that validated user agents stings to prevent from bad bot to impersonate as good bots. Since user agent can be easily forged good bots includes domain name to verify who they claim to be by issuing a reverse DNS look up.

for example: Googlebot/2.1 (+http://www.google.com/bot.html)

Since google is a good bot it should be allowed based on the user agent. However, only when doing a reverse DNS check on the user agent FQDN can know for sure that this is truly google bot arriving from its known IP as expected.

This configuration prevents most of the unwanted bots and improve application performance as various reports claim to see around 50 % of the application traffic are bots. This anti Impersonating bot engine can reduce the amount of bots traffic to the web application and is considered today as best practice.

To use the anti bot impersonating engine the DNS resolver and DNS look up list must be defined

Anti bots capability checks

Bots can be of various types and sometimes the only way to detect them is by inspecting their nature which is what the Proactive bot defense does. The anti bot engine is a sophisticated set of checks that has the following configuration:

This configuration makes the proactive bot defense easy to use and filter the bad bots. The concept of the anti bot engine is to gradually inspect the source:

CSID – are you a browser that support cookie, Java script ?
Capabilities script – are you who you say you are ? comparing the browser answer to what the Anti bot engine sees.
1. If the score is from 0 to 59 it is assumed to be a browser and the request can pass through.
2. If the score is between 60 to 99 it is declared unknown and a CAPTCHA is sent to unknown sources. If the CAPTCHA challenge is solved the client is allowed in. A failed CAPTCHA challenge results in a connection reset.
3. If the score is 100 then the request is reset
CAPTCHA – are you a human that can type characters ?

The configuration reflects those options:

If Block Suspicious Browsers is unchecked and CAPTCHA is unchecked à send CSID Challenge
If Block Suspicious Browsers is checked and CAPTCHA is checked à send Client Capabilities challenge and give it a score:
- If score is good, then allow access
- If score in doubt send a CAPTCHA for human verification
- If score is bad, then block it
If Block Suspicious Browsers is checked but CAPTCHA Challenge is unchecked à do not send CAPTCHA and only block if the score is more than a human

Operation mode includes two modes:

Always – use it when under attack for immediate response to apply proactive bod defense on the entire virtual server

e.g. the site is under DDoS and I want to mitigate all bots traffic now.

During attack – use it when other detection is triggered, and then proactive bot defense will be applied.

e.g. the site is not under attack and I want to mitigate with proactive bot defense only when any other anomaly engine (mention above) is triggered in transparent mode. Or any request that pass the rate limit of the anomaly engine.

The option for during attack provides a very powerful mitigation scenario where when the site is experiencing increase in RPS that indicates bots activity only then examine the sources and if they are suspicious present to those specific sources CAPTCHA challenge or block them if they are being detected by capability script as bots.

The configuration will be as follows:

Define fixed thresholds for RPS on the anomaly engine in transparent mode
Define proactive bot defense to be during attack
1. Enable If Block Suspicious Browsers
2. Enable CAPTCHA Challenge

White listing

It is recommended to white list all known Ip’s that access the site and exclude them from the dos profile checks. The reason is that when under attack the mitigations will not apply on known good sources.

Reporting

Bot defense reporting provides a full overview on the bots (good and bad) that access your web application. Those graphs are critical when under attack to indicate the offending sources and easily mitigate the attacks.

Irule mitigations

Irule are the F5 swiss army knife that can be used with the anti bot engine. In the following example any source that access the login php URL will get the proactive bot defense check and be allowed if it pass it. The full commands for using bot defnse with irule is located here: BotDefense

# EXAMPLE: enable client-side challenges on a specific URL

when BOTDEFENSE_REQUEST {

if {[HTTP::uri] eq "/login.php"} {

BOTDEFENSE::cs_allowed true

}

Proactive Bot defense Summary

Proactive bot defense is a dedicated bot detection and mitigations engine which focus on the attack agent capabilities. There are several layers of protection with proactive bot defense :

Bot signature – is this known bad / good bod ?
Bot impersonation checks – is this a valid bot ?
Browser check – is this a browser ?
Browsers capabilities – which capabilities the browser has compare to what he say
CAPTCHA – is this a human ?

Proactive bot defense can be used with the anomaly engine the can trigger proactive bot defense once a specific threshold has reached.

For example: only if login URL exceeds 20 RPS then apply proactive bot defense. (in transparent mode)

Other combinations are also very useful when under attack.

For example: sending the client capabilities script and send CATPCHA to verify if the sources is a browser and if this is a human.

Proactive bot defense has good reporting that allows fine tuning of the security policy to match bots traffic.

Finally irule can be used to utilize proactive bot defense.

Under Attack – use F5 SIRT

About F5 SIRT

Published Dec 27, 2018

Version 1.0

ASM Advanced WAF

F5 SIRT

security