Forum Discussion

Jeremiah-Swanso's avatar
Jeremiah-Swanso
Icon for Nimbostratus rankNimbostratus
Sep 25, 2014

Web Scraping Configuration

We would like some clarification on the F5 Web Scraping Application Security that we can't seem to find.

 

Does this block based on session or IP? If a bot is detected during the grace interval, and say we have the unsafe interval set to 100,000, shouldn't it block that IP for 100,000 requests following the detection?

 

We are seeing that once the session is closed it allows that IP back through with another grace interval. The scraping we are receiving is intelligent enough to kill the session once it detects we blocked them, and then opens another session. So, in our event logs we see the same IPs listed multiple times back to back where they were blocked for say 11 requests, then just came right back through with another session.

 

This isn't a desired configuration in our opinion. We were under the impression that if an IP was detected as a bot that it would be blocked for the subsequent unsafe interval we have set.

 

We have tested this from an external connection by sending requests to get detected and blocked by the device, but once we opened a new session we were free and clear again.

 

Is there a setting we need to change for our desired effect? We have looked through the documentation and don't see what would possibly need changed.

 

3 Replies

  • nathe's avatar
    nathe
    Icon for Cirrocumulus rankCirrocumulus

    Jeremiah,

     

    I think you've found the answer in the tests but I would've said it was Session based, mainly because ASM uses TS cookies as part of its Web Scraping mitigation.

     

    You're right about the Unsafe Period - the session is blocked for this amount of requests and then the grace period kicks in again to allow ASM to re-evaluate the requests. So, it looks like by resetting the session is a way back in.

     

    I've just re-checked the GUI and noticed something I haven't seen before "Persistent Client Identification". Do you have this enabled? I wonder, from looking at the Help, whether this will store the client IP and then block on this, for the period configured. Worth a look I think.

     

    If not then I can only recommend an iRule to respond to the ASM_VIOLATION event.

     

    Hope thiis helps,

     

    N

     

  • Thanks for the reply, Nathan. I had noticed the "Persistent Client Identification" after I made this post. We currently don't have this set, but from reading the help information is sounds like its what we might want - "Specifies, when enabled, that you want the system to log and/or prevent attackers from circumventing web scraping protection by resetting sessions and sending requests. The default setting is disabled. This setting appears after Session Opening is set to Alarm or Alarm and Block."

     

    We tried applying this to one of our security policies but it was causing issues with actual clients that should not be blocked so we had to disable it right away. We will need to get more information on this setting, and its subsequent settings to see what is the best way to have it configured so we can block scrapers from just opening new sessions and not affect users.

     

    • nathe's avatar
      nathe
      Icon for Cirrocumulus rankCirrocumulus
      Thanks for the update. Be great if you could feedback what else you find in your testing