Forum Discussion

Thrillseeker_12's avatar
Jun 26, 2016

F5 BIGIP VE with Virtual Forwarding Server dropped some DNS requests...

Hi all,

 

Last week I had a very strange issue. We have an F5 VE LTM Active/Standby Cluster pair. The setup is running stable for more than 4 months now. In this environment the F5 VE acts as an default gateway towards the internet for some servers. To achieve this I configured an special forwarding virtual server with FastL4 profile for any protocol.

 

Last Sunday our operation team recognised that there was an issue with DNS requests running through the F5 VE. Note: We do NOT use any DNS (GTM) function on the F5 VE, so DNS traffic just went through the F5. The funny thing was when I did a tcpdump that 3 of 5 DNS requests were answered correctly. For the other 2 requests I never saw a request going out on the outer F5 interface (Internet-Leg). I check several logs and also performance statistics (CPU, Memory, Sessions, etc.) but everything was just normal for non-business hours during a weekend.

 

After some further testing/debugging I decided to do a Failover to the standby member. Just after the failover happened the DNS issue was gone and all DNS requests where answered properly.

 

I already opened a case at F5 and uploaded QKVIEW files. F5 support is still analysing the data but was not able to find the root cause so far. F5 support said that one problem could be the Bandwidth-Controller policy (Rate Limit to 50Mbits, burst = 0) I've configured on the forwarding VIP as well. But when the DNS issue came up network usage was very low, only around 10-20 mbits. In the past I did a lot of testing with Bandwidth-Controller and had never such issues before.

 

How about your experience with FastL4 profiles and any protocols forwarding? Do you have an idea why DNS requests where not forwarded correctly by F5 VE? Is there something special to consider when using Bandwidth-Controller policies on any protocol forwarding virtual servers?

 

Thanks a lot for your feedback/ideas/suggestions Thrillseeker

 

9 Replies

  • Sorry, I've to correct, I'm not using a Bandwidth-Policy, instead it's a Rate-Class to allow max. 50Mbits/burst rate 0 active in both directions on that forwarding virtual server.
  • For the Forwarding VS, what protocols are allowed ? May be consider using "ALL", if that is not being used currently. The suggestion doesn't address the weirdness that you have seen i.e., the standby working fine after becoming active but wanted to see if this is an issue.

     

  • Create and apply a new FastL4 pofile. Enable loose Initiation. Enable loose Close.

     

    See if problem goes away :)

     

  • Hi,

    Just to give you a bit more info, find fastl4 profile, rate-shaping class and virtual server config below: And yes, "lose-close" and "loose-initialization" is enabled already in the profile.

    ltm profile fastl4 /Common/pf-fastL4_router {
        app-service none
        defaults-from /Common/fastL4
        hardware-syn-cookie enabled
        idle-timeout 300
        ip-tos-to-client pass-through
        ip-tos-to-server pass-through
        keep-alive-interval disabled
        late-binding disabled
        link-qos-to-client pass-through
        link-qos-to-server pass-through
        loose-close enabled
        loose-initialization enabled
        mss-override 0
        pva-dynamic-client-packets 1
        pva-dynamic-server-packets 0
        pva-offload-dynamic enabled
        reassemble-fragments disabled
        receive-window-size 0
        reset-on-timeout disabled
        rtt-from-client disabled
        rtt-from-server disabled
        server-sack disabled
        server-timestamp disabled
        software-syn-cookie disabled
        syn-cookie-whitelist disabled
        tcp-close-timeout 5
        tcp-generate-isn disabled
        tcp-handshake-timeout 5
        tcp-strip-sack disabled
        tcp-timestamp-mode preserve
        tcp-wscale-mode preserve
    }
    
    
    net rate-shaping class /Common/rs_50Mbps-max {
        ceiling 50mbps
        drop-policy /Common/fred
        queue /Common/pfifo
        rate 50mbps
    }
    
    
    ltm virtual /Common/vs_forward-0.0.0.0_any {
        destination /Common/0.0.0.0:0
        ip-forward
        ip-intelligence-policy /Common/ip-intelligence
        mask any
        profiles {
            /Common/pf-fastL4_router { }
        }
        rate-class /Common/rs_50Mbps-max
        source 0.0.0.0/0
        source-address-translation {
            pool /Common/sp_default
            type snat
        }
        translate-address disabled
        translate-port disabled
        vlans {
            /Common/vl-internal
        }
        vlans-enabled
    }
    
  • If the configuration is the same on both active & standby device and the issue is seen only on the current standby device (previous active device), I would also check to see if there are any intermediate devices that may potentially have different configuration that may impact the traffic flow to the current standby device.

     

  • Sure I will check the bigip.conf on both devices but I think they are absolutely the same. The solution was running fine for more than 5 months without any issues. It's the first time that some DNS requests where not forwarded properly by F5 LTM.

     

    any other comments, suggestions or feedback from the f5 community out there? thx

     

  • Out of curiosity - What Software version is in use ?

     

    Do you have a tcpdump "client side" that shows the packets going into the F5, and a corresponding one showing nothing going out ?

     

    Is there anything strange about the DNS responses going outbound, could they have been truncated and an upstream firewall not allowing a TCP response ?

     

  • Unfortunately I do not have a packet capture at the moment. Maybe I will find a service window this week to reproduce the issue.

    At the moment 11.6 HF6 is installed on both F5 BIG-IP VE's.

    The DNS requests in my tests where just normal "dig" commands with mx option like:

    dig www.google.com mx

    As I saw in the TCPDUMP's the requests where simple UDP/53 on the inner f5 interface. From 5 DNS requests (dig) 3 where answered correctly and 2 where not answered at all... After failover to the standby unit the DNS issue was gone...

    The f5 gears are directly connected to the internet in our cloud. The only device between is the internet gw (cisco router) but there aren't any ACL's preventing DNS to change from UDP to TCP.

  • Hi all,

     

    In the meantime we found the issue for the dropped DNS requests. After removing the rate shaping policy from the forwarding virtual server the problem was solved. The funny thing is that this rate shaping policy only affected the primary F5 cluster member. On the secondary one the policy could be applied without any issues. F5 support is still analysing the issue so at the moment we do not use any rate shaping policy. Keep you posted.

     

    Cheers Thrillseeker