Forum Discussion

Jérôme_ANGLES's avatar
Jérôme_ANGLES
Icon for Nimbostratus rankNimbostratus
Nov 25, 2014

Delay between two HTTP requests in a common TCP.flow (between 10 - 30 seconds)

I have a long wait on a application configured on a bigip LTM 3900 v10.2.2. I captured the traffic on the bigip and there are a few things i see that seems strange...

 

  • On a common tcp.flow, there is about 10-30 seconds between two HTTP request. The previous ack before the delayed HTTP request doesn't seems to be seen on wireshark... (the number of request and response is différent according to the stats on wireshark)

     

  • the server doesn't send anything after the reply to the request (completed?). The response to the delayed request Is fine though.

     

Now i'm looking for help to troubleshoot this problem. There is no delay if we access the application directly. Is wireshark not the best tools to help me understand the delay? What can i do with BigIP profiles? (I use the standard TCP profile and the standard HTTP profile with caching disadled)

 

Thank in advance for your help

 

Jérôme

 

20 Replies

  • giltjr's avatar
    giltjr
    Icon for Nimbostratus rankNimbostratus

    To make sure I understand what you are seeing.

     

    1. HTTP request from the client to the F5 takes "no time at all."
    2. HTTP request is forwarded from F5 the to the server in "not time at all."
    3. HTTP response from server to F5 takes "no time at all."
    4. HTTP response from the F5 to the client take 10-20 seconds.

    If this is correct, what I would do is examine the packet capture from the F5 showing the interaction between the F5 and the client and the packet capture from the client. Look at the time stamps on each individual packet.

     

  • I looked many time the capture form each side of the F5, and it seems that :

     

    • The request are delivered in no time and are fine (from the client to the f5 and from the f5 to the server.
    • The response seems to go well from the server to the f5 (i don't see any packet from this side during the wait - neither on the f5 nor the server capture)
    • There are some retransmission from the client during the transfer... each retransmission take 1 more second. What can i do to explain those retransmission and find a solution? I tried with différents internet browser but the problem is still the same.
  • giltjr's avatar
    giltjr
    Icon for Nimbostratus rankNimbostratus

    Which browser being used should not make a difference.

     

    You would need to run a packet capture on each device (both the ingress and egress side of that device) that sits between the F5 and the client to see where the packet might be getting dropped.

     

    I would start with the device closest to the F5. Each device includes all routers/switches/firewalls. Anything that could possibly drop/delay a packet.

     

  • As it is difficult to run packet capture on our router as it is very critical, i continued to do some test in our network. So, i figured that anywhere on the network, if :

     

    • my computer is connected with 100Mb/s, the problem appear
    • my computer is connected with 1000Mb/s, there is no problem

    So my question is : how the speed negociated with my computer can change the behaviour of the application on such a small need of trafic...?

     

  • giltjr's avatar
    giltjr
    Icon for Nimbostratus rankNimbostratus

    Are you doing anything on your computer to change the speed you are connecting at? What type of switch is your computer connected too?

     

    Double check your duplex. If the switch port is doing full duplex and your computer is doing half, you will get increased response time.

     

    Earlier you stated that if you bypassed the F5 that you got good response time. If you connect at 100 Mbps and bypass the F5 is response time fast or slow?

     

  • I don't do anything to change the speed and duplex on my computer. Both the switchs and computers are configured with autonegociation. The duplex is always Full on both switch and computer.

     

    I tried on different switch type : - Avaya ERS 4550T PWR (users ports are 100Mb/s and there are 2 x 1Gb/s ports). I tried both - Avaya ERS 4850 GTS (all ports are 1Gb/s) - Avaya ERS 4548 (all ports are 1Gb/s) - Avaya ERS 8800 (Router - i used a copper Gbic to run the test - 1Gb/s only)

     

    The result are always the same anywhere on the network : - with 1Gb/s Full duplex : The application works fine - with 100Mb/s Full duplex : the application suffers of delay (10-30s)

     

    Also, if i bypass the F5 and use the real IP of the server, everything work fine.

     

  • giltjr's avatar
    giltjr
    Icon for Nimbostratus rankNimbostratus

    What I would suggest is:

     

    1. Run tcpdump to capture output stream from F5 to Client
    2. Run packet capture on client to capture stream from F5
    3. Setup a port mirror on the Avaya ERS 4550 uplink.
    4. Access the application
    5. Look at all of the packet captures and see where the problem is.

    Since you have a ERS 4550 PWR I am assuming that you have VOIP. Do you also have QOS setup? Is the 4550 getting flooded with too much data and dropping things?

     

  • I already ran some capture some capture like you said (see the command on the second answer). What i see is some missing packet, some duplicate ACK and some retransmission. At a time, there is 1 second between the retransmission and it represent the delay i talk about.

     

    Also, this morning, i tried an exact same configuration on a VM installation with the same BiGIP version and the same application, and everything work fine.

     

    Is this possible the BigIP ltm3900 have some communication issue? (it is attached with 2 x 1Gb J45 with LACP)

     

  • giltjr's avatar
    giltjr
    Icon for Nimbostratus rankNimbostratus

    In the packet capture that you did on the F5, are you seeing duplicate ACK's or re-transmission requests?

     

    You did a capture on the server, the F5, and the client. But I don't see where you did any captures on the devices between the F5 and the client.

     

    If you can see the packet leaving the F5, and NOT arriving at the client, then it is getting dropped someplace in between.

     

    You would need to do packet captures at various points to see where the packets are getting dropped.

     

    I doubt that the LTM3900 itself is having a problem, my guess is that someplace in the network between the 3900 and the clients is having a problem.

     

    You say the 3900 has 2 connection setup in LACP. Are both connections to the same switch or are they to 2 different switches?

     

    If to the same switch are both switch ports configured in a LACP group?

     

    If different switches, do both switches have clean paths to the client?

     

  • I posted a complete answer to the problem, but when i tried to edit it to add an "s" to a word, the entire text disappear... So here are the solution again :

     

    The symptoms seems like "bufferbloat".

     

    The solution are :

     

    • use "performance L4" type of VS
    • use a lower value for the "send buffer" parameter in the BigIP client TCP profile.