Forum Discussion

hmatt_36737's avatar
hmatt_36737
Icon for Nimbostratus rankNimbostratus
Sep 03, 2013

Forwarding Virtual Servers, SNAT Pools, and Port Collisions

So, I have a Forwarding(IP) Virtual Server setup with it's own SNAT pool and Source Port set to change. The SNAT pool currently has four IPs assigned to it. The Virtual Server has a custom FastL4 Profile assigned with a 30 second TCP Close Timeout and Loose Close enabled. This forwarding server is used to hit a single external IP with which we do a rather large number of transactions, in the area of 500 connections per second at peak usage.

 

We are running into an issue with session collisions on a small number of these connections where the remote host has not released the four-tuple for reuse when we attempt to reuse it. This causes the connection attempts to timeout. This is occuring any time we attempt to reuse a given four-tuple in less than 16 seconds; any attempts that wait at least 16 second succeed. With the 4 IPs and 500 connections per second we should be able to go for 524.28 seconds (65535 port/IP * 4 IPs / 500 ports/second) before needing to reuse ports.

 

Based on my understanding of the TCP Close Timeout this should force any connections to wait 16 seconds before attempting to reuse a given four-tuple after the connection is successfully closed, however this does not appear to be happening. Looking at packet captures I see the same four-tuples being reused in at little as 1.6 seconds.

 

Does anyone know of a way of forcing an LTM to wait for a given time period before allowing that four-tuple to be reused for connections flowing through a Fordwarding(IP) Virtual Server?

 

I've put in a ticket with F5 support, but haven't been able to get anywhere following that route, anyone on DevCentral have any tips?

 

3 Replies

  • I assume it's only the source IP and port that's being re-used; the destination IP and port don't ever change right. I wonder if your calculation is correct? The calculation of the 524s figure doesn't take into account how long the connections are open for. I assume the SNAT addresses are not used elsewhere? Also, can I assume no source NAT is occuring in relation to the client addresses before they hit the VIP?
  • Yes, that is correct, the error occurs when the same IP/port pair is used on both ends, but we do always connect to the same remote IP and port. As for the connection calculation, the connections are all simple HTTPS connections, typically lasting in the area of 0.3 - 0.5 seconds. Your assumption on the SNAT pool is correct, it is only assigned and used by this single forwarding virtual, which connections only to a single remote IP. There are no NATs occuring outside of this single SNAT pool prior to the connection leaving the F5, so yes, no source NAT on the client address prior to them hitting the forwarding virtual.
  • Based on my understanding of the TCP Close Timeout this should force any connections to wait 16 seconds before attempting to reuse a given four-tuple after the connection is successfully closed

    i do not think snat port is reserved until close timeout passes.

    TIME_WAIT and “port reuse”

    http://blog.davidvassallo.me/2010/07/13/time_wait-and-port-reuse/

    anyway, can you try to change tm.portfind.linear and tm.portfind.random db variables to see if they help?

     tmsh modify sys db tm.portfind.linear value 0 
     tmsh modify sys db tm.portfind.random value 32