Forum Discussion

rmd1023's avatar
rmd1023
Icon for Nimbostratus rankNimbostratus
Feb 04, 2013

Timeouts when I have more than 1500 sessions

Hi there!

 

I've got an LTM3900 running in layer 3 mode, and a virtual server running with a tcp profile that has the timeout set to 3600s on both the client and server side. If I have a small number of users, the users can sit there idle without a problem. But when I get about 1500 users going through, the first 700 or so start timing out after only 20m.

 

Looking at performance stats on the box, I'm not seeing it running out of resources. Is there some resource allocation issue I'm not seeing? Something I should be tuning?

 

 

Thanks!

 

--r

 

7 Replies

  • Hmmm, sounds like Connection Reaping at play. Can you check what is configured for the Low and High Water Marks here: System > Configuration > Local Traffic > General.

     

     

    Also, what's the memory usage when you have the higher number of users?
  • Reaper High-water Mark: 95%

     

    Reaper Low-water Mark: 85%

     

    It doesn't look like the Overview > Statistics > Memory graph isn't showing me the actual memory utilization - the lines on the graph are flat. Is there a way to get that information? Or is the graph accurate and my memory usage is remarkably consistent?

     

    Thanks!

     

  • I'm not seeing any sweeper_update messages in /var/log/ltm, however.
  • OK, thanks. The graph is probably correct but you could verify with [tmsh] show sys memory.

     

     

    I can't really think of anything else at this point that might shorten any configured Idle Times. What is the Idle Time set to anyway?
  • Yeah, I'm kind of stumped. Idle time is set to an hour on both sides of the connection, so it shouldn't be timing out at 20m.

     

    Thanks.

     

  • Absolutely not. Perhaps it's not the F5 but a function of the real servers when they are under load? Have you checked the resource usage there when connections are high?

     

     

    To be 100% I'd say a packet capture using tcpdump is in order, you can keep the captured data low by capturing data only from one test client that you connect when connections are low, just before they normally ramp up. You can capture client and server-side at the same time. Then you'd know for sure who's actually closing the connection, the F5 or the servers.
  • It's beginning to look like it's not the load balancer closing the connections. Which explains why I can't find a reason the load balancer is closing the connections.