Forum Discussion

MR_RJ's avatar
MR_RJ
Icon for Cirrus rankCirrus
Apr 29, 2009

Draining issues with Ms IIS and F5 LTM/BigIP 9.4.6

Hi,

 

 

Maybe this question doesn't belong in this forum but since it involves the LTM and load balancing I'll post it anyway since there must be others out there with similar issues.

 

 

We have a environment like this:

 

 

|Webfront1 MS IIS| |Webfront2 MS IIS|

 

 

----LTM-HA-PAIR----

 

 

|Appserver1 MS IIS| |Appserver1 MS IIS|

 

 

The Appservers are LBing HTTP traffic without persistance of any kind.

 

 

...when im trying to drain Appserver1 by setting the node to disabled, most of the connections will drain out but there is always 7-10 connections that wont. Something is keeping these ones alive from the webfronts.

 

 

I can list them in the LTM from CLI, I can see the traffic in the IIS logs.

 

The only way I can "solve" this is by restarting the IIS so the LTM notice it as down (at least I think it does, the connections is zeroed anyway).

 

 

Do anyone have any ideas how to do a nice and normal draining, is there any command I can run on the MS/IIS to clear some cache or on the LTM to force a TCP-reset to the active connections to the disabled node?

 

I've spent plenty of time trying to solve this, googling, opening cases to F5 support etc.

 

 

Help? =)

 

 

//Robert

 

9 Replies

  • When you select the node you can select "Forced Offline (Only active connections allowed) " This should force the node to simply not allow any active connections through.

     

     

     

     

    CB
  • No it doesnt, it seems that the webfront is keeping them alive :/

     

    Trying to figure out why and if it's possible to clear some cache on the IIS-side or something.
  • No node setting will clear existing connections. If you want to clear existing connections, you could remove them from the connection table using 'b conn'. You can check the 'b conn help' page for details.

     

     

    Disabled (Only persistent or active connections allowed) - existing TCP connections and clients who have/present a valid persistence record would be allowed

     

    Forced Offline (Only active connections allowed) - only existing TCP connections will be allowed

     

     

    It's possible the "existing" connections are not existing, and are from monitor traffic. However, why do you need to clear existing TCP connections? If you're taking the node down, the server would reset/drop the connections anyhow.

     

     

    Aaron
  • We had a very similar issue to the OP. We ended up raising a support ticket.

     

     

    These steps solved it:

     

     

    Go to the Pool you want to make the change on.

     

    Change the Configuration to "Advanced".

     

    Select "Reject" next to "Action on Service Down".

     

     

    This seems to kill all open TCP connections when Forced Offline is selected for a node.

     

     

    Hope that helps.

     

     

    Cheers

     

    Sam
  • I've also tracked down a great doc by debs on DevCentral that goes into good detail about the different Action on Service Down settings:

     

     

    http://devcentral.f5.com/Default.aspx?tabid=63&articleType=ArticleView&articleId=179

     

     

     

    Cheers

     

    Sam
  • Hi,

     

     

    Thanks for all the answers.

     

    Im going to look through the document and see if there's any help.

     

    About reject on service down, isnt that equal to ....if one server fails (dies) the LTM will still continue to send traffic to it but it will be rejected? - Since there is a option for reselect, I guess thats equal to, if one server/node dies, send the traffic to another server/node instead.

     

    Right or wrong?!
  • Most of the apps we use are session state aware (i.e. the session state is stored in a DB), so doing a "Reject" will simply force the client to another node within the pool and their session will continue as normal.

     

     

    We use ASP.NET with the web-heads talking to the app layer via SOAP.

     

     

    I believe sending a TCP RST (Reset) simply tells the TCP/IP Stack in Windows to resend the packet and most apps should continue as normal.

     

     

    I haven't had an opportunity to test the "Reselect" option yet. It was the "Reject" option that was recommended by F5 support.
  • That's interesting.

     

     

    The response I received from F5 support was:

     

     

    "I think you can do it this way to achieve what you want. You can turn on "Reject" for "Action On Service Down" option under Pools. When you force the pool member to go offline, the client connection will be resetted.

     

     

    The option is under "Local Traffic" > "Pools".

     

    - Under pool Properties, select the "Configuration" from "Basic" to "Advanced".

     

    - You can find the option "Action On Service Down" there.

     

     

    Let me know if this is what you are looking for."

     

     

    I've had the developers test this and they have observed the correct behaviour and provided signoff on this method.

     

     

    I would suggest trying it and seeing if it accomplishes what you need it to.

     

     

    Cheers

     

    Sam
  • Hi,

     

     

    I've changed the option to "reject" on action status down, on the pool. The behavior I've noticed is if I set the node (pool member) to disabled, he slowly drains the connections, takes forever with these web servers. If I set it to forced offline, it will immediately drain out the connections, and we haven't seen any issues in the logs.

     

     

    So at the moment, everything looks good!

     

     

    //Robert