Take Pool Member offline for service upgrade

Question

Disclaimer:  I may not be using correct terminology.  (Pool member environment:  Windows Server 2008 R2)&nbsp;
We have a pool named LF_Pool, with two members:  10.16.20.83:9080 and 10.16.20.84:9080&nbsp;
I want to upgrade my service running on port 9080 for each pool member regularly.  I don't want to just hap-hazardly just stop the service, perform the upgrade and restart the service... I want to try and do it elegantly without causing any down-time or service interruption.&nbsp;
So we found the iControl Powershell cmdlets and we spent the entire day testing them out to see if we can achieve our goal.  But we can't figure out how to get this working correctly.  Sure, we were easily successful in running the Powershell cmdlets (albeit the iControlSnapInReadme.txt file is dissappointingly incorrect in some areas.)  &nbsp;
We successfully ran "Set-F5.LTMPoolMemberState -Pool LF_Pool -Member 10.16.20.83:9080 -State Offline"... and it appears to be offline... but I am guessing my understanding of Offline is not correct.  When I attempt to call the service on that Pool member, it readily and happily continues to work... but I don't want any new requests to be allowed into that pool member.  (My service is a very simple HTTP REST service;  my client application is just a web browser (Chrome) that displays the JSON result).  &nbsp;
How can we get the pool member to be "offline"... meaning, it can still return the results of an in-progress transaction... but absolutely no new incoming connections period until I permit it.  What's the secret?&nbsp;
P.S. We already tried the PsServerControl script to see if that would work... but it still doesn't do what I want.  At least we are pretty certain it doesn't work the way we want it to.&nbsp;
Thanks, Lee&nbsp;

what_lies_bene1 · Answer

Are you using any persistence? When you test a service call are you sure a new connection is established? 
&nbsp;  
&nbsp; If the Pool is configured with the default Action on Service Down setting of None then established connections remain (as long as data is sent and received) and possibly new ones can be establish because of persistence. 
&nbsp;  
&nbsp; If you don't use persistence then changing the setting to Reject is your best bet as TCP will recover quickly and connections to the offline server are removed quickly. 
&nbsp;  
&nbsp; If you are using persistence: a) do you really need to b) using OneConnect helps connections move more quickly and c) Reject is still an option as TCP recovery mechanisms may still mean 0 data loss and seamless recovery.

leeg_118759 · Answer

Thanks Steve, I think those two ideas are tremendously helpful.  I'm not explicitly using persistence in my real-application, but when testing from a web browser, the browser itself is... so if I instead test from Fiddler, the request will be more pure and not include the "KeepAlive" header.  So that is what I'll try next.  Plus, your suggestion of changing the "Service Down" setting to 'Reject' is also a very good idea.  I am not the admin of the F5 balancer... so I have to schedule some time with the Admin to try these ideas.  I'll try and let you know how things work out.  
&nbsp; -- 
&nbsp; Lee

leeg_118759 · Answer

Well, we had mostly success, but with a two points of wonder.  First, I tested my web service using Fiddler (a simple HTTP GET query) without altering the "Action On Service Down" setting (e.g. left it as "None").  When we set one of the nodes in the pool Offline, Fiddler still continued to communicate with that offline node.  I don't believe Fiddler is sending any "Keep-Alive" header or any other header to keep persistence... so I'm ignorant as to what is keeping the connection alive. 
&nbsp;  
&nbsp; Then... when we changed the "Action On Service Down" in the BIG-IP settings window to "Reject"... my tests started working as expected.  Steve, do you feel this is the recommended way to achieve taking a pool member down for maintenance?  Do you recommend an alternative way?  I'm not the Admin of the BIG-IP... just a developer, so I need a little hand-holding.  Thanks. 
&nbsp; -Lee

what_lies_bene1 · Answer

OK, we're still missing a complete picture here. Is persistence configured on the F5 itself? If so that may explain the continued use of the offline node. 
&nbsp; Is Fiddler establishing a new TCP connection when you test after marking the node offline? It's not a valid test if it's using an existing connection. This only applies if there is no persistence in play on the F5. 
&nbsp;  
&nbsp; I can't really determine the best approach without understanding the above two points and knowing a bit more about the application itself. The reject setting is certainly valid if the application recovers/deals with this well. Happy to help.

brent_west_7733 · Answer

What protocol is the Server using?  Keep-Alives are default in HTTP/1.1.  If you wish to close a connection after each response, you can change this behavior on the server or have the BigIP include a "Connection: close" header via an iRule. 
&nbsp;  
&nbsp; The "Action on Service Down" will continue to select a down member if the TCP connections on both sides of the proxy are still valid, i.e. a reset hasn't been issued by the server or client. 
&nbsp; Setting the pool action to "Reject" will cause a reset packet to be sent to the client, forcing a new TCP connection to be established, and a new load-balancing decision to be made (unless there is a persistence record.) 
&nbsp;  
&nbsp; Additionally, there is a difference between "Disabled" and "Offline" 
&nbsp; Offline still selects an active connection, but will not honor a persistence record 
&nbsp; Disabled will honor a persistence record as well as active TCP connections

Forum Discussion

Take Pool Member offline for service upgrade

6 Replies

Recent Discussions

Overwriting or adding LTM SSL Traffic cert and key using iControlREST

Pricing when used with aws waf

F5 terminal - help to run commands - disk space full

import live updates from version x to version y

Tenant image upgrade

Related Content

Support and Help for DevCentral and Offline Contact

Per-app failover for Kubernetes-based services using F5 Distributed Cloud Services

Getting Started with BIG-IP Next: Upgrading Instances

Software Upgrade

Upgrade to 15.1.10