Health Monitor Retries?

Question

We are planning on using some of the monitors on our LTM 8400 9.4x. I noticed that you are able to change only two settings the intveral and timeout. I have two basic questions, my first question is what is the default retry threshold upon a failure? And secondly, is there a way to change the retry threshold, so that it fails a certain amount of times before taking a device out of service? Please help thanks!

hooleylist · Answer

The interval and timeout for a monitor are loosely defined as:&nbsp;
&nbsp;interval: how often in seconds to send a request&nbsp;
&nbsp;timeout: how long to wait for a successful response before marking the member down&nbsp;
&nbsp;By default, these values are set to interval and timeout of 5 sec and 16 seconds (timeout = 3 x interval + 1).  So the monitoring daemon starts a timer equal to the value of the timeout.  It will send a request every five seconds--the length of the interval.  When it receives a successful response, the countdown is reset to the timeout value.  In this scenario, the node basically has three chances to respond before being marked down.  Requests are still sent every interval even when the node is marked down.  This allows for automatic resumption of use of the node when it responds correctly to a request.  If you want to keep the node marked down even after it responds again, you can enable 'manual resume'.&nbsp;
&nbsp;If you want to give the node more chances to respond before marking it down, you could extend the timeout length.  Setting the interval and timeout to 5 and 31 would mean the node would get sent six requests before being marked down.&nbsp;
&nbsp;Does this make sense?&nbsp;
&nbsp;Aaron

jcmattos_41723 · Answer

Awesome...Clear as can be...Thx again Hoolio!

brad_11480 · Answer

However it doesn't appeaer that the timeout determines the number of polls. Or perhaps there is a set minimum of 3? If I set the interval to 5 and the timeout to 11 it still is 16 seconds before health will change.

hooleylist · Answer

Neither should be the case.  If you enable debug logging, you should see bigd mark the node down after the timeout expires if the pool member hasn't responded to any of the requests. 
&nbsp;  
&nbsp; You can enable bigd debug by running 'b db bigd.debug enable' from the command line.  The output is written to /var/log/bigdlog. 
&nbsp;  
&nbsp; Aaron

dennypayne · Answer

Setting an interval/timeout more frequent than the 5/16 defaults will many times result in a node flapping up and down, I've found.  It depends on the app but in general I don't recommend lowering these. 
&nbsp;  
&nbsp; Denny

Forum Discussion

Health Monitor Retries?

5 Replies

Recent Discussions

import live updates from version x to version y

Tenant image upgrade

iRule editor partition button does not work

F5Access | MacOS Sonoma

Overwriting or adding LTM SSL Traffic cert and key using iControlREST

Related Content

Health monitor question

F5 Distributed Cloud - Regional Edge Health Monitoring Insights

iRule based RADIUS Health Monitor Builder

GTM health Monitoring and Probe

disable http retry