LTM SSL health check fails when standby node reboots
Last night I was upgrading a BigIP LTM 11.4.1 HF5 active/standby 2 node cluster to HF7. The hardware platform is 4200v. When I was rebooting the standby node, a health check on the active node started failing. No changes had been made to the active node. This is the check that started failing:
ltm monitor https CSG_SSL {
cipherlist DEFAULT:+SHA:+3DES:+kEDH
compatibility enabled
defaults-from https
description "Generic HTTPS monitor but without a send string"
destination *:*
interval 30
send "GET /\\r\\n"
time-until-up 0
timeout 16
}
This monitor has been in production for well over a year and runs fine. When the failure happened, I changed the pool to use a generic TCP monitor to work around the pool outage. ALso of note is that this SSL health monitor was failing on both the active and standby units. The same monitor, used on a test pool containing one of the production severs, was also failing.
I had a related problem when this same cluster was upgraded to HF5: The SSL health monitor disappeared from the pool for reasons I was never able to determine.
If it matters, here is the pool. I have removed the IP addresses as they are from a public IP block that is used privately (don't ask... not my design):
ltm pool pool_Citrix_CSG {
load-balancing-mode least-connections-node
members {
(removed):https {
address (removed)
session monitor-enabled
state up
}
(removed):https {
address (removed)
session monitor-enabled
state up
}
(removed):https {
address (removed)
session monitor-enabled
state up
}
}
monitor tcp
}
I completed the HF7 install and the cluster operates fine. The SSL health monitor is still failing and the pool is running on the generic TCP monitor as you can see above. I have asked the server owners to drain one server of user connections, reboot it, and then we can test to see if the SSL monitor will work correctly again.
Has anyone see the behavior before where a health monitor fails on an active node when the standby BigIP node is rebooted?