Forum Discussion

Ken_B_50116's avatar
Ken_B_50116
Icon for Cirrostratus rankCirrostratus
Jan 21, 2015

LTM SSL health check fails when standby node reboots

Last night I was upgrading a BigIP LTM 11.4.1 HF5 active/standby 2 node cluster to HF7. The hardware platform is 4200v. When I was rebooting the standby node, a health check on the active node started failing. No changes had been made to the active node. This is the check that started failing:

ltm monitor https CSG_SSL {
    cipherlist DEFAULT:+SHA:+3DES:+kEDH
    compatibility enabled
    defaults-from https
    description "Generic HTTPS monitor but without a send string"
    destination *:*
    interval 30
    send "GET /\\r\\n"
    time-until-up 0
    timeout 16
}

This monitor has been in production for well over a year and runs fine. When the failure happened, I changed the pool to use a generic TCP monitor to work around the pool outage. ALso of note is that this SSL health monitor was failing on both the active and standby units. The same monitor, used on a test pool containing one of the production severs, was also failing.

I had a related problem when this same cluster was upgraded to HF5: The SSL health monitor disappeared from the pool for reasons I was never able to determine.

If it matters, here is the pool. I have removed the IP addresses as they are from a public IP block that is used privately (don't ask... not my design):

ltm pool pool_Citrix_CSG {
    load-balancing-mode least-connections-node
    members {
        (removed):https {
            address (removed)
            session monitor-enabled
            state up
        }
        (removed):https {
            address (removed)
            session monitor-enabled
            state up
        }
        (removed):https {
            address (removed)
            session monitor-enabled
            state up
        }
    }
    monitor tcp
}

I completed the HF7 install and the cluster operates fine. The SSL health monitor is still failing and the pool is running on the generic TCP monitor as you can see above. I have asked the server owners to drain one server of user connections, reboot it, and then we can test to see if the SSL monitor will work correctly again.

Has anyone see the behavior before where a health monitor fails on an active node when the standby BigIP node is rebooted?

5 Replies

  • Dont recall seeing a situation like this. Strange. Did you capture any outputs from /var/log/ltm on the active when you rebooted the standby?

     

    Tried a different monitor (address check?)

     

    • Ken_B_50116's avatar
      Ken_B_50116
      Icon for Cirrostratus rankCirrostratus
      Unfortunately I never did find a solution to this, and I never went back and did any more troubleshooting.
  • NikhilB_149913's avatar
    NikhilB_149913
    Historic F5 Account

    Dont recall seeing a situation like this. Strange. Did you capture any outputs from /var/log/ltm on the active when you rebooted the standby?

     

    Tried a different monitor (address check?)

     

    • Ken_B_50116's avatar
      Ken_B_50116
      Icon for Cirrostratus rankCirrostratus
      Unfortunately I never did find a solution to this, and I never went back and did any more troubleshooting.
  • Have you tried a tcpdump to see why it is failing? TCP hanshake failure? SSL handshake failure? Receive string failure(need the key to decrypt either with wireshark or SSLDUMP)?