Forum Discussion

ukstin's avatar
ukstin
Icon for Nimbostratus rankNimbostratus
Jan 28, 2011

Persistent Monitor

Hi all,

 

 

I´m trying to create a monitor that once connected do not disconnect and time to time send a string to the server and wait for a string.

 

 

I create it with a python script, it works with only one persistent connection when I´m running it in shell. But when I create an external monitor in big-ip, everytime it runs the script it kill all the processes and start it again.

 

 

I tried to create another script, that just call the main script and put it in background, but even this way big-ip monitor kills all related processes everytime it runs.

 

 

Below are my "monitor" that calls the main script that make all the dirt job.... when I run it via shell it works fine but when it is called via monitor everytime hsm_monitor finish hsm_client.py also finish. Any suggestion to avoid this behavior?

 

 

!/usr/bin/python

 

 

import os

 

import sys

 

import subprocess

 

 

recebe as variaveis por parametro

 

serverIP6 = sys.argv[1]

 

serverIP = serverIP6.lstrip("::ffff:")

 

recebe o segundo parametro do comando que define a porta

 

serverPort = sys.argv[2]

 

if os.getenv('SEND'):

 

message = os.getenv('SEND')

 

else:

 

message = sys.argv[3]

 

if os.getenv('RECV'):

 

resposta = os.getenv('RECV')

 

else:

 

resposta = sys.argv[4]

 

 

filename = "/config/monitors/hsm_monitor_status.txt"

 

pidfile = "/var/lock/hsm_client_" + serverIP + serverPort + ".pid"

 

 

program_name = "nohup"

 

program_2 = "/config/monitors/hsm_client.py"

 

arguments = (serverIP6, serverPort, message, resposta)

 

command = [program_name]

 

command.extend(arguments)

 

command = "nohup /config/monitors/hsm_client.py " + serverIP6 + " " + serverPort + " " + message + " " + resposta + " &"

 

print program_2

 

print arguments

 

 

verifica se existe o pid file

 

if os.path.isfile(pidfile):

 

file = open(pidfile, 'r')

 

file.seek(0)

 

pd = file.readline()

 

file.close()

 

if os.path.exists("/proc/%s" % pd):

 

Programa ja esta rodando, so coleta o status

 

file = open(filename, 'r')

 

file.seek(0)

 

status = file.readline()

 

if status == "UP":

 

print status

 

file.close()

 

else:

 

Pid file existe mas o programa nao esta mais rodando, inicia-o

 

os.unlink(pidfile)

 

subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT)

 

subprocess.Popen(command, stderr = subprocess.STDOUT)

 

os.system(command)

 

pid = os.fork()

 

if not pid:

 

subprocess.Popen(command, shell = True)

 

os.wait()[0]

 

else:

 

nao existe pid file, executa o programa

 

subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT)

 

subprocess.Popen(command, stderr = subprocess.STDOUT)

 

os.system(command)

 

pid = os.fork()

 

if not pid:

 

subprocess.Popen(command, shell = True)

 

os.wait()[0]

 

 

 

Thanks

 

Ukstin

 

3 Replies

  • Hi Ukstin,

     

     

    I believe this is by design. The monitoring daemon calls the script to check an individual pool member. Once the script returns anything to standard out, bigd marks the member up and kills the script. I don't know of any way to work around this.

     

     

    Are you concerned with the overhead on LTM or the pool member of opening and closing TCP connections? If so, you could try to back off the frequency of the polling. You could also consider using an inband monitor to check load balanced connections in addition to monitor initiated connections.

     

     

    Aaron
  • ukstin's avatar
    ukstin
    Icon for Nimbostratus rankNimbostratus
    Hi hoolio, thanks for your help, I also think the monitors act like this by design. I´ve changed a little the script and now I control execution of it via other script that runs with cron. The main script just print the status in a file and another script, the external monitor, reads this and return the status to bigip.

     

     

    I need to do it not because overhead of opening connections, it was just that because the pool member aren´t good with tcp connections, and open and close connections frequently could impact on the server.

     

     

    Anyway, now it works, thanks for your help.

     

     

    regards,

     

    Ukstin

     

  • That's a novel solution. If you have time, it would be great if you could post a simplified, anonymized copy of the scripts here or in the monitoring codeshare. I'm sure others would find thi useful.

     

     

    Thanks, Aaron