You Want Action on a Threshold Violation? Use iCall!

iCall has been around since the 11.4 release, yet there seems to be a prevailing gap in awareness of this amazing functionality in BIG-IP. A blog I wrote last year covers the overview of the iCall system, but in brief, it provides event-based automation. The events can be periodic (like cron functionality,) perpetual (watching for something like a file to appear in a directory,)  or triggered by an alert (like a pool member failure.)

Late last week I was at the mother ship (F5 Corporate in Seattle) and found this question in Q&A (paraphrased):

What is a good method for toggling interface 1.1 if active pool members in a pool falls below 70%?

My mind went immediately to iCall, as this is a perfect use case. It binds an event (a pool's active members falling below a threshold) to a task (disable an  interface.) I didn't have time to flesh out the solution last week, but I dropped some (errant) code in the thread to point the original poster (Lee) down the right path. Flash forward to this week, and I was intrigued enough about the solution I thought I'd take a crack at making it work.

 

Building Out the Solution

Given that Lee set a threshold of 70% of active pool members, I figured a test pool of four members would be a good candidate since failing one member would be just over the threshold at 75% whereas failing a second member would take me to 50%. I suppose a pool of three members would have been equally fine, but I like to see that some failure doesn't force an accidental event. So I fired up my test BIG-IP device and a linux vm with several interface aliases and built a pool with four members.

 

ltm pool pool4 {
    members {
        192.168.101.10:80 {
            address 192.168.101.10
            session monitor-enabled
            state up
        }
        192.168.101.20:80 {
            address 192.168.101.20
            session monitor-enabled
            state up
        }
        192.168.101.21:80 {
            address 192.168.101.21
            session monitor-enabled
            state up
        }
        192.168.101.22:80 {
            address 192.168.101.22
            session monitor-enabled
            state up
        }
    }
    monitor http
} 

 

 

Next, I needed to build the iCall script. An iCall script is just a tmsh script stored in a specific section of the configuration. It's tcl just like tmsh. But what does the script need to do? Well, a few things:

  1. Define the pool of interest
  2. Set the total number of pool members
  3. Set the number of available members
  4. Do math
  5. Enable/Disable the interface based on the result of that math

Steps 1, 4, & 5 are pretty self explanatory. In tmsh scripting, setting an interface (and most other tmsh-based commands) look nearly identical to the shell command.

 

#tmsh
tmsh modify /net interface 1.1 disabled
#tmsh script
tmsh::modify /net interface 1.1 disabled

 

Where it gets tricky is figuring out how to get pool member data. This is where the tmsh::get_status and tmsh::get_field_value commands come into play. Everything is object based in tmsh, and it can be a little overwhelming to figure out how to address the objects. If you were to just run the commands below in a script, the resulting output (in /var/tmp/scriptd.out) shows you the nomenclature of the addressable objects in that data.

 

set pn "/Common/pool4"
set pooldata [tmsh::get_status /ltm pool $pn detail]
puts $data

#data set
ltm pool pool4 {
    active-member-cnt 4
    connq-all.age-edm 0
    connq-all.age-ema 0
    connq-all.age-head 0
    connq-all.age-max 0
    connq-all.depth 0
    connq-all.serviced 0
    connq.age-edm 0
    connq.age-ema 0
    connq.age-head 0
    connq.age-max 0
    connq.depth 0
    connq.serviced 0
    cur-sessions 0
    members {
        192.168.101.10:80 {
            addr 192.168.101.10
            connq.age-edm 0
            connq.age-ema 0
            connq.age-head 0
            connq.age-max 0
            connq.depth 0
            connq.serviced 0
            cur-sessions 0
            monitor-rule http (pool monitor)
            monitor-status up
            node-name 192.168.101.10
            nodes {
                192.168.101.10 {
                    addr 192.168.101.10
                    cur-sessions 0
                    monitor-rule none
                    monitor-status unchecked
...continued...

 

So I get to the pool member data by first getting the pool data. And the data needed for pool member availability is the availability-state and the enabled-state from the pool member data (incomplete view of data shown below, but the necessary information is there.)

 

members 192.168.101.22:80 {
    addr 192.168.101.22
    monitor-rule http (pool monitor)
    monitor-status up
    node-name 192.168.101.22
    nodes {
        192.168.101.22 {
            addr 192.168.101.22
            cur-sessions 0
            monitor-rule none
            monitor-status unchecked
            name 192.168.101.22
            session-status enabled
            status.availability-state unknown
            status.enabled-state enabled
            status.status-reason
            tot-requests 0
        }
    }
    pool-name pool4
    port 80
    session-status enabled
    status.availability-state available
    status.enabled-state enabled
    status.status-reason Pool member is available
}

 

Now that the data set is known, the script can be completed. Note that to get to particular state information bolded above, I just set those attributes against the member in the tmsh::get_field_value commands bolded below. The math part is simple, though to get floating point, the .0 is added to the $usable count variable in the expression. Logging statements and puts commands (sending data to /var/tmp/scriptd.out for debugging) added to the script for demonstration purposes.

 

sys icall script poolCheck.v1.0.0 {
    app-service none
    definition {
        set pn "/Common/pool4"
        set total 0
        set usable 0
        foreach obj [tmsh::get_status /ltm pool $pn detail] {
            puts $obj
            foreach member [tmsh::get_field_value $obj members] {
                puts $member
                incr total
                if { [tmsh::get_field_value $member status.availability-state] == "available" && \
                     [tmsh::get_field_value $member status.enabled-state] == "enabled" } {
                         incr usable
                }
            }
        }
        if { [expr $usable.0 / $total] < 0.7 } {
            tmsh::log "Not enough pool members in pool $pn, interface 1.3 disabled"
            tmsh::modify /net interface 1.3 disabled
        } else {
            tmsh::log "Enough pool members in pool $pn,  interface 1.3 enabled"
            tmsh::modify /net interface 1.3 enabled
        }
    }
    description none
    events none
}

Now that the script is complete, I just need to create the handler. A triggered handler could be created to run the script every time a pool member alert happens (as configured in /config/user_alert.conf,) but for demo purposes I used a periodic handler with a 60 second interval.

 

sys icall handler periodic poolCheck.v1.0.0 {
    first-occurrence 2014-09-16:11:00:00
    interval 60
    script poolCheck.v1.0.0
}

 

Configuration complete, moving on to test!

Testing the Solution

To test, I activated the vm instance in my lab and validated that my BIG-IP interfaces and pool members were up. Then, I shut down one apache virtual ahead of the first period at 11:26, and since I had 75% availability the interface remained enabled. Next, I shut down the second apache virtual, dropping availability to 50%. At 11:27, the BIG-IP interface was deactivated. Finally, I re-enabled the apache virtuals and at the next period the BIG-IP interface was reactivated. Log files and ping test to that interface shown below.

 

# Log Files
Sep 16 11:25:43 Pool /Common/pool4 member /Common/192.168.101.21:80 monitor status down.
Sep 16 11:26:00 Enough pool members in pool /Common/pool4,  interface 1.3 enabled
Sep 16 11:26:26 Pool /Common/pool4 member /Common/192.168.101.22:80 monitor status down.
Sep 16 11:27:00 Not enough pool members in pool /Common/pool4,  interface 1.3 disabled
Sep 16 11:27:32 Pool /Common/pool4 member /Common/192.168.101.21:80 monitor status up.
Sep 16 11:27:36 Pool /Common/pool4 member /Common/192.168.101.22:80 monitor status up.
Sep 16 11:28:01 Enough pool members in pool /Common/pool4,  interface 1.3 enabled
 
# Ping Test to Interface 1.3
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Request timed out.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.205: Destination host unreachable.
Reply from 10.10.10.5: bytes=32 time=1000ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
Reply from 10.10.10.5: bytes=32 time=1ms TTL=255

 

One note from this solution, don't rely on the GUI or CLI status of the interface (known tested versions in 11.5.x+. Bug 471860 catalogs the reporting issue on BIG-IP for the interface status. At boot time, if the interface is up it reports as ENABLED, but if you disable and then re-enable, it reports as DISABLED even though it will be up and passing traffic.

Dig into iCall!

iCall (and tmsh more generally) is tremendously powerful, take a look at several other use cases already in the iCall codeshare! This solution has been added to the codeshare as well.

Published Sep 17, 2014
Version 1.0

Was this article helpful?

3 Comments

  • Kohlaa's avatar
    Kohlaa
    Icon for Nimbostratus rankNimbostratus

    for those people who stumbled here looking for which folder / file, icall scripts are stored in, as you can't browse to /Common..

     

    the file is located in /config/bigip_script.conf