Advanced iRules: An Abstract View of iRules with the Tcl Bytecode Disassembler
In case you didn't already know, I'm a child of the 70's. As such, my formative years were in the 80's, where the music and movies were synthesized and super cheesy. Short Circuit was one of those cheesy movies, featuring Ally Sheedy, Steve Guttenberg, and Johnny Five, the tank-treaded laser-wielding robot with feelings and self awareness. Oh yeah! The plot...doesn't at all matter, but Johnny's big fear once reaching self actualization was being disassembled. Well, in this article, we won't dissamble Johnny Five, but we will take a look at disassembling some Tcl code and talk about optimizations. Tcl forms the foundation of several code environments on BIG-IP: iRules, iCall, tmsh, and iApps. The latter environments don't carry the burden of performance that iRules do, so efficiency isn't as big a concern. When we speak at conferences, we often commit some time to cover code optimization techniques due to the impactful nature of applying an iRule to live traffic. This isn't to say that the system isn't highly tuned and optimized already, it's just important not to introduce any more impact than is absolutely necessary to carry out purpose. In iRules, you can turn timing on to see the impact of an iRule, and in the Tcl shell (tclsh) you can use the time command. These are ultimately the best tools to see what the impact is going to be from a performance perspective. But if you want to see what the Tcl interpreter is actually doing from an instruction standpoint, well, you will need to disassemble the code. I've looked at bytecode in some of the python scripts I've written, but I wasn't aware of a way to do that in Tcl. I found a thread on stack that indicated it was possible, and after probing a little further was given a solution. This doesn't work in Tcl 8.4, which is what the BIG-IP uses, but it does work on 8.5+ so if you have a linux box with 8.5+ you're good to go. Note that there are variances from version to version that could absolutely change the way the interpreter works, so understand that this is an just an exercise in discovery. Solution 1 Fire up tclsh and then grab a piece of code. For simplicity, I'll use two forms of a simple math problem. The first is using the expr command to evaluate 3 times 4, and the second is the same math problem, but wraps the evaluation with curly brackets. The command that will show how the interpreter works its magic is tcl::unsupported::disassemble. ## ## unwrapped expression ## ## % tcl::unsupported::disassemble script { expr 3 * 4 } ByteCode 0x0x1e1ee20, refCt 1, epoch 16, interp 0x0x1d59670 (epoch 16) Source " expr 3 * 4 " Cmds 1, src 12, inst 14, litObjs 4, aux 0, stkDepth 5, code/src 0.00 Commands 1: 1: pc 0-12, src 1-11 Command 1: "expr 3 * 4 " (0) push1 0 # "3" (2) push1 1 # " " (4) push1 2 # "*" (6) push1 1 # " " (8) push1 3 # "4" (10) concat1 5 (12) exprStk (13) done ## ## wrapped expression ## ## % tcl::unsupported::disassemble script { expr { 3 * 4 } } ByteCode 0x0x1de7a40, refCt 1, epoch 16, interp 0x0x1d59670 (epoch 16) Source " expr { 3 * 4 } " Cmds 1, src 16, inst 3, litObjs 1, aux 0, stkDepth 1, code/src 0.00 Commands 1: 1: pc 0-1, src 1-15 Command 1: "expr { 3 * 4 } " (0) push1 0 # "12" (2) done Because the first expression is unwrapped, the interpreter has to build the expression and then call the runtime expression engine, resulting in 4 objects and a stack depth of 5. With the wrapped expression, the interpreter found a compile-time constant and used that directly, resulting in 1 object and a stack depth of 1 as well. Much thanks to Donal Fellows on Stack Overflow for the details. Using the time command in the shell, you can see that wrapping the expression results in a wildly more efficient experience. % time { expr 3 * 4 } 100000 1.02325 microseconds per iteration % time { expr {3*4} } 100000 0.07945 microseconds per iteration Solution 2 I was looking in earnest for some explanatory information for the bytecode fields displayed with tcl::unsupported::disassemble, and came across a couple pages on the Tcl wiki, one building on the other. Combining the pertinent sections of code from each page results in this script you can paste into tclsh: namespace eval tcl::unsupported {namespace export assemble} namespace import tcl::unsupported::assemble rename assemble asm interp alias {} disasm {} ::tcl::unsupported::disassemble proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name $argl [list asm [dis2asm $res]]] eval $res } return $res } proc dis2asm body { set fstart " push -1; store @p; pop " set fstep " incrImm @p +1;load @l;load @p listIndex;store @i;pop load @l;listLength;lt " set res "" set wait "" set jumptargets {} set lines [split $body \n] foreach line $lines { ;#-- pass 1: collect jump targets if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc} } set lineno 0 foreach line $lines { ;#-- pass 2: do the rest incr lineno set line [string trim $line] if {$line eq ""} continue set code "" if {[regexp {slot (\d+), (.+)} $line -> number descr]} { set slot($number) $descr } elseif {[regexp {data=.+loop=%v(\d+)} $line -> ptr]} { #got ptr, carry on } elseif {[regexp {it%v(\d+).+\[%v(\d+)\]} $line -> copy number]} { set loopvar [lindex $slot($number) end] if {$wait ne ""} { set map [list @p $ptr @i $loopvar @l $copy] set code [string map $map $fstart] append res "\n $code ;# $wait" set wait "" } } elseif {[regexp {^ *\((\d+)\) (.+)} $line -> pc instr]} { if {$pc in $jumptargets} {append res "\n label L$pc;"} if {[regexp {(.+)#(.+)} $instr -> instr comment]} { set arg [list [lindex $comment end]] if [string match jump* $instr] {set arg L$arg} } else {set arg ""} set instr0 [normalize [lindex $instr 0]] switch -- $instr0 { concat - invokeStk {set arg [lindex $instr end]} incrImm {set arg [list $arg [lindex $instr end]]} } set code "$instr0 $arg" switch -- $instr0 { done { if {$lineno < [llength $lines]-2} { set code "jump Done" } else {set code ""} } startCommand {set code ""} foreach_start {set wait $line; continue} foreach_step {set code [string map $map $fstep]} } append res "\n [format %-24s $code] ;# $line" } } append res "\n label Done;\n" return $res } proc normalize instr { regsub {\d+$} $instr "" instr ;# strip off trailing length indicator set instr [string map { loadScalar load nop "" storeScalar store incrScalar1Imm incrImm } $instr] return $instr } Now that the script source is in place, you can test the two expressions we tested in solution 1. The output is very similar, however, there is less diagnostic information to go with the bytecode instructions. Still, the instructions are consistent between the two solutions. The difference here is that after "building" the proc, you can execute it, shown below each aproc expression. % aproc f x { expr 3 * 4 } -x proc f x {asm { push 3 ;# (0) push1 0 # "3" push { } ;# (2) push1 1 # " " push * ;# (4) push1 2 # "*" push { } ;# (6) push1 1 # " " push 4 ;# (8) push1 3 # "4" concat 5 ;# (10) concat1 5 exprStk ;# (12) exprStk ;# (13) done label Done; }} % f x 12 % aproc f x { expr { 3 * 4 } } -x proc f x {asm { push 12 ;# (0) push1 0 # "12" ;# (2) done label Done; }} % f x 12 Deeper Down the Rabbit Hole Will the internet explode if I switch metaphors from bad 80's movie to literary classic? I guess we'll find out. Simple comparisons are interesting, but now that we're peeling back the layers, let's look at something a little more complicated like a for loop and a list append. % tcl::unsupported::disassemble script { for { $x } { $x < 50 } { incr x } { lappend mylist $x } } ByteCode 0x0x2479d30, refCt 1, epoch 16, interp 0x0x23ef670 (epoch 16) Source " for { $x } { $x < 50 } { incr x } { lappend mylist $x " Cmds 4, src 57, inst 43, litObjs 5, aux 0, stkDepth 3, code/src 0.00 Exception ranges 2, depth 1: 0: level 0, loop, pc 8-16, continue 18, break 40 1: level 0, loop, pc 18-30, continue -1, break 40 Commands 4: 1: pc 0-41, src 1-56 2: pc 0-4, src 7-9 3: pc 8-16, src 37-54 4: pc 18-30, src 26-32 Command 1: "for { $x } { $x < 50 } { incr x } { lappend mylist $x }" Command 2: "$x " (0) push1 0 # "x" (2) loadStk (3) invokeStk1 1 (5) pop (6) jump1 +26 # pc 32 Command 3: "lappend mylist $x " (8) push1 1 # "lappend" (10) push1 2 # "mylist" (12) push1 0 # "x" (14) loadStk (15) invokeStk1 3 (17) pop Command 4: "incr x " (18) startCommand +13 1 # next cmd at pc 31 (27) push1 0 # "x" (29) incrStkImm +1 (31) pop (32) push1 0 # "x" (34) loadStk (35) push1 3 # "50" (37) lt (38) jumpTrue1 -30 # pc 8 (40) push1 4 # "" (42) done You'll notice that there are four commands in this code. The for loop, the x variable evaluation, the lappend operations, and the loop control with the incr command. There are a lot more instructions in this code, with jump pointers from the x interpretation to the incr statement, the less than comparison, then a jump to the list append. Wrapping Up I went through an exercise years ago to see how far I could minimize the Solaris kernel before it stopped working. I personally got down into the twenties before the system was unusable, but I think the record was somewhere south of 15. So...what's the point? Minimal for minimal's sake is not the point. Meet the functional objectives, that is job one. But then start tuning. Less is more. Less objects. Less stack depth. Less instantiation. Reviewing bytecode is good for that, and is possible with the native Tcl code. However, it is still important to test the code performance, as relying on bytecode objects and stack depth alone is not a good idea. For example, if we look at the bytecode differences with matching an IP address, there is no discernable difference from Tcl's perspective between the two regexp versions, and very little difference between the two regexp versions and the scan example. % dis script { regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})} 192.168.101.20 _ a b c d } ByteCode 0x0x24cfd30, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-" Cmds 1, src 90, inst 19, litObjs 8, aux 0, stkDepth 8, code/src 0.00 Commands 1: 1: pc 0-17, src 1-89 Command 1: "regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9" (0) push1 0 # "regexp" (2) push1 1 # "([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})" (4) push1 2 # "192.168.101.20" (6) push1 3 # "_" (8) push1 4 # "a" (10) push1 5 # "b" (12) push1 6 # "c" (14) push1 7 # "d" (16) invokeStk1 8 (18) done % dis script { regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ a b c d } ByteCode 0x0x24d1730, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _" Cmds 1, src 64, inst 19, litObjs 8, aux 0, stkDepth 8, code/src 0.00 Commands 1: 1: pc 0-17, src 1-63 Command 1: "regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ " (0) push1 0 # "regexp" (2) push1 1 # "^(\d+)\.(\d+)\.(\d+)\.(\d+)$" (4) push1 2 # "192.168.101.20" (6) push1 3 # "_" (8) push1 4 # "a" (10) push1 5 # "b" (12) push1 6 # "c" (14) push1 7 # "d" (16) invokeStk1 8 (18) done % dis script { scan 192.168.101.20 %d.%d.%d.%d a b c d } ByteCode 0x0x24d1930, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " scan 192.168.101.20 %d.%d.%d.%d a b c d " Cmds 1, src 41, inst 17, litObjs 7, aux 0, stkDepth 7, code/src 0.00 Commands 1: 1: pc 0-15, src 1-40 Command 1: "scan 192.168.101.20 %d.%d.%d.%d a b c d " (0) push1 0 # "scan" (2) push1 1 # "192.168.101.20" (4) push1 2 # "%d.%d.%d.%d" (6) push1 3 # "a" (8) push1 4 # "b" (10) push1 5 # "c" (12) push1 6 # "d" (14) invokeStk1 7 (16) done However, if you look at the time results from these examples, they are very different. % time { regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})} 192.168.101.20 matched a b c d } 100000 11.29749 microseconds per iteration % time { regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ a b c d } 100000 7.78696 microseconds per iteration % time { scan 192.168.101.20 %d.%d.%d.%d a b c d } 100000 1.03708 microseconds per iteration Why is that? Well, bytecode is a good indicator, but it doesn't address the inherent speed of the commands being invoked. Regex is a very slow operation comparatively. And within the regex engine, the second example is a simpler regex to evaluate, so it's faster (though less accurate, so make sure you are actually passing an IP address.) Then of course, scan shows off its optimized self in grand fashion. Was this a useful exercise in understanding Tcl under the hood? Drop some feedback in the comments if you'd like more tech tips like this that aren't directly covering a product feature or solution, but reveal some utility that assists in learning how things tick.1.3KViews1like3CommentsIs it possible to use the following Irule syntac with TCL in a policy ?
Hello, I've setup some code in an Irule . This concerns a code that will take the URI, within this uri, search for first directory in the path and put it tolower before sending it to the server. set uri [HTTP::uri] set block [lindex [split $uri /] 1] if { $block ne [string tolower $block]} { set block2 [string tolower $block] HTTP::uri [string map [list $block $block2] $uri] #log local0. "Rewrited part of the URI : $block2" #log local0. "URI Send to Back-end application : https://[HTTP::host][HTTP::uri]" } Is it possible to put this code in a TCL within a policy rule ? I need to replace the first directory of the URI (ie: "/APPLICATION/dir1/DIR2/index.html") to lowercase /application/dir1/DIR2/index.html ( the rest of the URI must stay intact, only /Application/ part must be set to lowercase. Thanks in advance. Regards Frédéric529Views1like2Commentsstring trim generating strange results
Hi, I have an issue with one of my string manipulations with variables: set test [ACCESS::session data get session.http.last.response_cookie] set test [lindex [split $test "|"] 0] set test [string trim $test "Test="] So, my cookie is built up like this: example: Test=ZTR(abc1234-1);ABC(qwe12314);|version=1;... example: Test=OEZZV(abc1234);ABC(qwe12314);|version=1... In the first example, my $test ends up being correctly shown as "ZTR(abc1234-1);ABC(qwe12314);". In the second one, $test ends up being "ZZV(abc1234);ABC(qwe12314);". So the "OE" is stripped away, although it shouldn't be. I've read about some odd results with trim when using numbers and I assume that this occurs because the OE (letter O, not number 0) represents some kind of value that is handled by the F5 differently. But can anyone tell why and how to mitigate this? Is it possible to handle the variable as a string variable or similar, so that these changes of the values are not done? Thanks in advance!264Views0likes1CommentAdvanced Resource Assign Expression Failing
I have a logging agent right before the Advanced Resource Assign with session.custom.* The variable i am testing against is shown. I cannot get the right information to display. I tried ends_with "201" and tried contains "201" but the second one will NOT load and I have no idea why. I do not see those resources on the webtop. Common:bcb3012f: Logging Agent Common:bcb3012f: session.custom.rdp.destination is 172.30.7.119 Common:bcb3012f: session.custom.rdp.domain is WORKGROUP Common:bcb3012f: session.custom.rdp.password is ********** Common:bcb3012f: session.custom.rdp.username is a Common:bcb3012f: session.custom.version is 801-201585Views0likes2CommentsBIG-IP : tcl boolean logic for string comparisons
F5 BIG-IP Virtual Edition v11.4.1 (Build 635.0) LTM on ESXi What is correct syntax for boolean operations around string comparisons ? set a "a" set b "b" set result 0 if { { $a eq "a" } && { $b eq "b" } } { set result 1 } This compiles but throws a runtime exception : expected boolean value but got " $a eq "a" " while executing "if { { $a eq "a" } && { $b eq "b" } ...340Views0likes1CommentBIG-IP : TCL to match member of set
F5 BIG-IP Virtual Edition v11.4.1 (Build 635.0) LTM on ESXi There must be a better way : if { ( $segments_count == 3 ) || ( $segments_count == 4 ) || ( $segments_count == 5 ) } { if { [expr {$name_first eq "John"}] || [expr {$name_first eq "Greg"}] || [expr {$name_first eq "Brian"}] } { NOTE: I don't want to use class matches ( as these require external data-group-files )235Views0likes2CommentsBIG-IP irule : determine page-type
F5 BIG-IP Virtual Edition v11.4.1 (Build 635.0) LTM on ESXi Request urls can be in following forms : www.gofish.com www.gofish.com/barracuda www.gofish.com/sharks/blacknose www.gofish.com/species.html www.gofish.com/bait.aspx www.gofish.com/bait.aspx?color=green I need to determine if a request is for a page ( *.html , *.aspx ) vs a resource ( barracuda , /sharks/blacknose ). More specifically, I need to determine if request is for an .aspx page.192Views0likes1CommentLoading DLLs within iRules?
Hello, Is it possible to load DLLs/SOs within I rules using the TCL load command? Something like when HTTP_REQUEST { load [file join [pwd] mylib.DLL] ... call some function from mylib... } If it's possible, is there anything special that needs to be done about the functions in mylib.DLL? Would iRules be able to infer the different return types? If it's not possible to load third-party DLLs, then is there some mechanism to access custom logic outside of the iRules?Solved450Views0likes1CommentAPM :: Detecting IE 11
Ok I suck, I know. But why is this wrong? expr { [string tolower [mcget {session.user.agent}]] matches_regex {trident\/7.+rv\:11} } It never matches... even though my UA is: Mozilla/5.0 (Windows NT 10.0; Win64; x64; Trident/7.0; rv:11.0) like Gecko I'm using an empty branch rule with this as the 'advanced' script above. Always hits fallback (i.e. doesn't match).403Views1like3CommentsLocal Traffic Policy for creating Logging Profile
Hi All, I am working on creating a logging profile for HTTP/S virtual server for which I need help in tcl format for below logging parameters like tcl:[HTTP::host]: Virtual server name BIGIP HOSTNAME DATE and TIME CLIENT PORT POOL NAME SERVER IP SERVER PORT SNAT PORT HTTP STATUS CODE Thanks Ashish Solanki509Views0likes3Comments