[Richard Suchenwirth] 2013-11-30 - Another chapter in the [dis2asm] saga: The Tcl compiler converts [proc] bodies into [bytecode]. With tcl::unsupported::'''disassemble''' we can inspect the generated code in [assembler] notation "dis" . With tcl::unsupported::'''assemble''' we can convert a (somewhat different) assembler notation "[TAL]" to bytecode again. The job of [dis2asm] is to convert a string in "dis" to another string in TAL so the original proc works equally well. Looking at the [dis2asm] output, we sometimes notice that it produces code that could be better - in other words, "optimized": doing the same job in less bytecodes, and hence (at least marginally) less time. Examples: push {} pop is an absolutely redundant piece of code: first something is pushed on the stack, and then immediately popped off again. No effect, but 3 bytes in bytecode, and just a little more time needed to run. Or jump Lxx label Lyy; jump Lzz is also redundant: because the code above jumps away, Lyy can only be reached by code that explicitly jumps there - only to be redirected to Lzz again. It might as well have directly jumped to Lzz. The code shown on this page deals with such issues. It is a postprocessor which converts [dis2asm] output to another string in the same TAL language, but optimized where possible. To test it, I have extended the ''aproc'' wrapper to accept a -o flag and if present, runs the optimizer on the TAL output: ====== proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name $argl [list asm [dis2asm $res]]] if {"-o" in $args} {set res [optimize $res]} eval $res } return $res } ====== The '''optimize''' proc is a little longer than that, even though it currently just handles the two cases discussed above. It splits the TAL input in a list of lines, so it can also operate on other than the current line. Lines considered redundant are first marked with the prefix "#o", and removed after one pass, so that indexes don't get confused. ====== proc optimize tal { set last "" set last2 "" set lines [split $tal \n] for {set i 0} {$i < [llength $lines]} {incr i} { set instr [regexp -inline {[A-Za-z0-9_]+} [lindex $lines $i]] if {"$last $instr" eq "push pop"} { lset lines $i-1 #o[lindex $lines $i-1] ;# mark for deletion lset lines $i #o[lindex $lines $i] } set last $instr } while 1 { set tmp {} ;# remove marked lines foreach line $lines {if ![string match #o* $line] {lappend tmp $line}} set lines $tmp set found 0 for {set i 0} {$i < [llength $lines]} {incr i} { set instr [regexp -inline {[A-Za-z0-9_]+} [lindex $lines $i]] if {"$last2 $last $instr" eq "jump label jump"} { set oldTrg [string trimright [lindex $lines $i-1 1] ";"] set newTrg [lindex $lines $i 1] lset lines $i-1 #o[lindex $lines $i-1] ;# mark for deletion lset lines $i #o[lindex $lines $i] set found 1 break } set last2 $last set last $instr } if $found { set tmp {} ;# remove marked lines foreach line $lines { if [regexp "jump.* $oldTrg " $line] { set line [string map [list " $oldTrg " " $newTrg "] $line] } if ![string match #o* $line] {lappend tmp $line} } set lines $tmp } else break } return [join $lines \n] } ====== Testing: first unoptimized TAL... ====== % aproc f x {foreach i {a b} {foreach j $x {puts $i,$j}}} -x proc f x {asm { push {a b} ;# (0) push1 0 # "a b" store 1 ;# (2) storeScalar1 %v1 # temp var 1 pop ;# (4) pop push -1; store 2; pop ;# (5) foreach_start4 0 label L10; incrImm 2 +1;load 1;load 2 listIndex;store i;pop load 1;listLength;lt ;# (10) foreach_step4 0 jumpFalse L63 ;# (15) jumpFalse1 +48 # pc 63 ;# (17) startCommand +43 1 # next cmd at pc 60 load x ;# (26) loadScalar1 %v0 # var "x" store 4 ;# (28) storeScalar1 %v4 # temp var 4 pop ;# (30) pop push -1; store 5; pop ;# (31) foreach_start4 1 label L36; incrImm 5 +1;load 4;load 5 listIndex;store j;pop load 4;listLength;lt ;# (36) foreach_step4 1 jumpFalse L58 ;# (41) jumpFalse1 +17 # pc 58 push puts ;# (43) push1 1 # "puts" load i ;# (45) loadScalar1 %v3 # var "i" push , ;# (47) push1 2 # "," load j ;# (49) loadScalar1 %v6 # var "j" concat 3 ;# (51) concat1 3 invokeStk 2 ;# (53) invokeStk1 2 pop ;# (55) pop jump L36 ;# (56) jump1 -20 # pc 36 label L58; push {} ;# (58) push1 3 # "" pop ;# (60) pop jump L10 ;# (61) jump1 -51 # pc 10 label L63; push {} ;# (63) push1 3 # "" ;# (65) done label Done; }} ====== and now, with the -o switch added, the optimized version: ====== % aproc f x {foreach i {a b} {foreach j $x {puts $i,$j}}} -x -o proc f x {asm { push {a b} ;# (0) push1 0 # "a b" store 1 ;# (2) storeScalar1 %v1 # temp var 1 pop ;# (4) pop push -1; store 2; pop ;# (5) foreach_start4 0 label L10; incrImm 2 +1;load 1;load 2 listIndex;store i;pop load 1;listLength;lt ;# (10) foreach_step4 0 jumpFalse L63 ;# (15) jumpFalse1 +48 # pc 63 ;# (17) startCommand +43 1 # next cmd at pc 60 load x ;# (26) loadScalar1 %v0 # var "x" store 4 ;# (28) storeScalar1 %v4 # temp var 4 pop ;# (30) pop push -1; store 5; pop ;# (31) foreach_start4 1 label L36; incrImm 5 +1;load 4;load 5 listIndex;store j;pop load 4;listLength;lt ;# (36) foreach_step4 1 jumpFalse L10 ;# (41) jumpFalse1 +17 # pc 58 push puts ;# (43) push1 1 # "puts" load i ;# (45) loadScalar1 %v3 # var "i" push , ;# (47) push1 2 # "," load j ;# (49) loadScalar1 %v6 # var "j" concat 3 ;# (51) concat1 3 invokeStk 2 ;# (53) invokeStk1 2 pop ;# (55) pop jump L36 ;# (56) jump1 -20 # pc 36 label L63; push {} ;# (63) push1 3 # "" ;# (65) done label Done; }} ====== Lines 58 to 61 are gone, line 41 now jumps directly to L10... but does it still work as before? ====== % f {0 1 2} a,0 a,1 a,2 b,0 b,1 b,2 ====== <>Example