Updated 2016-12-19 07:18:52 by wiwo

wiwo 2016-12-13

Why is tcl 8.5 much faster than 8.6 with this script:

github mandel.tcl

# Optimized Version by Samuel Zafrany
# Ported from C by Anders Bergh <[email protected]>

set BAILOUT 16

proc mandelbrot {x y} {
       global BAILOUT
       global MAX_ITERATIONS

       set cr [expr {$y - 0.5}]
       set ci $x
       set zi 0.0
       set zr 0.0
       set i 0

       while {1} {
               incr i
               set temp [expr {$zr * $zi}]
               set zr2 [expr {$zr * $zr}]
               set zi2 [expr {$zi * $zi}]
               set zr [expr {$zr2 - $zi2 + $cr}]
               set zi [expr {$temp + $temp + $ci}]

               if {$zi2 + $zr2 > $BAILOUT} {
                       return $i

               if {$i > $MAX_ITERATIONS} {
                       return 0

set begin [clock clicks -milliseconds]

proc do {} {
        for {set y -39} {$y < 39} {incr y} {
               puts ""
               for {set x -39} {$x < 39} {incr x} {
                       set i [mandelbrot [expr {$x / 40.0}] [expr {$y / 40.0}]]

                       if {$i == 0} {
                               puts -nonewline "*"
                       } else {
                               puts -nonewline " "
                       flush stdout
        puts ""

set diff [expr [clock clicks -milliseconds] - $begin]
puts "Tcl Elapsed [expr $diff / 1000.0]"

8.5: Tcl Elapsed 1.08
8.6: Tcl Elapsed 1.463

This might be worth investigating.

bll 2016-12-13 Put the main loop into a procedure, comment out all the puts and flush and use "puts [time main 10]" to time it:
bll-tecra:bll$ tclsh8.6 t.tcl
2379487.4 microseconds per iteration
bll-tecra:bll$ tclsh8.5 t.tcl
2028435.5 microseconds per iteration

So yes, it does seem a bit slower.

[wiwo] 2016-12-15 I pasted the wrong version of the script. This is now fixed. The timings are from the new and correct version above. Tcl 8.6 is constantly 30 to 35 % slower than 8.5.

DKF 2016-12-18: I'm not certain, but I think this is the overhead due to the switch to the NRE execution engine. (Also make sure you've built both versions of Tcl yourself with the same configuration options if you want to be sure; I've had problems in the past with this on OSX where the system tclsh was built with some debugging options that had a substantial performance impact.)

Other possibilities include issues in the I/O layer; have you tried doing the comparisons with the puts and flush commands (in the loop in do) commented out?

[wiwo] 2016-12-18: I optimized the compiler flags. Now I get 0.93s for Tcl 8.5 and 1.32s for Tcl 8.6 (best of 10). This is for a Xeon E5-1650v3. For an i7-2670QM I get 1.6s (8.5) and 1.5s (8.6). So with this CPU 8.6 is faster. I still don't get why 8.5 is so much faster on the Xeon, also compared to Perl and Python.

DKF: I'm suspicious about those i7-2670QM figures. I expect 8.6 to be slower because it has more memory allocation going on. I guess that there might've been some other confounding effect happen in that run.

(My quick tests indicate that I can save about 3–4% of the time by removing the flush, for no perceivable difference.)

[wiwo] I've removed the flush and the put statements and tried yet another computer. Here I get slightly better values for 8.6 as well (1.21s vs 1.29).