MS I ran the tclbench suite on tclsh compiled with three different compilers and several optimisation combinations. This page summarizes the results.
These tests were run on a PIII/600Mhz/192MB laptop running linux RedHat7.2.
The compilers were:
[L2 ] Notes on compiling tcl with icc
Results
SPEED SIZE COMPILER 1.00 1.00 gcc2.96 -O -march=pentiumpro 1.05 1.00 gcc2.96 -O -march=pentiumpro -fomit-frame-pointer 1.01 1.01 gcc2.96 -O2 -march=pentiumpro 1.07 1.02 gcc2.96 -O2 -march=pentiumpro -fomit-frame-pointer 1.01 0.97 gcc2.96 -Os -march=pentiumpro 1.05 0.98 gcc2.96 -Os -march=pentiumpro -fomit-frame-pointer 0.99 1.07 gcc3.1 -O -march=pentium3 1.03 1.08 gcc3.1 -O -march=pentium3 -fomit-frame-pointer 1.02 1.12 gcc3.1 -O2 -march=pentium3 1.06 1.13 gcc3.1 -O2 -march=pentium3 -fomit-frame-pointer 1.03 1.14 gcc3.1 -O3 -march=pentium3 1.08 1.15 gcc3.1 -O3 -march=pentium3 -fomit-frame-pointer 1.04 0.97 gcc3.1 -Os -march=pentium3 1.06 0.97 gcc3.1 -Os -march=pentium3 -fomit-frame-pointer 1.11 1.47 icc6.0 -O3 -xK -ip
Conclusions (?)
gcc. It is a question if it is worth the loss of a traceable core file - tcl shouldn't dump core
GNU compilers produce faster and smaller code with "-Os"
measured by tclbench), but a much larger image (in the only tested configuration).
Notes
HEAD
"gcc2.96 -O". This produced a 702kB tclsh which ran the tclbench suite in 00:04:35.
features ("-march" and "-x" flags).
which I suppose gives the best optimisation. I have not checked for the intel equivalent to gcc's "-Os" flag.
non-debuggable - the stack trace in core files is not usable. This behaviour is also present (I think) in the optimised code produced by icc.