Updated 2010-11-25 23:14:03 by dkf

by Reinhard Max, August 2003

Starting at version 3.3 gcc supports profile based optimizations, which are said to be especially beneficial for things like interpreters. In my tests, Tcl gained about 3..5% in speed when compiled with profile based optimization depending on the CPU architecture.

Compiling Tcl (or any other project) with GCC's profile based optimization goes in three steps:

  1. Use -fprofile-arcs as an additional compiler flag to compile the code with profiling support.
  2. Run the result at some “typical workload” to gain statistics on which branches are taken how often.
  3. Compile the code again using -fbranch-probabilities as an additional compiler flag, which will use the statistics written before to better optimise the code.

Normally one would feed these additional gcc flags into Tcl's configure script via the CFLAGS environment variable, but Tcl's build system still doesn't handle this properly (Jeff, can you hear me? ;) ). So currently the best practice is to hack the Makefile after running configure, and just add the options mentioned above to the CFLAGS_OPTIMIZE variable.

For more details about the -fprofile-arcs, and -fbranch-probabilities flags see the gcc manpage and/or othe documentation that comes with gcc.