Updated 2015-06-04 11:00:21 by juef

AMG: Recent Tcl doesn't attempt to clean up all its memory on exit because that takes time and serves no purpose. However, this interferes with valgrind's leak detection. Setting the TCL_FINALIZE_ON_EXIT environment variable is supposed to restore full cleanup, though I couldn't get it to work for me. Compiling with -DPURIFY works much better, but it seems to have the side effect of making Tcl unload all dynamic libraries on exit, which prevents valgrind from showing the names of functions in loaded libraries.

If you're debugging a library loaded by Tcl and not Tcl itself, you can just use this suppression file:
   Tcl allocation 1
   match-leak-kinds: possible
   Tcl allocation 2
   match-leak-kinds: possible
   Tcl allocation 3
   match-leak-kinds: possible
   linker uninitialized branch at startup

That last entry isn't for Tcl, but it's useful to me on my Slackware64 system, so I'm leaving it in for now.

(author unknown, dated 2002-08-02) (MS pleads guilty)

The recent report on the valgrind memory debugger [1] prompted me to try it on the tcl sources - help out the poor Jeff, the only purify licensee among us.

This is what I did after downloading and installing valgrind on my linux/i386 laptop:

        proc exit args {}

  • ran the testsuite under valgrind:
        /opt/valgrind/bin/valgrind -v --leak-check=yes --num-callers=10 \
          --logfile-fd=9 --leak-resolution=high --show-reachable=no \
          ./tcltest ../tests/all.pfy 9>valgrind.out

The output is
 ==19434== valgrind-1.0.0, a memory error detector for x86 GNU/Linux.
 ==19434== Copyright (C) 2000-2002, and GNU GPL'd, by Julian Seward.
 ==19434== Startup, with flags:
 ==19434==    --suppressions=/opt/valgrind/lib/valgrind/default.supp
 ==19434==    -v
 ==19434==    --leak-check=yes
 ==19434==    --num-callers=10
 ==19434==    --logfile-fd=9
 ==19434==    --leak-resolution=high
 ==19434==    --show-reachable=no
 ==19434== Reading suppressions file: /opt/valgrind/lib/valgrind/default.supp
 ==19434== Reading syms from /CVS/tcl_SF_clean/unix/tcltest
 ==19434== Reading syms from /lib/ld-2.2.4.so
 ==19434== Reading syms from /opt/valgrind/lib/valgrind/valgrind.so
 ==19434== Reading syms from /lib/libdl-2.2.4.so
 ==19434== Reading syms from /lib/i686/libm-2.2.4.so
 ==19434== Reading syms from /lib/i686/libc-2.2.4.so
 ==19434== Estimated CPU clock rate is 598 MHz
 ==19434== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
 ==19434== malloc/free: in use at exit: 3930 bytes in 22 blocks.
 ==19434== malloc/free: 136637 allocs, 136615 frees, 56516658 bytes allocated.
 ==19434== searching for pointers to 22 not-freed blocks.
 ==19434== checked 4470668 bytes.
 ==19434== definitely lost: 544 bytes in 17 blocks.
 ==19434== possibly lost:   0 bytes in 0 blocks.
 ==19434== still reachable: 3386 bytes in 5 blocks.
 ==19434== 544 bytes in 17 blocks are definitely lost in loss record 4 of 6
 ==19434==    at 0x400467C4: malloc (vg_clientfuncs.c:100)
 ==19434==    by 0x80CF7AE: TclpAlloc (./../generic/tclAlloc.c:680)
 ==19434==    by 0x8066041: Tcl_Alloc (./../generic/tclCkalloc.c:1002)
 ==19434==    by 0x80BC2E9: NewVar (./../generic/tclVar.c:4266)
 ==19434==    by 0x80B9419: TclLookupArrayElement (./../generic/tclVar.c:929)
 ==19434==    by 0x80840B4: TclExecuteByteCode (./../generic/tclExecute.c:1775)
 ==19434==    by 0x8082CB9: TclCompEvalObj (./../generic/tclExecute.c:1007)
 ==19434==    by 0x80AF574: TclObjInterpProc (./../generic/tclProc.c:1074)
 ==19434==    by 0x8061BAD: TclEvalObjvInternal (./../generic/tclBasic.c:3033)
 ==19434==    by 0x808381C: TclExecuteByteCode (./../generic/tclExecute.c:1430)
 ==19434== LEAK SUMMARY:
 ==19434==    definitely lost: 544 bytes in 17 blocks.
 ==19434==    possibly lost:   0 bytes in 0 blocks.
 ==19434==    still reachable: 3386 bytes in 5 blocks.
 ==19434== Reachable blocks (those to which a pointer was found) are not shown.
 ==19434== To see them, rerun with: --show-reachable=yes
 --19434--       lru: 798 epochs, 0 clearings.
 --19434-- translate: new 14353 (218894 -> 3060822), discard 0 (0 -> 0).
 --19434--  dispatch: 39900000 basic blocks, 14078/524777 sched events, 248895 tt_fast misses.
 --19434-- reg-alloc: 5203 t-req-spill, 573378+33302 orig+spill uis, 77003 total-reg-r.
 --19434--    sanity: 1014 cheap, 41 expensive checks.


  • valgrind is amazingly fast: 'time' reports 1m18 user time for the testsuite, vs 0m45 without using it.
  • the option '--show-reachable=yes' produces also a stack trace of the creation of the reachable unfreed memory at exit. In our case, the 5 blocks totalling 3386 bytes are due to tcl not freeing encoding-related memory properly ( See Bug #543549 [2] )
  • this shows a real leak: 17 times in the testsuite, a new element is added to a local array and the corresponding variable is never freed. The hunt is on!

UPDATE: "if it is too good to be true ... it is probably not true"

When run as above, valgrind will not monitor children processes. But tests are normally run in separate processes, so that I was not monitoring the memory usage of the actual tests.

With the option --trace-children=yes, or running the tests in the same process, the slowdown is enormous - append.test takes 75x longer under valgrind.