This page is here to collect tips and tricks that are useful while hunting memory leaks at the C level, in the Tcl core or in extensions. To track script-level leaks (like lingering keys in global arrays, objects, channels, etc), see sibling page "Leak Hunt (Tcl level)".
Valgrind is a very powerful tool. Its oldest component, "memcheck", is key to hunting leaks, buffer overflows, uninitialized values, etc. One nice thing is that it works on unmodified executables (no need to recompile in a dedicated mode). The basic incantation is:
valgrind executable arguments...
It will produce a summary telling how dynamically allocated memory is used when exiting the process: definitely lost, indirectly lost, possibly lost, or still reachable. Additional flags can give further details, see valgrind(1).
In the context of Tcl, a few things are worth noticing to get an efficient leak hunt:
Taking these facts into account, the typical setup becomes:
env CFLAGS=-DPURIFY ./configure --enable-symbols && make clean && make valgrind --leak-check=yes ./tclsh somescript.tcl args...
Once a leak is spotted, and valgrind has given you a detailed C-level stack trace of the point of allocation... of course it's time to switch to gdb. Various kind of allocations request varied techniques, but a rather frequent class is that of refcounted things (like Tcl_Obj's). Here gdb's watchpoint tool is very useful:
One Thing That Should Never Occur in Tcl is a cycle in the graph of references among Tcl_Objs. Indeed, since we depend on refcounting for memory management, a cycle is an absolute show stopper. Fortunately, by design, the language is strictly unable to produce such "reference loops": the copy-on-write principle prevents all in-place operations which would "close the loop". Any exception to this rule is a bug, not in the script, but in the core.
But here we're discussing core debugging, right ? So these things may happen. It happened to me in Bug 3386417, where the newly introduced info errorstack (TIP #348) was somehow plugged back into a compiled scriplet that it referred to. The interesting generalization is as follows:
If:
Then it is very likely that you have a Reference Loop. Of course you're on your own to actually track it down, but knowing the mere existence of their kind may save you hours (it would have, in my case :/ ).