Managing the reference count of Tcl objects

The general rule for managing the reference count of Tcl objects is to increment the reference count of a Tcl_Obj before passing it to a Tcl C API function, and then in turn to decrement the reference count once finished accessing that Tcl_Obj.

See Also

Tcl_Obj refCount HOWTO
Information on the same topic.

Description

From a usenet posting by André Pönitz:

Until now I have been using Tcl_Eval as the only method to access Tcl from my C++-Code. For performance reasons I'd like to switch to some more elaborate method, but I am still confused where and when I have to call Tcl_IncrRefCount and Tcl_DecrRefCount. Specifically: Suppose I have an object that actually is a list. I have called Tcl_IncrRefCount once on every item in the list and on the list itself. How do I free the list?

MS 2005-10-24: Apart from all the comments below, something specific to lists. In general, you should not concern yourself with the refCount of list items, as the Tcl library will handle those for you, i.e., the rule is that you should not touch them (neither incr nor decr) as long as you are using the List functions. If you are doing direct surgery on the list (as opposed to doing it via Tcl_ListObjReplace() or similar), it is really more involved. Recommendation is

  1. Don't.
  2. if you still do, read the Tcl sources carefully (especially tclListObj.c).

You're making things difficult for yourself.

There's only one rule:

You need to worry about ref counts if and only if you have a Tcl_Obj* on the left hand side of an equal sign. In this case, you must

  1. Tcl_IncrRefCount() the new content of the variable.
  2. Make sure that you Tcl_DecrRefCount() the new content when you lose the reference, either because you overwrote it with another assignment or because it went out of scope.

If you know exactly what you're doing, you can sometimes skip incrementing/decrementing the ref count, because you're sure that there's another reference somewhere. But the rule above always works.

The problem with this rule is performance. Tcl_IncrRefCount() often makes an unshared object appear to be shared, mandating an extra Tcl_DuplicateObj() for any modification of the value. For this reason, you can optionally add a second rule:

  • If you're absolutely sure that nobody else will decrement the ref count of an object while you're holding a reference to it, you can skip manipulating the ref count.

Let's make an lshift command to illustrate how this works. lshift will accept the name of a variable whose value is a list a list, and remove its first element. A Tcl equivalent would be:

proc lshift { varName } {
    upvar 1 $varName var
    set var [lrange $var 1 end]
}

The following is a naive implementation of lshift in C. It is ultraconservative about reference counts; it always, always adjusts the count when it stores a Tcl_Obj pointer.

int
Lshift_ObjCmd( ClientData unused,
               Tcl_Interp* interp,
               int objc,
               Tcl_Obj * CONST objv[] )
{
    Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
    int status;          /* Status return from Tcl library */
    Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 

   /* Check arguments */

    if ( objc != 2 ) {
        Tcl_WrongNumArgs( interp, 1, objv, " varName" );
        return TCL_ERROR;
    }

    /* Get a pointer to the list */ 

    listPtr = Tcl_ObjGetVar2( interp, objv[1], (Tcl_Obj*) NULL,
                              TCL_LEAVE_ERR_MSG );
    if (listPtr == NULL ) {
        return TCL_ERROR;
    }

    /* See the discussion for comments on the following line */

    Tcl_IncrRefCount( listPtr );                        /* [A] */

    /* If the list object is shared, make a private copy. */

    if ( Tcl_IsShared( listPtr ) ) {
        Tcl_Obj* temp = listPtr;
        listPtr = Tcl_DuplicateObj( listPtr );
        Tcl_DecrRefCount( temp );
        Tcl_IncrRefCount( listPtr );
    }
    
    /**
     ** At this point, listPtr designates an unshared copy of the
     ** list.  Edit it.
     **/

    status = Tcl_ListObjReplace( interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL );

    /**
     ** Put the new copy of the list back in the variable.
     **/

    if ( status == TCL_OK ) {
        retPtr = Tcl_ObjSetVar2( interp, objv[ 1 ], (Tcl_Obj*) NULL,
                                 listPtr, TCL_LEAVE_ERR_MSG );
    }

    /**
     ** Store the new copy of the list in the interpreter result.
     **/
    if ( retPtr == NULL ) {
        status =  TCL_ERROR;
    } else {
        /* Use retPtr instead of listPtr if trace action are required in the result */
        Tcl_SetObjResult( interp, listPtr ); 
    }

    /* Record that listPtr is going out of scope */

    Tcl_DecrRefCount( listPtr );

    /* Tell the caller whether the operation worked. */

    return status;
}

OK, now why did I say this was a naive implementation? It's simple: After the Tcl_IncrRefCount() at point [A] in the code, the object is always shared: there's at least the one reference to it in the variable table, and the one we just added. The result is that we'll always duplicate the object.

This is strictly a performance issue. The code as written will work, but the extra steps affect its performance. Making it faster in a safe manner requires a little bit of source diving to discover that Tcl_ListObjReplace() doesn't mess with the ref count of the list. However, Tcl_ObjSetVar2() does adjust the ref count of the object stored in the variable. We therefore can let the ref count float while we're performing surgery on the list, as long as we repair it by the time we're storing it back in the variable.

Doing this kind of optimization requires knowing about routines safe for zero-ref objs.

That change leads us to the following:

int
Lshift_ObjCmd( ClientData unused,
               Tcl_Interp* interp,
               int objc,
               Tcl_Obj * CONST objv[] )
{
    Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
    int status;          /* Status return from Tcl library */
    Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 

    /* Check arguments */

    if ( objc != 2 ) {
        Tcl_WrongNumArgs( interp, 1, objv, " varName" );
        return TCL_ERROR;
    }

    /* Get a pointer to the list */ 

    listPtr = Tcl_ObjGetVar2( interp, objv[1], (Tcl_Obj*) NULL,
                              TCL_LEAVE_ERR_MSG );
    if (listPtr == NULL ) {
        return TCL_ERROR;
    }

    /** PERFORMANCE CHANGE:
     ** We intend to perform surgery on the list object, so
     ** avoid making adjustments to its reference count yet.
     ** Hence, its reference count is one too low.
     **/

    /* REMOVED:  Tcl_IncrRefCount( listPtr ); */

    /* If the list object is shared, make a private copy. */

    if ( Tcl_IsShared( listPtr ) ) {
        Tcl_Obj* temp = listPtr;
        listPtr = Tcl_DuplicateObj( listPtr );
        /** PERFORMANCE CHANGE: At this point, we're not yet
         **    tracking the reference count of listPtr.  Its
         **    reference count remains one too low.
         ** REMOVED:
         **   Tcl_DecrRefCount( temp );
         **   Tcl_IncrRefCount( listPtr );
         **/
    }
    
    /**
     ** At this point, listPtr designates an unshared copy of the
     ** list.  Edit it.
     **/

    status = Tcl_ListObjReplace( interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL );

    /** PERFORMANCE CHANGE: 
     ** From this point forward, we ensure
     **    that listPtr's reference count is correct.
     ** ADDED: */

    Tcl_IncrRefCount( listPtr );

    /**
     ** Put the new copy of the list back in the variable.
     **/

    if ( status == TCL_OK ) {
        retPtr = Tcl_ObjSetVar2( interp, objv[ 1 ], (Tcl_Obj*) NULL,
                                 listPtr, TCL_LEAVE_ERR_MSG );
    }

    /**
     ** Store the new copy of the list in the interpreter result.
     **/
     if ( retPtr == NULL ) {
        status =  TCL_ERROR;
    } else {
        Tcl_SetObjResult( interp, listPtr ); 
    }

    /* The reference to the list is going out of scope. */

    Tcl_DecrRefCount( listPtr );

    /* Tell the caller whether the operation worked.

    return status;
}

DKF 2002-01-09: It turns out that you can be even more efficient than the above by taking advantage of the fact that Tcl_ObjSetVar2() does the Right Thing in the process of putting an object into a variable so that if you put an object into a variable where it already exists, its refcount can never drop to zero. This allows for the following implementation which does virtually no explicit reference count manipulation at all (just the Tcl_IsShared()/Tcl_DuplicateObj() combo in the middle):

int
Lshift_ObjCmd( ClientData unused,
               Tcl_Interp* interp,
               int objc,
               Tcl_Obj * CONST objv[] )
{
    Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
    int status;          /* Status return from Tcl library */
    Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 

    /* Check arguments */

    if (objc != 2) {
        Tcl_WrongNumArgs(interp, 1, objv, "varName");
        return TCL_ERROR;
    }

    /* Get a pointer to the list */ 

    listPtr = Tcl_ObjGetVar2(interp, objv[1], (Tcl_Obj*) NULL,
                             TCL_LEAVE_ERR_MSG);
    if (listPtr == NULL) {
        return TCL_ERROR;
    }

    /* If the list object is shared, make a private copy. */

    if (Tcl_IsShared(listPtr)) {
        listPtr = Tcl_DuplicateObj(listPtr);
    }
    
    /**
     ** At this point, listPtr designates an unshared copy of the
     ** list.  Edit it.
     **/

    status = Tcl_ListObjReplace(interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL);

    /** PERFORMANCE CHANGE: 
     ** At this point, listPtr's refcount is either zero or one
     ** and it will get incremented (and then decremented again
     ** if it was previously 1) in the Tcl_ObjSetVar2() call, in
     ** effect transferring ownership of the object to the
     ** variable.
     **
     ** NOTE that this may leak listPtr if Tcl_ObjSetVar2 fails. 
     ** DO NOT USE in Tcl8.4, nor in Tcl8.5 until Bug #1334947 is fixed.
     **
     ** REMOVED:
     **   Tcl_IncrRefCount(listPtr);
     **/

    /**
     ** Put the new copy of the list back in the variable.
     **/

    if (status == TCL_OK) {
        retPtr = Tcl_ObjSetVar2(interp, objv[1], (Tcl_Obj*) NULL,
                                listPtr, TCL_LEAVE_ERR_MSG);
    }

    /**
     ** Store the new copy of the list in the interpreter result.
     ** Increments the reference count of listPtr.
     **/
     if ( retPtr == NULL ) {
         status =  TCL_ERROR;
     } else {
         Tcl_SetObjResult( interp, listPtr ); 
     }
 
    /** PERFORMANCE CHANGE:
     ** The refcount for listPtr is now definitely 2, so do
     ** nothing here; effectively, the
     ** REMOVED:
     **   Tcl_DecrRefCount(listPtr);
     **/

    /* Tell the caller whether the operation worked.

    return status;
}

Historical note: in some previous versions of Tcl, issue issue #1334947 , "refCounts on PtrSetVar failure", 2005-10-22, Tcl_ObjSetVar2() didn't always do the right thing, causing the code above to leak the listPtr whenever Tcl_ObjSetVar2() failed due to e.g. objv[1] being the name of an array at the time it is called.


NEM 2003-05-10 PYK 2023-04-04: I had some trouble recently tracking down a bug which occurred due to not calling Tcl_IncrRefCount() on an object which I later call Tcl_DecrRefCount() on. The bug was very difficult to track down, as it only showed up somewhere completely unrelated. Example:

Tcl_Obj *cmd = Tcl_NewStringObj("somecmd args", -1);
// Missing Tcl_IncrRefCount here - this is the bug
int res = Tcl_EvalObjEx(interp, cmd, TCL_EVAL_GLOBAL);
Tcl_DecrRefCount(cmd);
// Check res and continue or error

The problem is that 99% of the time, the bug will not surface here but much later. For me, it surfaced when calling info commands in my code, at which point the program dumped core (segfault).

MS explained why this can happen: Basically, you should always call Tcl_IncrRefCount() on any object you send to a Tcl API call, as that call may call Tcl_IncrRefCount() or Tcl_DecrRefCount() on it, which would result in the object being "freed". The reason why "freed" is in inverted commas, is that (depending on the memory allocator in use) the object may not be freed at all, but returned to a pool for reuse later. In my example, the problem was that the object was being freed, and the then immediately reused for a different purpose, before I did my Tcl_DecrRefCount(). This meant that I was basically freeing an object that I don't own, and that causes all hell to break loose.

The solution, obviously, is to double check to make sure that all of your Tcl_IncrRefCount/Tcl_DecrRefCount pairs match up. It can be easy to overlook one, though, as I discovered. You can compile Tcl with -DPURIFY (or all, or debug) to get Tcl to use the standard malloc()/free() in which case the problem will be localized (i.e. you overly freeing an object won't result in a problem for Tcl), but you still need to find and fix the bug.

To help with this, I'm going to look into how hard it would be to add a debug compile flag for Tcl which allows tracking of Tcl_Obj reference counts. Should be fun!

jcw 2003-11-09: For another idea on how to simplify the programming task of cleaning up reference count, see the AutoReleasePool used in Cocoa on Mac OS X.

DKF 2003-11-10: It's not entirely clear to me from the above page how an AutoReleasePool would be done in conventional C, but then I'm a bit jet-lagged right now...

The idea seems to be that one puts a ref in a pool, when returning an object which ought to be released at some point. The pool is normally created right after every event and deleted just before returning to poll/suspend for another event. Deleting the pool means: drop all refcounts stored in it This approach makes it possible to never have zero-ref objects: simply add them to the pool and if the object is not used anywhere else it'll go away when the system is idle. -jcw

elfring 2003-11-11: Can the design pattern Resource Acquisition is Initialization help to find solutions? Can anything from the Smart Pointer Library be converted into C functions for Tcl?

dkf 2009-06-19 10:37:25:

The Tcl equivalent to how RAII is used in practice is to create a command that takes (among other arguments as necessary) the name of a variable and a script and which operates as follows:

  1. allocate the resource
  2. store the name of the resource in the variable (which is resolved in the caller's scope, so use upvar if doing this in a procedure)
  3. evaluate the script in the caller's context (using uplevel if necessary)
  4. free the resource when the script returns, catching and rethowing any exceptional conditions in the process.

Yes, RAII can be used in other ways. It usually isn't though, and this way is pretty easy. It ends up leading to code like this (a simple example where the implementation is left as an exercise):

withOpenFile /some/file/name fid {
    gets $fid line
    puts "the first line of /some/file/name is $line"
}

Survey of Tcl_Obj reference-counting issues

Issues related to reference-counting of Tcl_Obj can be categorized as follows:

  1. Passing a Tcl_Obj with refCount of 0 to a function which decrements the refCount, and then attempting to access Tcl_Obj, which has now been freed: Examples include decrementing the refCount again when the function is complete. In this case, the caller should understand when the called function decrements the refCount, and not duplicate the effort. Either that or increment the refCount before passing the Tcl_Obj to the function.
  2. Passing a Tcl_Obj with refCount of 0 to a function which saves the pointer somewhere and increments the refCount, and then then decrementing the refCount after the function returns, meaning that the saved pointer now points to freed memory: In this case, the caller should understand when the called function increments the refCount, and not decrement the refCount so that it falls back to zero. Either that or increment the refCount before passing the Tcl_Obj to the function.
  3. A Tcl_Obj can increment the refCount of another Tcl_Obj and store a pointer to that object in its internal representation. If the other Tcl_Obj does the same with the original Tcl object, neither Tcl_Obj ever reaches a refCount of 0, which is is a memory leak.
  4. A Tcl_Obj can store a reference to itself in its own internal representation, either directly or through one or more intermediate Tcl_Obj structures, so the refeence count never reaches 0, which is a memory leak.
  5. Aliasing: If a function that takes a pointer to a Tcl_Obj as an argument and returns a pointer to a Tcl_Obj returns a pointer that was passed in, a caller that doesn't expect this could then decrement the reference count of the object that was passed as an argument, causing it to be freed, and then attempt to access the same Tcl_Obj via the pointer that was returned.

A list of relevant issue reports along with their category:

memory leak: SetFsPathFromAny, assisted by the global literal table, causes a Tcl_Obj to reference itself {2023 03 26}
Category 4.
[try] interaction with local variable names produces segmentation fault {2019 04 18}
Category 1.
foreach memleak {2016 02 25}
Category 3.
Tcl_LinkVar memory corruption causing crashes {2005 04 27}
Category 1.
refCounts on PtrSetVar failure {2005 10 22}
Category 1.
Crash in Tcl_WriteChars() {2022 11 16}
Category 5.

Page Authors

PYK
Added the case study on Tcl variables.