An objectifying hook for Tcl 9

Below is a very rough idea... it might be doable in Tcl 8 (i.e. compatibly), I'm not sure... jcw


Suppose a Tcl value could be tagged by one extra bit: the tag would indicate that the value wants to behave as an object. So for example, the command:

    set v [blah ...]

would be turned into:

    [blah ...] set v

If the result of blah was tagged in this way. If not, then the result would simply be stored in v as usual.

That requires changes to "set" (which is the most important one for me at this stage), but perhaps also to "lindex" and others. And given that set is such a core function, the tag-check must be super fast. Which means either a bit in Tcl_Obj, or a change to the Tcl_ObjType vector (this may be tricky, due to shimmering).

The point of all this...

I am building a system which lets me write complex "expressions" of Tcl commands. Each returns an object (in the usual way: the name of a freshly created Tcl command), but the most important issue is cleanup. So far, I can get everything to clean up just dandy, by avoiding set. So instead of writing:

    set a [b ... [c ... [d ...] ... [e ...] ...] ... [f ...] ...]

I have to write:

    [b ... [c ... [d ...] ... [e ...] ...] ... [f ...] ...] as a

Everything works. It's in fact surprisingly simple to achieve full cleanup (through a bit of trace trickery), across proc invocations and all. But the one nastiness here, is that "set a ..." is trouble.

So if all the above commands were to return a value which instructs "set" to turn around and let the command object handle the setup, then "set a ..." would work ok. So would "set a(ha) ...".

The problem now moves to lset and lreplace and lappend not working, in that they would not clean up as the rest does. But this is definitely a second-order issue. The difference is that these special objects are aggregates, and will therefore be forbidden for storage.

There is one more refinement, to really turn this into a solid solution: if the same logic were also applied when initing proc arguments, then the case of passing-args/cleaning-up becomes effortless (without this, one needs an extra call for each arg in each proc, easy to forget). In other words, the tag needs to be dealt with also when "setting arg-variables" (turning each tagged value passed in into a member call).

To keep things in perspective all of this is only about adding machinery so automatic cleanup of (command) objects can be implemented (at the Tcl script level).

To put it another way... the above adds the ability to implement automatic object lifetimes (a form garbage collection).


DKF: A few points...

  • Adding flags to objects - I'm keener on doing this to the Tcl_ObjType structure, because I feel that the property of not being shimmered away is properly a feature of the type, and not the instances of that type. And anything added to Tcl_Obj will have a large effect on Tcl's memory consumption.
  • The bit I don't get at this stage is what is wrong with [set]?

Agree on #1. As for "set": the auto-cleanup trick hinges on adding an unset trace to every variable when it receives the value (which is a command name). In essence, a set that properly tracks objects must do the following:

  1. set the variable to the object reference (i.e. command name)
  2. increment an internal reference count for that object (this is not the Tcl_Obj ref count!)
  3. set an unset trace on the variable which decrements the refcount again

It's like saying "I will let you be my friend and will add you to my list of friends, if you promise to let me know when you no longer want to be my friend" ... :) -jcw


Aren't you just trying to reinvent garbage collection and references? - davidw

28nov02 jcw - In a way that fits in Tcl, yes indeed. A command is an object, the name of that cmd can be passed around as a reference. If stored in vars or array elements the above proposes a mechanism to make things automatic.


DKF - Yes, but there's quite a few places where people would actually rather like garbage-collected named things in Tcl (and Tk too.) At the moment, only unnamed things (like lists, strings and numbers) are GC'ed, but it is quite clear that we really need more than that. The question is how to do it.

One way of thinking about the problem is to say that at the moment there is one reference, in Tcl's internal name symbol table (e.g. as a command, which you can list out with [info commands]). If you could somehow have an unnamed entity which handled the same way as a command (or whatever other resource we're interested in), then it would be very useful in that it would vanish when the last of its (unnamed) references goes away.

There is one problem with this; Tcl tends to throw away references in Tcl_Objs rather eagerly...

jcw - Here's a thought: the string representation of a Tcl_Obj could be re-used. If "length" is zero, then "bytes" points to another Tcl_Obj* ... a reference! Not sure of all the implications (a lot more test through the entire Tcl core), but this could act as a true indirection / reference.

Other variations are possible: if "length" is zero, then "bytes" could also be a null-terminated string, interpreted as a fully qualified Tcl variable or array element? Hm, nah that's brittle: deleting the var/elt would break things.