Version 7 of Reference

Updated 2004-03-05 20:01:23

A reference is a generic term for a value that refers to data that is stored somewhere. Some important examples are:

  • Pointers (as in e.g. C)
  • ...

A feature that is often expected from references in high level languages is that storages should disappear (be freed/deallocated) when there are no more references to them. That kind of references does not exist in Tcl.

[Explain garbage collection and the like.]

[It might also be reasonable to explain the distinctions between value, variable, and reference. Some languages have variables whose values are either real values if those values are primitive types or references if the values are structured types. Not all languages make the same distinctions. Java, Icon, and Smalltalk all have slightly different implementations of such things.]


References touch upon one of the ways in which Tcl is different from many other languages. What could be called the "standard model" for complex data structures is that:

  • all values are scalar, complexity comes from how they are stored.

A basic problem in this model is how to represent a thing that can contain other similar things. The soltion is to introduce references: scalar values that lets one access a (usually composite) storage. Some old languages (FORTRAN 77?) couldn't do this in general and as a result had severe restrictions, whereas others (Pascal) just barely provide the basic functionality. C and Lisp were probably the first languages in their respective families to have all the kinks worked out. In low-level languages the references are just bare memory addresses (pointers) to a block of memory and the structuring of storage is just a way of labeling offsets within this memory block. In higher level languages there usually exist more flexible ways of structuring storage (variable-length lists, "hashes", etc.) but the role of the references is the same. Every data structure that is not predefined in the language has to be implemented by connecting more basic data structures using references.

Tcl does not follow this standard model, because in Tcl everything is a string. Since a list can contain arbitrary strings it follows that a list can be a list of other lists, or a list of lists of lists, or whatever, nested to arbitrary depth. The basic problem of a thing containing other similar things can be solved without resorting to references. (Although of course, since Tcl is implemented in C, there are references "under the hood". Keeping them hidden is however often a strength!)


Discussion moved here from Why adding Tcl calls to a C/C++ application is a bad idea:

davidw - And this is still an area where Python has us beat:

  • Python has one, reasonably good OO system
  • Python objects are GC'ed, making them easier to use in scripts. Tcl objects must be destroyed by the programmer.
  • Python has references, which are also useful for doing data structures.

Of course, it's not impossible to do this in Tcl... there are numerous testaments to the intelligence and cleverness of Tcl programmers on this wiki that put paid to that notion. However, what we need to keep in mind is the programmer examining both languages for the first time.

It is not obvious how to handle complex data structures in Tcl, as compared with Python, Ruby or other scripting languages.


?: Some arguments are not correct

  • XOTcl is better object system than Python. Tcl is flexible to write OO-Systems (see snit)
  • In XOTcl not all object need to be explicit destroyed (without GC). There are subobjects and volatile objects that are destroyed automatic.

Artur Trzewik In fact XOTcl Objects names are references. Objects are referenced by their names. I do not see big difference amoung

    C
    Point *point = Alloc(sizeof(Point));
    Free(point);

    C++
    Point *point = new Point();
    delete point;

    C#
    Point point = new Point();
    point = null;

    Smalltalk
    | point |
    point := Point new.
    point := nil.

    XOTcl
    set point [Point new]
    $point destroy

    set point [Point new -volatile] # if object sould life only in current block

In all cases point is such kind of reference to object (structure) of type Point.

Memento By using of OO in Tcl we not need really any additional references.

About GC in OO-Tcl. I programm this language about several years and after big period with Smalltalk programming I would also like GC in Tcl. But till now I do not very miss GC in XOTcl. There are techniques like volatile objects and subobjects that make destroing of many objects very simple. The C++ problems with forgotten objects (dangling objects) do not occurs in XOTcl. On the other side. Releasing objects (destroying) are very important period of object life-time that sould be controlled by programm. I have seen many problems with smalltalk memory handling because the programmer have forgotten to do this magic ( point := nil.) in some global references. Also with GC you have to care about object releasing and you do not have full control about it.

It is also interesting to see some C++-programmers arguments agains GC.


davidw - XOTcl looks ok, but it's not available in ActiveTcl or the default starkit setup and there are no books about it. Even the .deb's have been pulled (I might be convinced to work on fixing at least that...). This is the problem with not having one standard thing.

Also, the 'volatile' objects, from what I see on the docs, depend on living in a particular Tcl procedure. I don't want that, I want something that goes away when nothing else references it.

The object system may be 'better' than python's or ruby's or something else, but for the new user, the situation looks like a mess. It's something I think we'll have to face up to if we want to program in this "Tcl is the controlling application" style, and promote it to the world at large.


Lars H: IMO, references are evil. I cannot recall a single case where using references would have simplified a programming task or would have lead to a better solution. Indeed, I do not know of any problem where the asymptotic complexity is smaller in a computing model with references than it is in current Tcl (i.e., without references). But feel free to enlighten me if you have an example.

davidw: The classic example of having references/OO/GC make life easier is the http package. It requires you to do a 'cleanup', because otherwise it will never be GC'ed.

jcw - Another example is when you want to create Tcl commands and nest them yet clean up automatically. The Vkit is a vector engine page goes into an example and tries to find a solution, but it really applies to a lot of problem domains, let's say matrices:

   set a [$matrix invert]
   set b [$a transpose]
   set c [$b invert]
   $c print

I'd like to write:

   [[[$matrix invert] transpose] invert] print

And have everything clean up after use, i.e. the Tcl commands. That's where Python's "matrix.invert().transpose().invert().print()" makes life a lot easier. Even C++'s constructors/destructors with temporaries support such an idiom, with automatic cleanup.

Lars H: The catch in both examples is that they are of idioms that require references rather than actual problems (i.e., something which is solved with an algorithm) that need them. HTTP communication is, as I understand it, completely stateless -- hence there should be no need to create anything that needs to be GC'ed. I find the matrix example almost proposterous. Matrices are data, so why should they ever be make commands? What is wrong with

  print [invert [transpose [invert $matrix]]]

?

RS: The former form can be read from left to right to indicate the temporal order of the nested methods, while the latter requires the reader to read from right to left, as usual with nested functions (and APL), but less in Western cultures...

jcw - Lars, you say "matrices are data" - I see them as objects. Instances of a class, in OO terms (as in Python, C++, Smalltalk). An example which interests me more than matrices btw, is relational algebra [L1 ]. If you brush aside examples where Tcl commands are used as objects, and with it the issue of object cleanup, so be it. Your example uses a namespace to identify operators and apply them to data, the OO model takes objects and makes them respond to methods/messages. Polymorphism, encapsulation - you're free to not care about that, of course.

Lars H: Well, I suppose I the OO hype must have missed me, because I tend not to see things as objects, and since this furthermore seems to save me a lot of trouble I'm only glad for it. Polymorphism does not require referenced objects.

I can agree that the "object on which a method is acting" view is sometimes appropriate, and I have on some occations suggested using it, but matrix arithmetic does hardly lend itself to that view. When all operations create new things from old then it isn't objects you're working with, but values! The Ratcl you refer to looks like a very striking example of this -- is there anything (apart from your preferences with respect to syntax) that prevent you from implementing views as Tcl values? (And you can of course have any syntax you like, if you just bite the bullet and create a proper little language.)


[ Category Concept ]