Islist Extension

This is a little extension that will check whether a var is (internally) a list or not. Someone wanted something like this so I tried it. The good thing is, it works :) but since im not really into C I did not know how to dynamically (?) link it. So I guess you will need to edit the tcl.h include if you folder differs.

I just though I'd post it here, in case someone was looking for something similar, or just wanted another simple (!) extension example.


See also critcl.


/*
* islist.c --
*
*    A dirty Check for list.
*
*    (c) Gotisch
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "/usr/include/tcl8.4/tcl.h" 

static int
Islist_Cmd(
     ClientData cdata,
     Tcl_Interp *interp,
     int objc,
     Tcl_Obj *CONST objv[])
{
    if (objc < 2) {
        Tcl_SetObjResult(interp, Tcl_NewStringObj("No Argument given...", -1));
        return TCL_ERROR;
    }
    if (objv[1]->typePtr == NULL) {
        Tcl_SetObjResult(interp, Tcl_NewIntObj(0));
    } else if (strcmp(objv[1]->typePtr->name,"list")==0) {
        Tcl_SetObjResult(interp, Tcl_NewIntObj(1));
    } else {
        //Tcl_SetObjResult(interp, Tcl_NewStringObj(objv[1]->typePtr->name, -1));
        Tcl_SetObjResult(interp, Tcl_NewIntObj(0));
    }
    return TCL_OK;
}

/*
* Islist_Init --
*
*    Called when Tcl [load]'s your extension.
*/
int
Islist_Init(Tcl_Interp *interp)
{
    if (Tcl_InitStubs(interp, TCL_VERSION, 0) == 0L) {
        return TCL_ERROR;
    }
    Tcl_CreateObjCommand(interp, "Islist", Islist_Cmd, NULL, NULL);
    Tcl_PkgProvide(interp, "Islist", "1.0");
    return TCL_OK;
}

to compile like this

  gcc -shared -DUSE_TCL_STUBS islist.c -o islist.so -ltclstub8.4

and then use like this

% load ./islist.so
% Islist [list 1 2 3]
1
% Islist "this is a string"
0
% Islist [split "this is a list because we split it"]
1
his also shows how TCL is switching the internal representation of variables

% set a [list 3] 
3
% Islist $a 
1
% incr a
4
% Islist $a
0

First its a list, then it gets (internally) converted to a integer (i think).


male 2006-01-23:

In my eyes it is very important to program with Tcl in a way, that prevents shimmering, if the application to be developed has to handle a lot of probably large or huge lists and has to do a lot of numerical work!

A kind of developer extension to get a kind of statistic about shimmering, to get to know the most critical parts in an algorithm ... that would be real fine!

Some people already suggested to extend the info command to obtain object types and/or to convert objects into specified types.

This would be helpful to get more knowledge about those "shimmering happenings"! And if it's only about to show some less tcl-driven developers that sometimes problematic situation of shimmering and its consequences!

Best regards,

Martin


NEM: Hmm... the above extension violates Tcl' "everything is a string" semantics, and breaks referential transparency. Consider the following:

% set a "this is a string"
this is a string
% set b "this is a string"
this is a string
% llength $b
4
% expr {$a == $b}
1
% Islist $a
0
% Islist $b
1

So, here we have two values that are "equal" in Tcl's eyes and yet are we cannot substitute one for the other, as they are not really equal. We have replaced Tcl's referentially transparent value semantics (where we can substitute equal terms) with a referentially opaque reference semantics where we have to distinguish between objects which are equal values and those which are equal references. To make matters worse, you can now cause unwanted side-effects due to shared references (carrying on session above):

% set c $a ;# now shared
this is a string
% Islist $a
0
% llength $c
4
% Islist $a
1

There may be occassional legitimate debugging reasons for finding out the internal type that a Tcl_Obj happens to have at a given time, but it should be something that is used very rarely and you shouldn't base the logic of a program around such information.


male 2006-01-24:

For me a kind of feature to introspect object types is only for development and debugging purposes and not for any application logic.

In former times a colleague programmed vector functions in tcl and instead of using the list command he built a result list as string "$x1 $x2 $x3". Or numbers were rounded by the format command losing their internal representation.

To detect such probably very time consuming "mistakes" (not bugs), probably causing inaccuracy in matrix calculations while multiplying a lot of 4x3 matrices along a structural path.

In my sources I try to program on data with a wanted type only with the type related tcl commands, just to prevent shimmering, more runtime consumption as necessary, to guarantee more "data stability", etc..

But ... I would never ask for change of ...

% set a [list hello world]
% set b "hello world"
% expr {$a eq $b} # if equality would first test the types
0

What's about a kind of warning/log mechanism, that could be introduced by tracing objects for changing their type. So a developer could detect, if variable contents change their type, even if they shouldn't. If a command like "info type data" would exist, than we could do it by a simple trace on a variable.

And ... something I would really like, would be a equality test for lists, without comparing "only" the string representations, with respecting each nested list. Ok - perhabs I want this functionality, because I'm only afraid, that lists are converted to strings before testing for equality. Something I really want to prevent.

Best regards,

Martin

NEM: It would be quite nice to be able to trace internal representation changes of Tcl_Obj's sometimes for debugging. Perhaps rather than having a command that returns the type it would be safer to have some command which directly prints out the type of an object whenever it changes type?

An lequal command is probably a good idea for efficiency. Even more important, I think a dict equal command is needed as equality (or rather equivalence) of dictionaries is not the same as string equality. However, your proposal for a list equality command that descends into sub-lists is tricky as we run into the traditional problem with Tcl of how to distinguish a sub-list with one element from an "atom"? Checking for [llength $list] > 1 doesn't work. The best solution I can remember is to check:

% proc atom? a { expr {[lindex $a 0] eq $a} }
% set l [list [list [list 1 2 3 4]]]
{{1 2 3 4}}
% atom? $l
0
% set a 12
12
% atom? $a
1
% proc recurse {l {indent 0}} {
   if {[atom? $l]} {
       puts "[string repeat { } $indent]Atom: $l"
   } else {
       foreach item $l {
           recurse $item [expr {$indent + 2}]
       }
   }
}
% recurse $l
     Atom: 1
     Atom: 2
     Atom: 3
     Atom: 4

But that is not without its own problems. Perhaps it is doable efficiently at the C level, though?

male: You are right ... to differentiate between an atom or a list is difficult at tcl level, but I expect, it should be completely clear on C level! So a kind of lequal should be done in C, just like a kind of dict equal. Testing the equality of dictionaries would be as problematic, because dictionaries look like lists, while doing string representation comparision.

Is there a chance to test if something, looking like a list is a dictionary? Like I can test for the existence of an array using "array exists arrayName"? Looking at the dict manpage I see a "dict exists dictionaryValue key ?key ...?". So a test for a dictionary is not possible?

About the just printing commmand, which prints something, if the object type changes ... what's about a new trace type "type"? Then we could create a trace on a type change like this:

trace add variable listVar type typeChangeCallBack

I favor the possibility to react on type changes, than only to have some outputs, that (for example) our MFC application won't show!

NEM: It isn't any easier to test for atom/list difference at the C level; you essentially have to do the same test as in the Tcl version. You could check the "type" field of the Tcl_Obj (as the extension on this page does), but that is very much the wrong thing to do in terms of Tcl's semantics (where the type/internal rep are an optimisation detail and Tcl's semantics are oblivious to them). Equality functions are fairly difficult in general as you want a notion of equality (strictly, equivalence) that depends on a notion of type, and types can be quite complicated. For instance, in our list equality test, we could define it like:

proc lequal {l1 l2} {
    if {[atom? $l1] && [atom? $l2]} {
        # What here?
        expr {$l1 eq $l2}
    } elseif {[atom? $l1]} {
        return 0
    } elseif {[atom? $l2]} {
        return 0
    } else {
        if {[llength $l1] != [llength $l2]} {
            return 0
        } else {
            foreach a $l1 b $l2 {
                if {![lequal $a $b]} {
                    return 0
                }
            }
            return 1
        }
    }
}

The question is, when we reach an atom what equality predicate do we use? String equality? Perhaps our atoms are floating point numbers and we wish to use expr's ==. We could perhaps parameterise lequal by an equality predicate:

proc lequal {l1 l2 {test "string equal"}} {
    if {[atom? $l1] && [atom? $l2]} {
        uplevel #0 [linsert $test end $l1 $l2]
    } else ...
}
proc numEqual {a b} { expr {$a == $b} }
lequal $l1 $l2 numEqual

Which would do a reasonable job. However, what if we have a heterogenous list (where we wish to treat different elements as being of different types)? Or what if the "type" of our elements overlaps with the representation of lists (for instance, if our elements are dictionaries)? Likewise a dict equal runs into similar problems. Personally, I think the parameterised version given above (that defaults to "string equal") is probably reasonable for most uses (perhaps taking the test predicate as an option). For more complex structures, however, you're still best off writing your own equality predicate that knows the structure (or using tags).

male: About to find the right way to test the atoms ... what's about using: "string is $class -strict $atom"

proc lequal {l1 l2} {
    if {[atom? $l1] && [atom? $l2]} {
        # This here?
        if {[string is double -strict $l1] &&
            [string is double -strict $l2]} {
            expr {$l1 == $l2}
        } else {
            expr {$l1 eq $l2}
        }
    } elseif {[atom? $l1]} {
        return 0
    } elseif {[atom? $l2]} {
        return 0
    } else {
        if {[llength $l1] != [llength $l2]} {
            return 0
        } else {
            foreach a $l1 b $l2 {
                if {![lequal $a $b]} {
                    return 0
                }
            }
            return 1
        }
    }
}

Or we don't try to differentiate between numerical and textual atoms and use simply the == operator, because if one of the arguments of the == operator is not numerical, a textual comparison will be done. Probably this will be quicker than differentiating the type of the data to be compared and reacting on it.