Version 7 of Fitting a new Rear-End to Arrays

Updated 2004-05-13 11:17:06

This page written by DKF


What is good about Tcl's arrays? Their compact syntax for common operations.

What is not so good about Tcl's arrays? You have to use a hash-table for the mapping from keys to values.

Would would make things extra cool? If we could take arrays and put a new back-end on them based on lists, dictionaries, or even something more exotic. There have been attempts to do this in the past (BLT's vectors spring to mind here)

Why do this to arrays instead of putting magic conversion on values? We'd have a proper named location to store the metadata describing how the array is implemented.

What is going to be hard? Well, traces are definitely tricky, as is upvar and friends, as they both (currently) require that your mapping mechanism hands out references to a Var structure, and that's not something that you can really map nicely onto a Tcl_Obj as it is an updatable structure.


What Operations are Needed?

get
Retrieve a value from the mapping given a key. This operation should succeed if the key was one returned by the list keys operation below; behaviour for other keys is undefined in general.
set
Update or insert a value into the mapping given a key
unset
Update the mapping so as to remove an element
list keys
Get the list of keys that map to data values. Note that an array might be permitted to support other keys with "magic" names, but this operation should only list the keys that map in a straight-forward fashion.
serialize
Convert entire array to a string (preferably in the order listed by the list keys operation.)
deserialize
Convert string to array

[...]


RS Don't arrays (and dicts) represent a string -> value mapping, while vectors (and lists) do (0<=int<maxint) -> value?

DKF: So what if vectors/lists have a restricted language of keys? :^)

13may04 jcw - Cool! The serialize/deserialize operations are "slightly less primitive operations" IMO, in that they can be implemented with the first four. The get/set/unset/list combo is the core. They offer all sorts of interesting new options, similar to Perl's "tie". My first goal would probably be to map this to hashed Metakit views, i.e. persistent memory-mapped arrays. Gdbm is another obvious candidate. More advanced uses may need some more machinery, but I think most of that can be done in Tcl.

I'd like to describe an idea which unifies keyed access (arrays), indexed access (lists), scalars, and more - but let me just point to [L1 ] and [L2 ] for now.