Simple Records

NEM 2008-02-16: Here is a simple way of constructing records. A record, like a dict, is simply a mapping from field names to values. In a typed language, the key difference would be that a dictionary typically has values of the same type, whereas the types of the different fields in a record (or struct) can be different. Obviously in Tcl this doesn't matter. The other difference is one of typical implementation. A dict is implemented as a hashtable, whereas a record is usually implemented as a contiguous block of memory and fields are retrieved by fixed offsets into this memory. In Tcl we can get this behaviour using lists. For instance, we might represent a "person" record by a list of {name age address} and then write custom getter procs:

 proc name p { lindex $p 0 }
 proc age p { lindex $p 1 }
 proc address p { lindex $p 2 }

So what is the advantage of this over a dict? Well, firstly it is a more compact representation, and secondly it is potentially faster as it avoids a hash lookup. However, in practice the performance is rarely going to be better (even if implemented in C), as dict caches hash lookups. Still, the compact representation is nice for some applications (e.g. if you are storing lots of these), and the method of using procs as field-getters means that they are first-class and can be easily used with higher-order functions, such as map and fold. For example:

 map name $people

would return a list of the names of every person record in $people. This is a powerful technique.

Rather than write these getter methods (not to mention setters) manually, it is easy enough to generate them automatically. Here we present one fairly efficient means of doing that using Tcl 8.5 features. In particular, for each record type we construct a namespace ensemble for each field (i.e. [person name $record], [person age $record] etc), and we construct specialised lambda expressions to act as these getter/setter methods. No namespace is actually created for these commands, they live entirely within the ensemble -map. The interface supported is the following:

Definition
record define type fields
Constructor
type -field value -field value ...
Getter
type field
Setter
type field value

The records are values just like dicts, so the setter methods return a new record with the change made. There is no support for incr or anything similar. The code is very simple though:

Code

namespace eval record {
    namespace export define construct
    namespace ensemble create

    proc define {name fields} {
        set map [dict create]
        dict set map create [namespace code [list construct $name $fields]]
        set index -1
        foreach field $fields {
            dict set map $field [lambda {record args} [format {
                if {[llength $args] == 0} {
                    lindex $record %d
                } else {
                    lset record %d [lindex $args 0]
                }
            } [incr index] $index]]
        }
        set cmd [list namespace ensemble create -command $name -map $map \
                -unknown [lambda {type op args} { list $type create $op }]]
        uplevel 1 $cmd
    }
    proc construct {type fields args} {
        set self [list]
        foreach field $fields { lappend self [dict get $args -$field] }
        return $self
    }
    proc lambda {params body} { list ::apply [list $params $body] }
}

Example

record define person {name age address}
set neil [person -age 27 -name "Neil Madden" -address "Somewhere"]
# Getters
puts "Neil = $neil"
puts "Name: [person name $neil]"
puts "Age : [person age $neil]"
puts "Addr: [person address $neil]"
# Setters
set neil [person age $neil 28]
puts "Age : [person age $neil]"

This is itself is quite nice, and pleasingly easy to construct. The performance is pretty good too -- within a gnat's crotchet of dict performance (some rough timing figures later). We can easily sugar an object-like interface on top of this, TOOT-style:

proc despatch {type self args} {
    if {[llength $args]} {
        uplevel 1 [list $type [lindex $args 0] $self {*}[lrange $args 1 end]]
    } else {
        return $self
    }
}
proc let {name = cmd args} { interp alias {} $name {} {*}$cmd {*}$args }
let person: = despatch person
let neil = person: $neil
puts "Neil = [neil]"
puts "Name: [neil name]"
puts "Age : [neil age]"
puts "Addr: [neil address]"
# Setter
let neil = person: [neil age 27]
puts "Age : [neil age]"

This interface is slower by a constant factor due to the extra despatch complexity, but still performs reasonably well, and is more convenient for long-lived records (lack of proc-local commands/aliases makes it unsuitable for temporary usage).

Performance

A quick look at comparative performance of the basic field lookup:

# Timing tests
puts "Get    : [time { person name $neil } 1000]"
puts "GetObj : [time { neil name } 1000]"
# Dict comparison
set d [dict create name "Neil Madden" age 27 address unknown]
puts "Dict   : [time { dict get $d name } 1000]"
# Results on my intel MacBook:
#Get    : 1.5724090000000002 microseconds per iteration
#GetObj : 5.086719 microseconds per iteration
#Dict   : 0.552626 microseconds per iteration