Serializing TclOO objects

In Playing with TclOO, Richard Suchenwirth provided a method to dump the class and variables of an object:

 define oo::object method dump {{pat *}} {
    set res [list class [info object class [self]]]
    foreach i [info object vars [self] $pat] {
        my variable $i
        lappend res $i [set $i]
    }
    set res
 }

This page should explore ways to serialize/unserialize TclOO objects.

My first thought would be to check if the self's variables themselves are objects:

 define oo::object method serialize {{pat *}} {
    set res [list class [info object class [self]]]
    foreach i [info object vars [self] $pat] {
        my variable $i
        if {[info object isa object [set $i]]} {
            set value [[set $i] serialize]
        } else {
            set value [set $i]
        }
        lappend res $i $value
    }
    set res
 }

RS 2009-05-08: This is however dangerous when cycles are in the object graph: say, a genealogy application, where a has father:b and b has children:a ...


Package: serialize 0.1

DKF: I've written a small package to perform serialization of objects. Note that it assumes that it does not need to serialize any classes (and can't deserialize them) that's a project for someone to write in the future.

package require Tcl 8.6

::oo::class create ::oo::serializable {
    method Serialize {{objectsToSerialize {}}} {
        set o [self]
        upvar #1 pendingObjs($o) pendingObjs
        lappend pendingObjs {*}$objectsToSerialize

        set result {}

        set methods {}
        set publics [info object methods $o]
        foreach m [info object methods $o -private] {
            if {![catch {info object definition $o $m} def]} {
                lappend methods method $m [expr {$m in $publics}] $def
            } elseif {![catch {info object forward $o $m} def]} {
                lappend methods forward $m [expr {$m in $publics}] $def
            }
        }
        if {[llength $methods]} {
            dict set result methods $methods
        }

        if {[llength [info object filters $o]]} {
            dict set result filters [info object filters $o]
        }
        if {[llength [info object mixins $o]]} {
            dict set result mixins  [info object mixins $o]
        }
        if {[llength [info object variables $o]]} {
            dict set result variables [info object variables $o]
        }

        dict set result nsvars {}
        foreach v [info vars [self namespace]::*] {
            if {[array exists $v]} {
                set val [array get $v]
                foreach {eName eVal} $val {
                    if {[info object isa object $eVal]} {
                        lappend pendingObjs $eVal
                    }
                }
                dict lappend result nsvars [list \
                    array [namespace tail $v] $val]
            } else {
                set val [set $v]
                if {[info object isa object $val]} {
                    lappend pendingObjs $val
                }
                dict lappend result nsvars [list \
                    scalar [namespace tail $v] $val]
            }
        }

        return $result
    }

    method Deserialize {dict} {
        set o [self]
        if {[dict exists $dict methods]} {
            foreach m [dict get $dict methods] {
                lassign $m t n p d
                ::oo::objdefine $o $t $n {*}$d
                if {$p} {
                    ::oo::objdefine $o export $n
                } else {
                    ::oo::objdefine $o unexport $n
                }
            }
        }

        if {[dict exists $dict filters]} {
            ::oo::objdefine $o filter {*}[dict get $dict filters]
        }
        if {[dict exists $dict mixins]} {
            ::oo::objdefine $o mixin {*}[dict get $dict mixins]
        }
        if {[dict exists $dict variables]} {
            ::oo::objdefine $o variable {*}[dict get $dict variables]
        }

        foreach v [dict get $dict nsvars] {
            lassign $v t n val
            switch $t {
                "array" {
                    array set [my varname $n] $val
                }
                "scalar" {
                    set [my varname $n] $val
                }
            }
        }
    }

    # ----------------------------------------------------------------------

    method Serialize:object {} {
        set o [self]
        upvar #1 objects objects pendingObjs($o) pendingObjs
        set pendingObjs {}
        set objects($o) working

        set result [my Serialize]

        dict set result name $o
        dict set result class [info object class $o]
        dict set result objects {}
        foreach otherObj $pendingObjs {
            if {![info object isa typeof $otherObj ::oo::serializable]} {
                # non-serializable object; skip
                continue
            }
            if {![info exists objects($otherObj)]} {
                dict lappend result objects \
                    [[info object namespace $otherObj]::my Serialize:object]
            }
        }

        set objects($o) done
        return $result
    }

    method Deserialize:object {data} {
        try {
            dict size $data
        } on error {} {
            rename [self] {}
            throw "OO SERIAL FORMAT" \
                "data does not describe serialized objects"
        }

        try {
            set name [dict get $data name]
            if {[info object isa object $name]} {
                error "object already exists"
            }
            rename [self] $name
            ::oo::objdefine [self] class [dict get $data class]
            set objs [dict get $data objects]

            my Deserialize [dict remove $data name class objects]

            foreach otherData [dict get $data objects] {
                [info object namespace [::oo::serializable new]]::my \
                    Deserialize:object $otherData
            }
        } on error {msg opts} {
            rename [self] {}
            dict set opts -message $msg
            return -level 0 -code error -options $opts \
                -errorcode "OO SERIAL FAILURE" \
                "failed to deserialize object \"$name\": $msg"
        }
        return [self]
    }
}

# ----------------------------------------------------------------------

proc ::oo::serialize {object} {
    if {![info object isa object $object]} {
        throw "OO SERIAL USAGE" "\"$object\" is not an object"
    }
    if {![info object isa typeof $object ::oo::serializable]} {
        throw "OO SERIAL USAGE" "\"$object\" is not serializable"
    }
    # Use a coroutine to make some "local global" storage
    coroutine ::oo::SerializerCoroutine apply {{object} {
        array set objects {}
        array set pendingObjs {}
        set data [list TclOO::serializer 0.1 \
                      [[info object namespace $object]::my Serialize:object]]
        lappend data [zlib crc32 $data]
    }} $object
}

proc ::oo::deserialize {data} {
    if {![string is list $data] || [llength $data] < 4} {
        throw "OO SERIAL FORMAT" "data does not describe serialized objects"
    }
    lassign $data tag version content crc
    if {
        $tag ne "TclOO::serializer"
        || ![package vsatisfies 0.1 $version]
        || [llength $data] > 4
        || [zlib crc32 [lrange $data 0 2]] != $crc
    } then {
        throw "OO SERIAL FORMAT" \
            "data does not describe serialized objects or version wrong"
    }

    return [[info object namespace [::oo::serializable new]]::my \
                Deserialize:object $content]
}

package provide TclOO::serializer 0.1

Usage

Declare the classes of objects to be serialized to be subclasses of oo::serializable or at least mix that class into the objects to serialize; multiple inheritance can help here.

Then, just call:

oo::serialize theRootObject

and it will return the serialized form. You can deserialize that string (assuming that the objects have been deleted first) using:

oo::deserialize string

and it will return the recreated root object.

If your objects need more control over the serialization/deserialization, override the Serialize and Deserialize methods; don't forget to call the superclass implementations at some point! Here's a simple example of serializing control:

oo::class create listOfObjs {
    superclass oo::serializable
    variable objlist ;# Holds a list of objects, but serializer package can't figure that out by default 
    # ...
    method Serialize {{objectsToSerialize {}}} {
        # Override so that the list of objects is understood to be part of the object graph
        next [concat $objectsToSerialize $objlist]
    }
}

The Serialize method takes an optional list of objects to serialize (allows traversing the object graph) and returns a dictionary describing the object. The Deserialize method takes a dictionary describing the object and rebuilds the object from it; it's result is unimportant (unless it is an error, in which case it aborts the deserialization of all objects that have not yet successfully deserialized).

Example

This script:

package require TclOO::serializer 0.1
oo::class create foo {
    superclass oo::object oo::serializable
    variable foo
    constructor {{init <unset>}} {
        set foo $init
    }
    method foo {} {
        puts fooooo!!-$foo
    }
    destructor {
        if {[info object isa object $foo]} {
            $foo destroy
        }
    }
}
foo create f1 [foo create f2]
set ser [oo::serialize f1]
puts $ser
f1 destroy
puts [info object isa object f1],[info object isa object f2]
oo::deserialize $ser
f1 foo
f2 foo

produces this output:

TclOO::serializer 0.1 {nsvars {{scalar foo ::f2}} name ::f1 class ::foo objects {{nsvars {{scalar foo <unset>}} name ::f2 class ::foo objects {}}}} 2840851482
0,0
fooooo!!-::f2
fooooo!!-<unset>

glennj 20090519 -- that code gives me an error:

 % set ser [oo::serialize f1]
 unknown or ambiguous subcommand "namespace": must be class, definition, filters, forward, isa, methods, mixins, variables, or vars

I can reproduce like this:

 % info pa
 8.6b1.1
 % package require TclOO
 0.6.1
 % info object namespace f1
 unknown or ambiguous subcommand "namespace": must be class, definition, filters, forward, isa, methods, mixins, variables, or vars

Is there a newer OO package?

DKF: Yes, there is. To be exact, you need the CVS HEAD of Tcl (i.e., 8.6b1.1) since I added a little extra introspection to TclOO. (I always knew that TclOO was incomplete, but I didn't know where; which things were missing but scriptable — serialization is an example of this — and which just had to be there. Introspection of the namespace will allow a number of features to be made much simpler; serialization is only one of them.)


George Petasis: I have created a modified serialisation approach, based on Donal's code. Instead of restoring the object in memory, this approach re-creates the objects (i.e. calling their constructors), and then the variable values are overwritten. This targets code that the variables hold Tk widgets, files, databases, etc. that need to be restored, but without having to write code specific for serialisation.

namespace eval ELEP { namespace eval Base {} }

package require TclOO    ;# TclOO adds object-oriented facilities to Tcl.

oo::class create ELEP::Base::serialisable {

  # constructor {args} {
  # };# constructor

  ##########################################################################
  ## Method: Internal:Serialise:ConstructorArguments
  ##########################################################################
  method Internal:Serialise:ConstructorArguments {args} {
    my variable __serialisable_constructor_arguments
    set __serialisable_constructor_arguments $args
  };# Internal:Serialise:ConstructorArguments

  ##########################################################################
  ## Method: Internal:Serialise:UnserialisableVariables
  ##########################################################################
  method Internal:Serialise:UnserialisableVariables {args} {
    my variable __serialisable_unserialisable_variables
    set __serialisable_unserialisable_variables $args
  };# Internal:Serialise:UnserialisableVariables

  ##########################################################################
  ## Method: Internal:Serialise
  ##########################################################################
  method Internal:Serialise {{objectsToSerialise {}}} {
    my variable __serialisable_unserialisable_variables
    set o [self]
    upvar #1 pendingObjs($o) pendingObjs
    lappend pendingObjs {*}$objectsToSerialise

    set result [dict create]

    ## Ensure that variable __serialisable_unserialisable_variables exists
    ## (since we don't have a constructor or other way to initialise it if the
    ## caller has not called Internal:Serialise:UnserialisableVariables...
    if {[catch {set __serialisable_unserialisable_variables}]} {
      set __serialisable_unserialisable_variables {}
    }

    dict set result nsvars {}
    foreach v [info vars [self namespace]::*] {
      set v_tail [namespace tail $v]
      if {$v_tail in $__serialisable_unserialisable_variables} {continue}
      ## Is it an array variable?
      if {[array exists $v]} {
        set val [array get $v]
        ## Examine all values, and if any value is an object,
        ## add the object to the pending for serialisation ones...
        foreach {eName eVal} $val {
          if {[info object isa object $eVal]} {
            lappend pendingObjs $eVal
          }
        }
        dict lappend result nsvars [list array $v_tail $val]
      } else {
        ## A scalar variable...
        if {[catch {set $v} val]} {
          ## A variable defined, but without a value: skip it...
          continue
        }
        ## is the value on object?
        if {[info object isa object $val]} {
          lappend pendingObjs $val
        }
        dict lappend result nsvars [list scalar $v_tail $val]
      }
    }

    return $result
  };# Internal:Serialise

  ##########################################################################
  ## Method: Internal:Deserialise
  ##########################################################################
  method Internal:Deserialise {dict} {
    foreach v [dict get $dict nsvars] {
      lassign $v t n val
      switch $t {
        "array" {
          array set [my varname $n] $val
        }
        "scalar" {
          set [my varname $n] $val
        }
      }
    }
  };# Internal:Deserialise

  ##########################################################################
  ## Method: Internal:Serialise:object
  ##########################################################################
  method Internal:Serialise:object {} {
    set o [self]
    upvar #1 objects objects pendingObjs($o) pendingObjs \
             serialisedObjects serialisedObjects
    set pendingObjs {}
    set objects($o) working
    lappend serialisedObjects $o

    set result [my Internal:Serialise]
    
    my variable __serialisable_constructor_arguments
    if {[catch {set __serialisable_constructor_arguments} v]} {
      set v {}
    }
    dict set result args $v

    dict set result name $o
    dict set result class [info object class $o]
    dict set result objects {}
    foreach otherObj $pendingObjs {
      if {![info object isa typeof $otherObj ELEP::Base::serialisable]} {
        # non-serializable object; skip
        continue
      }
      if {![info exists objects($otherObj)] &&
            $otherObj ni $serialisedObjects} {
        dict lappend result objects \
            [[info object namespace $otherObj]::my Internal:Serialise:object]
      }
    }

    set objects($o) done
    return $result
  };# Internal:Serialise:object

  ##########################################################################
  ## Method: Internal:Deserialise:object
  ##########################################################################
  method Internal:Deserialise:object {data} {
    try {
      dict size $data
    } on error {} {
      rename [self] {}
      throw "OO SERIAL FORMAT" \
          "data does not describe serialised objects"
    }

    try {
      set name [dict get $data name]
      # if {[info object isa object $name]} {
      #   error "object already exists"
      # }
      # rename [self] $name
      # ::oo::objdefine [self] class [dict get $data class]
      # set objs [dict get $data objects]

      # my Internal:Deserialise [dict remove $data name class objects]
      my Internal:Deserialise $data

      foreach otherData [dict get $data objects] {
        set n [dict get $otherData name]
        set c [dict get $otherData class]
        set a [dict get $otherData args]
        ## Create the object only if does not already exists!
        if {[info object isa object $n]} {
          if {![info object isa typeof $n $c]} {
            error "object \"$n\" not of class $c"
          }
        } else {
          $c create $n {*}$a
        }
        [info object namespace $n]::my \
              Internal:Deserialise:object $otherData
      }
    } on error {msg opts} {
        rename [self] {}
        dict set opts -message $msg
        return -level 0 -code error -options $opts \
            -errorcode "OO SERIAL FAILURE" \
            "failed to deserialise object \"$name\": $msg"
    }
    return [self]
  };# Internal:Deserialise:object

};# class ELEP::Base::serialisable

##########################################################################
## Procedure: ELEP::Base::serialise
##########################################################################
proc ELEP::Base::serialise {object} {
  if {![info object isa object $object]} {
    throw "OO SERIAL USAGE" "\"$object\" is not an object"
  }
  if {![info object isa typeof $object ::ELEP::Base::serialisable]} {
    throw "OO SERIAL USAGE" "\"$object\" is not serializable"
  }
  # Use a coroutine to make some "local global" storage
  coroutine ::ELEP::Base::SerialiserCoroutine apply {{object} {
    array set objects {}
    array set pendingObjs {}
    set serialisedObjects {}
    set data [list ELEP::Base::serialisable 1.0 \
        [[info object namespace $object]::my Internal:Serialise:object]]
    lappend data [zlib crc32 $data]
  }} $object
};# ELEP::Base::serialise

##########################################################################
## Procedure: ELEP::Base::deserialise
##########################################################################
proc ELEP::Base::deserialise {data} {
  if {![string is list $data] || [llength $data] < 4} {
    throw "OO SERIAL FORMAT" "data does not describe serialised objects"
  }
  lassign $data tag version content crc
  if { $tag ne "ELEP::Base::serialisable" ||
       ![package vsatisfies 1.0 $version] ||
       [llength $data] > 4 ||
       [zlib crc32 [lrange $data 0 2]] != $crc } {
    throw "OO SERIAL FORMAT" \
          "data does not describe serialised objects or version wrong"
  }

  set n [dict get $content name]
  set c [dict get $content class]
  set a [dict get $content args]

  $c create $n {*}$a
  return [[info object namespace $n]::my Internal:Deserialise:object $content]
};# ELEP::Base::deserialise

package provide ELEP::Base::serialisable 1.0

It has a different package name (sorry :-)), and some test code is here:

package require Tk
package require ELEP::Base::serialisable
catch {console show}

oo::class create foo {
  superclass ELEP::Base::serialisable
  
  constructor {widget_name {file_name {}}} {
    my variable name fd child
    ## Create a widget...
    set name [toplevel $widget_name]
    if {$file_name ne ""} {
      ## Open the file...
      set fd [open $file_name]
      ## Create another object...
      set child [foo create [self]:child $widget_name.child]
    }

    ## Important: save constructor arguments!
    my Internal:Serialise:ConstructorArguments $widget_name $file_name
    ## Important 2: declare variables whose serialisation is meaningless!
    my Internal:Serialise:UnserialisableVariables fd
  };# constructor

  method fd {} {
    my variable fd
    catch {set fd} result
    puts "fd: $result"
  };# fd

  destructor {
    my variable name fd child
    catch {$child destroy}
    catch {destroy $name}
    catch {close $fd}
  };# destructor

};# class foo

pack [ttk::button .deserialise -command deserialise -text Deserialise] \
  -fill both -expand 1

proc deserialise {} {
  set ser [ELEP::Base::serialise f1]
  f1 destroy
  puts $ser; update
  ELEP::Base::deserialise $ser
  f1 fd
}
foo create f1 .foo1 [info script]
f1 fd

George Petasis: I think that both approaches need to filter what is added to the list of objects, so as not to serialise the same object multiple times...

DKF: This is why in my code, you don't serialize objects directly, but rather ask the serialization library to do it for you (at some point). It ensures that object graphs are serialized once instead of over and over…

George Petasis: I am not sure I understand. I modified your example to:

oo::class create foo {
    superclass oo::object oo::serializable
    variable foo
    constructor {{init <unset>}} {
        set foo $init
    }
    method foo {} {
        puts fooooo!!-$foo
    }
    method change {init} {
      set foo $init
    }
    destructor {
        if {[info object isa object $foo]} {
            $foo destroy
        }
    }
}
foo create f1 [foo create f2]
f2 change f1
set ser [oo::serialize f1]
puts $ser
f1 destroy
puts [info object isa object f1],[info object isa object f2]
oo::deserialize $ser
f1 foo
f2 foo

And I get an error. But the serialisation is:

TclOO::serializer 0.1 {nsvars {{scalar foo ::f2}} name ::f1 class ::foo objects {{nsvars {{scalar foo f1}} name ::f2 class ::foo objects {
{nsvars {{scalar foo ::f2}} name ::f1 class ::foo objects {}}}}}} 3626470617

What I am saying, is that both approaches need a check so as not to serialise multiple times the same object (in this case f1).

DKF: That's because you've not serialized both objects; the serialization code isn't smart enough to figure that out for you magically. (It's pretty dumb really.) You need to add another method, like this:

    method Serialize {{objs ""}} {
        if {[info object isa object $foo]} {
            next [list {*}$objs $foo]
        }  else {
            next $objs
        }
    }

(You can simplify a lot if you know you're definitely holding an object or if you know for sure you're being subclassed.)

Note that you don't need a custom Deserialize method in this case; the default behaviour is fine.

George Petasis: In my approach there is no need for a custom Serialise method, you can just call UnserialisableVariables with the variable name. In general I am against to the need to write code specific for serialisation, unless it is absolutely necessary to do so. And in this case, it is just a simple check in serialise: if it is an object already serialised, don't serialise it again. I think serialisation code needs to be a little clever, if this will prevent the programmer from writing extra code...

George Petasis: I have modified my approach to serialise objects only once automatically.

George Petasis: Is there a way to add a trace on the constructor, and automatically record the arguments given to the constructor?