Updated 2011-10-06 12:13:48 by RLE

Created on 2003-06-06

Howdy y'all, it's Rohan Pall here, and here is some code I use to glue diff, patch, and md5sum with Snit.

I'm on Linux Mandrake 9.1, but it should work on any Unix that has compatible executables. It should be rather easy to get it to work on Windows, if you have diff.exe, patch.exe and md5sum.exe -- I have not tested it on Windows though.

'gunkata' is a namespace that has two procs that I use a lot, to read and write binary files. Those two procs, readb and writeb are exported globally.

'cleric' is a snit type that has the useful procs md5_file, md5_data, patch_data and diff_data. Each cleric instance creates its own temporary directory that is deleted when that cleric instance is destroyed.

  package require Tk 8.4
  package require snit 0.81

  namespace eval gunkata {
    namespace export *
    proc readb {fn} {
      set f [open $fn]
      fconfigure $f -translation binary
      set content [read $f]
      close $f
      return $content
    proc writeb {fn c} {
      set f [open $fn w]
      fconfigure $f -translation binary
      puts -nonewline $f $c
      close $f
  namespace import gunkata::*

  snit::type cleric {
    variable tmpdir
    constructor {args} {
      #puts "  $self breathes (type $type)"
      set tmpdir [file join \
                   [file dirname [info script]] \
                   cleric_tmp_[pid]_[clock seconds]_[clock clicks] ]
      file mkdir $tmpdir
      $self configurelist $args
    destructor {
      #puts "  $self dies (type $type)"
      file delete -force $tmpdir
    method new_sub_tmpdir {} {
      set sub_tmpdir [file join $tmpdir "tmp_[clock seconds]_[clock clicks]"]
      file mkdir $sub_tmpdir ; return $sub_tmpdir
    method md5_file {fn} {return [lindex [exec md5sum $fn] 0]}
    method md5_data {data} {
      set sub_tmpdir [$self new_sub_tmpdir]
      set data_file [file join $sub_tmpdir data]
      writeb $data_file $data
      set md5 [$self md5_file $data_file]
      file delete -force $sub_tmpdir
      return $md5
    method patch_data {old_data diff_data} {
      set sub_tmpdir [$self new_sub_tmpdir]
      writeb [file join $sub_tmpdir old] $old_data
      writeb [file join $sub_tmpdir diff] $diff_data
      set working_dir [pwd]
      cd $sub_tmpdir ; catch {exec patch -o new old diff} ; cd $working_dir
      set new_data [readb [file join $sub_tmpdir new]]
      file delete -force $sub_tmpdir
      return $new_data
    # You get unified diff_data back, but it is not known if it will help
    # make correct new_data again using patch.
    # This proc is useless if you care about actually using the diff to
    # recreate the original data.
    method diff_data_without_check {old_data new_data} {
      set sub_tmpdir [$self new_sub_tmpdir]
      writeb [file join $sub_tmpdir old] $old_data
      writeb [file join $sub_tmpdir new] $new_data
      set working_dir [pwd]
      cd $sub_tmpdir ; catch {exec diff -u old new > diff} ; cd $working_dir
      set diff_data [readb [file join $sub_tmpdir diff]]
      file delete -force $sub_tmpdir
      return $diff_data
    # You get unified diff_data that is known to generate the correct data.
    method diff_data {old_data new_data} {
      set new_md5 [$self md5_data $new_data]
      set new_size [string length $new_data]
      #puts "$new_md5 -- $new_size"
      set diff_data [$self diff_data_without_check $old_data $new_data]
      set gen_data [$self patch_data $old_data $diff_data]
      set gen_md5 [$self md5_data $gen_data]
      set gen_size [string length $gen_data]
      #puts "$gen_md5 -- $gen_size"
      if {($new_md5 eq $gen_md5) && ($new_size eq $gen_size)} {
        return $diff_data
      } else {
        return -code error "could not recreate correct data"

Why reinvent the md5, etc. code when the tcllib that snit comes in includes md5, etc.? Just curious.