The iterator design pattern

Marco Maggi The iterator design pattern: to initialise an opaque structure with the data required to access a sequence of items then enter a loop, processing items until the end of the sequence.

The argument named "selfvar" in the following descriptions is the name of a variable used to hold the iterator data. It can be an automatic variable (local to a procedure) or a static variable (a global or namespace variable); the iterators that act upon other iterators require the lower level iterator variables to be static. The usage of static variables for both the iterator variable and the data the iterator acts upon, allows us to hand the iterator to a script bound to an event.

Typically, the user's code will do:

 iterator_initialisation_procedure iterator ... 
while { [iterator_next iterator out] } {
    # do something with $out
}

there's no need to invoke a destructor for the iterator, but since TCL does reference counting on its internal objects: it's better to unset the iterator variable so that the resources can be freed.

The iterator constructor is specific for a type of iteration, the procedure that advances the iteration can act on any iterator.

The iterators presented here cannot fail. It's easy to write a couple of iterator procedures that store an error code in the iterator variable, so that it can be checked that way:

 iterator_initialisation_procedure iterator ... 
while { [iterator_next iterator out] } {
    # do something with $out
}
if { [iterator_specific_error iterator] } {
    # handle the error
}

when the iteration fails: [iterator_next] can return zero and we check for the error. Raising an error with [error] or [return -code error] is also good.


[iterator_next selfvar outvar ?...?] - Advances an iteration to the next element. Returns true if a new element was extracted from the sequence, false otherwise. In the former case the element is stored in the variable "outvar" in the scope of the caller. It's possible for a specific iterator to require more than a single output variable.

[iterator_list selfvar list] - Initialises an iteration over a "list". The iterator will visit all the elements from the first to the last.

[iterator_array selfvar arrayvar] - Initialises an iteration over an array. "arrayvar" is the name of the array variable; the array must be a static variable if we use this iterator as source for a higher level iterator. The iterator will visit all the keys in the array in no predefined order.

If we don't go till the end of the iteration, a resource leakage will result: this is because the builtin TCL array iterator requires the invocation of [array donesearch], this is done only when the iteration ends.

Its always possible to iterate over the element of an array extracting the keys with [array names] and treating them as a common list.

The following iteration procedures will join two iterators to provide some sort of composition between two sequences of elements. The sequences provided by the two input iterators must be ordered: the elements must be visited in such a way that:

string compare $previous $current

could return "-1". At present, no sophisticated comparison is possible (the full power of [lsort] is not used). We have to be careful with this because, for example, the string "8" is greater than the string "10", while the string "08" is lesser than the string "10".

[iterator_intersection selfvar iter1var iter2var] - Initialises an intersection iterator between the sequences of elements provided by the two already initialised iterators "iter1var" and "iter2var". The resulting iterator will visit all the elements that are common to both sequences.

[iterator_union selfvar iter1var iter2var] - Initialises a union iterator between the sequences of elements provided by the two already initialised iterators "iter1var" and "iter2var". The resulting iterator will visit all the elements from both the sequences, sorted.

[iterator_difference selfvar iter1var iter2var] - Initialises a difference iterator between the sequences of elements provided by the two already initialised iterators "iter1var" and "iter2var". The resulting iterator will visit all the elements from "iter1var" that are not present in "iter2var".

[iterator_complintersect selfvar iter1var iter2var] - Initialises a complementary intersection iterator between the sequences of elements provided by the two already initialised iterators "iter1var" and "iter2var". The resulting iterator will visit all the elements from the two sequences that are not common to both of them.

It is possible to compose iterators to realise complex set operations, for example: the following code intersects an iterator with a union iterator:

iterator_list ::iter1 { 0 2 4 6 8 }
iterator_list ::iter2 { 1 3 5 7 9 }
iterator_list ::iter3 { 4 5.5 6 }

iterator_union ::iter4 ::iter1 ::iter2
iterator_intersection ::iter5 ::iter4 ::iter3

set elms {}
while { [iterator_next ::iter5 out } {
    lappend elms $out
}
'# elms -> { 4 6 }
proc iterator_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out errno errno errms errms
   foreach v $args { upvar $v $v }
   return [eval {[lindex $self 0] self out} $args]
}

proc iterator_list { selfvar lst } {
   upvar       $selfvar self
   set self [list [namespace current]::_iterator_list_next $lst -1]
   return
}

proc _iterator_list_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out

   set lst     [lindex $self 1]
   set idx     [lindex $self 2]

   incr idx
   if { $idx < [llength $lst] } {
       set out [lindex $lst $idx]
       lset self 2 $idx
       return 1
   } else {
       return 0
   }
}

proc iterator_array { selfvar arrayvar } {
   upvar       $selfvar self $arrayvar array
   set self [list [namespace current]::_iterator_array_next \
           $arrayvar [array startsearch array]]
   return
}

proc _iterator_array_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out


   set arrayvar        [lindex $self 1]
   set searchid        [lindex $self 2]

   upvar       $arrayvar array

   if { [set e [array anymore array $searchid]] } {
       set out [array nextelement array $searchid]
   } else {
       array donesearch array $searchid
   }
   return $e
}

proc iterator_intersection { selfvar iter1var iter2var } {
   upvar       $selfvar self
   set self [list [namespace current]::_iterator_intersection_next \
           $iter1var $iter2var]
   return
}

proc _iterator_intersection_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out
   upvar       [lindex $self 1] iter1 [lindex $self 2] iter2

   set one [iterator_next iter1 val1]
   set two [iterator_next iter2 val2]

   while { $one && $two } {
       set e [string compare $val1 $val2]
       if { $e == 0 } {
           set out $val1
           return 1
       } elseif { $e < 0 } {
           set one [iterator_next iter1 val1]
       } else {
           set two [iterator_next iter2 val2]
       }
   }
   return 0
}

proc iterator_union { selfvar iter1var iter2var } {
   upvar       $selfvar self
   set self [list [namespace current]::_iterator_union_next \
           $iter1var $iter2var 0 {}]
   return
}

proc _iterator_union_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out
   upvar       [lindex $self 1] iter1 [lindex $self 2] iter2


   switch [lindex $self 3] {
       1       {
           set one [iterator_next iter1 val1]
           set two 1
           set val2 [lindex $self 4]
       }
       2       {
           set one 1
           set val1 [lindex $self 4]
           set two [iterator_next iter2 val2]
       }   
       3       {
           set one [iterator_next iter1 val1]
           set two 0
       }
       4       {
           set one 0
           set two [iterator_next iter2 val2]
       }   
       default {
           set one [iterator_next iter1 val1]
           set two [iterator_next iter2 val2]
       }
   }

   if { $one && $two } {
       set e [string compare $val1 $val2]
       if { $e <= 0 } {
           lset self 3 1
           set out $val1
           lset self 4 $val2
       } else {
           lset self 3 2
           set out $val2
           lset self 4 $val1
       }
       return 1
   } elseif { $one } {
       lset self 3 3
       set out $val1
       return 1
   } elseif { $two } {
       lset self 3 4
       set out $val2
       return 1
   } else {
       return 0
   }
}

proc iterator_difference { selfvar iter1var iter2var } {
   upvar       $selfvar self
   set self [list [namespace current]::_iterator_difference_next \
           $iter1var $iter2var 0 {}]
   return
}

proc _iterator_difference_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out
   upvar       [lindex $self 1] iter1 [lindex $self 2] iter2


   # If no elements left end the iteration.

   if { ! [iterator_next iter1 out] } { return 0 }

   # Check the state of the iteration.

   set val {}
   switch [lindex $self 3] {
       0 {
           # This is the first iteration. Get a value from the second
           # sequence and update the state.
           iterator_next iter2 val
           lset self 3 1
       }
       1 {
           # The iteration is going on. Retrieve the last value extracted
           # from sequence two.
           set val [lindex $self 4]
       }
       2 {
           # Sequence two ended in a previous step.
           return 1
       }
   }

   # If the new value from sequence one is less than the old value
   # from sequence two we return it, else we have to advance the
   # first iterator.

   if { $out < $val } {
       lset self 4 $val
       return 1
   }

   # If we're here the new value from sequence one is equal or greater
   # than the old value from sequence two.

   while {1} {
       while { $out == $val } {
           if { ! [iterator_next iter1 out] } {
               return 0
           }
       }

       while { $val < $out } {
           if { ! [iterator_next iter2 val] } {
               lset self 3 2
               return 1
           }
       }
       if { $out < $val } {
           lset self 4 $val
           return 1
       }
   }
}

proc iterator_complintersect { selfvar iter1var iter2var } {
   upvar       $selfvar self
   set self [list [namespace current]::_iterator_complintersect_next \
           $iter1var $iter2var 0 {}]
   return
}

proc _iterator_complintersect_next { selfvar outvar args } {
   upvar       $selfvar self $outvar out
   upvar       [lindex $self 1] iter1 [lindex $self 2] iter2


   set val1 {}
   set val2 {}
   switch [lindex $self 3] {
       0       {
           # This is the first iteration.
           set one [iterator_next iter1 val1]
           set two [iterator_next iter2 val2]
       }
       1       {
           # In the previous iteration the selected value was "val1".
           set one [iterator_next iter1 val1]
           set two 1
           set val2 [lindex $self 4]
       }
       2       {
           # In the previous iteration the selected value was "val2".
           set one 1
           set val1 [lindex $self 4]
           set two [iterator_next iter2 val2]
       }
       3       {
           # The second iteration is over.
           set one [iterator_next iter1 val1]
           set two 0
           set val2 [lindex $self 4]
       }
       4       {
           # The first iteration is over.
           set one 0
           set val1 [lindex $self 4]
           set two [iterator_next iter2 val2]
       }
   }

   if { $one && $two } {
       set e [string compare $val1 $val2]
       while { $e == 0 } {
           set one [iterator_next iter1 val1]
           set two [iterator_next iter2 val2]
           if { (! $one) && (! $two) } {
               return 0
           } elseif { (! $one) } {
               set out $val2
               lset self 3 4
               return 1
           } elseif { (! $two) } {
               set out $val1
               lset self 3 3
               return 1
           }
           set e [string compare $val1 $val2]
       }
       if { $e < 0 } {
           set out $val1
           lset self 3 1
           lset self 4 $val2
       } else {
           set out $val2
           lset self 3 2
           lset self 4 $val1
       }
       return 1
   } elseif { $one } {
       set out $val1
       lset self 3 3
       lset self 4 $val2
       return 1
   } elseif { $two } {
       set out $val2
       lset self 3 4
       lset self 4 $val1
       return 1
   } else {
       return 0
   }
}