partial evaluation

NEM 25July2005: Partial evaluation is the technique of evaluating some of an expression now, but leaving the rest for a later time. This is quite a fundamental concept in Computer Science. It separates the static (what is known now) from the dynamic (what we hope to know later). To understand what a partial evaluator is, it is necessary to have a model of what evaluation is (obviously). Rather than use Tcl's eval as a base we will use the notion of evaluation from the lambda calculus, which is even simpler than Tcl. In the lambda calculus you have lambda-terms, which are simple functions of one variable, which are written something like:

λx.2x
λy.1+y

(The λ symbol is the greek letter "lambda", if my browser doesn't mangle the encoding). The first function takes a number and multiplies it by 2, the second takes a number and adds 1 to it. Evaluation is done in the lambda calculus solely by function application (known as β-reduction):

(λx.2x) 4 = 8
(λy.1+y) 12 = 13

Here we also allow general arithmetic operations like "+", and we have introduced the natural numbers. This is for clarity only; you can build both up directly from just lambda expressions, but we won't go into that here. More complex functions can be created by nesting functions inside each other, e.g. a general adder of two numbers can be created by:

λx.(λy.x+y)

Which is a function which takes a number (x), and returns a function which takes another number and then returns the result of adding the two together. Incidentally, this arrangement of functions which return functions is called currying after Haskell Curry (see Hot Curry). In order to calculate the result of the whole expression, we need to use function application twice:

((λx.(λy.x+y)) 12) 14 = 26

You can usually leave out the brackets and just write:

λx.λy.x+y 12 14

One of the interesting things to notice about this, is that you don't have to supply all the arguments at once. You could just supply the first argument (x), and leave y for later. For example:

λx.λy.x+y 12 = λy.12+y

This is the idea of partial application, which can be generalised to the idea of supplying some of the arguments of a function now, while leaving some until later. In the lambda calculus, function application is the only form of evaluation, and that boils down to substitution of variables. In Tcl, evaluation is slightly more complicated, involving the rules of Tcl.n. Essentially, it comes down to a pretty similar idea: substitute variables and reduce. This means that we can create a partial evaluator in Tcl too, which I will call "peval". Here's a first pass at it:

 proc peval {vars script args} {
     if {[llength $args] == [llength $vars]} {
         # All variables bound, so evaluate expression
         foreach _v $vars _a $args { set $_v $_a }
         eval $script
     } else {
         # Partial eval: substitute the variables we know
         set map [list]
         for {set i 0} {$i < [llength $args]} {incr i} {
             lappend map $[lindex $vars $i] [list [lindex $args $i]]
         }
         string map $map $script
     }
 }

This new version of eval takes a script, along with a list of unknown ("free") variables in the script. It then takes a list of arguments and tries to bind them to the free variables in order. If there are enough, then we can evaluate the whole expression, otherwise we simply substitute the values of the variables that we do know and return the new script. Let's test it out:

 % set script [peval {a b} { expr {$a + $b} }]
 expr {$a + $b} 
 % set script [peval {a b} $script 1]
 expr {1 + $b} 
 % peval b $script 2
 3

Here we partially evaluate the procedure at each stage until we can evaluate the whole script and get a result. Note that we have to tell the partial evaluator what the free variables are, so that it knows when the expression is complete. It would be nice to also return a new list of variables with the script, so that we can keep track of what is left to be done. We can do this by packaging up the list of variables and the script into a single value. This packaging up of variables with a script corresponds exactly to the representation of anonymous procedures proposed in TIP 194 [L1 ], and so we will call the new command "papply", to show that it is doing partial application of a procedure, rather than evaluation of a script. Anonymous procedures are lambda terms (approximately), so this papply command somewhat mimicks the beta-reduction of the lambda calculus.

 proc papply {lambda args} {
     foreach {params body} $lambda { break }
     if {[llength $args] == [llength $params]} {
         foreach _p $params _a $args { set $_p $_a }
         eval $body
     } else {
         set map [list]
         for {set i 0} {$i < [llength $args]} {incr i} {
             lappend map $[lindex $params $i] [list [lindex $args $i]]
         }
         set body [string map $map $body]
         return [list [lrange $params $i end] $body]
     }
 }
 % set add {{a b} {expr {$a + $b}}}
 {a b} {expr {$a + $b}}
 % papply $add
 {a b} {expr {$a + $b}}
 % papply $add 1
 b {expr {1 + $b}}
 % set succ [papply $add 1]
 b {expr {1 + $b}}
 % papply $succ 3
 4

This only scratches the surface of what a partial evaluator can do. A more complete partial evaluator could do much more impressive tricks, for instance, it should be possible to take a piece of code written using ensembles, and partially evaluator the ensemble command to leave a hard-coded direct path to the command. i.e.:

 peval {arg1 arg2} { MyCmd SubCmd $arg1 $arg2 }

could return:

 {::MyCmd::SubCmd $arg1 $arg2 }

leaving the variables unsubstituted. We can almost achieve this with what we have:

 % set MyCmd {{subcmd args} { ::MyCmd::$subcmd {*}$args }}
 {subcmd args} { ::MyCmd::$subcmd {*}$args }
 % papply $MyCmd SubCmd
 args { ::MyCmd::SubCmd {*}$args }

But our toy partial evaluators don't know about commands, only variables. What would be needed would be for the partial evaluator to try and run the script in the same way as Tcl does, but only partially evaluating each command instead of completely doing so. Once you have this ability, you can use it to specialise commands in different ways. For instance, we can partially evaluate TOOT type commands, to package up a value with a type:

 % proc lambda {params body} { list $params $body }
 % set listtype [lambda {items method args} { ::l$method $items {*}$args }]
 {items method args} { ::l$method $items {*}$args }
 % set l [papply $listtype {a b c d e}]
 {method args} { ::l$method {a b c d e} {*}$args }
 % papply $l index
 args { ::lindex {a b c d e} {*}$args }
 % papply $l index 2
 c

Here we can see that by replacing Tcl's normal evaluation rules by partial evaluation, and by replacing commands by lambda terms stored in normal variables, we get a very flexible behaviour in which we can specialize commands for particular purposes, e.g. $l in the above example is a typed value similar to TOOT, but we have represented it as a partially evaluated procedure rather than a list of type and value names. Note that the dispatch mechanism used in the listtype body could be much more complicated, for instance searching a hierachy of namespace to find the right implementation (as in inheritance).


Lars H: One problem with papply is that its output to a certain extent is unpredicable; if you are missing arguments, the result is a function, but if you supplied all arguments, the result is a function value. This feels somehow wrong, although I cannot quite put my finger on why. Possibly the result should always be a function (with an empty list of variables when fully evaluated)? To have papply "unpack" the value in in a constant function it is fed as the argument somehow feels less disturbing...

NEM: You could be right there. It's an interesting choice: fully evaluate, or leave as a function? Brian Cantwell Smith talks about a similar distinction between the declarative designation mapping (called Φ) between expressions and values, and the procedural consequence of evaluating an expression (called Ψ). (The paper is Reflection and Semantics in LISP, and is a fascinating read if you have access to the locked away treasures of the ACM [L2 ]). For instance if we take the expression "expr {1+2}", then we can see that Φ("expr {1+2}") = 3 (the expression designates the number three), whereas Ψ("expr {1+2}") = "3" (i.e., the expression evaluates to the numeral "3"). Note though, that Φ("3") = 3 too, so the evaluation is designation preserving. He points out that Lisp (and Scheme) violate this designation-preserving principle on occassion, e.g. the atom if we consider an atom like 'A, then in Lisp Ψ('A) = A (i.e. evaluation does dereference), which is not designation preserving, as Φ('A) = A, but Φ(A) = ?? (whatever the variable A is bound to in the environment). I don't think this is the same issue you mention with "papply", which should be designation preserving (I think), but rather the issue is whether papply should be strongly normalising or not. Is this the distinction between strict and lazy evaluation semantics?

FCR: Another alternative for partial application is to remove arguments from the function, and to insert "set arg value;" at the beginning of the code for each applied argument. That way we will never need to evaluate the function within papply. This is an example naive implementation:

 proc papply {lambdaproc args} {
   foreach {arglist code} $lambdaproc {break}
   foreach arg $args {
     if {[llength $arglist] == 1 && [lindex $arglist 0 0] eq "args"} {
       # yes, yes, the "args" case makes all this code dirty
       set nonbracedcode $code
       while {[regsub "\\\{\[^\}\]*\\\}" $nonbracedcode "" nonbracedcode]} {
         continue
       }
       if {[lsearch $nonbracedcode "#@@@@__insert__partially_applied_args"] < 0} {
         set code2 [join [list \
           {#@@@@__insert__partially_applied_args} \
           {set args [linsert $args 0 {*}[lreverse $__partially_applied_args]]} \
           {unset __partially_applied_args}] "\n"]
         set code "$code2\n$code"
       }
       set code "lappend __partially_applied_args [list $arg]\n$code"
     } elseif {[llength $arglist] > 0} {
       # this is the usual case
       set var [lindex $arglist 0 0]
       set defcode [list set $var $arg]
       set code "$defcode\n$code"
       set arglist [lrange $arglist 1 end]
     } else {
       error "too much parameters"
     }
   }
   return [list $arglist $code]
 }

 % set fn2 {{arg1 args} {puts "msg: ($arg1, $args)"}}
 {arg1 args} {puts "msg: ($arg1, $args)"}

 % set fn3 [papply $fn2 hello "2 nd"]
 args {lappend __partially_applied_args {2 nd}
 #@@@@__insert__partially_applied_args
 set args [linsert $args 0 {*}[lreverse $__partially_applied_args]]
 unset __partially_applied_args
 set arg1 hello
 puts "msg: ($arg1, $args)"}

 % apply $fn3 foo
 msg: (hello, {2 nd} foo)