tailcall

tailcall , a built-in Tcl command, executes a command in place of the current command.

Synopsis

: tailcall command ?arg...?

Description

tailcall interprets its arguments as a command and executes the command,replacing the execution frame of the command that invoked tailcall. Unlike uplevel, it does not evaluate its arguments as a script, so double substitution does not occur.

Unlike some other languages, tailcall is not limited to executing only its caller, but can execute any command. The command to be executed is resolved in the current context before tailcall replaces the context.

tailcall is made possible by NRE. It first became available as ::tcl::unsupported::tailcall in the release of Tcl8.6a2.

Contrast the following two commands:

tailcall foo [bar] $var
return [uplevel 1 [list foo [bar] $var1]]

There are a couple of differences:

foo is resolved at the current level, not in the caller's
In contrast with uplevel, the current level is really replaced. This has positive effects on memory, and a possibly confusing effect on stack traces.

To tailcall a script:

tailcall try $script

WHD: Let me see if I understand what this does.

proc fred {} {
    george
}

proc george {} {
     tailcall harry
}

If I call fred, it's almost as though fred called harry directly, instead of george. Not so?

MS: yup - all traces of george are gone from the program stack when harry is called. Now, if harry resolves to a different command in george's current namespace than it would under fred's, the harry that is called is george's and not fred's (no diff if the commands are FQ, of course).

I think this does pretty much what delegation is supposed to do, right?

jima 2009-10-15: Perhaps this has been asked before or somewhere else...

Is this an optimization that takes place at bytecode generation time?

I mean, once fred knows that has to call harry directly the bytecodes generated would be the ones equivalent to have said:

proc fred {} {
    harry
}

I reckon I am not familiar with all the internals of Tcl but I find this would be an interesting thing. Wouldn't this be a new way to have some sort of macros?

MS: Currently, tailcall is not bytecompiled. Everything happens at runtime. That extremely simple example could indeed be bytecoded in a minute, but things get more involved as soon as fred has a bit more structure to it: arguments, local variables, namespace issues both for variable and command lookup, multiple exit points with different (or no) tailcall in them, etc.

jima: Thanks a lot Miguel for the answer. I see the point. I guess this is the same with uplevel 1, isn't it?

proc fred {} {
    uplevel 1 {
        #code here
    }
}

Would it be interesting to define a case (like a contract) saying if your procedure is simple enough then it gets bytecompiled and you get some benefits?

MS: you do not mean "bytecompiled" but rather "inlined into the caller", as all procedure bodies get bytecompiled. There are quite a few other issues with that, especially to accomodate Tcl's dynamic nature. Changing one inlined proc would cause a spoiling of all bytecodes and recompilation of the world, at least with the current approach to bytecode lifetime management.

AMG: Sounds a lot like exec in Unix shells. See execline for more information on a noninteractive Unix shell where everything is done with exec/tailcall.

PYK 2015-12-06: Combine tailcall with an identity command to emulate return:

proc p1 {} {
        tailcall lindex {Hello from p1}
}

Interaction with try

% proc foo {} {puts {I'm foo}}
% proc bar {} {puts {I'm bar}; try {tailcall foo} finally {puts exiting}}
% foo
I'm foo
% bar
I'm bar
exiting
I'm foo

HE 2015-03-31: I'm sure ;-) that I don't understood what happend there. Why "exiting" is printed before "I'm foo" when I call bar? If I change bar to

proc bar {} {puts {I'm bar}; try {puts tryBody; tailcall foo} finally {puts exiting}; puts afterwards}

and call it, I get:

I'm bar
tryBody
exiting
I'm foo

What I see is that tailcall replaces the rest of proc even inside the body of try. But then, why is the finally clause evaluated? And even, if we assume the finally clause has to be evaluated because it is documented always to be evaluated, then there would be the question, why before the execution of the tailcall command?

AMG: [foo] is invoked by replacing [bar] which implies the intervening [try] block must exit before [foo] can start.

wdb: Apparently, the tailcall closes one of the last gaps in Tcl: Tail recursion as known in Scheme.

Example: Cause Caller to Return

proc one {} {
    two
    return 8
}

proc two {} {
    tailcall return 5
}

one ;# -> 5

one returns 5, not 8, because by invoking two, which, through tailcall, is replaced by return.

Example: Factorial

NEM: As a test/demo of how to use this facility, here is a simple benchmark using the factorial function:

package require Tcl 8.6a1
namespace import ::tcl::mathop::*
interp alias {} tailcall {} tcl::unsupported::tailcall

# Naive recursive factorial function
proc fac n {
    if {$n <= 1} {
        return 1
    } else {
        * $n [fac [- $n 1]]
    }
}

# Tail-recursive factorial
proc fac-tr {n {k 1}} {
    if {$n <= 1} {
        return $k
    } else {
        tailcall fac-tr [- $n 1] [* $n $k]
    }
}

# Iterative factorial
proc fac-i n {
    for {set k 1} {$n > 1} {incr n -1} {
        set k [expr {$n*$k}]
    }
    return $k
}

proc test {} {
    set fmt {%-10s ..%-12.12s %s}
    puts [format $fmt Implementation Result Time]
    foreach n {1 5 10 100 500 1000 2500 5000 10000} {
        puts "\nfac $n:"
        foreach impl {fac fac-i fac-tr} {
            if {[catch {$impl $n} result]} { 
                set result n/a
                set time n/a
            } else {
                set time [time [list $impl $n] 10]
            }
            puts [format $fmt $impl $result $time]
        }
    }
}
test

Putting this in a table, we get (timings taken on Linux box, 2.66GHz, 1GB RAM):

N	fac Time	fac-i Time	fac-tr Time
1	3.2	3.0	2.8
5	10.1	4.7	19.4
10	18.4	6.4	37.9
100	345.5	267.4	717.8
500	3133.9	3715.6	6182.5
1000	n/a	13811.7	19764.3
2500	n/a	65121.1	84556.5
5000	n/a	241176.8	288136.1
10000	n/a	987057.8	1643480.7

As we can see, the tail-recursive version is slightly slower than the iterative version, and unlike the naive version, manages to not blow the stack.

Using Tailcall for Callbacks

Napier / Dash Automation 2015-12-28:

Someone (I forget who now) gave me this little snippet which I love and handles many cases for me. I use it with the -command switches throughout my scripts:

proc callback {args} {tailcall namespace code $args}

namespace eval foo {
     proc myProc var {puts $var}

     proc myCall {} { after 5000 [callback myProc $::myVar] }
}

set myVar "Cool!"

foo::myCall

Emulating tailcall

Lars H 2010-05-09: As of late, when writing an uplevel, I've sometimes found myself thinking "That would be slicker with tailcall, but I can't rely on 8.6 features in this project". Today it occurred to me that one can however use a proc to emulate the properties of tailcall that would be needed in these cases, and thus provide a route for forward compatibility.

The main situation I've encountered is that of delegating to another command which may make use of upvar or uplevel. That's basically taken care of by

proc utailcall args {uplevel 2 $args}

although it's safer to make it

proc utailcall args {return -code return [uplevel 2 $args]}

in case the "terminate proc early" aspect of tailcall is relied upon; this is easy to do without thinking much about it.

Another aspect of tailcall is the name resolution of the called command. This can be done as follows

proc ntailcall {cmd args} {
    return -code return [
        [uplevel 1 [list ::namespace which $cmd]] {*}$args
    ]
}

but it's almost as easy to do both at the same time

proc untailcall {cmd args} {
    return -code return [
        uplevel 2 [list [uplevel 1 [list ::namespace which $cmd]]] $args
    ]
}

A word of warning here is that this will produce a very confusing error message if the command is undefined, as namespace which returns an empty string in that case.

A third aspect is that of preserving return levels.

proc rtailcall args {
    catch $args result options
    dict incr options -level 2
    return -options $options $result
}

This leaves some extra material in the errorInfo, but one can probably live with that. Combining the "r" and "u" aspects is straightforward, but will leave even more:

proc rutailcall args {
    catch {uplevel 2 $args} result options
    dict incr options -level 2
    return -options $options $result
}

To complete the set, one might just as well write down the combination of the "r" and "n" aspects

proc rntailcall {cmd args} {
    catch {
        [uplevel 1 [list ::namespace which $cmd]] {*}$args
    } result options
    dict incr options -level 2
    return -options $options $result
}

and of all three

proc rnutailcall {cmd args} {
    catch {
        uplevel 2 [list [uplevel 1 [list ::namespace which $cmd]]] $args
    } result options
    dict incr options -level 2
    return -options $options $result
}

But note: all of the above will fail if used for tail recursion, as soon as the loops get long enough.

Replacement for uplevel

AMG: uplevel has limitations with respect to bytecode compilation and interpretation of return. If uplevel's level count is 1, and if it's the last thing being done in the proc, these limitations can be avoided by using tailcall instead. Note that uplevel takes a script whereas tailcall takes a command. If you want to pass a script to tailcall, make it be the sole argument to try.

See Possible uplevel deficiencies . Also see When to use uplevel for more on when to use or avoid uplevel. See eval vs bytecode for discussion and performance numbers regarding bytecode compilation with eval, uplevel, try, and others.

When to apply tailcall optimization

HaO 2012-12-14: Is it a good idea to replace any code:

proc proc1 {arg1 arg2} {
    # do something here which finds arg3 and arg4
    return [proc2 $arg3 $arg4]
}

proc proc1 {arg1 arg2} {
    # do something here which finds arg3 and arg4
    tailcall proc2 $arg3 $arg4
}

If proc2 is for sure found in the caller namespace?

Is this an intelligent optimization?

I came to this idea, as the TI C compiler calls this "tailcall optimization".

AMG: Yes, except in a highly unlikely situation where proc2 needs proc1 to be visible in the stack. Procedures really ought not to care who called them, but Tcl makes all sorts of things possible, including stupid things.

Misc

NEM: Many thanks to MS for his hard work making this a reality!

Make visible how tailcall works

HE 2021-10-01: I wanted to understand more how tailcall works. The below make it visible for me what happens in case tailcall is used, compared with the same procedure using return. Factorial comes from the documentation of tailcall. Factorial1 is the same procedure but, using return instead of tailcall. I also add some traces to monitor what happens.

We can see how the level in the call stack is changing in case we use return. And it doesn't change with tailcall.

proc factorial1 {n {accum 1}} {
    if {$n < 2} {
        return $accum
    }
    return [ factorial1 [expr {$n - 1}] [expr {$accum * $n}]]
}

proc factorial {n {accum 1}} {
    if {$n < 2} {
        return $accum
    }
    tailcall factorial [expr {$n - 1}] [expr {$accum * $n}]
}

proc debug {type args} {
    puts "debug type: $args"
    puts "debug level: [info level]"
}

trace add execution factorial enter [list debug enter]
trace add execution factorial leave [list debug leave]
trace add execution factorial1 enter [list debug enter]
trace add execution factorial1 leave [list debug leave]

First we execute:

factorial  5

and got

debug type: {factorial 5} enter
debug level: 1
debug type: {factorial 5} 0 {} leave
debug level: 1
debug type: {factorial 4 5} enter
debug level: 1
debug type: {factorial 4 5} 0 {} leave
debug level: 1
debug type: {factorial 3 20} enter
debug level: 1
debug type: {factorial 3 20} 0 {} leave
debug level: 1
debug type: {factorial 2 60} enter
debug level: 1
debug type: {factorial 2 60} 0 {} leave
debug level: 1
debug type: {factorial 1 120} enter
debug level: 1
debug type: {factorial 1 120} 0 120 leave
debug level: 1
120

Then we execute:

factorial1 5

and got

debug type: {factorial1 5} enter
debug level: 1
debug type: {factorial1 4 5} enter
debug level: 2
debug type: {factorial1 3 20} enter
debug level: 3
debug type: {factorial1 2 60} enter
debug level: 4
debug type: {factorial1 1 120} enter
debug level: 5
debug type: {factorial1 1 120} 0 120 leave
debug level: 5
debug type: {factorial1 2 60} 0 120 leave
debug level: 4
debug type: {factorial1 3 20} 0 120 leave
debug level: 3
debug type: {factorial1 4 5} 0 120 leave
debug level: 2
debug type: {factorial1 5} 0 120 leave
debug level: 1
120

Above shows that 'tailcall' replaces the command in the stack.

But what is with the defined local variables of the replaced command?

Below variable a is defined twice. Once in the global context and once in procedure foo1. If we call foo1, variable a is defined locally in foo1 before foo2 is called by 'tailcall'.

foo2 queries variable a first in 'uplevel 1' and then locally. 'uplevel 1' finds the global defined a. The local query of a raise an error because variable a is not defined inside foo2 and context of foo1 doesn't exists any longer.

If foo2 is called directly it does the same because foo1 doesn't do anything which changes anything outside of foo1.

set a 23
proc foo1 {} {
        set a 42
        tailcall foo2
}
proc foo2 {} {
        uplevel 1 {
                puts $a
        }
        puts $a
}
foo1 ;# prints 23 and raise then an error about missing variable a. The error is raised by foo2.
foo2 ;# prints 23 and raise then an error about missing variable a. The error is raised by foo2.

Next 'return' is used instead of 'tailcall' in foo3:

set a 23

proc foo3 {} {
        set a 42
        return [foo4]
}
proc foo4 {} {
        uplevel 1 {
                puts $a
        }
}
foo3 ;# prints 42
foo4 ;# prints 23

After foo3 defines variable a, it calls foo4. That means, foo4 is put on the stack and 'uplevel 1' points into the still existing foo3. Therefore, foo4 founds the locally defined variable a of foo3.

Category Command

Category Concept

Category Control Structure

Category Functional Programming