Sugar command macros

NOTE

 The API changed in two ways, you should be aware of this if you want to try with the current release of
 sugar what is shown in this tutorial.

 First Change - Now macros are expanded only inside procedures defined with ::sugar::proc
 Second Change - Now macros get a list of arguments like normal procedures, but the first argument is
                 the macro name itself. All the macros in this tutorial will work substituting the 'argv'
                 argument with 'args'.

Section 0 - Sugar
Section 1 - Sugar command macros (what you are reading)
Section 2 - Sugar syntax macros
Section 3 - Sugar transformers

What is a Tcl macro.

A macro is an operator implemented by transformation. Macros are procedures that generate a Tcl program at compile time, and substitute it where the programmer used their name as command.

It's a simple concept if explained by examples.

Suppose you want to write a clear command that sets the varName to a null string. It could be Implemented using upvar, like this:

proc clear varName {
    upvar 1 $varName var
    set var {}
}

Every time the user type

clear myvar

When clear is called, the $myvar of the caller is set to to the empty string.

As an alternative to call a procedure that is able to alter the caller's execution environment, we may want to automatically substitute every occurrence of the command clear varname with set varname {}.

So basically we want that when we write

clear myvar

in a program, it is substitute with

set myvar {}

as if the programmer had really typed set myvar {} instead of clear myvar. That's the goal of the simplest form of a Sugar's macro.

The definition of a new macro is very similar to the creation of a procedure. The following is the implementation of [clear] as a macro:

sugar::macro clear argv {
    list set [lindex $argv 1] {{}}
}

It means: "If you encounter a command called clear inside the source code, call the following procedure putting all the parts of which the command is composed in $argv, and substitute the occurrence of the clear command and arguments, with what the procedure will return."

Again, with other words:

So, what happens is that when a procedure is compiled, for every occurrence of the clear command inside the procedure, the above procedure is called, with $argv set to a list that represents the arguments used to call the macro (including the macro name itself as first argument). The result value of the function, that should be a list of the same form, is substituted in place of the original macro call.

To make the example more concrete, see the following code:

proc foobar {
    set x 10
    clear x
}

Before compiling the procedure, Tcl will call the clear procedure with $argv set to clear x. That procedure returns set x {{}}, This return value will be substituted in place of "clear x".

After the proc was defined, we can use info body to see what happened:

info body proc

will output

set x 10
set x {}

Sugar makes it possible to use a macro like clear as if it it was a Tcl procedure, and the macro is called at compile time to produce the procedure that replaces it.

But Tcl has uplevel and upvar, so what are macros useful for? Fortunately they allow for many interesting things not possible at all otherwise. The following example shows the first big advantage of macros:

1) Macros makes Tcl faster, without forcing the user to inline code by hand.

When clear is implemented as macro, it runs 3 times faster in my Tcl 8.4.

Also, upvar is one of the biggest obstacles to the ability of the Tcl compiler to optimize Tcl bytecode, it's not impossible that at some point Tcl will be able to run much faster if the user will ensure a given procedure is never the target of upvar.

Simple commands that involve the use of upvar can be even more simple to write as macros. The following are four examples:

# [first $list] - expands to [lindex $list 0]
sugar::macro first argv {
    list lindex [lindex $argv 1] 0
}

# [rest $list] - expands to [lrange $list 1 end]
sugar::macro rest argv {
    list lrange [lindex $argv 1] 1 end
}

# [last $list] - expands to [lindex $list end]
sugar::macro last argv {
    list lindex [lindex $argv 1] end
}

# [drop $list] - expands to [lrange $list 0 end-1]
sugar::macro drop argv {
    list lrange [lindex $argv 1] 0 end-1
}

Sugar supports three types of macros. We are dealing with the simplest and more common macros: command macros.

The other two types, syntax macros, and transformers, will be covered later. For now let's go to create a more complex macro.

A more complex example

Good macros do source code transformation in a smart way, they turn a form that is undestood by the programmer into code that is also understood by the compiler, that's hard to type and use in raw form without the macro support, but optimal otherwise.

Ideally a macro should expand to a single command call (possibly including many other nested), and should not expand to code that magically creates variables at runtime to store intermediate results all the times it can be avoided (because there may be collisions with variables in the function, or created by other bad macros. Btw, in the TODO list of sugar there is a way to generate unique local variable names).

If the macro is well written, then the programmer can use it like any other command without to care much.

We will see a more real example of macro that implements a very efficient lpop operator. It accepts only one argument, the name of a variable, and returns the last element of the list stored inside the given variable. As side effect, lpop removes the last element from the list. (it's something like the complement of lappend).

A pure-Tcl implementation is the following:

proc lpop listVar {
    upvar 1 $listVar list
    set res [lindex $list end]
    set list [lrange $list 1 end]
    return $res
}

This version of lpop is really too slow. In fact when lrange is called, it creates a new list object even if the original one stored in the $list variable is going to be freed and replaced by the copy. To modify the list in-place is far better.

The lrange implementation is able to perform this optimization if the object in "not shared" (if you don't know about this stuff try to read the Wiki page about the K operator before to continue)

So it's better to write the proc using the K operator. The lrange line should be changed to this:

set list [lrange [K $list [set list {}]] 1 end]

With K being:

proc K {x y} {
    return $x
}

But even to call K is costly in terms of performace, so why not inline it also? Doing it requires changing the previous lrange line to this:

set list [lrange [lindex [list $list [set list {}]] 0] 1 end]

That's really a mess to read, but works at a different speed, and even more important, at a different time complexity!.

With a macro for lpop, we can go even faster, and the code is easier to maintain and read. Macros are allowed to expand to commands containing other macros, recursively. This means that we can write a macro for every single step of lpop. We need the first last and drop macros already developed, and a macro for K:

sugar::macro K argv {
    foreach {x y} $argv break
    list first [list $x $y]
}

Note that for speed, we used foreach instead of two calls to lindex. But remember that macros don't have to be fast in the generation of the expanded code.

K $x $y expands to first [list $x $y], which expands to lindex [list $x $y] 0.

We have one last problem. Even after the optimization and the use of K inline, the procedure above required a local variable 'res' to save the last argument of the list before to modify it, and use $res later as return value for the procedure. We don't want to create local vars into the code that calls the lpop macro, nor do we want to expand to more than a single command. The K operator can help us to do so:

set res [lindex $list end]
set list [lrange [lindex [list $list [set list {}]] 0] 1 end]
return $res

leading to:

K [lindex $list end] [set list [lrange [lindex [list $list [set list {}]] 0] 1 end]]

That's ok, but what an unreadable code! Thanks to macros we can abstract from the fact that to call procedures is slow, so we just write:

[K [last $list] [set list [rest [K $list [set list {}]]]]]

Will not win the clean-code context this year, but it's much better than the previous. Ok... now we want a macro that, every time we type lpop $list, will expand in the above line:

sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set argv [list K \
        {[last $%varname%]} \
        {[set list [drop [K $%varname% [set %varname% {}]]]]}
    ]
    foreach i {1 2} {
        lset argv $i [string map [list %varname% $varname] [lindex $argv $i]]
    }
    return $argv
}

There are few things to note about this code. The macro returns a list, where every element is a token of a Tcl command in the source code. This does not mean we have to transform in lists even arguments that happens to represent a script. Also note that the input list of the macro is just a list of tokens that are *exactly* what the user typed they in the source code, verbatim. What follows is that the tokens are already quoted and valid representations of a procedure argument. We don't need to care about the fact that they must be interpreted as a single argument as we sould when generating code for eval.

This allows the macro developer to use templates for macros, in fact the lpop macro is just using a three argument template, and the final foreach will substitute the arguments that needs to refer to the variable name, with that name. You don't have to care what that variable name is. It can be a complex string formed by more commands, vars, and so on [like][this]$and-this. If it was a single argument in the source code, it will be in the macro after the expansion.

Another interesting thing to note is that we don't really have to return every token as a different element of the list. In pratice we can return it even as a single-element list. The rule is that the macro expander will care to put an argument separator like a tab, or a space, for every element of the list, and put a command separator like newline or ; at the end. If we put spaces ourself, we can just return a single element list.

So, the lpop macro can also by written in this way:

sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set cmd [format {
        K [last $%varname%] [set list [drop [K $%varname% [set %varname% {}]]]]
    } $varname $varname $varname]
    return [list $cmd]
}

This is much more simple and clean, and actually it's possible to use this style. The difference is that returning every token as a different element of a list makes Sugar macros able to left the indentation of the original code unaltered. This is helpful both to take procedure error's line numbers correct, and to see a good-looking output of info body. But as long as most macros are about commands that are just typed in the same line together with all the arguments, for many macros is just a matter of tastes.

If you are implementing control structures that are:

indented in {
    this way
}

It's another question, and it's better to return every token as a list element.

Number of argument and other static checks in macros

Macros expand to code that will raise an error if the number of arguents is wrong in most cases, but it's possible to add this control inside the macro. Actually it's a big advantage of macros because they are able to signal a bad number of arguments at run time: this can help to write applications that are more reliable. It's even possible to write a macro that expands to exactly what the user typed in, but as side effect does a static check for bad number (or format) of arguments:

sugar::macro set argv {
   if {[llength $argv] != 3 || [llength $argv] != 2} {
       error "Bad number of arguments for set"
   }
   return $argv
}

This macro returns $argv itself, so it's an identity transformation, but will raise errors for set with a bad number of arguments even for code that will never be reached in the application. Note that the previous macro for set is a bit incomplete: to get it right we should add checks for arguments that starts with {*}, for this reason Sugar will provide a function to automatically search for a bad number of arguments in some next version.

Note that {*} introduces for the first time the possibility for a command to get a number of arguments that is non evident reading the source code but computed at runtime. Actually, {*} is an advantage for static checks because prior to it, the way to go was eval, that does totally "hide" the called command postponing all the work at run-time. With {*} it's always possible to say from the source code that a command is called with *at least* N arguments. Still, to add new syntax to Tcl will probably not play well with macros and other form of source code processing.

Identity macros are very powerful to perform static syntax checks, they can not only warn on bad number of arguments, but with the type of this arguments. See for example the following identity macro for "string is":

proc valid_string_class class {
    set classes {alnum alpha ascii control boolean digit double false graph integer lower print punct space true upper wordchar xdigit}
    set first [string index $class 0]
    if {$first eq {$}} {return 1}
    if {$first eq {[}} {return 1}
    if {[lsearch $classes $class] != -1} {return 1}
    return 0
}

sugar::macro string argv {
    if {[lindex $argv 1] eq {is} && [llength $argv] > 2} {
        if {![valid_string_class [lindex $argv 2]]} {
            puts stderr "Warning: invalid string class in procedure [sugar::currentProcName]"
        }
    }
    return $argv
}

Thanks to this macro it's possible to ensure that errors that like to write string is number instead string is integer are discovered at compile-time. In this respect macros can be seen as a programmable static syntax checker for Tcl. We will see how "syntax macros" are even more useful in this respect. This is the second feature that macros add to Tcl:

2) Macros are a powerful programmable static checker for Tcl scripts.

Actually I think it's worth to use macros even only for this during the development process, and than flush they away.

Conditional compilation

That's small and neat: we can write a simple macro that expands to some code only if a global variable is set to non-zero. Let's write this macro that we call [debug].

sugar::macro debug argv {
   if {$::debug_mode} {
       list if 1 [lindex $argv 1]
   } else {
       list
   }
}

Than you can use it in your application like if it was a conditional:

# Your application ...
debug {
    set c 0
}
while 1 {
    debug {
        incr c
        if {$c > 100} {
            error "Too many iteractions..."
        }
    }
    .... do something ....
}

if the value of $::debug_mode is true, all the debug {someting} commands are compiled as if 1 {something}. Otherwise, they will not be compiled at all.

That's the simplest example, you can write similar macros like ifunix, ifwindows, ifmac, or even to expand to different procedures call if a given command is called with 2, 3 or 4 arguments. The limit is the immagination.

New control stuctures

Not all the programming languages allow to write new control structures. Tcl is one of the better languages that don't put the programmer inside a jail, but not all the programming languages that allow to write new control structures are able to make them efficient.

Tcl macros can make new control structures as fast as byte-compiled control structures, because user defined ones are usually syntax glue for code transformations. Being macro transformers that translates a from to another, that's a good fit for macros.

Here is a macro for the ?: operator.

# ?: expands
#   ?: cond val1 val2
# to
#   if $cond {format val1} {format val2}
sugar::macro ?: argv {
    if {[llength $argv] != 4} {
        error "Wrong number of arguments"
    }
    foreach {_ cond val1 val2} $argv break
    list if $cond [list [list format $val1]] [list [list format $val2]]
}

The macro's comment shows the expansion performed. Being it translated to an if command, it's as fast as a Tcl builtin.

How macros knows what's a script?

In Tcl there are no types, nor special syntaxes for what is code and what is just a string, so you may wonder why macros are not expanded in the following code:

puts {
    set foo {1 2 3}; [first $foo]
}

But they are expanded in this:

while 1 {
    set foo {1 2 3}; [first $foo]
}

I guess this is one of the main problems developers face designing a macro system for Tcl, and even one of the better helpers of the idea that a good macro system for Tcl is impossible because you can't say what is code and what isn't.

Sugar was designed to address this problem in the simplest possible of the ways: because it can't say if an argument is a script or not, macro expansion is not performed in arguments, so in theory Sugar will not expand the code that's argument to puts, nor while.

But of course, in the real world for a macro system to be usable, macros should be expanded inside the while, and not expanded in puts, so the idea is that for commands that you know accept a script as an argument, you write a macro that returns the same command but with script arguments macro-expanded. It is very simple and in pratice this works well. For example that's the macro for while:

sugar::macro while argv {
    lset argv 1 [sugar::expandExprToken [lindex $argv 1]]
    lset argv 2 [sugar::expandScriptToken [lindex $argv 2]]
}

That's the macro for if:

sugar::macro if argv {
    lappend newargv [lindex $argv 0]
    lappend newargv [sugar::expandExprToken [lindex $argv 1]]
    set argv [lrange $argv 2 end]
    foreach a $argv {
        switch -- $a {
            else - elseif {
                lappend newargv $a
            } 
            default {
                lappend newargv [sugar::expandScriptToken $a]
            }
        }
    }
    return $newargv
}

As you can see, Sugar exports an API to perform expansion in Tcl scripts and Expr expressions. There are similar macros for switch, for, and so on. If you write a new conditional or loop command with macros, you don't need it at all because the macro will translate to code that contains some form of a well known built-in conditional or loop command, and we already have macros for this (remember that macros can return code with macros).

If you write any other command that accept as arguments a Tcl script or expr expression, just write a little macro for it to do macro expansion. This has a nice side effect:

proc nomacro script {
    uplevel 1 $script
}

Don't write a macro for nomacro, and you have a ready-to-use command that works as a barrier for macro expansion.

Continue with section 2 - Sugar syntax macros

WHD: This is very cool, but I have to ask--why not allow macros to have a standard Tcl argument list? That is,

sugar::macro mymacro args {...}

Gives the behavior you describe here, while

sugar::macro mymacro {a b c} {...}

explicitly creates a macro that takes three arguments and will generate a standard error message if you supply some other number?

SS: This can be a good idea, being always possible to use args as only argument to have the current behaviour. I used a single list as input mainly because the same macro can have more then a name, and in order to have the same interface for both command macros and syntax macros. For example:

sugar::macro {* + - /} argv {
    list expr [list [join [lrange $argv 1 end] " [lindex $argv 0] "]]
}

Will handle * + - / with the same code. Macros with more than a name may in extreme cases even give different meanings for arguments in the same position. Btw there is 'args' for this case. So I can change the API to something like this:

sugar::macro {name arg1 arg2 ...} {...}

That's like a Tcl proc, but with the name that was used to call the macro as the first argument. For syntax macros, this format actually may not make a lot of sense, but there is still args. I'll include this change in the next version if I'll not receive feedbacks against it. Thanks for the feedback WHD.

WHD: I think that on the whole I prefer the previous syntax for command macros; the macro can always have an implicit argument that is the macro name. For example,

# Identity macro
sugar::macro {+ - * /} {args} { return "$macroname $args" }

JMN: I'd just like to add my vote for removing the macroname as first argument syntax. From my hacking about, it seems easy to make it implicitly available more or less as WHD suggests. (I don't *think* I broke anything.. )

SS: For a different question about the sugar API, I wonder if Tclers interested in this macro system feel better about the current redefinition of proc, or if it's better to provide a sugar::proc that's exactly like proc but with macro expansion.

If the API will remain the current with proc redefined, I'll add in the wrapper an option -nomacro that will just call the original command. Please add your name with optional motivation below.

Yes, I think it's better to wrapper the real proc:

Put your name here if you are for this solution.

No, I want macro expansion only using sugar::proc:

SS (avoid to waste CPU time for procs that don't use macros, this can be a big difference if you package require sugar before Tk or other big packages)
DKF: Avoiding overriding the real proc allows packages to use sugar if they want without surprising packages that don't expect it. Packages that do want it can just do [namespace import ::sugar::proc] into their own private workspace.

WHD: Since you have to override the standard control structures to make macros work, it seems to me that what you really need is a pair of commands:

sugar::configure -enabled 1

# Macros expanded in body
proc myproc {a b c} {....}

# Macros expanded in expression and body
while {$a > $b} {....}

sugar::configure -enabled 0

# Macros no long expanded.

SS: Actually Sugar overrides nothing! (so it will expand all at compile time, no run-time overhead). It does expansion inside control structures just using macros for while and so on. In this page this is better explained in the section: How a macro knows what's a script?. So to override proc, or to provide a private proc-like command is just a matter of design (or tastes), all will work in both the cases.

alpha_tcler 2016-04-06: The library should allow easily to choose if I want macros inside ::sugar::proc or used elsewhere . Secondly, the examples should be corrected ( the args replacing argv). Great work , macros make TCL a first class LISP equivalent language.

PYK 2016-04-06: Now other Lisp languages just need to grow uplevel, upvar, coroutines, explicit tailcalls, and safe interpreters, in order to become first-class Tcl-equivalent languages :p If you really want to go down the macro rabbit hole, check out procstep.

Category Dev. Tools

Sugar