Updated 2014-08-09 16:54:46 by SEH

Summary  edit

a Tcl package (and command defined by this package) to facilitate writing sequences of mathematical formulae in the familiar infix form.

Requirements  edit

tcllib
> 1.8

Documentation  edit

An infix math command for Tcl
includes the entire source code, commented and explained

Another location: [1]

Obtaining  edit

infix-0.2.1.tar.gz ,2010-03-30
a distro of the infix code as Tcl Modules (the two make a very nice fit). Can be downloaded from

This version includes the [infix::core::funalias] bugfix mentioned below.

Description  edit

First published on 2006-10-14.

A short example:
package require infix 0.2
::infix::core::setup base numconst expr::ops expr::fun
proc ngon_corner {num_sides radius} {
    infix {
        n <- num_sides
        r <- radius
    } {
        alpha = acos(-1) / n ;  # acos(-1) = pi
        r*cos(alpha), r*sin(alpha)
    }
}
ngon_corner 6 10 ; # Returns "8.66025403784 5.0" - a two element list

A notable feature is that the little language implemented by this package is completely configurable (setting it up for [expr]-like operations is what [::infix::core::setup] does), so you can define new operations, or define the usual ones to do something unusual. A setting that turns +, -, etc. into the operations of the math::bignum package is included with the infix package.

The code is available abel.math.umu.se

Some kind of a user's manual (incomplete)  edit

The user commands created by the package are

[::infix::core::setup]

::infix::core::setup ?module ...?

Creates [infix] in the namespace it is called from, and loads the listed modules of settings for the little language of [infix]. See below for lists of defined modules and the syntax of [infix].

[::infix::core::opalias]

::infix::core::opalias name type cmd ?arg ...?

Defines a new operation, name, of type type that gets implemented by appending the operand(s) to the command prefix cmd ?arg ...?, as specified. Any previous meaning of the token name gets overwritten.

The possible types include:
binary priority
A binary, left-associative operation with priority as specified.
binary priority associativity
A binary operation with priority and associativity as specified. Possibilities for associativity include right-associative, non-associative, and n-ary.
prefix priority
A unary prefix operation with priority as specified.
postfix priority
A unary postfix operation with priority as specified.

The priorities should be Tcl numbers (non-integers are fine). Higher priority means tighter binding to the operands. In case of equal priority, the associativity setting is used to resolve which operation acts on which operands. The standard modules uses priorities in the range -2 (for ;) to 14 (for factorial), with + at 10, * at 11, and ** (right-associative) at 12.

Example (requires Tcl 8.5):
::infix::core::opalias ++ {binary 10} ::tcl::mathfunc::hypot

(MetaFont uses ++ for "Pythagorean addition": the length of hypotenuse in right triangle where other sides have these lengths. C programmers no doubt find this very strange.)

[::infix::core::funalias]

::infix::core::funalias name numargs cmd ?arg ...?

Defines a new function, name with numargs arguments that gets implemented by appending the argument(s) to the command prefix, cmd ?arg ...?, as specified. Any previous meaning of the token name gets overwritten. numargs may be any (in which case any number of arguments are accepted) or an integer.

[infix]

infix symlinks body

body is where the actual expressions in the infix little language are written; [infix] returns the value of (the last statement in) the body. The symlinks argument links symbolic names appearing in the body to Tcl variables in the context from which infix was called.

The format of the symlinks is a list containing some multiple of three elements. The first element in each triplet is the infix body symbolic name. The last element in a triplet is the external quantity to which the symbolic name is linked. The middle element of the triplet is an "arrow" that determines how the two are linked:
<-
Input-only value; the external quantity is the name of a variable whose value is copied to the symbol.
->
Output-only value; the external quantity is the name of a variable which is set to the final value of the symbol.
<->
Input/output value; combines <- and ->.
<=
Input-only constant; the external quantity becomes the value of the symbol. Useful for constants that don't fit into the infix body syntax.
<e
The external quantity is evaluated as a script, and the symbol is set to the result of that script. (This is somewhat like using command substitution in [expr].)

To increment $a by $b:
infix {
    a <-> a
    b <-  b
} { a := a + b }

or:
infix {
    a <- a
    b <- b
    c -> a
} { c = a + b }

To Be Continued...

List of modules

Each module, module is implemented by the package infix::module, so anyone can define new modules. The following are those that come with the infix package itself.
base
Basic definitions: parentheses for grouping, = for definition, := for assignment, semicolon as separator, and comma as list constructor.
expr::ops
The unary and binary operations of expr.
expr::fun
The [expr] built-in functions.
expr::ternary
The [expr] ternary ?: operation. (See ifthen for an alternative.)
softsemicolon
A more forgiving statement seprator. The base semicolons may only appear between expressions, but this relaxes the syntax so that a semicolon is effectively ignored if there is no expression after it.
ifthen
Implements expression choices of the form if condition then expression else expression fi (also allowing elseif clauses and omitting the else clause). Unlike ?:, this can have semicolons and the like in the expressions without a need to wrap them up in parentheses.
numconst
Makes symbols that look like numeric constants be interpreted as such. (Without it, e.g. 0, 1, and 3.5 behave just like x and y.) Underscore is a substitute for minus sign in exponents: 3.2e_1 is 3.2*10**-1.
bignum
Operations as for expr::ops, but implemented using the math::bignum commands, with their representation for values. Numeric constants are supported, as are the postfix operations ! (factorial) and !! (semifactorial), and functions sqrt, powm, fromstr, and tostr. Compatible with the ifthen module.
TeX::semi
Changes the tokenizer, so that TeX-style control sequences such \alpha and \cdot count as tokens.
expr::delim
Some expr-functions written as delimiters.
listbracket
Brackets for list construction and indexing.

Example: complex numbers  edit

In want of a module for this, operations on complex numbers (as implemened using the math::complexnumbers package) are fairly easy to set up using opalias and funalias. First we need the basic package requires:
package require infix
infix::core::setup base
package require math::complexnumbers

Then we can define the operations +, -, *, /, and **:
infix::core::opalias + {binary 10} math::complexnumbers::+
infix::core::opalias - {try {binary 10} - {prefix 13} -} math::complexnumbers::-
infix::core::opalias * {binary 11} math::complexnumbers::*
infix::core::opalias / {binary 11} math::complexnumbers::/
infix::core::opalias ** {binary 12 right-associative} math::complexnumbers::pow

The try part in the definition of - is because there are two common operations denoted by minus: subtraction (binary) and negation (unary prefix). Both interpretations are tried, in that order.

Defining the functions should be equally straightforward, but as it turns out version 0.2 of infix there is a bug in [infix::core::funalias], so we need to fix that first:
proc ::infix::core::funalias {name numargs cmd args} {
    set ns [uplevel 1 {::namespace eval infix {::namespace current}}]
    set fcmd [uplevel 1 [list ::namespace which -command $cmd]]
    if {$fcmd eq ""} then {
        return -code error "Undefined command: $cmd"
    }
    set ${ns}::tokentype($name) function
    set ${ns}::function($name) [list $numargs byvalue\
      [list ::concat [linsert $args 0 $fcmd]]]
}

After that, the complex-valued functions are trivial.
foreach fun {exp log conj sqrt sin cos tan} {
    infix::core::funalias $fun 1 math::complexnumbers::$fun
}

There are however also some real-valued functions which make things trickier, as their results cannot be used as arguments to any of the operations provided by the package. A solution in this case is to define companion commands which return complex numbers (that however has imaginary part 0) and let the infix functions refer to these instead.
foreach fun {real imag mod arg} {
    proc ::math::complexnumbers::c${fun} {z} [format {
        complex [%s $z] 0
    } $fun]
}
foreach fun {real imag mod arg} {
    infix::core::funalias $fun 1 math::complexnumbers::c${fun}
}

The code above puts the companions, e.g. cimag of imag, in the math::complexnumbers namespace too. A well-behaved complexnumbers module would rather put such auxilliary commands in its own private namespace ::infix::complexnumbers.

Some examples:
% infix {z <= {2 0}} {-z*z+z}
-2.0 0.0
% infix {z <= {2 1}} {conj(-z)}
-2.0 1.0
% infix {z <= {2 1}} {exp(-z)*z+z}
2.26012464526 0.845360537469
% infix {z <= {2 1}} {log(-z)*z+z}
6.28738295702 -3.55117113296
% infix {z <= {3 4}} {mod(z)}
5.0 0
% infix {z <= {3 4}} {arg(z)}
0.927295218002 0
% infix {z <= {3 0}} {z**z**z}
7.62559748499e+12 0.0

Works well, but having to import all numbers feels a bit awkward. It is however possible to modify the parser so that complex constants can be inlined into the expressions, as follows:
proc infix::symbol {parity token} {
    if {$parity} then {
        error "Operand symbol after operand"
    } elseif {[
        regexp -- {(?xi) ^
          ( [0-9]+\.? | [0-9]+\.[0-9]+ | \.[0-9]+ ) # Mantissa
          (e-?[0-9]+)?                              # Exponent
          i?                                        # Unit
        $} $token
    ]} then {
        return [list [list dappend [
            list operand [list (constant) [
                if {[scan $token %g%c num ""]==1} then {
                    ::math::complexnumbers::complex $num 0
                } else {
                    ::math::complexnumbers::complex 0 $num
                }
            ]]
        ]] {normal 1}]
    } elseif {$token eq "i"} then {
        return [list [list dappend [
            list operand [list (constant) [
                ::math::complexnumbers::complex 0 1
            ]]
        ]] {normal 1}]
    } else {
        return [list [list dappend [
            list operand [list (symbol) $token]
        ]] {normal 1}]
    }
}

[infix::symbol] , which normally is an import of [::infix::core::symbol], is what the parser calls to handle all tokens that haven't been defined as anything special. Its default action is to say "OK, I'll assume that is a variable" (by returning the funny list of the else branch above), but it can return arbitrary operands. In particular it can return (parse tree representations of) constant values, which is what the last two then branches do. The last then branch provides an interpretation of the token i as the complex number with real part 0 and imaginary part 1. The second last then branch provides an interpretation for tokens which look like a number (decimal integer or floating-point) as a real number and tokens which look like a number followed by an i as an imaginary number. With that, it is suddenly possible to do
% infix {} {2}
2.0 0
% infix {} {2i}
0 2.0
% infix {} {2i*3}
0.0 6.0
% infix {} {2i*3i}
-6.0 0.0
% infix {z <= {1 2}} {2*z-z**2}
5.0 -8.881784197001252e-16
% infix {z <= {1 2}} {2*z-z*z}
5.0 0.0

One catch here is however that while you can use numbers like 1.2e3i (1200i), you can't say 1.2e-3i (0.0012i), because that by default tokenizes like 1.2e - 3i. Hence, we'd better change the tokenizer, [infix::scantokens], (which likewise by default is an import of [::infix::core::scantokens]) too, to allow negative exponents. Using the -all -inline mode of [regexp], that is quite easy:
proc infix::scantokens {code} {
   regsub -all {#[^\n]*} $code {} code
   regexp -all -inline {(?x)
      [[:alnum:]_.]+ | # Ordinary identifiers
      \( | \) |        # Parentheses
      (?: [0-9]+\.? | [0-9]+\.[0-9]+ | \.[0-9]+ ) (?:[Ee]-?[0-9]+)? i? |
                       # Numeric complex constants
      [^[:alnum:][:space:]_.()]* [^-[:alnum:][:space:]_.()] |
                       # Non-alphanumeric symbol not ending with -.
      -+               # Symbol consisting only of - signs.
   } $code
}

With this you get
%  infix {} {1.2e-3i}
0 0.0012

An extra bonus above is a modification of the tokenization rules for non-alphanumeric characters: - is not allowed to be the last character of such a symbol (unless it is made up entirely of minuses), since it's much more likely that the minus was intended to be unary. With that in place, you can do
% infix {z <= {1 2}} {(conj(z)-z)/-2i}
2.0 -0.0

Discussion  edit

Sarnold, 2008-01-13: math::bignum is deprecated in Tcl 8.5, isn't it? But I see a real interest for complex numbers and math::bigfloat extension, for instance, to be infix'd. Nice and interesting work you've done!

Lars H: Well, infix was written under 8.4, and (if memory serves) it wouldn't be too hard to get it running under 8.3 as well. math::bignum was done as a proof-of-concept; demonstrating that infix didn't rely on having [expr] do the parsing or calculations. An extra module for math::complexnumber is fairly straightforward, but at the time I wrote the stuff I found math::bignum more appealing for a demo (maybe it was better documented, or had more features, or something). ;-) Update, 2008-01-14: See above for implementing math::complexnumbers operations.

Larry Smith, 2008-07-01: I just noticed that [infix] sports a piece of ambiguous syntax: a<-5 could be read as either a:=5 OR as a < -5.

Lars H: Not really. The arrows belong in the symlinks argument, which is strictly a list (hence spaces are required); think of it as the arguments part of a subroutine declaration. Expressions, which might involve less-than or minus, are written in the body argument.

Larry Smith: Okay, I see that. I'm not sure what it buys, though. It's nice to have that kind of control, but not at the expense of a great deal of complexity or code. Why not something like:
infix {
    in a b c
    out d e f
    inout g
} ...

It does not require people to know your convention using arrows. It lacks the ability to specify mappings in great detail, but that seems to be a feature in search of a problem. Even the in/out lists seem a little over-specified to me.

Lars H: Yes, that's the kind of thing I'm aiming at myself, although for slightly different reasons. In the present version, the "infix variables" are simply entries in an array that the compiled body accesses, so it makes sense for the symlinks to simply specify what to copy into the array at start and what to copy out of it at, but when I had completed the system it became apparent that this simple model had its flaws. In particular, several things could be compiled much better if I could know for sure whether a particular symbol would be defined at a particular point in the program or not.

For efficiency reasons, I want to use the body argument as index into the array that stores compiled code (its hash gets cached in the Tcl_Obj). This is only possible if the compiled code depends only on the body argument, so it needs to contain all in, out, and inout declarations, pretty much as in your example above. The tricky part for which I have not yet found a satisfactory solution is how to combine the totally flexible syntax of [infix] bodies with the fixed roles of in, out, and inout declarations. Some possibilities include:

  1. Make in, out, and inout tokens of some new type, that are declared in the base module. This has the disadvantage that these names are then not available for symbols.
  2. Designate a "delimiter", such that text before it is declarations and can follow other syntax rules than the expressions below it. This has the disadvantage that it feels like a hack.

My undecidedness on this matter is the primary reason [infix] hasn't evolved further. (A close second is the many other projects I've been working on in the meantime.)

Larry Smith: And another thing: we go to a lot of trouble to support unicode and then we use "<-" instead of \u2190 (←)? As Ice-T would say, "Whut's up wid dat?"

Lars H: Even if the language (and package) have no problem with them, the same need not be true for users' Development Environment as a whole: editors may be restricted to 8-bit character sets, and even if they aren't there might not be a convenient way to type ←. Requiring users to type it (or \u2190 which would work fine as the symlinks is just a list) is far less unfriendly, especially as it is part of the fixed syntax of [infix] . For the configurable little language found in the body, Unicode symbols are sometimes obvious choices — when coding the math::complexnumbers stuff above I actually considered defining a Weierstrass ℘ or Euler Γ function just to show off this aspect, but I skipped that since I couldn't find an ready-to-use implementation.

Larry Smith: This brings up an interesting point that I will just throw out for discussion: when, I ask, will it be reasonable to expect that any user could reasonably be expected to type a unicode character as such in a program? In my opinion we are already past this point: virtually all the editors I use regularly have no problem at all with unicode. vi does, but I suspect there are few die-hard vi users that use a lot of tcl. Unicode presents a cornucopia of symbols that can be made to serve useful ends - admittedly some of those ends could be evil (there are thousands of characters to abuse). It's been in the language now for years, so when can the cobbler's children use them?

Lars H: FWIW, my particular editor of choice (Alpha) unfortunately still doesn't support Unicode in documents (although it uses Tcl as scripting language, and thus can handle Unicode for internal processing, the text buffer code is written in C and years overdue for a rewrite). Also, one shouldn't underestimate the portability problems encountered when code is not pure ASCII; the default for source is encoding system, and as long as you find systems where that is iso8859-1, there will be problems.

AM 2010-03-16: Here is a simple example of how you can apply this package. The program below uses it to implement the kind of array processing one finds in MATLAB or Fortran (90/95/2003):
# example.tcl --
#     Example of the use of the infix package
#
package require infix

# add --
#     Add the elements of two lists
#
# Arguments:
#     alist       First list
#     blist       Second list
#
# Returns:
#     List of the sum of the elements of the two lists
#
# Note:
#     The arguments may also be scalars
#
proc add {alist blist} {
    set result {}

    if { [llength $alist] == [llength $blist] } {
        foreach a $alist b $blist {
            lappend result [expr {$a + $b}]
        }
    } elseif { [llength $alist] == 1 } {
        set a $alist
        foreach b $blist {
            lappend result [expr {$a + $b}]
        }
    } elseif { [llength $blist] == 1 } {
        set b $blist
        foreach a $alist {
            lappend result [expr {$a + $b}]
        }
    } else {
        lappend result [expr {$a + $b}]
    }
    return $result
}

# mult --
#     Multiply the elements of two lists
#
# Arguments:
#     alist       First list
#     blist       Second list
#
# Returns:
#     List of the product of the elements of the two lists
#
# Note:
#     The arguments may also be scalars
#
proc mult {alist blist} {
    set result {}

    if { [llength $alist] == [llength $blist] } {
        foreach a $alist b $blist {
            lappend result [expr {$a * $b}]
        }
    } elseif { [llength $alist] == 1 } {
        set a $alist
        foreach b $blist {
            lappend result [expr {$a * $b}]
        }
    } elseif { [llength $blist] == 1 } {
        set b $blist
        foreach a $alist {
            lappend result [expr {$a * $b}]
        }
    } else {
        lappend result [expr {$a * $b}]
    }
    return $result
}

# main --
#     Test it
#
::infix::core::setup base
::infix::core::opalias + {binary 10} add
::infix::core::opalias * {binary 15} mult

set a {1 2 3 4}
set b {2 4 6 7}
set c 2

puts [infix {a <- a b <- b} { 
          a + b }]
puts [infix {a <- a b <- b c <- c} {
          a + c*b }]

#
# Timing
#
proc addmult {alist blist c} {
    set result {}
    foreach a $alist b $blist {
        lappend result [expr {$a + $c * $b}]
    }
    return $result
}

puts "Direct:    [time {addmult $a $b $c} 1000]"
puts "Via infix: [time {infix {a <- a b <- b c <- c} { a + c*b }} 1000]"      

The output:
3 6 9 11
5 10 15 18
Direct:    3.48 microseconds per iteration
Via infix: 15.999 microseconds per iteration

(Just to get an impression)

Lars H: I suspect the relative overhead would be reduced for larger expressions. That's why there are modules defining things like "statement separators", assignments, and ifthen constructions — in many cases, one could write an entire procedure body in some infix little language!

[N/A]: I think the add/mult operations contain a bug if the two input lists are mismatched in size. The last clause for addition is:
lappend result [expr {$a + $b}]

But this simply tries to add up two lists. Similarly for the multiplication one.

AM: You're quite right, but I was just experimenting with the package, so did not intend to create a foolproof program. Another bug you might say is the fact that it does not check the types - you can now pass it a list like {A B C}, causing [expr] to complain.

AM 2010-03-18: A somewhat more elaborate test:
proc addmult {alist blist c} {
    set result {}
    foreach a $alist b $blist {
        lappend result [expr {$a + $c * $b}]
    }
    return $result
}

foreach size {10 30 100 300 1000} {
    set a [lrepeat $size 2.0]
    set b [lrepeat $size 3.0]
    puts "Size: $size"
    puts "Direct:    [time {addmult $a $b $c} 1000]"
    puts "Via infix: [time {infix {a <- a b <- b c <- c} { a + c*b }} 1000]"
}
puts "Second test:"
foreach size {10 30 100 300 1000} {
    set a [lrepeat $size 2.0]
    set b [lrepeat $size 3.0]
    puts "Size: $size"
    puts "Direct:    [time {
        set r [addmult $a $b $c]
        set d [mult $a $b]
        set e [add  $c $a]
        set f [mult $c [mult $a $b]]
    } 1000]"
    puts "Via infix: [time {infix {a <- a b <- b c <- c} {
    a + c*b;
    d = a * b;
    e = c + a;
    f = c * a * b
    } } 1000]"
}       

produces this output:
3 6 9 11
5 10 15 18
Size: 10
Direct:    7.106 microseconds per iteration
Via infix: 21.66 microseconds per iteration
Size: 30
Direct:    19.151 microseconds per iteration
Via infix: 39.241 microseconds per iteration
Size: 100
Direct:    59.159 microseconds per iteration
Via infix: 98.564 microseconds per iteration
Size: 300
Direct:    181.967 microseconds per iteration
Via infix: 280.807 microseconds per iteration
Size: 1000
Direct:    574.934 microseconds per iteration
Via infix: 873.943 microseconds per iteration
Second test:
Size: 10
Direct:    32.733 microseconds per iteration
Via infix: 50.306 microseconds per iteration
Size: 30
Direct:    78.108 microseconds per iteration
Via infix: 100.438 microseconds per iteration
Size: 100
Direct:    240.574 microseconds per iteration
Via infix: 277.947 microseconds per iteration
Size: 300
Direct:    689.702 microseconds per iteration
Via infix: 780.871 microseconds per iteration
Size: 1000
Direct:    2299.788 microseconds per iteration
Via infix: 2562.072 microseconds per iteration

So the overhead is not constant, but depends on the actual work (number of operations within the infix body and the amount of work per operation).

Lars H: Such a difference in the first test isn't surprising; [addmult] has only one loop where using [add] and [mult] instead has one each. The difference in the second test is more worrying, since I think infix should compile the code to pretty much what you're comparing with — in particular there shouldn't be any dependence on the list length! The compiled code is cached in the infix::Cache array. Perhaps you could paste the contents here for comparison?

Lars H: Nevermind, the second test also compares [addmult] with [add] plus [mult], which would account for the difference. In addition, it's comparing
set f [mult $c [mult $a $b]]

with
f = c * a * b

but * is left-associative, so the latter is equivalent to
f = (c * a) * b

and would rather be compiled as
set f [mult [mult $c $a] $b]

Both should come out the same in this case, but would be different if vector-vector multiplication was defined to return a scalar.

AM: Looking at the cache was revealing by the way (not as far as the timings are concerned, but wrt to the = token):

  • Using f = c * a * b introduces an error condition - you can not reset f later
  • Using lf := c * a * b does not cause such an error condition to be inserted

Lars H: Yes, that's (IMHO) the difference between definition and assignment (but you can of course redefine = to do assignment too, If you want). I realised when I was 90% through that it would be much better to catch redefinition errors at compile-time, but the (not so good) calling convention I had started out with had the implication that you couldn't know at compile-time which variables were going to be defined, so I had to resort to run-time checks. A version 1.0 should definitely be able to report the error at compile-time.

AM: Here is a first shot at an interactive calculator using this package: Calculator using the infix package

AM 2010-04-02: I wonder how difficult it will be to use this package for rendering mathematical formulae. One difficulty would be to preserve parentheses, but I guess the parser is flexible enough to get around that.

See Also  edit

Infix Notation
Calculator using the infix package