wrapper extension for LAPACK

Arjen Markus (28 December 2009) I have been contemplating this for several years now, but I finally made a start with a wrapper extension to get the functionality of LAPACK into Tcl.

I started off with two simple BLAS routines (BLAS is the more basic library that underlies LAPACK - basic linear algebra system, if I am not mistaken) and the Fortran wrapper (wrapfort) you can find in my ftcl library (see [L1 ]). Immediately I ran into design issues:

Take the routine dapxy, which computes a*X + Y (a a scalar, X and Y vectors) and returns the result in Y. The interface is such that X and Y could also be two-dimensional arrays (matrices) so that you could use this routine to update the rows and columns of a matrix Y using a vector X o a matrix X.

Questions that arise:

  • Is that flexibility something we want in the Tcl command as well?
  • Do we want to update a vector or should the Tcl command simply return the result?
  • Do we want to expose the full interface? That is the most flexible, but it is also distinctly non-Tcl

Right now, my answers are:

  • We merely want to pass two vectors X and Y, not columns or rows of a matrix
  • We do not want to update any arguments - so the Tcl command will return the resulting vector and it is up to the user to put it into the original Y argument, if he/she wants to.
  • The interface should be as simple as:
   set result [daxpy $a $x $y]

(The Fortran interface is:

   call dapxy( n, a, dx, incx, dy, incy )

Because of incx and incy, you can pass two-dimensional arrays such that only a part is updated.)

Another question is: should we wrap all routines (some 1500!) or just the more commonly used ones? (If so, which ones?)

More direct support for Fortran and C arrays

AM (31 December 2009) A discussion in the Tcl chat room brought back an old idea: use byte arrays (binary strings) to store the vector or matrix data for a Fortran or C routine directly. Here is a small Tcl program that illustrates the basic storage mechanism (for vectors only at the moment):

# vecmat.tcl --
#     Utilities to manipulate Tcl lists that hold
#     binary data that can directly act as Fortran or C arrays
#
#     TODO: error checking
#

# Vecmat --
#     Namespace to hold these utilities
#
namespace eval ::Vecmat {
    namespace export vector matrix
}

# vector --
#     Utility to manipulate vectors
#
# Arguments:
#     method     Method in question
#     args       Any arguments for the method
#
# Returns:
#     Depends on the method
#
proc ::Vecmat::vector {method args} {

    switch -- $method {
        "create" {
             if { [llength $args] != 1 } {
                 return -code error "Invalid number of arguments - should be: vector create number"
             }
             return [::Vecmat::Vcreate {*}$args]
        }
        "set" {
             if { [llength $args] != 3 } {
                 return -code error "Invalid number of arguments - should be: vector set name index number"
             }
             return [::Vecmat::Vset {*}$args]
        }
        "tolist" {
             if { [llength $args] != 1 } {
                 return -code error "Invalid number of arguments - should be: vector tolist vector-value"
             }
             return [::Vecmat::Vtolist {*}$args]
        }
        "fromlist" {
             if { [llength $args] != 1 } {
                 return -code error "Invalid number of arguments - should be: vector fromlist list"
             }
             return [::Vecmat::Vfromlist {*}$args]
        }
        default {
             return -code error "Unknown method: $method"
        }
    }
}

# Vcreate --
#     Create a vector of doubles of a given length
#
# Arguments:
#     number     Number of elements
#
# Returns:
#     Vector with values initialised to 0.0
#
proc ::Vecmat::Vcreate {number} {

    set data [binary format d* [lrepeat $number 0.0]]
    return [list VECTOR $number $data]
}

# Vset --
#     Set a single element in a vector
#
# Arguments:
#     name       Name of the vector
#     index      Index of the element
#     value      The new value
#
# Returns:
#     Vector with new value at position index
#
# Note:
#     This updates the vector variable whose name is $name
#
proc ::Vecmat::Vset {name index value} {
    upvar 2 $name name_

    set data [lindex $name_ 2]
    set data [binary format a*@[expr {8*$index}]d $data $value]
    lset name_ 2 $data
    return $name_
}

# Vtolist --
#     Convert a vector into a regular Tcl list
#
# Arguments:
#     vector     value of the vector
#
# Returns:
#     List containing the values of the vector
#
proc ::Vecmat::Vtolist {vector} {

    binary scan [lindex $vector 2] d* data
    return $data
}

# Vfromlist --
#     Create a vector from a regular Tcl list of doubles
#
# Arguments:
#     list       list of values to be converted
#
# Returns:
#     Vector containing the values of the list
#
proc ::Vecmat::Vfromlist {list} {

    set data [binary format d* $list]
    return [list VECTOR [llength $list] $data]
}

# main --
#     Test the vector and matrix utilities
#
set v [::Vecmat::vector create 10]

foreach index {0 4} {
    ::Vecmat::vector set v $index [expr {cos($index)}]
}

puts [::Vecmat::vector tolist $v]

set w [::Vecmat::vector fromlist {1.0 .2 3.0}]
puts [::Vecmat::vector tolist $w] 

Automatically generating an interface definition

AM (26 January 2010) Here is a small program that scans the LAPACK source files and produces a first version of an interface definition that can be used by wrapfort.

# autowrap.tcl --
#     Program to automatically generate the wrapfort-style
#     wrapper definitions for LAPACK
#
#     TODO:
#     - input/output arguments
#     - functions
#     - recognizing matrix arguments (dlarrv: matrix z for instance)
#     - exposing the actual interface (workspace and output arguments
#       are not included after all!)
#

# extractInterface --
#     Extract the interface for a LAPACK/BLAS routine
#
# Arguments:
#     filename       Name of the source file to examine
#
# Returns:
#     Description of the interface as a wrapfort command
#
proc extractInterface {filename} {

    set infile [open $filename r]

    #
    # First part: the subroutine/function header
    #
    set header ""
    while {1} {
        gets $infile line
        append header $line
        if { [string first ) $line] >= 0 } {
            break
        }
    }
    set header [string tolower $header]

    #
    # Second part: the description of the arguments
    #
    while { [gets $infile line] >= 0 } {
        if { [regexp {^\* *Arguments} $line] > 0 } {
            gets $infile line
            break
        }
    }

    set arguments {}
    set inputs    {}
    set results   {}
    set count     0

    while { [gets $infile line] >= 0 } {
        if { [string first "======" $line] >= 0 } {
            break
        }

        #
        # Take care of several irregularities in the LAPACK code
        # and of interference with Tcl's list format
        #
        set line [string map {\" "" " - " " "} [string tolower $line]]

        set intent "?"
        regexp {^[^(]*(\([^)]+\))} $line ==> intent
       # puts "Intent: $intent"

        switch -- $intent {
            "(input)" {
                incr count
                set name [lindex $line 1]
                set type [extractType $line]
                lappend arguments [list $type $name input]
            }
            "(output)" {
                incr count
                set name [lindex $line 1]
                set type [extractType $line]
                set size [extractSize $line]
                if { $name != "info" } {
                    if { [string match *array $type] || [string match *matrix $type] } {
                        lappend arguments [list $type $name [list allocate ($size)]]
                    } else {
                        lappend arguments [list $type $name [list assign 0]]
                    }
                    lappend results [list $type $name]
                } else {
                    lappend arguments [list $type $name {assign 0}]
                }
            }
            "(input/ouptut)"            -
            "(input or output)"         -
            "(input/workspace/output)"  -
            "(input / output)"          -
            "(input/output)"            {
                incr count
                set name [lindex $line 1]
                set type [extractType $line]
                set size [extractSize $line]
                lappend arguments [list $type $name input]
                lappend results   [list $type $name]
            }
            "(external procedure)" {
                puts "    External procedure: [string trim $line]"
                incr count
                set name [lindex $line 1]
                set type "external"
                lappend arguments [list $type $name {}]
            }
            "(workspace)" -
            "(workspace/output)" {
                # Note: optional output is ignored
                incr count
                set name [lindex $line 1]
                set type [extractType $line]
                set size [extractSize $line]
                lappend arguments [list $type $name [list allocate $size]]
            }
            default {
                # Ignore
            }
        }
    }

    set results [convertResults $results]

    foreach {routine numberArgs call rtype} [convertHeader $header $arguments $results] {break}

    if { $count != $numberArgs } {
        puts "File: $filename -- not all arguments analysed ($count -- $numberArgs)"
    }

    lappend arguments [list code {} $call]

    return [list [string toupper $routine] [string tolower $routine] $arguments $rtype]
}

# extractType --
#     Extract the data type of an argument
#
# Arguments:
#     line             Line of text describing the argument
#
# Returns:
#     Keyword for wrapfort
#
proc extractType {line} {

    set type "?"

    set dim "?"
    regexp {dimension +(.*)} $line ==> dim
    set dims 1
    if { [string first , $dim] > 0 } {
        set dims 2
    }

    switch -glob -- $line {
        "*integer array*" {
             set type [expr {$dims == 1? "integer-array" : "integer-matrix"}]
        }
        "*integer*" {
             set type "integer"
        }
        "*double precision array*" {
             set type [expr {$dims == 1? "double-array" : "double-matrix"}]
        }
        "*double precision*" {
             set type "double"
        }
        "*real array*" {
             set type [expr {$dims == 1? "real-array" : "real-matrix"}]
        }
        "*real*" {
             set type "real"
        }
        "*character*" {
             set type "string"
        }
        "*logical array*" {
             set type [expr {$dims == 1? "logical-array" : "logical-matrix"}]
        }
        "*logical*" {
             set type "logical"
        }
    }

    return $type
}

# extractSize --
#     Extract the size of an array argument
#
# Arguments:
#     line             Line of text describing the argument
#
# Returns:
#     Size information usable by wrapfort
#
proc extractSize {line} {

    set size "?"

    regexp {dimension (.*)} $line ==> size
    regexp {\((.*)\)} $size ==> size ;# Get rid of the outer parentheses, if any

    if {  [string first , $size] >= 0 &&
         !([string first (max $size] >= 0 || [string first (min $size] >= 0) } {
        set size "sz([string map {" " ""} $size])"
    }

    return $size
}

# convertHeader --
#     Convert the subroutine/function header
#
# Arguments:
#     header           Complete Fortran header
#     arguments        Information about the arguments (types notably)
#     results          Generated code to convert the result variables
#
# Returns:
#     Size information useable by wrapfort
#
proc convertHeader {header arguments results} {

    regexp { ([a-z0-9_]+) *\(} $header ==> routine
    set count [llength [split $header ,]]

    # For now
    set rtype ""

    #
    # Check the individual arguments
    #
    set call     "$routine\("
    set wrapup   ""
    set listargs {}
    set trailer  ""
    foreach arg $arguments {
        foreach {type name size} $arg {break}

        set name [string trimleft $name _]

        switch -glob -- $type {
            "external" -
            "*-array"  -
            "*-matrix" {
                 lappend listargs "$name"
            }
            "string" {
                 lappend listargs "STRING($name)"
                 append  trailer  " STRINGLEN($name)"
            }
            default {
                 lappend listargs "&$name"
            }
        }
        if { $name == "info" } {
            set wrapup "
if ( info != 0 ) {
    Tcl_SetResult( interp, \"Error in $routine\", NULL );
    return TCL_ERROR;
}"
        }
    }

    set call "\n$call [join $listargs ", "] $trailer );\n$results\n$wrapup"

    return [list $routine $count $call $rtype]
}

# convertResults --
#     Convert the list of result variables to code
#
# Arguments:
#     results          List of result variables with their types
#
# Returns:
#     Boiler-plate code to convert the results to a list
#
proc convertResults {results} {

    set number   [llength $results]

    set prologue [string map [list NUMBER $number] "    {
        Tcl_Obj *_result_obj\[NUMBER\];"]

    set epilogue [string map [list NUMBER $number] "
        if ( NUMBER > 1 ) {
            Tcl_SetObjResult( interp, Tcl_NewList( NUMBER, _result_obj ) );
        } else {
            Tcl_SetObjResult( interp, _result_obj\[0\] );
        }
    }"]

    set body ""

    set idx 0
    foreach arg $results {
        foreach {type name} $arg {break}
        switch -glob -- $type {
            "double-array" {
                append body "
        if ( WrapCopyDoubleArrayToList( interp, $name, size__$name, &(result_obj\[$idx\]) ) != TCL_OK ) {
            Tcl_SetResult( interp, \"Can not copy array to Tcl list\", NULL );
        }"
            }
            "double-matrix" {
                append body "
        if ( WrapCopyDoubleMatrixToList( interp, $name, columns__$name, rows__$name, &(result_obj\[$idx\]) ) != TCL_OK ) {
            Tcl_SetResult( interp, \"Can not copy array to Tcl list\", NULL );
        }"
            }
            "integer-array" {
                append body "
        if ( WrapCopyDoubleArrayToList( interp, $name, size__$name, &(result_obj\[$idx\]) ) != TCL_OK ) {
            Tcl_SetResult( interp, \"Can not copy array to Tcl list\", NULL );
        }"
            }
            default {
                append body "/* Unknown type or not yet supported */"
            }
        }

        incr idx
    }

    return "$prologue\n$body\n$epilogue"
}

# main --
#     Get it all going
#
set outfile [open "autowrap_lapack.tcl" w]

foreach file [glob .../lapack/lapack-3.2.1/src/d*.f] {
    puts "File: $file ..."

    foreach {routine cname arguments rtype} [extractInterface $file] {break}

    puts $outfile "\nWrapfort::fproc ::Lapack::$routine $cname {"
    foreach arg $arguments {
        puts $outfile "    $arg"
    }
    puts $outfile "}"
}

The program creates a file with "fproc" commands for each individual routine in LAPACK based on the following assumptions:

  • Arrays are represented at the Tcl side as lists of integers or doubles
  • Matrices are represented at the Tcl side as lists of lists
  • Multiple output arguments are stored in a list result

The interface that is generated in this way is not perfect: not all arguments are documented in the way that the program expects and the program itself does not handle functions or external procedures as fully as could be done. The law of diminishing returns though has caught up with me: at this point it appears to be simpler to adjust the interface manually than to try and solve it by tweaking the above program any further.

Some statistics:

  • There are 404 files that are treated in this way, with one routine per file
  • There are 25 files that present some problem (not enough information about the arguments or an external procedure used) and a number of routines where he size of one or more arrays could not be extracted correctly. I'd say, less than ten percent.
  • The generated output file is 0.5 MB and the C source as generated by the wrapfort utility is 1.5 MB

The generated interfaces will have to be checked, but it seems doable.


AM (1 February 2010) I have experimented a bit with the performance of the (partial) wrapper:

  • Filling the vector in the form of a byte array from Tcl using the above procedures is very slow - 100 times slower than using a list.
  • Using that byte array in the LAPACK wrapper could decrease the computation time, but the gain seems fairly small. I used vectors of 10000 and 100000 elements and computed the norm using LAPACK's dnrm2 (meaning there is only a conversion from Tcl list to C/Fortran array) and the byte-array variant was some 40% faster.
  • I compared the time required to solve a system of equation by dgels to that required to solve it via Tcllib's math::linearalgebra::solveGauss procedure for various sizes of the system (matrices of 10, 50, 100 and 200 rows and columns). LAPACK is 40 times faster than the Tcl-only method (the matrix is stored as a list of lists, so some further gain is possible). (Mind you: the 200x200 system is solved by solveGauss in about 4 seconds - with LAPACK's dgels taking 0.1 second. I did not check the performance on a 1000x1000 matrix with Tcl, but LAPACK requires 12 seconds.

Wrapping the Specfunc library

(18 March 2010) While my work on LAPACK is (slowly and intermittently) progressing, I have also created a wrapper for the quite extensive Specfunc library for special mathematical functions, like the Bessel functions and orthogonal polynomials (see [L2 ]). The work is not finished yet, but the results look promising. The source code can be found in the repository of my Ftcl project at [L3 ] - see the directory examples/wrapspecfunc.

(4 May 2010) I have wrapped most of the routines in that library. The major exceptions are: Mathieu functions and prolate and oblate spherical wave functions. The use of these routines requires first that you determine certain coefficients and then use those coefficients to evaluate the functions themselves. As they are not that common, I have decided not to put any effort in them.

Other libraries

I am working on a few other libraries as well:

  • A library for numerical integration of functions (a collection of algorithms by John Burkardt)
  • The MINPACK library for least-square minimisation

lm 04/05/2010 : I'm just curious ... Will these packages contain MultiDimensional Scaling algorithms, and wavelet transformations ? Thanks ! anyway for the work !!


arjen - 2010-05-06 03:27:30

Interesting idea - do you have any libraries in mind? (For multidimensional arrays of data you may want to look at Neil McKay's Tensor package too)


AK - 2010-05-06 12:51:04

Tensor v4 is in ActiveState's TEApot repository


arjen - 2010-05-20 03:45:54

Update: I have been able to generate the raw wrapper code for the LAPACK driver routines (the double precision ones). I have now also finished the manual adjustment of that code, so that you do not have to deal with arguments that can be derived from others (size of the matrices for instance). Now I will have to check the code for (obvious) mistakes.


arjen - 2013-07-21 09:59:27

It took me a while, but I have picked up this project again. There were a few bugs to solve, but at least I am making progress.


arjen - 2014-01-17 13:27:01

I am currently trying to get the bugs out. Stay tuned.


arjen - 2014-01-17 13:27:45

Actually with the help of jima ;).